[go: up one dir, main page]

CN120434153A - Communication method, communication device and communication equipment - Google Patents

Communication method, communication device and communication equipment

Info

Publication number
CN120434153A
CN120434153A CN202410165492.3A CN202410165492A CN120434153A CN 120434153 A CN120434153 A CN 120434153A CN 202410165492 A CN202410165492 A CN 202410165492A CN 120434153 A CN120434153 A CN 120434153A
Authority
CN
China
Prior art keywords
message
sequence number
packet
number information
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410165492.3A
Other languages
Chinese (zh)
Inventor
李�杰
冀智刚
刘世兴
周道龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202410165492.3A priority Critical patent/CN120434153A/en
Priority to PCT/CN2025/070717 priority patent/WO2025161854A1/en
Publication of CN120434153A publication Critical patent/CN120434153A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0677Localisation of faults
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0805Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability
    • H04L43/0811Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters by checking availability by checking connectivity

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本申请提供了一种通信方法、通信装置及通信设备。该通信方法包括:发送第一报文,第一报文包括第一序号信息;获取第二报文的第二序号信息,其中,第一报文和第二报文来自目标业务流;根据第一序号信息和第二序号信息,确定目标业务流发生故障。

The present application provides a communication method, communication apparatus, and communication equipment. The communication method includes: sending a first message, the first message including first sequence number information; obtaining second sequence number information of a second message, wherein the first message and the second message are from a target service flow; and determining that a fault has occurred in the target service flow based on the first sequence number information and the second sequence number information.

Description

Communication method, communication device and communication equipment
Technical Field
The present application relates to the field of communications technologies, and in particular, to a communications method, a communications device, and a communications device.
Background
Bidirectional forwarding detection (Bidirectional Forwarding Detection, BFD) is a network protocol for rapidly detecting and monitoring forwarding connectivity status of links or internet protocol (Internet Protocol, IP) routes in a network. BFD can improve the performance of the existing network, and can more quickly establish a standby channel to restore communication between network devices by quickly detecting communication faults.
And the BFD messages are periodically transmitted on the communicated links between the network devices at the two ends. If a certain network device does not receive the BFD messages within a preset period of time, the network device may determine that the link to which the peer network device is connected fails.
BFD can only detect the integrity of the link (i.e., whether the link is down), and is not perceptible to the failure of the data flow. In view of this, a fault detection scheme for data flow is needed.
Disclosure of Invention
The application provides a communication method, a communication device and communication equipment, which are used for detecting service flow faults.
In a first aspect, the present application provides a method of communication. After the message in the target service flow is sent out from the source equipment, the message is forwarded by the forwarding equipment, and the message can reach the target equipment. Each message of the target service flow includes sequence number information corresponding to the message. In the embodiment of the application, the sequence number information of the message is used for realizing important functions such as ordered transmission, packet loss retransmission, error recovery and the like of the message, thereby ensuring the reliable transmission of the service flow in the network.
In the embodiment of the application, forwarding equipment receives a first message in a target service flow, wherein the first message comprises first sequence number information corresponding to the first message. The forwarding device may obtain second sequence number information of the second packet, where the first packet and the second packet are both from the target service flow, that is, the first packet and the second packet are different packets in the same service flow.
In the embodiment of the application, the forwarding device can determine that the target service flow fails according to the first sequence number information and the second sequence number information. In practical application, the target service flow includes a plurality of messages, and the messages continuously pass through the forwarding device, so that the forwarding device can execute the communication method in the embodiment of the application based on the messages, thereby determining whether the target service flow has a fault or not, and timely sensing the fault of the target service flow. On the other hand, the communication method of the embodiment of the application does not need to modify and expand the original message format, has lower implementation complexity and higher fault detection efficiency.
Based on the first aspect, in an alternative embodiment, the Sequence Number information of the message may be a Sequence Number (Sequence Number) in a TCP header, where the Sequence Number of the message is used to indicate a location of a data content of the message in a complete data content of a target traffic flow, or may be an acknowledgement Number (Acknowledgment Number) in the TCP header, where the acknowledgement Number of the message is used to indicate a location of a message that has been successfully received by a destination device in the target traffic flow, or may be a Sequence Number and an acknowledgement Number in the TCP header, or may be a packet Sequence Number (Packet Sequence Number, PSN) in a remote direct data access (Remote Direct Memory Access, RDMA) message, and the PSN of the RDMA message is used to indicate an order of RDMA messages sent or received through a Queue Pair (QP).
Based on the first aspect, in an alternative implementation manner, in a transmission control protocol (Transmission Control Protocol, TCP) scenario, a source IP address of the first packet is the same as a destination IP address of the second packet, a destination IP address of the first packet is the same as a source IP address of the second packet, a source port of the first packet is the same as a destination port of the second packet, and a destination port of the first packet is the same as a source port of the second packet. The first Sequence Number information of the first message includes at least a Sequence Number (Sequence Number) in a TCP header of the first message, and the second Sequence Number information of the second message includes at least an acknowledgement Number (Acknowledgment Number) in a TCP header of the second message. Optionally, in practical application, the transport layer protocols of the first packet and the second packet are the same.
Optionally, the first message and the second message are respectively the latest messages acquired by the forwarding device in two opposite transmission directions. Assuming that the target traffic flow includes two messages in opposite transmission directions, if the first message is the latest message acquired by the forwarding device in one transmission direction, the second message is the latest message acquired by the forwarding device (or other devices that establish a peer link with the forwarding device) in the other transmission direction.
The forwarding device obtains first sequence number information (namely the sequence number of the first message) and second sequence number information (namely the acknowledgement number of the second message), and if the sequence number in the TCP header of the first message is larger than the acknowledgement number in the TCP header of the second message and the difference between the current time and the first arrival time of the second message is larger than a preset threshold value, the forwarding device determines that the target service flow fails. The current time is the time for the forwarding device to execute the fault determination of the target service flow, and in practical application, the frequency, period and time for the forwarding device to execute the fault determination and the size of the preset threshold value can be adaptively configured according to the service requirement and the network environment, so that the flexibility of the scheme is improved. Specifically, for two opposite delivery directions of the target traffic flow, the sequence number of the message in one direction corresponds to the acknowledgement number of the message in the other direction. If the difference between the current time and the first arrival time of the second message is greater than a preset threshold, the message of the target service flow is not received for a long time in the transmission direction of the second message of the forwarding device. And the sequence number in the TCP header of the first message is larger than the acknowledgement number in the TCP header of the second message, it is indicated that the target traffic flow is continuously transmitted in the transmission direction of the first message, that is, the message that the target traffic flow is not received for a long time in the transmission direction of the second message of the forwarding device due to the stop of the target traffic flow can be eliminated. Thus, the forwarding device may determine that the target traffic flow is malfunctioning.
Based on the first aspect, in an alternative implementation manner, in the TCP scenario, a source IP address of the first packet is the same as a source IP address of the second packet, a destination IP address of the first packet is the same as a destination IP address of the second packet, a source port of the first packet is the same as a source port of the second packet, and a destination port of the first packet is the same as a destination port of the second packet. The first sequence number information of the first message comprises a sequence number and an acknowledgement number in a TCP header of the first message, and the second sequence number information of the second message comprises a sequence number and an acknowledgement number in a TCP header of the second message. Optionally, in practical application, the transport layer protocols of the first packet and the second packet are the same.
The first message is the latest message currently acquired by the forwarding device, and the second message is the previous message of the first message in the same transmission direction. I.e. the forwarding device receives the second message first and then the first message. And after the first message arrives at the forwarding equipment, the forwarding equipment judges whether the target service flow fails or not by taking the time of the first message arriving at the forwarding equipment as the current time. If the sequence number in the TCP header of the first message is smaller than or equal to the sequence number in the TCP header of the second message, and the acknowledgement number in the TCP header of the first message is smaller than or equal to the acknowledgement number in the TCP header of the second message, and the difference between the current time and the first arrival time of the second message is larger than a preset threshold value, it is indicated that the first message received by the forwarding device belongs to a retransmission message, and the forwarding device can determine that the target service flow fails. The current time is the time when the first message arrives at the forwarding device.
Based on the first aspect, in an alternative implementation manner, in the TCP scenario, a source IP address of the first packet is the same as a destination IP address of the second packet, the destination IP address of the first packet is the same as the source IP address of the second packet, a source port of the first packet is the same as a destination port of the second packet, and a destination port of the first packet is the same as a source port of the second packet. The first Sequence Number information of the first message includes a Sequence Number (Sequence Number) and an acknowledgement Number (Acknowledgment Number) in a TCP header of the first message, and the second Sequence Number information of the second message includes at least a Sequence Number (Sequence Number) and an acknowledgement Number (Acknowledgment Number) in a TCP header of the second message.
In one possible implementation, the first message and the second message are respectively the latest messages acquired by the forwarding device in two opposite delivery directions. Taking the first message as an uplink message and the second message as a downlink message as an example, the first message is the latest uplink message acquired by the forwarding device in the uplink direction, and the second message is the latest downlink message acquired by the forwarding device (or other devices establishing peer link with the forwarding device) in the downlink direction.
The forwarding device obtains the first sequence number information (i.e. the sequence number and the acknowledgement number of the first message) and the second sequence number information (i.e. the sequence number and the acknowledgement number of the second message), if the acknowledgement number in the TCP header of the first message is smaller than the sequence number in the TCP header of the second message, and the difference between the current time and the second arrival time of the first message is greater than a preset threshold value, it is indicated that the forwarding device can receive the target traffic flow in the direction of the second message, but the forwarding device has not received the target traffic flow in the same direction as the first message for a long time (exceeding the preset duration), so that the acknowledgement number of the first message stops growing. Therefore, the forwarding device determines that the target traffic flow fails, and an upstream path of the second packet is normal and does not fail, wherein the upstream path of the second packet refers to a path that the second packet passes through in a process of transmitting from the transmitting end to the forwarding device. Therefore, the forwarding equipment can further determine the normal path passed by the target service flow while perceiving the target service flow, so that the follow-up fault location is facilitated.
Based on the first aspect, in an optional implementation manner, the forwarding device is further capable of determining that an upstream path of the second packet is normal, and no fault occurs. At this time, the fault cause is described regardless of the device on the upstream path of the second packet (i.e., the upstream device of the second packet), so the forwarding device may send a fault notification to the upstream device of the second packet, where the fault notification indicates that the upstream path of the second packet is normal, so that the upstream device of the second packet may avoid performing an invalid path switch.
Based on the first aspect, in an optional implementation manner, the first packet and the second packet are packets in the RDMA protocol, a source IP address of the first packet is the same as a destination IP address of the second packet, a destination IP address of the first packet is the same as a source IP address of the second packet, a Queue Pair (QP) of the first packet is different from a Queue Pair (QP) of the second packet, the first sequence number information includes a PSN of the first packet, and the second sequence number information includes a PSN of the second packet. Specifically, the first packet and the second packet are packets in different transmission directions in the same RDMA stream, in other words, the first packet and the second packet have the same Queue Pair context (Queue Pair Context, QPC) and Queue Pair Key (QKey), but the destination QP (Destination QP) of the first packet and the destination QP of the second packet are different. Alternatively, the first message may be a request message in an RDMA service flow, and the second message may be a response message in an RDMA service flow, or the second message may be a response message in an RDMA service flow, and the second message is a request message in an RDMA service flow. If the PSN of the first message is larger than that of the second message and the difference between the current time and the first arrival time of the second message is larger than a preset threshold value, determining that the target service flow fails. In particular, the RDMA protocol provides a reply mechanism that is used to ensure that RMDA messages can be reliably received. In the RDMA scene with the response mechanism, after receiving a request message from a source device, a destination device of a target service flow determines a PSN of a response message of the request message according to the PSN of the request message, wherein the PSN of the response message indicates the position of the data content of the response message in the complete data content of the target service flow, and also indicates the position of the request message in the target service flow, which the destination device has successfully received. After receiving the response message, the source device continues to determine the PSN of the new request message according to the PSN in the response message so as to continue to send the new request message. Thus, in an RDMA scenario, the PSN of a message in one direction corresponds to the PSN of a message in the other direction for two opposite delivery directions of the target traffic. If the difference between the current time and the first arrival time of the second message is greater than a preset threshold, the fact that the message of the target service flow is not received for a long time in the transmission direction of the second message is indicated. And if the PSN of the first message is greater than that of the second message, it indicates that the target service flow is continuously transmitted in the transmission direction of the first message, that is, the message that the target service flow is not received for a long time in the transmission direction of the second message of the forwarding device due to the stop of the target service flow can be eliminated. Thus, the forwarding device may determine that the target traffic flow is malfunctioning.
Based on the first aspect, in an alternative implementation manner, the first packet and the second packet are packets of the same type in the RDMA protocol, a source IP address of the first packet is the same as a source IP address of the second packet, a destination IP address of the first packet is the same as a destination IP address of the second packet, a Queue Pair (QP) of the first packet is the same as a Queue Pair (QP) of the second packet, the first sequence number information includes a PSN of the first packet, and the second sequence number information includes a PSN of the second packet.
The first message is the latest RDMA message currently acquired by the forwarding device, and the second message is the previous RDMA message of the first message in the same transmission direction. I.e. the forwarding device receives the second message first and then the first message. And after the first message arrives at the forwarding equipment, the forwarding equipment judges whether the target service flow fails or not by taking the time of the first message arriving at the forwarding equipment as the current time. In the RDMA protocol, the PSN of an RDMA message indicates the location of the data content of the RDMA message within the full data content of the target traffic stream. Therefore, if the PSN of the first packet is smaller than or equal to the PSN of the second packet, and the difference between the current time and the first arrival time of the second packet is greater than the preset threshold, it is indicated that the first packet received by the forwarding device belongs to the retransmission packet, and the forwarding device can determine that the target service flow fails. The current time is the time when the first message arrives at the forwarding device.
Based on the first aspect, in an optional implementation manner, the source IP address of the first packet is the same as the destination IP address of the second packet, the destination IP address of the first packet is the same as the source IP address of the second packet, the QP of the first packet is different from the QP of the second packet, the first sequence number information includes the PSN of the first packet, and the second sequence number information includes the PSN of the second packet. Specifically, the first packet and the second packet are packets in different transmission directions in the same RDMA stream, in other words, the first packet and the second packet have the same Queue Pair context (Queue Pair Context, QPC) and Queue Pair Key (QKey), but the destination QP (Destination QP) of the first packet and the destination QP of the second packet are different. Alternatively, the first message may be a request message in an RDMA service flow, and the second message may be a response message in an RDMA service flow, or the second message may be a response message in an RDMA service flow, and the second message is a request message in an RDMA service flow. From the above, in the RDMA scenario where the reply mechanism is enabled, the PSN of the RDMA message indicates the location of the data content of the RDMA message in the complete data content of the target traffic stream, and also indicates the location of the message in the target traffic stream that the destination device has successfully received. I.e. for two opposite delivery directions of the target traffic flow, the PSN of a message in one direction corresponds to the PSN of a message in the other direction. If the PSN of the first packet is smaller than the PSN of the second packet and the difference between the current time and the second arrival time of the first packet is greater than the preset threshold, it indicates that the forwarding device may receive the target traffic in the direction of the second packet, but the forwarding device has not received the target traffic in the same direction as the first packet for a long time (exceeding the preset duration), so that the PSN of the first packet stops increasing. Therefore, the forwarding device determines that the target traffic flow fails, and an upstream path of the second packet is normal and does not fail, wherein the upstream path of the second packet refers to a path that the second packet passes through in the process of transmitting to the forwarding device.
Based on the first aspect, in an optional implementation manner, a second message in the target service flow is transferred to the forwarding device, where the second message includes second sequence number information. After the forwarding device receives the second message, the forwarding device can acquire second sequence number information of the second message. On the other hand, the forwarding device records the time when the second message arrives at the forwarding device, so as to obtain the first arrival time of the second message, in other words, the first arrival time of the second message is the time when the second message arrives at the forwarding device.
Based on the first aspect, in an alternative implementation manner, the forwarding device establishes a peer link (peer link) with a plurality of other forwarding devices. The second message in the target traffic is not transferred to the forwarding device, but to other forwarding devices with which the peer link is established. The forwarding device may synchronously receive the second sequence number information of the second packet through the peer link. Therefore, the forwarding device can perform fault judgment on the target service flow based on the messages in the target service flow received by other forwarding devices, so that the communication method of the embodiment of the application is still applicable to the scene of inconsistent transmission paths of the first message and the second message, the flexibility of the scheme is improved, and the applicable scene of the scheme is increased. The first arrival time of the second message may be a time when the second message is transmitted to the peer link of the forwarding device, or may be a time when the forwarding device receives the second sequence number information, which is not limited herein.
Based on the first aspect, in an optional implementation manner, after determining that the target traffic flow fails, the forwarding device generates failure information for the target traffic flow, where the failure information indicates that the target traffic flow fails.
Optionally, the fault information includes, but is not limited to, a time when the target traffic flow is determined to be faulty, first sequence number information of the first message, second sequence number information of the second message, and a quintuple (or triplet) of the target traffic flow. Illustratively, the quintuple can be a source IP address, a destination IP address, a source port, a destination port, and a transport layer protocol of the target traffic in a TCP scenario, and the triplet can be a source IP address, a destination IP address, and a QP of the target traffic in an RDMA scenario.
Optionally, the fault information of the target service flow may be queried by a manager in real time, or the forwarding device may further send fault information for the target service flow to the controller, where the fault information is used to indicate that the target service flow has a fault. After the controller receives the fault information for the target traffic, the fault of the target traffic can be analyzed.
In a second aspect, the present application provides a communication apparatus comprising:
The receiving and transmitting unit is used for receiving a first message, and the first message comprises first sequence number information;
The receiving and transmitting unit is further used for acquiring second sequence number information of a second message, wherein the first message and the second message correspond to the target service flow;
And the processing unit is used for determining that the target service flow fails according to the first sequence number information and the second sequence number information.
Based on the second aspect, in an alternative implementation manner, the source IP address of the first packet is the same as the destination IP address of the second packet, the destination IP address of the first packet is the same as the source IP address of the second packet, the source port of the first packet is the same as the destination port of the second packet, the destination port of the first packet is the same as the source port of the second packet, the first sequence number information includes a sequence number in a TCP header of the first packet, and the second sequence number information includes an acknowledgement number in a TCP header of the second packet;
The processing unit is specifically configured to determine that the target traffic flow fails when a sequence number in a TCP header of the first packet is greater than an acknowledgement number in a TCP header of the second packet, and a difference between a current time and a first arrival time of the second packet is greater than a preset threshold.
Based on the second aspect, in an alternative implementation manner, the source IP address of the first packet is the same as the source IP address of the second packet, the destination IP address of the first packet is the same as the destination IP address of the second packet, the source port of the first packet is the same as the source port of the second packet, the destination port of the first packet is the same as the destination port of the second packet, the first sequence number information includes a sequence number and an acknowledgement number in a TCP header of the first packet, and the second sequence number information includes a sequence number and an acknowledgement number in a TCP header of the second packet;
The processing unit is specifically configured to determine that the target traffic flow fails when a sequence number in a TCP header of the first packet is less than or equal to a sequence number in a TCP header of the second packet, an acknowledgement number in the TCP header of the first packet is less than or equal to an acknowledgement number in the TCP header of the second packet, and a difference between a current time and a first arrival time of the second packet is greater than a preset threshold.
Based on the second aspect, in an alternative implementation manner, the source IP address of the first packet is the same as the destination IP address of the second packet, the destination IP address of the first packet is the same as the source IP address of the second packet, the source port of the first packet is the same as the destination port of the second packet, the destination port of the first packet is the same as the source port of the second packet, the first sequence number information includes a sequence number and an acknowledgement number in a TCP header of the first packet, and the second sequence number information includes a sequence number and an acknowledgement number in a TCP header of the second packet;
The processing unit is specifically configured to determine that an upstream path of the second message is normal when an acknowledgement number in a TCP header of the first message is smaller than a sequence number in a TCP header of the second message, and a difference between a current time and a second arrival time of the first message is greater than a preset threshold.
Based on the second aspect, in an optional implementation manner, the transceiver unit is further configured to send a fault notification to an upstream device of the second packet, where the fault notification is used to indicate that an upstream path of the second packet is normal.
Based on the second aspect, in an alternative implementation manner, the first packet and the second packet are packets in the remote direct memory access RDMA protocol, the source IP address of the first packet is the same as the source IP address of the second packet, the destination IP address of the first packet is the same as the destination IP address of the second packet, the queue pair QP of the first packet is the same as the QP of the second packet, the first sequence number information includes the packet sequence number PSN of the first packet, and the second sequence number information includes the PSN of the second packet;
the processing unit is specifically configured to determine that the target service flow fails when the PSN of the first packet is less than or equal to the PSN of the second packet, and a difference between the current time and the first arrival time of the second packet is greater than a preset threshold.
Based on the second aspect, in an alternative implementation manner, the first packet and the second packet are packets in a remote direct memory access RDMA protocol, a source IP address of the first packet is the same as a destination IP address of the second packet, the destination IP address of the first packet is the same as the source IP address of the second packet, QP of the first packet is different from QP of the second packet, the first sequence number information includes PSN of the first packet, and the second sequence number information includes PSN of the second packet;
The processing unit is specifically configured to determine that the target service flow fails when the PSN of the first packet is greater than the PSN of the second packet and the difference between the current time and the first arrival time of the second packet is greater than a preset threshold.
Based on the second aspect, in an alternative implementation manner, the source IP address of the first packet is the same as the destination IP address of the second packet, the destination IP address of the first packet is the same as the source IP address of the second packet, the QP of the first packet is different from the QP of the second packet, the first sequence number information includes the PSN of the first packet, and the second sequence number information includes the PSN of the second packet;
The processing unit is specifically configured to determine that an upstream path of the second message is normal when a PSN of the first message is smaller than a PSN of the second message and a difference between a current time and a second arrival time of the first message is greater than a preset threshold.
Based on the second aspect, in an optional implementation manner, the transceiver unit is specifically configured to receive a second packet, where the second packet includes second sequence number information.
Based on the second aspect, in an alternative embodiment, the second sequence number information is received through a peer link.
Based on the second aspect, in an optional implementation manner, the transceiver unit is further configured to send fault information for the target traffic flow to the controller, where the fault information is used to indicate that the target traffic flow fails.
The content of the information interaction and the execution process of the embodiment shown in the present aspect is based on the same concept as the embodiment shown in the first aspect, so the description of the beneficial effects shown in the present aspect is shown in the above first aspect, and details are not repeated here.
In a third aspect, the application provides a communications device comprising a processor coupled to a memory for storing instructions that, when executed by the processor, cause the computing device to implement the method of the first aspect, or any of the possible implementations of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored thereon instructions that, when executed, cause a computer to perform the method of the first aspect, or any of the possible implementation manners of the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product having computer readable instructions stored therein, which when executed by a processor, implement the method of the first aspect, or any of the possible implementations of the first aspect.
In a sixth aspect, an embodiment of the application provides a chip comprising a processor coupled to a memory for storing instructions which, when executed by the processor, cause the chip to implement the method of the first aspect, or any one of the possible implementation manners of the first aspect.
The technical effects caused by any implementation manner of the third aspect to the sixth aspect may be referred to the technical effects caused by the implementation manner of the first aspect, which are not described herein.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a BFD scenario;
FIG. 2 is a schematic diagram of one possible non-limiting network architecture of a communication method according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of a communication method according to an embodiment of the application;
Fig. 4 is a schematic diagram of one possible scenario in which a forwarding device obtains second sequence number information according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a possible scenario of a communication method according to an embodiment of the present application;
FIG. 6 is a schematic diagram of another possible scenario of a communication method according to an embodiment of the present application;
FIG. 7 is a schematic diagram of another possible scenario of a communication method according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a communication method applied to an RDMA scene in an embodiment of the present application;
FIG. 9 is a schematic diagram of a possible scenario of a communication method according to an embodiment of the present application;
Fig. 10 is a schematic structural diagram of a communication device according to an embodiment of the present application;
fig. 11 is a schematic logic structure diagram of a communication device according to an embodiment of the present application.
Detailed Description
The embodiment of the application provides a communication method, a communication device and communication equipment, which are used for detecting service flow faults.
Embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application. The terminology used in the description of the embodiments of the application is for the purpose of describing particular embodiments of the application only and is not intended to be limiting of embodiments of the application. As one of ordinary skill in the art can know, with the development of technology and the appearance of new scenes, the technical scheme provided by the embodiment of the application is also applicable to similar technical problems.
In the embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or" describes an association of associated objects, meaning that there may be three relationships, e.g., A and/or B, and that there may be A alone, while A and B are present, and B alone, where A, B may be singular or plural. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (a, b, or c) of a, b, c, a-b, a-c, b-c, or a-b-c may be represented, wherein a, b, c may be single or plural.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented, for example, in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The following description is given of some terms or terminology used in connection with the embodiments of the present application, which also form part of the description.
Transmission control protocol (Transmission Control Protocol, TCP), which is an important transport layer protocol for the Internet (Internet). TCP provides connection-oriented, reliable, ordered, byte-stream transport services. Before using TCP, the application program must first establish TCP connection, and TCP can implement reliable transmission by means of mechanisms of checksum, sequence number, acknowledgement, retransmission control, connection management and window control.
Remote direct data access (Remote Direct Memory Access, RDMA) RDMA technology is a high performance network communication technology, and has the advantages of high bandwidth, low latency, no CPU (Central Processing Unit, CPU) overhead, zero copy and the like. RDMA transfers data directly into the memory area of a computer over a network, and moves the data quickly from one system to a remote system memory without any impact on the operating system, thus eliminating the need for more or less computer processing functions. RDMA eliminates the overhead of external memory copying and context switching, thus freeing up memory bandwidth and CPU cycles for improved application system performance.
The Sequence Number field in the TCP header is a 32-bit unsigned integer, and plays a very critical role in the TCP data transmission process. The sequence number is used to identify the unique location of each byte in the data stream transmitted from the transmitting end to the receiving end. The sequence numbers of the TCP messages are accumulated every time data is sent, so that the receiving end can reorganize the received data according to the correct sequence. The sequence number of the TCP header is necessary because network transmissions may disrupt the order of the data. By using the sequence numbers, the receiving end can determine the correct position of each data packet in the data stream and reassemble them to recover the original data.
The acknowledgement number (Acknowledgment Number) field in the TCP header is also a 32-bit unsigned integer. In TCP communication, the acknowledgement number is used to indicate the position in the data stream of the message that the receiving end has successfully received. After the receiving end correctly receives the data, a TCP message with an acknowledgement number is sent as an acknowledgement response to the received data, so that the sending end knows which data have been correctly received by the receiving end, and the sending end can continue to send the subsequent data.
Gray faults (Gray faults) are faults that are not easily detected or located. These failures typically do not immediately cause the system to crash or out of service completely, but can potentially negatively impact system performance, stability, availability, and can cause problems such as inconsistent data or reduced quality of service, such as performance degradation, random packet loss, memory jitter, and non-fatal anomalies.
Next, description is made of possible application scenarios according to the embodiments of the present application.
In a real network environment, there are often grey faults such as silence faults, misconfigurations, routing black holes, hardware failures or physical port dying. Currently, there are a number of definitions for gray faults. One definition is that a system is defined to experience a gray fault when at least one application observes that the system is unhealthy and the system fault detection tool observes that the system is healthy. Another definition is that a "grey fault" is a hardware fault that causes non-transient packet loss of traffic forwarded on a forwarding device (switch), and congestion does not belong to a grey fault. Common to both definitions is that link failures (e.g., port failure or link disconnection scenarios) are excluded from gray failures.
In a real network environment, gray faults are a major type of network faults, and various types of network faults are represented, such as performance degradation, random data packet loss, memory jitter and the like. There is currently no effective sensing scheme for these gray faults.
Bidirectional forwarding detection (Bidirectional Forwarding Detection, BFD) is a network protocol for rapidly detecting and monitoring forwarding connectivity status of links or internet protocol (Internet Protocol, IP) routes in a network. BFD can improve the performance of the existing network, and can more quickly establish a standby channel to restore communication between network devices by quickly detecting communication faults.
And the BFD messages are periodically transmitted on the communicated links between the network devices at the two ends. If a certain network device does not receive the BFD messages within a preset period of time, the network device may determine that the link to which the peer network device is connected fails. Referring to fig. 1, fig. 1 is a schematic diagram of a BFD scene. As shown in fig. 1, there are two reachable links (link 1 and link 2) between network device a and network device B, and network device a periodically sends BFD messages to network device B via links 1 and 2, respectively. If the network equipment A does not receive the BFD message from the network equipment B through the link 1 within the preset time length, the network equipment A determines that the link 1 connected with the network equipment B fails, and if the network equipment A receives the BFD message from the network equipment B through the link 2, the network equipment A determines that the link 2 connected with the network equipment B is normal.
BFD can only detect the integrity of a link (i.e., whether the link is down), however, the occurrence of a gray failure is primarily due to configuration errors in routing tables on the network device, etc., resulting in inconsistent actual and intended forwarding paths, i.e., the occurrence of a gray failure is independent of the integrity of the link. Therefore, the BDF is not aware of the failure of the data stream.
On the other hand, BFD failure detection requires network devices to periodically generate BFD messages, which requires a certain resource overhead. In practical applications, the BFD failure detection and convergence time generally takes about 200 milliseconds. BFD schemes are not satisfactory for low latency requirements (e.g., latency within 10 milliseconds) for applications such as games, autopilot, or live news.
In view of the above, the embodiments of the present application provide a communication method, a communication device, and a communication apparatus for detecting a traffic flow failure. For ease of understanding, first, one possible, non-limiting network architecture of a communication method in an embodiment of the present application will be described. Referring to fig. 2, fig. 2 is a schematic diagram of a possible, non-limiting network architecture of a communication method according to an embodiment of the application. In the scenario illustrated in fig. 2, the network architecture includes a server 1, a server 2, a switch 1, a switch 2, a switch 3, a switch 4, a controller, and a network (network). Wherein communication is performed between the server 1 and the server 2 through a plurality of forwarding devices (e.g., switch 1, switch 2, switch 3, and switch 4 shown in fig. 2). As shown in fig. 2, the message sent by the server 1 reaches the server 2 through the transmission of the switch 1 and the switch 3, and the message sent by the server 2 reaches the server 1 through the transmission of the switch 4 and the switch 2. And a controller, which is an Analyzer, for receiving the failure information of the traffic flows from the respective forwarding devices so as to analyze the failed traffic flows.
It should be understood that the network architecture shown in fig. 2 is merely exemplary, and that in practical applications, a failure may occur on any forwarding path in any network architecture. The communication method of the embodiment of the application is suitable for at least one forwarding device on the transmission path of the service flow (message), thereby detecting the service flow fault. Wherein the forwarding device is a communication entity for transmitting signals, or receiving signals, or transmitting signals and receiving signals. Alternatively, the communication entity may be a router, a switch, a virtual router, or an intelligent network card, which is not limited herein. Next, a communication method in the embodiment of the present application will be described. Referring to fig. 3, fig. 3 is a flow chart of a communication method according to an embodiment of the application. The communication method in the embodiment of the present application uses the forwarding device as an execution body to illustrate the method, but the present application does not limit the execution body of the interactive illustration, and does not limit the specific hardware form or software form of the forwarding device. For example, the forwarding device in fig. 3 may be a chip, a chip system, or a processor for supporting the forwarding device to implement the method, or the forwarding device may also be a logic node, a logic module, or software for implementing all or part of the functions of the forwarding device, or the forwarding device may also be a generic name of multiple logic nodes, multiple logic modules, or multiple software for implementing the communication method. As shown in fig. 3, the communication method in the embodiment of the present application includes, but is not limited to, steps 101 to 103.
101. The forwarding device receives the first message.
The forwarding device is a forwarding device on a path of the target service flow from the source device to the destination device, and after the message in the target service flow is sent out from the source device, the message is forwarded by the forwarding device, so that the message can reach the destination device. Each message of the target service flow includes sequence number information corresponding to the message. In the embodiment of the application, the sequence number information of the message is used for realizing important functions such as ordered transmission, packet loss retransmission, error recovery and the like of the message, thereby ensuring the reliable transmission of the service flow in the network. In one possible implementation, the Sequence Number of the message may be a Sequence Number (Sequence Number) in a TCP header, where the Sequence Number of the message is used to indicate the location of the data content of the message in the complete data content of the target traffic, or may be an acknowledgement Number (Acknowledgment Number) in the TCP header, where the acknowledgement Number of the message is used to indicate the location in the target traffic where the destination device has successfully received the message, or may be a Sequence Number and acknowledgement Number in the TCP header, or may be a packet Sequence Number (Packet Sequence Number, PSN) in a remote direct data access (Remote Direct Memory Access, RDMA) message, where the PSN of the RDMA message is used to indicate the order of RDMA messages sent or received through a Queue Pair (QP).
In the embodiment of the application, a first message in a target service flow is transmitted to forwarding equipment, and the first message comprises first sequence number information corresponding to the first message. And after the forwarding equipment receives the first message, the forwarding equipment sends the first message to the next hop equipment.
102. The forwarding device obtains second sequence number information of the second message.
The forwarding device obtains second sequence number information of a second message, wherein the first message and the second message are both from a target service flow, namely, the first message and the second message are different messages in the same service flow. The first message and the second message may have the same transmission direction, that is, the source IP address of the first message is the same as the source IP address of the second message, and the destination IP address of the first message is the same as the destination IP address of the second message, or may have different transmission directions, that is, the source IP address of the first message is the same as the destination IP address of the second message, and the destination IP address of the first message is the same as the source IP address of the second message.
In the embodiment of the present application, the execution sequence of step 101 and step 102 is not limited. For example, step 101 may be performed first and then step 102 may be performed, or step 102 may be performed first and then step 101 may be performed, or step 101 and step 102 may be performed simultaneously, and the specific embodiments are not limited herein.
In one possible implementation, a second message in the target traffic flow is delivered to the forwarding device, where the second message includes second sequence number information. After the forwarding device receives the second message, the forwarding device can acquire second sequence number information of the second message. On the other hand, the forwarding device records the time when the second message arrives at the forwarding device, so as to obtain the first arrival time of the second message, in other words, the first arrival time of the second message is the time when the second message arrives at the forwarding device.
In one possible implementation, the forwarding device establishes a peer link with several other forwarding devices. The second message in the target traffic is not transferred to the forwarding device, but to other forwarding devices with which the peer link is established. The forwarding device may synchronously receive the second sequence number information of the second packet through the peer link. Therefore, the forwarding device can perform fault judgment on the target service flow based on the messages in the target service flow received by other forwarding devices, so that the communication method of the embodiment of the application is still applicable to the scene of inconsistent transmission paths of the first message and the second message, the flexibility of the scheme is improved, and the applicable scene of the scheme is increased. The first arrival time of the second message may be a time when the second message is transmitted to the peer link of the forwarding device, or may be a time when the forwarding device receives the second sequence number information, which is not limited herein.
For ease of understanding, referring to fig. 4, fig. 4 is a schematic diagram of one possible scenario in which the forwarding device obtains the second sequence number information in an embodiment of the present application. As shown in fig. 4, the forwarding device a establishes a peer link with the forwarding device B. The first message is transmitted to the forwarding device a, and the second message is transmitted to the forwarding device B. The forwarding device B may synchronize the second sequence number information of the second message to the forwarding device a through the peer link, so that the forwarding device a obtains the second sequence number information of the second message, or may synchronize the first sequence number information of the first message to the forwarding device B through the peer link, so that the forwarding device B obtains the first sequence number information of the first message.
103. And the forwarding equipment determines that the target service flow fails according to the first sequence number information and the second sequence number information.
In the embodiment of the application, the forwarding device can determine that the target service flow fails according to the first sequence number information and the second sequence number information. In practical application, the target service flow includes a plurality of messages, and the messages continuously pass through the forwarding device, so that the forwarding device can execute the communication method in the embodiment of the application based on the messages, thereby determining whether the target service flow has a fault or not, and timely sensing the fault of the target service flow. On the other hand, the communication method of the embodiment of the application does not need to modify and expand the original message format, has lower implementation complexity and higher fault detection efficiency.
In one possible implementation, after determining that the target traffic flow fails, the forwarding device generates failure information for the target traffic flow, where the failure information indicates that the target traffic flow fails. Specifically, the fault information may include, but is not limited to, a time when the target traffic flow is determined to be faulty, first sequence number information of the first message, second sequence number information of the second message, and a quintuple (or triplet) of the target traffic flow. Illustratively, the quintuple can be a source IP address, a destination IP address, a source port, a destination port, and a transport layer protocol of the target traffic in a TCP scenario, and the triplet can be a source IP address, a destination IP address, and a QP of the target traffic in an RDMA scenario. Optionally, the fault information of the target service flow may be queried by a manager in real time, or the forwarding device may further send fault information for the target service flow to the controller, where the fault information is used to indicate that the target service flow has a fault. After the controller receives the fault information for the target traffic, the fault of the target traffic can be analyzed.
In the embodiment of the application, the forwarding device acquires the first sequence number information and the second sequence number information and can determine whether the target service flow fails or not through various judging logics. The following describes various decision logic in the embodiment of the present application.
And the first judging logic is that the forwarding equipment judges the faults of the target service flow based on two messages in different transmission directions in the target service flow.
In the TCP scene, the source IP address of the first message is the same as the destination IP address of the second message, the destination IP address of the first message is the same as the source IP address of the second message, the source port of the first message is the same as the destination port of the second message, and the destination port of the first message is the same as the source port of the second message. The first Sequence Number information of the first message includes at least a Sequence Number (Sequence Number) in a TCP header of the first message, and the second Sequence Number information of the second message includes at least an acknowledgement Number (Acknowledgment Number) in a TCP header of the second message. Optionally, in practical application, the transport layer protocols of the first packet and the second packet are the same.
In one possible implementation, the first message and the second message are respectively the latest messages acquired by the forwarding device in two opposite delivery directions. Assuming that the target traffic flow includes two messages in opposite transmission directions, if the first message is the latest message acquired by the forwarding device in one transmission direction, the second message is the latest message acquired by the forwarding device (or other devices that establish a peer link with the forwarding device) in the other transmission direction.
The forwarding device obtains first sequence number information (namely the sequence number of the first message) and second sequence number information (namely the acknowledgement number of the second message), and if the sequence number in the TCP header of the first message is larger than the acknowledgement number in the TCP header of the second message and the difference between the current time and the first arrival time of the second message is larger than a preset threshold value, the forwarding device determines that the target service flow fails. The current time is the time for the forwarding device to execute the fault determination of the target service flow, and in practical application, the frequency, period and time for the forwarding device to execute the fault determination and the size of the preset threshold value can be adaptively configured according to the service requirement and the network environment, so that the flexibility of the scheme is improved. Specifically, for two opposite delivery directions of the target traffic flow, the sequence number of the message in one direction corresponds to the acknowledgement number of the message in the other direction. If the difference between the current time and the first arrival time of the second message is greater than a preset threshold, the message of the target service flow is not received for a long time in the transmission direction of the second message of the forwarding device. And the sequence number in the TCP header of the first message is larger than the acknowledgement number in the TCP header of the second message, it is indicated that the target traffic flow is continuously transmitted in the transmission direction of the first message, that is, the message that the target traffic flow is not received for a long time in the transmission direction of the second message of the forwarding device due to the stop of the target traffic flow can be eliminated. Thus, the forwarding device may determine that the target traffic flow is malfunctioning.
The same applies to the first decision logic described above in RMDA scenarios. Specifically, the first message and the second message are messages in an RDMA protocol, a source IP address of the first message is the same as a destination IP address of the second message, the destination IP address of the first message is the same as the source IP address of the second message, and Queue Pairs (QP) of the first message and the second message are different. Specifically, the first packet and the second packet are packets in different transmission directions in the same RDMA stream, in other words, the first packet and the second packet have the same Queue Pair context (Queue Pair Context, QPC) and Queue Pair Key (QKey), but the destination QP (Destination QP) of the first packet and the destination QP of the second packet are different. Alternatively, the first message may be a request message in an RDMA service flow, and the second message may be a response message in an RDMA service flow, or the second message may be a response message in an RDMA service flow, and the second message is a request message in an RDMA service flow. The first sequence number information includes a PSN of the first message, and the second sequence number information includes a PSN of the second message. If the PSN of the first message is larger than that of the second message and the difference between the current time and the first arrival time of the second message is larger than a preset threshold value, determining that the target service flow fails. In particular, the RDMA protocol provides a reply mechanism that is used to ensure that RMDA messages can be reliably received. In the RDMA scene with the response mechanism, after receiving a request message from a source device, a destination device of a target service flow determines a PSN of a response message of the request message according to the PSN of the request message, wherein the PSN of the response message indicates the position of the data content of the response message in the complete data content of the target service flow, and also indicates the position of the request message in the target service flow, which the destination device has successfully received. After receiving the response message, the source device continues to determine the PSN of the new request message according to the PSN in the response message so as to continue to send the new request message. Thus, in an RDMA scenario, the PSN of a message in one direction corresponds to the PSN of a message in the other direction for two opposite delivery directions of the target traffic. If the difference between the current time and the first arrival time of the second message is greater than a preset threshold, the fact that the message of the target service flow is not received for a long time in the transmission direction of the second message is indicated. And if the PSN of the first message is greater than that of the second message, it indicates that the target service flow is continuously transmitted in the transmission direction of the first message, that is, the message that the target service flow is not received for a long time in the transmission direction of the second message of the forwarding device due to the stop of the target service flow can be eliminated. Thus, the forwarding device may determine that the target traffic flow is malfunctioning.
In practical application, the forwarding device may establish a service flow table for the service flows arriving at the forwarding device, where sequence number information and arrival time of the messages from each service flow are recorded in the service flow table, so that the forwarding device may perform fault determination based on the first message and the second message of each service flow.
Referring to fig. 5, fig. 5 is a schematic diagram of a possible scenario of a communication method according to an embodiment of the application. As shown in fig. 5, both forwarding device a and forwarding device B may establish a traffic flow table for traffic flows arriving at the forwarding device, where the traffic flow table is used to record sequence number information and arrival time of packets from each traffic flow. It is assumed that the uplink message and the downlink message are messages in two opposite directions of the target traffic flow. The upstream message and the downstream message are opposite, and in the example of fig. 5, a message from the forwarding device a to the forwarding device B is taken as an upstream message, and a message from the forwarding device B to the forwarding device a is taken as a downstream message. The KEY field in each entry of the service flow table is used for recording an identifier of the service flow received by the forwarding device, for example, a quintuple or a triplet of the service flow, and the Additional Data (AD) field in the entry corresponding to the service flow is used for recording a sequence number (l 2r seq in the table) of the uplink message, an acknowledgement number (r 2l ack seq in the table) of the downlink message, an arrival time (r 2L TIMESTAMP in the table) of the downlink message, and the like.
In the scenario shown in fig. 5, the uplink and downlink messages of the target traffic flow share the same entry in the traffic flow table. The forwarding device may perform table building according to the five-tuple or the three-tuple of the uplink message, or may perform table building according to the five-tuple or the three-tuple of the downlink message. Next, the forwarding device a and the forwarding device B are both described by taking the five-tuple of the uplink packet as an example. When the first message (whether the uplink message or the downlink message) of the target service flow arrives at the forwarding device, the forwarding device can identify that the message belongs to the uplink message or the downlink message, and then establish a table entry aiming at the target service flow in the service flow table according to the five-tuple of the message. And if the message is an uplink message, recording the sequence number of the uplink message, and when the subsequent forwarding equipment receives a new uplink message in the target service flow, updating the sequence number of the uplink message in the table entry of the target service flow. And if the subsequent forwarding equipment receives the downlink message in the target service flow, the source IP address, the destination IP address, the source port and the destination port of the downlink message are exchanged to obtain five-tuple matched with the list item of the target service flow, and then the sequence number and the arrival time of the downlink message in the list item are updated. Next, the forwarding device periodically queries entries of each service flow in the service flow table, and uses the time when a certain service flow is queried as the current time (curTS), to trigger the forwarding device to execute the first decision logic provided by the embodiment of the present application. Specifically, when l2r seq > r2l ack seq is satisfied at the same time, and curTS-r2L TIMESTAMP > a preset Threshold (Threshold), determining that the service flow corresponding to the currently queried entry fails. After determining that the service flow fails, the forwarding device generates failure information for the service flow, where the failure information indicates that the service flow fails. In particular, the fault information may include, but is not limited to, a current time (curTS), a sequence number (l 2r seq) of the first message, an acknowledgement number (r 2l ack seq) of the second message, and a five tuple (or triplet) of the traffic flow. Optionally, the fault information of the target service flow may be queried by a manager in real time, or the forwarding device may further send fault information for the service flow to the controller, where the fault information is used to indicate that the service flow has a fault. After the controller receives the fault information for the service flow, the controller can analyze the fault of the service flow.
Referring to fig. 6, fig. 6 is a schematic diagram of another possible scenario of a communication method according to an embodiment of the application. As shown in fig. 6, both forwarding device a and forwarding device B may build a traffic flow table for traffic flows arriving at the forwarding device. It is assumed that the uplink message and the downlink message are messages in two opposite directions of the target traffic flow. The upstream message and the downstream message are opposite, and in the example of fig. 6, a message from the forwarding device a to the forwarding device B is taken as an upstream message, and a message from the forwarding device B to the forwarding device a is taken as a downstream message. Wherein, the uplink message and the downlink message of the service flow occupy one table entry in the service flow table respectively. Wherein each entry records the sequence number of the upstream or downstream message (l 2r seq in the table), the acknowledgement number (r 2l ack seq in the table) and the arrival time (l 2R TIMESTAMP in the table) at the forwarding device.
When the first message (whether the uplink message or the downlink message) of the target service flow arrives at the forwarding device, the forwarding device can identify that the message belongs to the uplink message or the downlink message, and then establish a table entry aiming at the target service flow in the service flow table according to the five-tuple of the message. Assuming that the message is an uplink message, a table entry corresponding to the uplink message is recorded, where a sequence number (l 2r seq in the table), an acknowledgement number (r 2l ack seq in the table) and an arrival time (l 2R TIMESTAMP in the table) of the uplink message reach the forwarding device are recorded. When the subsequent forwarding device receives a new uplink message in the target service flow, updating a sequence number (l 2r seq in a table), an acknowledgement number (r 2l ack seq in a table) and an arrival time (l 2R TIMESTAMP in a table) of the uplink message to the forwarding device, and if the message is a downlink message, recording the sequence number (l 2r seq in a table), the acknowledgement number (r 2l ack seq in a table) and the arrival time (l 2R TIMESTAMP in a table) of the downlink message in a table corresponding to the downlink message. When the subsequent forwarding device receives a new downlink message in the target traffic flow, the sequence number (l 2r seq in the table), the acknowledgement number (r 2l ack seq in the table) and the arrival time (l 2R TIMESTAMP in the table) at the forwarding device in the table entry of the downlink message are updated.
Next, the forwarding device periodically queries entries of the respective traffic flows in the traffic flow table. And taking the time when a certain table item is queried as the current time (curTS), and exchanging the KEY of the table item (exchanging the source IP address and the destination IP address and exchanging the source port and the destination port to obtain the table item of the service flow which belongs to the same service with the table item and has the opposite transmission direction). When the sequence 1> sequence 4 is satisfied at the same time, and curTS-ts2> is preset to a Threshold (Threshold), determining that the service flow corresponding to the currently queried table entry fails.
And the second judging logic is that the forwarding equipment judges the faults of the target service flow based on the two messages in the same transmission direction in the target service flow.
In the TCP scene, the source IP address of the first message is the same as the source IP address of the second message, the destination IP address of the first message is the same as the destination IP address of the second message, the source port of the first message is the same as the source port of the second message, and the destination port of the first message is the same as the destination port of the second message. The first sequence number information of the first message comprises a sequence number and an acknowledgement number in a TCP header of the first message, and the second sequence number information of the second message comprises a sequence number and an acknowledgement number in a TCP header of the second message. Optionally, in practical application, the transport layer protocols of the first packet and the second packet are the same.
The first message is the latest message currently acquired by the forwarding device, and the second message is the previous message of the first message in the same transmission direction. I.e. the forwarding device receives the second message first and then the first message. And after the first message arrives at the forwarding equipment, the forwarding equipment judges whether the target service flow fails or not by taking the time of the first message arriving at the forwarding equipment as the current time. If the sequence number in the TCP header of the first message is smaller than or equal to the sequence number in the TCP header of the second message, and the acknowledgement number in the TCP header of the first message is smaller than or equal to the acknowledgement number in the TCP header of the second message, and the difference between the current time and the first arrival time of the second message is larger than a preset threshold value, it is indicated that the first message received by the forwarding device belongs to a retransmission message, and the forwarding device can determine that the target service flow fails. The current time is the time when the first message arrives at the forwarding device.
The same applies to the second decision logic described above in RMDA scenarios. Specifically, the first message and the second message are messages of the same type in the RDMA protocol, the source IP address of the first message is the same as the source IP address of the second message, the destination IP address of the first message is the same as the destination IP address of the second message, the Queue Pair (QP) of the first message and the second message is the same, the first sequence number information comprises the PSN of the first message, and the second sequence number information comprises the PSN of the second message.
The first message is the latest RDMA message currently acquired by the forwarding device, and the second message is the previous RDMA message of the first message in the same transmission direction. I.e. the forwarding device receives the second message first and then the first message. And after the first message arrives at the forwarding equipment, the forwarding equipment judges whether the target service flow fails or not by taking the time of the first message arriving at the forwarding equipment as the current time. In the RDMA protocol, the PSN of an RDMA message indicates the location of the data content of the RDMA message within the full data content of the target traffic stream. Therefore, if the PSN of the first packet is smaller than or equal to the PSN of the second packet, and the difference between the current time and the first arrival time of the second packet is greater than the preset threshold, it is indicated that the first packet received by the forwarding device belongs to the retransmission packet, and the forwarding device can determine that the target service flow fails. The current time is the time when the first message arrives at the forwarding device.
In practical application, the forwarding device may establish a service flow table for the messages in the single transmission direction in the service flow, where sequence number information and arrival time of the messages in the single transmission direction from each service flow are recorded in the service flow table, so that the forwarding device may perform fault determination based on the first message and the second message of each service flow.
Referring to fig. 7, fig. 7 is a schematic diagram of another possible scenario of a communication method according to an embodiment of the application. As shown in fig. 7, both forwarding device a and forwarding device B may establish a traffic flow table for traffic flows arriving at the forwarding device, where the traffic flow table is used to record sequence number information and arrival times of packets in a single delivery direction from each traffic flow. In the scenario illustrated in fig. 7, the forwarding device only builds a traffic flow table for the upstream messages of the traffic flow. Taking the table construction of the forwarding device A and the forwarding device B according to the five-tuple of the uplink message as an example, the description is given. The upstream message and the downstream message are opposite, and in the example of fig. 7, a message from the forwarding device a to the forwarding device B is taken as an upstream message, and a message from the forwarding device B to the forwarding device a is taken as a downstream message. The KEY field in each entry of the service flow table is used for recording an identifier of the service flow received by the forwarding device, for example, a quintuple or a triplet of the service flow, and the Additional Data (AD) field in the entry corresponding to the service flow is used for recording a sequence number (l 2r seq in the table) of the uplink message, an acknowledgement number (l 2r ack seq in the table) of the uplink message, an arrival time (l 2R TIMESTAMP in the table) of the uplink message, and the like. In practical application, the forwarding device may perform table building according to the five-tuple or the three-tuple of the uplink message, and the forwarding device performs fault determination based on the two messages in the uplink direction, or may perform table building according to the five-tuple or the three-tuple of the downlink message, and the forwarding device performs fault determination based on the two messages in the downlink direction.
When the first uplink message of the target service flow arrives at the forwarding device, the forwarding device can identify that the message belongs to the uplink message, and then establish an entry for the target service flow in the service flow table according to the five-tuple of the uplink message, where the entry includes a sequence number (l 2r seq in the table) of the uplink message, a confirmation number (l 2r ack seq in the table) of the uplink message, and an arrival time (l 2rtimestamp in the table) of the uplink message. When the forwarding device subsequently receives a new uplink message in the target service flow, the forwarding device is triggered to execute the second decision logic provided by the embodiment of the application. The forwarding device takes the time when the new uplink message arrives at the forwarding device as the current time (curTS), takes the message recorded in the service flow table as the second message, and takes the newly received message as the first message. The sequence number of the message recorded in the service flow table is the second sequence number of the second message (l 2r seq in the table), the acknowledgement number of the message recorded in the service flow table is the second acknowledgement number of the second message (l 2r ack seq in the table), the arrival time of the message recorded in the service flow table is the first timestamp of the second message (l 2R TIMESTAMP in the table), the sequence number of the newly received message is the first sequence number of the first message (seq), and the acknowledgement number of the newly received message is the first acknowledgement number of the first message (ack seq).
Specifically, when seq < = l2r seq and ack seq < = l2r ack seq, and curTS-l2R TIMESTAMP > is preset to a Threshold (Threshold), determining that the traffic flow corresponding to the currently queried entry fails. If the condition is not satisfied, determining that the current target service flow has not failed, taking the newly received message as a second message, and updating the sequence number, the confirmation number and the arrival time of the newly received message to the sequence number, the confirmation number and the arrival time of the table entry aiming at the target service flow in the service flow table.
After determining that the service flow fails, the forwarding device generates failure information for the service flow, where the failure information indicates that the service flow fails. Specifically, the fault information may include, but is not limited to, a current time (curTS), a sequence number (seq) of the first message, an acknowledgement number (ack seq) of the first message, a sequence number (l 2r seq) of the second message, an acknowledgement number (l 2r ack seq) of the second message, and a five tuple (or triplet) of the traffic flow. Optionally, the fault information of the target service flow may be queried by a manager in real time, or the forwarding device may further send fault information for the service flow to the controller, where the fault information is used to indicate that the service flow has a fault. After the controller receives the fault information for the service flow, the controller can analyze the fault of the service flow.
The same applies to the second decision logic described above in RMDA scenarios. Specifically, the first message and the second message are messages of the same type in the RDMA protocol, the source IP address of the first message is the same as the source IP address of the second message, the destination IP address of the first message is the same as the destination IP address of the second message, the Queue Pair (QP) of the first message and the second message is the same, the first sequence number information comprises the PSN of the first message, and the second sequence number information comprises the PSN of the second message. If the PSN of the first message is smaller than or equal to that of the second message, and the difference between the current time and the arrival time is larger than a preset threshold value, determining that the target service flow fails. Specifically, RDMA messages include, but are not limited to, SEND messages, RECEIVE messages, READ messages, WRITE messages, and the like, where the PSNs of different types of RDMA messages are independent of each other. In the communication method of the embodiment of the application, the fault judgment is carried out based on two messages (a first message and a second message) of the same type in the RDMA protocol.
Referring to fig. 8, fig. 8 is a schematic diagram of a communication method applied to an RDMA scene in an embodiment of the application. As shown in fig. 8, both the forwarding device a and the forwarding device B may establish a traffic flow table for traffic flows arriving at the forwarding device, where the traffic flow table is used to record sequence number information and arrival time of a packet in a single delivery direction from each traffic flow. In the scenario illustrated in fig. 8, the forwarding device only builds a traffic flow table for the upstream messages of the traffic flow. The upstream message and the downstream message are opposite, and in the example of fig. 8, a message from the forwarding device a to the forwarding device B is taken as an upstream message, and a message from the forwarding device B to the forwarding device a is taken as a downstream message. Taking the table building of the forwarding device A and the forwarding device B according to the triad of the uplink message of a certain RDMA type as an example, the description is given. The KEY field in each entry of the service flow table is used for recording the identifier of the service flow received by the forwarding device, for example, the triplet (source IP address, destination IP address and QP) of the RDMA flow, and the Additional Data (AD) field in the entry corresponding to the RDMA flow is used for recording the PSN (l 2r PSN shown in the table) of the upstream message, the arrival time (l 2R TIMESTAMP in the table) of the upstream message, and so on.
In practical application, the forwarding device may build a table according to the triplets of the uplink messages, and the forwarding device performs fault determination based on the two uplink messages, or may build a table according to the triplets of the downlink messages, and the forwarding device performs fault determination based on the two downlink messages.
When the first uplink message of the target service flow arrives at the forwarding device, the forwarding device can identify that the message belongs to the uplink message, and then establish a table entry for the target service flow in the service flow table according to the five-tuple of the uplink message, where the table entry includes a PSN (l 2r PSN shown in the table) of the uplink message and an arrival time (l 2R TIMESTAMP in the table) of the uplink message. When the forwarding device subsequently receives a new uplink message in the target service flow, the forwarding device is triggered to execute the second decision logic provided by the embodiment of the application. The forwarding device takes the time when the new uplink message arrives at the forwarding device as the current time (curTS), takes the message recorded in the service flow table as the second message, and takes the newly received message as the first message. The PSN of the packet recorded in the service flow table is the second PSN of the second packet (l 2rPSN shown in the table), the arrival time of the packet recorded in the service flow table is the first timestamp of the second packet (l 2R TIMESTAMP in the table), and the PSN of the newly received packet is the first PSN (PSN) of the first packet.
Specifically, when PSN < = l2r PSN and curTS-l2R TIMESTAMP > is preset to a Threshold, determining that the service flow corresponding to the currently queried entry fails. If the condition is not satisfied, determining that the current target service flow has not failed, taking the newly received message as a second message, and updating the sequence number, the confirmation number and the arrival time of the newly received message to the sequence number, the confirmation number and the arrival time of the table entry aiming at the target service flow in the service flow table.
And the third judging logic is that the forwarding equipment judges the faults of the target service flow based on two messages in different transmission directions in the target service flow.
In the TCP scene, the source IP address of the first message is the same as the destination IP address of the second message, the destination IP address of the first message is the same as the source IP address of the second message, the source port of the first message is the same as the destination port of the second message, and the destination port of the first message is the same as the source port of the second message. The first Sequence Number information of the first message includes a Sequence Number (Sequence Number) and an acknowledgement Number (Acknowledgment Number) in a TCP header of the first message, and the second Sequence Number information of the second message includes at least a Sequence Number (Sequence Number) and an acknowledgement Number (Acknowledgment Number) in a TCP header of the second message.
In one possible implementation, the first message and the second message are respectively the latest messages acquired by the forwarding device in two opposite delivery directions. Taking the first message as an uplink message and the second message as a downlink message as an example, the first message is the latest uplink message acquired by the forwarding device in the uplink direction, and the second message is the latest downlink message acquired by the forwarding device (or other devices establishing peer link with the forwarding device) in the downlink direction.
The forwarding device obtains the first sequence number information (i.e. the sequence number and the acknowledgement number of the first message) and the second sequence number information (i.e. the sequence number and the acknowledgement number of the second message), if the acknowledgement number in the TCP header of the first message is smaller than the sequence number in the TCP header of the second message, and the difference between the current time and the second arrival time of the first message is greater than a preset threshold value, it is indicated that the forwarding device can receive the target traffic flow in the direction of the second message, but the forwarding device has not received the target traffic flow in the same direction as the first message for a long time (exceeding the preset duration), so that the acknowledgement number of the first message stops growing. Therefore, the forwarding device determines that the target traffic flow fails, and an upstream path of the second packet is normal and does not fail, wherein the upstream path of the second packet refers to a path that the second packet passes through in the process of transmitting to the forwarding device. Therefore, the forwarding equipment can further determine the normal path passed by the target service flow while perceiving the target service flow, so that the follow-up fault location is facilitated.
The same applies to the third decision logic described above in RMDA scenarios. Specifically, the source IP address of the first packet is the same as the destination IP address of the second packet, the destination IP address of the first packet is the same as the source IP address of the second packet, and the QP of the first packet is different from the QP of the second packet. Specifically, the first packet and the second packet are packets in different transmission directions in the same RDMA stream, in other words, the first packet and the second packet have the same Queue Pair context (Queue Pair Context, QPC) and Queue Pair Key (QKey), but the destination QP (Destination QP) of the first packet and the destination QP of the second packet are different. Alternatively, the first message may be a request message in an RDMA service flow, and the second message may be a response message in an RDMA service flow, or the second message may be a response message in an RDMA service flow, and the second message is a request message in an RDMA service flow. The first sequence number information includes a PSN of the first message, and the second sequence number information includes a PSN of the second message. From the above, in the RDMA scenario where the reply mechanism is enabled, the PSN of the RDMA message indicates the location of the data content of the RDMA message in the complete data content of the target traffic stream, and also indicates the location of the message in the target traffic stream that the destination device has successfully received. I.e. for two opposite delivery directions of the target traffic flow, the PSN of a message in one direction corresponds to the PSN of a message in the other direction. If the PSN of the first packet is smaller than the PSN of the second packet and the difference between the current time and the second arrival time of the first packet is greater than the preset threshold, it indicates that the forwarding device may receive the target traffic in the direction of the second packet, but the forwarding device has not received the target traffic in the same direction as the first packet for a long time (exceeding the preset duration), so that the PSN of the first packet stops increasing. Therefore, the forwarding device determines that the target traffic flow fails, and an upstream path of the second packet is normal and does not fail, wherein the upstream path of the second packet refers to a path that the second packet passes through in the process of transmitting to the forwarding device.
Referring to fig. 9, fig. 9 is a schematic diagram of a possible scenario of a communication method according to an embodiment of the application. As shown in fig. 9, both forwarding device a and forwarding device B may establish a traffic flow table for traffic flows arriving at the forwarding device, where the traffic flow table is used to record sequence number information and arrival time of packets from each traffic flow. It is assumed that the uplink message and the downlink message are messages in two opposite directions of the target traffic flow. The upstream message and the downstream message are opposite, and in the example of fig. 7, a message from the forwarding device B to the forwarding device a is taken as an upstream message, and a message from the forwarding device a to the forwarding device B is taken as a downstream message. The KEY field in each entry of the service flow table is used for recording an identifier of the service flow received by the forwarding device, for example, a quintuple or a triplet of the service flow, and the Additional Data (AD) field in the entry corresponding to the service flow is used for recording an acknowledgement number (l 2r ack seq in the table) of the uplink message, a sequence number (r 2l seq in the table) of the downlink message, an arrival time (r 2L TIMESTAMP in the table) of the downlink message, and the like. Taking forwarding device B shown in fig. 9 as an example, if seq7> seq6, and the current time (curTS) -the second arrival time (ts 3) > the preset Threshold (Threshold), forwarding device B determines that the target traffic flow fails, and the upstream path of the second packet is normal and has not failed. The specific tabulation process is similar to the description of the embodiments shown in fig. 5 to 8, and is not repeated here.
In one possible implementation, the transmission path of the target service flow includes a plurality of forwarding devices, where each forwarding device may apply the communication method provided by the embodiment of the present application to determine a failure occurring in the target service flow. Wherein, in each forwarding device, one or more decision logics provided by the embodiment of the application can be configured arbitrarily. If the first message and the second message received by the forwarding device meet any one of the judging logic in the embodiment of the application, the forwarding device can determine that the target service flow fails.
Further, as shown in fig. 9, in the third decision logic in the embodiment of the present application, the forwarding device can also determine that the upstream path of the second packet is normal, and no failure occurs. At this time, the fault cause is described regardless of the device on the upstream path of the second packet (i.e., the upstream device of the second packet), so the forwarding device may send a fault notification to the upstream device of the second packet, where the fault notification is used to indicate that the upstream path of the second packet is normal, so that the upstream device of the second packet may avoid performing an invalid path switch. Illustratively, the failure notification includes, but is not limited to, a five tuple or a three tuple of the target traffic flow.
Correspondingly, the embodiment of the application also provides a related device for implementing the scheme. Specifically, referring to fig. 10, fig. 10 is a schematic structural diagram of a communication device according to an embodiment of the present application. The communication device in fig. 10 may be a chip, a chip system, or a processor for supporting the forwarding apparatus to implement the method, or the communication device may also be a logic node, a logic module, or software for implementing all or part of the functions of the communication device, or the communication device may also be a generic term for a plurality of logic nodes, a plurality of logic modules, or a plurality of software for implementing the communication method. As shown in fig. 10, the communication apparatus includes:
a transceiver 201, configured to receive a first message, where the first message includes first sequence number information;
The transceiver 201 is further configured to obtain second sequence number information of a second packet, where the first packet and the second packet correspond to a target service flow;
The processing unit 202 is configured to determine that the target traffic flow fails according to the first sequence number information and the second sequence number information.
In one possible implementation, the source IP address of the first packet is the same as the destination IP address of the second packet, the destination IP address of the first packet is the same as the source IP address of the second packet, the source port of the first packet is the same as the destination port of the second packet, the destination port of the first packet is the same as the source port of the second packet, the first sequence number information includes a sequence number in a Transmission Control Protocol (TCP) header of the first packet, and the second sequence number information includes an acknowledgement number in a TCP header of the second packet;
the processing unit 202 is specifically configured to determine that the target traffic flow fails when a sequence number in a TCP header of the first packet is greater than an acknowledgement number in a TCP header of the second packet, and a difference between a current time and a first arrival time of the second packet is greater than a preset threshold.
In one possible implementation, the source IP address of the first packet is the same as the source IP address of the second packet, the destination IP address of the first packet is the same as the destination IP address of the second packet, the source port of the first packet is the same as the source port of the second packet, the destination port of the first packet is the same as the destination port of the second packet, the first sequence number information includes a sequence number and an acknowledgement number in a TCP header of the first packet, and the second sequence number information includes a sequence number and an acknowledgement number in a TCP header of the second packet;
The processing unit 202 is specifically configured to determine that the target traffic flow fails when the sequence number in the TCP header of the first packet is less than or equal to the sequence number in the TCP header of the second packet, the acknowledgement number in the TCP header of the first packet is less than or equal to the acknowledgement number in the TCP header of the second packet, and the difference between the current time and the first arrival time of the second packet is greater than a preset threshold.
In one possible implementation, the source IP address of the first packet is the same as the destination IP address of the second packet, the destination IP address of the first packet is the same as the source IP address of the second packet, the source port of the first packet is the same as the destination port of the second packet, the destination port of the first packet is the same as the source port of the second packet, the first sequence number information includes a sequence number and an acknowledgement number in a TCP header of the first packet, and the second sequence number information includes a sequence number and an acknowledgement number in a TCP header of the second packet;
The processing unit 202 is specifically configured to determine that the upstream path of the second message is normal when the acknowledgement number in the TCP header of the first message is smaller than the sequence number in the TCP header of the second message, and the difference between the current time and the second arrival time of the first message is greater than a preset threshold.
In a possible implementation, the transceiver unit 201 is further configured to send a fault notification to an upstream device of the second packet, where the fault notification is used to indicate that an upstream path of the second packet is normal.
In one possible implementation, the first packet and the second packet are packets in a remote direct memory access RDMA protocol, a source IP address of the first packet is the same as a source IP address of the second packet, a destination IP address of the first packet is the same as a destination IP address of the second packet, a queue pair QP of the first packet is the same as a QP of the second packet, the first sequence number information includes a packet sequence number PSN of the first packet, and the second sequence number information includes a PSN of the second packet;
The processing unit 202 is specifically configured to determine that the target traffic flow fails when the PSN of the first packet is less than or equal to the PSN of the second packet, and the difference between the current time and the first arrival time of the second packet is greater than a preset threshold.
In one possible implementation, the first and second messages are messages in a remote direct memory access RDMA protocol, the source IP address of the first message is the same as the destination IP address of the second message, the destination IP address of the first message is the same as the source IP address of the second message, the QP of the first message is different from the QP of the second message, the first sequence number information includes the PSN of the first message, the second sequence number information includes the PSN of the second message,
The processing unit 202 is specifically configured to determine that the target traffic flow fails when the PSN of the first packet is greater than the PSN of the second packet and the difference between the current time and the first arrival time of the second packet is greater than a preset threshold.
In one possible implementation, the source IP address of the first packet is the same as the destination IP address of the second packet, the destination IP address of the first packet is the same as the source IP address of the second packet, the QP of the first packet is different from the QP of the second packet, the first sequence number information includes the PSN of the first packet, and the second sequence number information includes the PSN of the second packet;
The processing unit 202 is specifically configured to determine that the upstream path of the second message is normal when the PSN of the first message is smaller than the PSN of the second message and the difference between the current time and the second arrival time of the first message is greater than a preset threshold.
In a possible implementation, the transceiver unit 201 is specifically configured to receive a second packet, where the second packet includes second sequence number information.
In one possible implementation, the second sequence number information is received over a peer link.
In a possible implementation, the transceiver unit 201 is further configured to send fault information for the target traffic flow to the controller, where the fault information is used to indicate that the target traffic flow is faulty.
It should be noted that, content such as information interaction and execution process between each module/unit in the communication device, the method embodiment corresponding to fig. 3 in the present application is based on the same concept, and specific content may be referred to the description in the foregoing method embodiment of the present application, which is not repeated herein.
Referring to fig. 11, fig. 11 is a schematic diagram of a logic structure of a communication device 30 according to an embodiment of the application. The communication device 30 in fig. 11 may be a chip, a chip system, or a processor for supporting the forwarding device to implement the method, or the communication device 30 may also be a logic node, a logic module, or software for implementing all or part of the functions of the communication device 30, or the communication device 30 may also be a generic term for a plurality of logic nodes, a plurality of logic modules, or a plurality of software for implementing the communication method. The communication device 30 may be provided with communication means as described in the corresponding embodiment of fig. 10 for implementing the functionality implemented by the forwarding device in the corresponding embodiment of fig. 3. The communication device 30 comprises a memory 301, a processor 302, a communication interface 303 and a bus 304. The memory 301, the processor 302 and the communication interface 303 are connected to each other by a bus 304.
The memory 301 may be a Read Only Memory (ROM), a static storage device, a dynamic storage device, or a random access memory (random access memory, RAM). The memory 301 may store a program which, when executed by the processor 302, is used by the processor 302 and the communication interface 303 to perform the steps 101-103 of the communication method embodiments described above.
The processor 302 may employ a central processing unit (central processing unit, CPU), microprocessor, application Specific Integrated Circuit (ASIC), graphics processor (graphics processing unit, GPU), digital signal processor (DIGITAL SIGNAL processing, DSP), field programmable gate array (field programmable GATE ARRAY, FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof, for executing associated programs to perform one or more of steps 101-103 of the communication method embodiments of the present application. The steps of the data processing method disclosed in connection with the embodiments of the present application may be performed by a compiler and an executor, where the compiler and the executor may be performed by a hardware decoding processor or may be performed by a combination of hardware and software modules in the decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory 301 and a processor 302 reads information in the memory 301 and in combination with its hardware performs one or more of the steps 101-103 of an embodiment of the communication method of the present application.
The communication interface 303 enables communication between the communication device 30 and other devices or communication networks using transceiving means such as, but not limited to, a transceiver.
Bus 304 may implement a pathway for information among the various components of computer device 30 (e.g., memory 301, processor 302, and communication interface 303). Bus 304 may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in FIG. 11, but not only one bus or one type of bus.
It should be noted that, content such as information interaction and execution process between each module/unit in the communication device, the method embodiment corresponding to fig. 3 in the present application is based on the same concept, and specific content may be referred to the description in the foregoing method embodiment of the present application, which is not repeated herein.
Embodiments of the present application also provide a computer program product comprising instructions. The computer program product may be software or a program product containing instructions capable of running on a computing device or stored in any useful medium. The computer program product, when run on at least one computer device, causes the at least one computer device to perform the method as described in the embodiment shown in fig. 3.
The embodiment of the application also provides a computer readable storage medium. The computer readable storage medium may be any available medium that can be stored by a computing device or a data storage device such as a data center containing one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), etc. The computer readable storage medium includes instructions that instruct a computing device to perform the above-described method applied to performing the embodiment described above with respect to fig. 3.
The communication device provided by the embodiment of the application can be a chip, wherein the chip comprises a processing unit and a communication unit, the processing unit can be a processor, and the communication unit can be an input/output interface, a pin or a circuit and the like. The processing unit may execute the computer-executable instructions stored in the storage unit to cause the chip to perform the method described in the embodiment shown in fig. 3. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, or the like, and the storage unit may also be a storage unit in the wireless access device side located outside the chip, such as a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a random access memory (random access memory, RAM), or the like.
It should be further noted that the above described embodiments of the apparatus are only schematic, where the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the device embodiment drawings provided by the embodiment of the application, the connection relation between the modules represents that the modules have communication connection, and the connection relation can be specifically realized as one or more communication buses or signal lines.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments of the present application may be implemented by software plus necessary general purpose hardware, or may be implemented by special purpose hardware including application specific integrated circuits, special purpose CPUs, special purpose memories, special purpose components, and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions can be varied, such as analog circuits, digital circuits, or dedicated circuits. But software program implementation is a preferred implementation for many more of the embodiments of the present application. Based on such understanding, the technical solution of the embodiments of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk or an optical disk of a computer, etc., including several instructions for causing a computer device (which may be a personal computer, a training device, or a network device, etc.) to perform the method according to the embodiments of the present application.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, training device, or data center to another website, computer, training device, or data center via a wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a training device, a data center, or the like that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk (Solid STATE DISK, SSD)), etc.

Claims (16)

1. A method of communication, comprising:
Receiving a first message, wherein the first message comprises first sequence number information;
Acquiring second sequence number information of a second message, wherein the first message and the second message correspond to a target service flow;
and determining that the target service flow fails according to the first sequence number information and the second sequence number information.
2. The method of claim 1, wherein a source IP address of the first message is the same as a destination IP address of the second message, wherein the destination IP address of the first message is the same as the source IP address of the second message, wherein a source port of the first message is the same as the destination port of the second message, wherein the destination port of the first message is the same as the source port of the second message, wherein the first sequence number information comprises a sequence number in a Transmission Control Protocol (TCP) header of the first message, and wherein the second sequence number information comprises an acknowledgement number in a TCP header of the second message;
Determining that the target service flow fails according to the first sequence number information and the second sequence number information, including:
and if the serial number in the TCP header of the first message is larger than the confirmation number in the TCP header of the second message and the difference between the current time and the first arrival time of the second message is larger than a preset threshold value, determining that the target service flow fails.
3. The method of claim 1, wherein a source IP address of the first message is the same as a source IP address of the second message, a destination IP address of the first message is the same as a destination IP address of the second message, a source port of the first message is the same as a source port of the second message, a destination port of the first message is the same as a destination port of the second message, the first sequence number information includes a sequence number and an acknowledgement number in a TCP header of the first message, and the second sequence number information includes a sequence number and an acknowledgement number in a TCP header of the second message;
Determining that the target service flow fails according to the first sequence number information and the second sequence number information, including:
If the sequence number in the TCP header of the first message is smaller than or equal to the sequence number in the TCP header of the second message, and the acknowledgement number in the TCP header of the first message is smaller than or equal to the acknowledgement number in the TCP header of the second message, and the difference between the current time and the first arrival time of the second message is larger than a preset threshold, determining that the target service flow fails.
4. The method of claim 1, wherein a source IP address of the first message is the same as a destination IP address of the second message, the destination IP address of the first message is the same as the source IP address of the second message, a source port of the first message is the same as the destination port of the second message, the destination port of the first message is the same as the source port of the second message, the first sequence number information includes a sequence number and an acknowledgement number in a TCP header of the first message, and the second sequence number information includes a sequence number and an acknowledgement number in a TCP header of the second message;
Determining that the target service flow fails according to the first sequence number information and the second sequence number information, including:
if the acknowledgement number in the TCP header of the first message is smaller than the sequence number in the TCP header of the second message, and the difference between the current time and the second arrival time of the first message is larger than a preset threshold value, determining that the upstream path of the second message is normal.
5. The method according to claim 4, wherein the method further comprises:
and sending a fault notification to upstream equipment of the second message, wherein the fault notification is used for indicating that an upstream path of the second message is normal.
6. The method of claim 1, wherein the first message and the second message are messages in a remote direct memory access, RDMA, protocol, a source IP address of the first message is the same as a destination IP address of the second message, the destination IP address of the first message is the same as the source IP address of the second message, a QP of the first message is different from a QP of the second message, the first sequence number information includes a PSN of the first message, and the second sequence number information includes a PSN of the second message;
Determining that the target service flow fails according to the first sequence number information and the second sequence number information, including:
If the PSN of the first message is larger than that of the second message and the difference between the current time and the first arrival time of the second message is larger than a preset threshold, determining that the target service flow fails.
7. The method of claim 1, wherein the first message and the second message are messages of a same type in a remote direct memory access RDMA protocol, a source IP address of the first message is identical to a source IP address of the second message, a destination IP address of the first message is identical to a destination IP address of the second message, a queue pair QP of the first message is identical to a QP of the second message, the first sequence number information includes a packet sequence number PSN of the first message, and the second sequence number information includes a PSN of the second message;
Determining that the target service flow fails according to the first sequence number information and the second sequence number information, including:
if the PSN of the first message is smaller than or equal to that of the second message, and the difference between the current time and the first arrival time of the second message is larger than a preset threshold, determining that the target service flow fails.
8. The method of claim 1, wherein a source IP address of the first message is the same as a destination IP address of the second message, the destination IP address of the first message is the same as the source IP address of the second message, a QP of the first message is different from a QP of the second message, the first sequence number information includes a PSN of the first message, and the second sequence number information includes a PSN of the second message;
Determining that the target service flow fails according to the first sequence number information and the second sequence number information, including:
if the PSN of the first message is smaller than that of the second message and the difference between the current time and the second arrival time of the first message is larger than a preset threshold, determining that the upstream path of the second message is normal.
9. The method according to any one of claims 2 to 8, further comprising:
and receiving a second message, wherein the second message comprises second sequence number information.
10. The method according to any of the claims 2 to 8, wherein the second sequence number information is received via a peer link.
11. The method according to any one of claims 1 to 10, further comprising:
And sending fault information aiming at the target service flow to a controller, wherein the fault information is used for indicating that the target service flow breaks down.
12. A communication device, comprising:
The receiving and transmitting unit is used for receiving a first message, wherein the first message comprises first sequence number information;
the receiving and transmitting unit is further configured to obtain second sequence number information of a second packet, where the first packet and the second packet come from a target service flow;
And the processing unit is used for determining that the target service flow fails according to the first sequence number information and the second sequence number information.
13. A communication device comprising a processor coupled to a memory;
the memory is used for storing instructions;
the processor configured to execute the instructions in the memory, to cause the communication device to perform the method of any one of claims 1 to 11.
14. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program which, when executed by a processor, implements the method according to any of claims 1 to 11.
15. A computer program product having computer readable instructions stored therein, which when executed by a processor, implement the method of any of claims 1 to 11.
16. A chip, characterized in that the chip comprises a processor, the processor is coupled to the memory;
the memory is used for storing instructions;
the processor configured to execute instructions in the memory, causing the chip to perform the method of any one of claims 1 to 11.
CN202410165492.3A 2024-02-02 2024-02-02 Communication method, communication device and communication equipment Pending CN120434153A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202410165492.3A CN120434153A (en) 2024-02-02 2024-02-02 Communication method, communication device and communication equipment
PCT/CN2025/070717 WO2025161854A1 (en) 2024-02-02 2025-01-06 Communication method, communication apparatus, and communication device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410165492.3A CN120434153A (en) 2024-02-02 2024-02-02 Communication method, communication device and communication equipment

Publications (1)

Publication Number Publication Date
CN120434153A true CN120434153A (en) 2025-08-05

Family

ID=96560527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410165492.3A Pending CN120434153A (en) 2024-02-02 2024-02-02 Communication method, communication device and communication equipment

Country Status (2)

Country Link
CN (1) CN120434153A (en)
WO (1) WO2025161854A1 (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100120823A (en) * 2009-05-07 2010-11-17 충남대학교산학협력단 Voip anomaly traffic detection method with flow-level data
CN107645409B (en) * 2017-08-18 2021-02-12 上海华为技术有限公司 Method and device for determining transmission fault reason of data
US10893004B2 (en) * 2018-11-20 2021-01-12 Amazon Technologies, Inc. Configurable detection of network traffic anomalies at scalable virtual traffic hubs
CN109672929B (en) * 2018-12-14 2021-04-27 中国联合网络通信集团有限公司 Method and device for detecting video service message
CN110225419A (en) * 2019-05-15 2019-09-10 深圳市麦谷科技有限公司 A kind of packet loss repeating method for realizing flow control
CN112637015B (en) * 2020-12-23 2022-08-26 苏州盛科通信股份有限公司 Packet loss detection method and device for realizing RDMA (remote direct memory Access) network based on PSN (packet switched network)

Also Published As

Publication number Publication date
WO2025161854A1 (en) 2025-08-07

Similar Documents

Publication Publication Date Title
US11902139B2 (en) Diagnosing and resolving issues in a network using probe packets
US10938712B2 (en) Compute node cluster based routing method and apparatus
CN107347021B (en) SDN-based reliable transmission method
US8868998B2 (en) Packet communication apparatus and packet communication method
US8195989B1 (en) Detection of ethernet link failure
US20080019265A1 (en) Systems and methods for configuring a network to include redundant upstream connections using an upstream control protocol
WO2018113425A1 (en) Method, apparatus and system for detecting time delay
US7525922B2 (en) Duplex mismatch testing
CN119013954A (en) Notification-based load balancing in a network
CN107332793B (en) A message forwarding method, related equipment and system
US20090006650A1 (en) Communication device, communication method, communication interface, and program product
JP4532253B2 (en) Frame transfer apparatus and frame loop suppression method
US8929200B2 (en) Communication device, communication system, and communication method
US20070115838A1 (en) Method and system for loop-back and continue in packet-based network
US10628201B2 (en) Analysis method and analysis apparatus
CN105281929B (en) A kind of service network interface state-detection and fault-tolerant devices and methods therefor
US11290319B2 (en) Dynamic distribution of bidirectional forwarding detection echo sessions across a multi-processor system
CN107846291A (en) Message processing method, Fault Locating Method and the network equipment
CN120434153A (en) Communication method, communication device and communication equipment
CN117376182A (en) Network fault diagnosis method and related equipment
CN111147386B (en) Method, electronic device and computer readable medium for handling data transmission congestion
CN112866187B (en) Path switching method and path switching device
CN104994017B (en) A kind of network router for including packet-receiving module
Hussein et al. Layer-4 Load Balancer for Flow Size Prediction with TCP/UDP Separation Using P4
CN119922106A (en) A message transmission and packet loss detection method, system and storage medium for AI network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication