WO2016101409A1 - Data switching method, device and system - Google Patents
Data switching method, device and system Download PDFInfo
- Publication number
- WO2016101409A1 WO2016101409A1 PCT/CN2015/073416 CN2015073416W WO2016101409A1 WO 2016101409 A1 WO2016101409 A1 WO 2016101409A1 CN 2015073416 W CN2015073416 W CN 2015073416W WO 2016101409 A1 WO2016101409 A1 WO 2016101409A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- module
- nfs
- socket
- tcp
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 65
- 230000005540 biological transmission Effects 0.000 claims abstract description 32
- 238000011084 recovery Methods 0.000 claims description 42
- 239000012634 fragment Substances 0.000 claims description 14
- 238000007726 management method Methods 0.000 description 28
- 230000005012 migration Effects 0.000 description 28
- 238000013508 migration Methods 0.000 description 28
- 238000010586 diagram Methods 0.000 description 16
- 230000008569 process Effects 0.000 description 11
- 238000012545 processing Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000007704 transition Effects 0.000 description 4
- 230000002159 abnormal effect Effects 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/34—Signalling channels for network management communication
Definitions
- the present invention relates to the field of network data storage, and in particular, to a data switching method, device and system.
- File lock is a means of keeping files synchronized. When multiple users operate the same file at the same time, the file lock can ensure that data does not conflict. Many database softwares need file lock support in the process of reading and writing data.
- Network File System (NFS) is a powerful network file system. File locks play a vital role in maintaining file synchronization. Therefore, the maintenance of file locks is also crucial for NFS.
- NFS Network File System
- cluster mode if a node in the data access is abnormal, it may face the risk of file lock information loss.
- the client checks the lock status through the state protocol to recover the lock, but through the lock state. The monitoring is to recover the lock for a long time, and in the special case, some locks cannot be recovered, and a technique for quickly recovering or saving the lock state is urgently needed to solve the problem.
- the present invention provides a data switching method, device and system, the main purpose of which is to solve the technical problem of how to quickly recover file lock information in the event of a node failure.
- a method of data switching including:
- the first device When the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket connection of the first device;
- the sending the first information to the second device includes:
- the TCP module of the first device acquires the first information, where the first information includes at least a keyword of a control block structure in the first device TCP module. And a keyword of the management structure in the first device socket module;
- the TCP module of the first device sends the first information to a TCP module of the second device.
- the sending the second information to the second device includes:
- the TCP module of the first device acquires the second information, where the second information includes at least the packet in the TCP module sending buffer and the packet in the socket receiving buffer.
- the TCP module of the first device sends the second information to the TCP module of the second device through a cluster channel, so that the TCP module of the second device assigns the second information to the copied socket connecting.
- the sending the third information to the second device includes:
- the NFS module of the first device acquires the third information, where the third information includes at least information about a control plane management structure of the NFS module and an input and output IO request that is not completed by the NFS module, where the NFS module
- the information of the control plane management structure includes NFS lock information
- the NFS module of the first device sends the third information to the NFS module of the second device by using a cluster channel.
- the NFS module of the first device acquires the third information, including:
- the NFS module of the first device acquires NFS lock information
- the NFS module of the first device sends the third information to the NFS module of the second device by using a cluster channel, including:
- the NFS module of the first device encapsulates the NFS lock information according to a packet format, where the packet format includes at least the information of the socket connection, the packet fragment number, the number of the NFS lock information, and the The end identifier of the message;
- the NFS module of the first device sends the encapsulated NFS lock information to the second NFS module.
- the method further includes:
- the first device receives a zero window message sent by the second device.
- the method further includes:
- the The NFS module of the first device sends a message that the switching is completed.
- the NFS module of the first device receives the message that the NFS module of the first device sends a switching completion, and the NFS module of the first device has sent the third information to the second device, The NFS module of the first device closes the TCP switching request;
- the NFS module of the first device receives the message that the first NFS module sends a switching completion, and the NFS module of the first device does not send the third information to the second device, The NFS module of the first device closes the TCP switching request after sending the third information to the second device.
- a method of data switching including:
- the second device Receiving, by the second device, the second information and the third information that are sent by the first device, generating a socket ID of the socket connection according to the second information, and using the third information, the network file system of the first device
- the NFS related information is assigned to the socket connection corresponding to the socket ID.
- the second device receives the second information and the third information that are sent by the first device, and generates the socket ID of the socket connection according to the second information, including:
- the TCP module of the second device receives the second information
- the TCP module of the second device assigns the second information to the copied socket connection, and generates a socket ID of the socket connection;
- the NFS module of the second device receives the third information
- the NFS module of the second device matches the socket connection corresponding to the socket ID according to the quintuple information in the third information, and if yes, assigns the third information to the socket ID Corresponding socket connection.
- the NFS module of the second device matches the socket connection corresponding to the socket ID according to the quintuple information in the third information, and if yes, assigns the third information to the After the socket connection corresponding to the socket ID, the method further includes:
- the NFS module of the second device sends a message that the switching is completed to the second TCP module.
- the protocol stack IP layer of the second device opens and receives the packet of the NET, and the TCP module of the second device sends a window recovery packet to the first device, and the NFS module of the second device sends and receives the packet.
- the method further includes:
- a first device includes a first transmission control protocol TCP module and a first network file system NFS module;
- the first TCP module is configured to: when the first device receives the transmission control protocol TCP switching request, send the first information to the second device, where the first information is used by the second device to be copied. a socket connection of the first device;
- the first TCP module is further configured to: send second information to the second device, where the second information is used by the second device to generate a socket ID of the socket connection;
- the first NFS module is configured to: send third information to the second device, where the third information is used to assign NFS related information of the first device to a socket connection corresponding to the socket ID. .
- the first TCP module includes:
- a first acquiring unit configured to: when the first device receives the TCP switching request, acquire first information, where the first information includes at least a keyword of a control block structure in the first device TCP module And a keyword of the management structure in the first device socket module;
- the first sending unit is configured to: send the first information to the second TCP module.
- the first TCP module includes:
- a second acquiring unit configured to: acquire the second information, where the second information includes at least a packet in the TCP module sending buffer and a packet in the socket receiving buffer;
- the second sending unit is configured to: send the second information to the second TCP module by using a cluster channel, so that the second TCP module assigns the second information to the copied socket connection.
- the first NFS module includes:
- the third obtaining unit is configured to: acquire the third information, where the third information includes at least information about a control plane management structure of the NFS module and an input/output IO request that is not completed by the NFS module, where the NFS module
- the information of the control plane management structure includes NFS lock information
- the third sending unit is configured to: send the third information to the second NFS module by using a cluster channel.
- the first NFS module includes:
- the fourth obtaining unit is configured to: obtain NFS lock information
- the encapsulating unit is configured to: encapsulate the NFS lock information according to a packet format, where the packet format includes at least the socket connection information, a packet fragment number, a number of the NFS lock information, and the packet The end of the text; and
- the fourth sending unit is configured to: send the encapsulated NFS lock information to the second NFS module.
- the first TCP module further includes:
- the first receiving unit is configured to: receive a zero window message sent by the second device.
- the first TCP module further includes:
- the fifth sending unit is configured to: after sending the second information to the second device, send a message that the switching is completed to the first NFS module;
- the first NFS module further includes:
- the closing unit is configured to: if the first NFS module receives the message that the first NFS module sends a switching completion, and the first NFS module has sent the third information to the second device, Close the TCP switching request;
- the shutting down unit is further configured to: if the first NFS module receives the message that the first NFS module sends a switch completion, and the first NFS module does not send the third message to the second device The information is closed after the third information is sent to the second device.
- a second device includes a second transmission control protocol TCP module and a second network file system NFS module;
- the second TCP module is configured to: when the first device receives the TCP switching request, receive the first information sent by the first device, and copy the socket of the first device according to the first information connection;
- the second TCP module is configured to: receive second information sent by the first device, and generate a socket ID of the socket connection according to the second information;
- the second NFS module is configured to receive the third information sent by the first device, and assign the NFS related information of the first device to the socket connection corresponding to the socket ID according to the third information.
- the second TCP module includes:
- a second receiving unit configured to: receive the second information
- a generating unit configured to: assign the second information to the copied socket connection, and generate a socket ID of the socket connection;
- the second NFS module includes:
- a third receiving unit configured to: receive the third information
- a matching unit configured to: match a socket connection corresponding to the socket ID according to the quintuple information in the third information;
- the assignment unit is configured to: if matched, assign the third information to the socket connection corresponding to the socket ID.
- the second NFS module further includes:
- a sixth sending unit configured to: send a message that the switching is completed to the second TCP module
- the second TCP module further includes:
- the seventh sending unit is configured to: send a window recovery message to the first device.
- the second TCP module further includes:
- the eighth sending unit is configured to: send a zero window message to the first device.
- a system comprising the first device described above and the second device.
- a computer readable storage medium storing program instructions that, when executed, implement the methods described above.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- FIG. 1 is a schematic flow chart of a first embodiment of a method for data switching
- FIG. 2 is a schematic flow chart of a second embodiment of a method for data switching
- FIG. 3 is a schematic flow chart of a third embodiment of a method for data switching
- FIG. 4 is a schematic flow chart of a fourth embodiment of a method for data switching
- FIG. 5 is a schematic flowchart of a fifth embodiment of a method for data switching
- FIG. 6 is a schematic flow chart of a sixth embodiment of a method for data switching
- FIG. 7 is a schematic diagram of an interaction process of a seventh embodiment of a method for data switching
- FIG. 8 is a schematic diagram of functional modules of a first embodiment of a first device of the present invention.
- FIG. 9 is a schematic diagram of functional modules of a second embodiment of a first device of the present invention.
- FIG. 10 is a schematic diagram of functional modules of a third embodiment of a first device of the present invention.
- FIG. 11 is a schematic diagram of functional modules of a first embodiment of a second device of the present invention.
- FIG. 12 is a schematic diagram of functional modules of a second embodiment of a second device of the present invention.
- FIG. 13 is a schematic diagram of functional modules of a third embodiment of a second device of the present invention.
- Figure 14 is a schematic diagram of functional modules of a first embodiment of the system of the present invention.
- the invention provides a method of data switching.
- FIG. 1 is a schematic flowchart of a first embodiment of a method for data switching.
- the data switching method includes:
- Step 101 If the first device receives the TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket connection of the first device.
- the sending the first information to the second device includes:
- the TCP module of the first device acquires first information, where the first information includes at least a key of a control block structure in the TCP module of the first device.
- the TCP module of the first device sends the first information to a TCP module of the second device.
- the first device may be a faulty node, and the second device may be a takeover node.
- the user can manually migrate the IP to the takeover node, and then power off the faulty node.
- the TCP module of the faulty end obtains the connection on the current switched IP.
- the faulty TCP module acquires the control block structure of the TCP module and the key field in the management structure of the socket module, and sends the obtained control block structure of the TCP module and the key field in the management structure of the socket module to the cluster channel through the cluster channel. Take over the node's TCP module so that the TCP module that takes over the node can clone a new socket connection in the first time.
- Step 102 The first device sends the second information and the third information to the second device, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to The NFS related information of the first device is assigned to a socket connection corresponding to the socket ID.
- the sending the second information to the second device includes:
- the TCP module of the first device acquires the second information, where the second information includes at least the packet in the TCP module sending buffer and the packet in the socket receiving buffer.
- the TCP module of the first device sends the second information to the second TCP module through a cluster channel, so that the TCP module of the second device assigns the second information to the copied socket connection. .
- the sending the third information to the second device includes:
- the network file system (NFS) module of the first device acquires the third information, where the third information includes at least information about a control plane management structure of the NFS module and an unfinished NFS module.
- An input/output (IO) request the information of the control plane management structure of the NFS module includes NFS lock information;
- the first NFS module sends the third information to the second NFS module through a cluster channel.
- the TCP layer of the faulty end notifies the NFS service layer that the NFS service and the TCP layer start the switching of the connection at the same time.
- NFS receives the switch message, NFS stops the IO operation of the back-end disk and the packet of the front-end disk; NFS collects the information of the switch connection, including the NFS control plane management structure and the unfinished IO request sent to the pair through the cluster channel.
- the NFS control plane management structure includes NFS lock information.
- the TCP layer also collects the data packets in the TCP send buffer and the packets in the socket receiving ring, and sends them to the TCP layer of the takeover node through the cluster channel, and takes over the TCP layer of the node. To assign this information to a new socket connection, complete the migration of the TCP connection. Both NFS and TCP are performed simultaneously here.
- the TCP module of the takeover node restores the received switch data to the new socket connection.
- the TCP module of the takeover node processes all the data and sends the new socket ID to the NFS module of the takeover node.
- the NFS module that takes over the node needs to temporarily store the received switching data and then restore the NFS lock information.
- the recovery process of the NFS lock is to match the temporarily stored switching data with the new socket ID, and the matching condition is the quintuple information (protocol, source-destination IP, source-destination port) of the TCP layer and the new socket ID.
- the corresponding socket connections match.
- the sending the third information to the second device includes:
- the NFS module of the first device acquires NFS lock information
- the NFS module of the first device encapsulates the NFS lock information according to a packet format, where the packet format includes at least the information of the socket connection, the packet fragment number, the number of the NFS lock information, and the The end identifier of the message;
- the NFS module of the first device sends the encapsulated NFS lock information to the NFS module of the second device.
- the port of the faulty node is triggered to be switched. After the target node of the switchover is determined, the NFS lock information migration is started.
- the faulty node After receiving the NFS lock information migration message, the faulty node first finds the lock of the corresponding socket information according to the socket information, and encapsulates the lock information according to the packet format shown in Table 1. When the lock information is too much, the message is fragmented. After receiving the message, the takeover node performs group packet processing according to the sequence number of the message fragment. After the NFS lock information is successfully sent, the NFS lock information of the corresponding connection of the node is cleared until all the connections that need to be switched on the faulty node are sent.
- the receiving node After the receiving node receives the switching packet of the faulty node, it will perform corresponding processing according to the packet type.
- the packet end information is marked according to the end of the packet and the file lock information needs to be grouped, and it is determined that if the currently connected port switching is successful, the migrated file is parsed on the node. Lock the information and actively initiate a lock recovery operation to lock the file on the failed node Resume on the takeover node; if the currently connected port switchover is not completed, it will wait.
- the NFSv4 protocol defaults to 90s as the lease time
- the migrated lock information is released because this Even if the recovery, the lock has expired on the client, losing the value of recovery, but the port switching speed is often much faster than the lock migration speed, only in rare cases, the lock migration is ahead of the port switching and the delay exceeds the customer.
- the default maximum delay case the NFSv4 protocol defaults to 90s as the lease time
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- the invention further provides a method of data switching.
- FIG. 2 is a schematic flowchart of a second embodiment of a method for data switching.
- the method further includes:
- Step 103 Receive a zero window message sent by the second device.
- the takeover node After the takeover node establishes a new socket connection, the takeover node sends a zero window message to the faulty node, and temporarily disables the corresponding NET to receive the packet at the network layer IP layer.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- FIG. 3 is a schematic flowchart of a third embodiment of a method for data switching.
- the method further includes:
- Step 104 After the TCP module of the first device sends the second information to the second device, the message is sent to the NFS module of the first device, and the NFS module of the first device closes the TCP. Switching request;
- the NFS module of the first device receives the message that the first NFS module sends the switching completion, and the NFS module of the first device has sent the third information to the second device, The NFS module of the first device closes the TCP switching request;
- the NFS module of the first device receives the message that the first NFS module sends a switching completion, and the NFS module of the first device does not send the third information to the second device, The NFS module of the first device closes the TCP switching request after sending the third information to the second device.
- the NFS service is sent to the NFS service.
- the NFS receives the message. If the NFS is also switched, the NFS actively closes the request. If the NFS has not completed the data switchover, the NFS switchover is performed. Close the connection when done.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- FIG. 4 is a schematic flowchart of a fourth embodiment of a method for data switching.
- the data switching method includes:
- Step 401 The second device receives the first information that is sent by the first device when receiving the TCP switching request, and copies the socket connection of the first device according to the first information.
- the first device may be a faulty node, and the second device may be a takeover node.
- the user can manually migrate the IP to the takeover node, and then power off the faulty node.
- the TCP module of the faulty end obtains the connection on the current switched IP.
- the faulty TCP module acquires the control block structure of the TCP module and the key field in the management structure of the socket module, and sends the obtained control block structure of the TCP module and the key field in the management structure of the socket module to the cluster channel through the cluster channel. Take over the node's TCP module so that the TCP module that takes over the node can clone a new socket connection in the first time.
- Step 402 Receive second information and third information that are sent by the first device, generate a socket ID of the socket connection according to the second information, and use the third information according to the third information.
- the NFS related information is assigned to the socket connection corresponding to the socket ID.
- the receiving the second information and the third information that are sent by the first device, and generating the socket ID of the socket connection according to the second information including:
- the TCP module of the second device receives the second information
- the TCP module of the second device assigns the second information to the copied socket connection, and generates a socket ID of the socket connection;
- the NFS module of the second device receives the third information
- the NFS module of the second device matches the socket connection corresponding to the socket ID according to the quintuple information in the third information, and if yes, assigns the third information to the socket ID Corresponding socket connection.
- the TCP layer of the faulty end notifies the NFS service layer that the NFS service and the TCP layer start the switching of the connection at the same time.
- NFS receives the switch message, NFS stops the IO operation of the back-end disk and the packet of the front-end disk; NFS collects the information of the switch connection, including the NFS control plane management structure and the unfinished IO request sent to the pair through the cluster channel.
- the NFS control plane management structure includes NFS lock information.
- the TCP layer also collects the data packets in the TCP transmission buffer and the packets in the socket receiving ring, and sends them to the TCP layer of the takeover node through the cluster channel.
- the TCP layer of the takeover node needs to assign the information to the new socket connection. Complete the migration of the TCP connection. Both NFS and TCP are performed simultaneously here.
- the TCP module of the takeover node restores the received switch data to the new socket connection.
- the TCP module of the takeover node processes all the data and sends the new socket ID to the NFS module of the takeover node.
- the NFS module that takes over the node needs to temporarily store the received switching data and then restore the NFS lock information.
- the recovery process of the NFS lock is to match the temporarily stored switching data with the new socket ID, and the matching condition is the quintuple information (protocol, source-destination IP, source-destination port) of the TCP layer and the new socket ID.
- the corresponding socket connections match.
- the system In order to implement the fast recovery of the NFS lock information, the system detects the faulty node access abnormality and triggers the port switching of the faulty node. After the target node of the switchover is determined, the NFS lock information migration begins.
- the faulty node After the faulty node receives the NFS lock information migration message, it first finds the lock of the corresponding sockett information according to the socket information, and encapsulates the lock information according to the packet format shown in Table 1. When the lock information is too much, the message is fragmented. After receiving the message, the takeover node performs grouping according to the sequence number of the message fragment. After the NFS lock information is successfully sent, the NFS lock information of the corresponding connection of the node is cleared until all the connections that need to be switched on the faulty node are sent.
- the receiving node After the receiving node receives the switching packet of the faulty node, it will perform corresponding processing according to the packet type.
- the packet end information is marked according to the end of the packet and the file lock information needs to be grouped, and it is determined that if the currently connected port switching is successful, the migrated file is parsed on the node. Lock the information, and initiate the recovery operation of the lock, and restore the file lock on the faulty node to the takeover node; if the currently connected port switchover is not completed, it will wait, when the maximum delay time of the client lock operation is exceeded (NFSv4 protocol) When the default 90s is the lease time, the migrated lock information is released. Because the recovery is lost at the client and the recovery value is lost, the port switching speed is often much faster than the lock migration speed. Only in rare cases occurs when lock migration is ahead of port switching and the delay exceeds the client's default maximum latency.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- FIG. 5 is a schematic flowchart of a fifth embodiment of a method for data switching.
- the method further includes:
- Step 403 The NFS module of the second device sends a message that the switching is completed to the TCP module of the second device.
- the protocol stack IP layer of the second device opens and receives the packet of the NET, and the TCP module of the second device sends a window recovery packet to the first device, and the second NFS module sends and receives the packet.
- the NFS service is sent to the NFS service.
- the NFS receives the message. If the NFS is also switched, the NFS actively closes the request. If the NFS has not completed the data switchover, the NFS switchover is performed. Close the connection when done.
- the embodiment of the present invention receives the transmission control protocol TCP switching request by the first device. Transmitting, to the second device, the first information, where the first information is used by the second device to copy the socket connection of the first device, and the second device is configured to send the second information and the third information, where The second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to assign NFS related information of the first device to a socket connection corresponding to the socket ID, thereby When the problem occurs in the faulty node, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring fast recovery of the file lock.
- FIG. 6 is a schematic flowchart of a sixth embodiment of a method for data switching.
- the method further includes:
- Step 404 The second device sends a zero window message to the first device.
- the takeover node After the takeover node establishes a new socket connection, the takeover node sends a zero window message to the faulty node, and temporarily disables the corresponding NET to receive the message at the network layer IP layer.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- FIG. 7 is a schematic diagram of an interaction process of a seventh embodiment of a method for data switching.
- the data switching method includes:
- Step 701 After the TCP layer of the faulty node's protocol stack obtains the takeover node, notify the NFS to start switching.
- Step 702 The TCP module sends a key information (key) such as a TCP layer protocol control block to the takeover node by using a cross-node cluster channel, and establishes a socket connection at the takeover node;
- key information such as a TCP layer protocol control block
- Step 703 the TCP module and the NFS module of the faulty node simultaneously send respective data to the takeover node through the cluster channel;
- Step 704 the data transmission of the TCP module of the faulty node is completed, the faulty node notifies the NFS to close the socket, and the takeover node notifies the NFS new connection (socket) to arrive;
- step 705 the NFS module of the takeover end acquires a new socket connection, and notifies that the TCP starts to work normally.
- the embodiment of the invention provides a technology for quickly completing the lock migration in the NFS switching and switching process in the cluster mode.
- the principle is that the NFS connection of the server is copied to the takeover end by copying the NFS connection, and the client does not perceive.
- the copied information is divided into two parts according to the category: control information and data information.
- the information of the control plane mainly copies the key field, and the data plane information needs to copy all the uncompleted data request messages to the takeover end;
- the key information of the control plane includes the NFS lock information, and the processing of the NFS lock can avoid conflicting access to the file. Collect NFS lock information on the fault side and recover the information on the takeover side.
- the faulty end notifies the local end of the TCP layer switching, and the TCP notifies the application layer to switch, which is equivalent to migrating the TCP connection of the server to the take-over end;
- a cluster channel as a switched data channel, which can make various forms of physical channels, and each node controller can access the shared storage pool of the back end.
- the application layer transfers the NFS file lock and NFS request data and other related information to the switched node through the internal communication of the cluster, and quickly recovers the lock information on the takeover node.
- the whole process is divided into TCP layer and NFS data migration. In order to improve the speed of data migration, TCP and NFS data migration are performed in parallel, and the cluster channel is a high-speed and reliable channel, so the data migration speed is fast, and the whole migration process is at the millisecond level.
- the embodiments of the present invention are applicable to the high availability of the distributed network file system NFS in the cluster mode, and the applied technologies include reliable access of application services and reliable access of the distributed network file system in the cluster mode. It mainly solves the problem of ensuring the reliability and stability of the NFS service in the case of a single-point link failure in a big data storage cluster environment.
- the specific solution is: when a link fault occurs between the client and the access node, the node sends the file lock and related information to another takeover node through the network port-based NFS switch, and clones and copies a new one on the takeover node.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the second The information is used by the second device to generate a socket ID of the socket connection, where the third information is used to assign NFS related information of the first device to a socket connection corresponding to the socket ID, thereby implementing a fault.
- the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thus ensuring fast recovery of the file lock.
- the present invention provides an apparatus.
- FIG. 8 is a schematic diagram of functional modules of a first embodiment of a first device of the present invention.
- the first device includes:
- the first TCP module 801 is configured to: when the first device receives the transmission control protocol TCP switching request, send the first information to the second device, where the first information is used by the second device The socket connection of the first device;
- the first TCP module 801 includes:
- the first obtaining unit 8011 is configured to: when the first device receives the TCP switching request, acquire first information, where the first information includes at least a key of a control block structure in the first device TCP module. The word and the keyword of the management structure in the first device socket module;
- the first sending unit 8012 is configured to: send the first information to the second TCP module.
- the first device may be a faulty node, and the second device may be a takeover node.
- the user can manually migrate the IP to the takeover node, and then power off the faulty node.
- the TCP module of the faulty end acquires the current handover IP address.
- the TCP module of the fault end acquires the control block structure of the TCP module and the key field in the management structure of the socket module, and sends the control block structure of the obtained TCP module and the key field in the management structure of the socket module through the cluster channel. Go to the TCP module of the takeover node so that the TCP module of the takeover node can clone a new socket connection in the first time.
- the first TCP module 801 is configured to: send second information to the second device, where the second information is used by the second device to generate a socket ID of the socket connection;
- the first NFS module 802 is configured to: send third information to the second device, where the third information is used to assign NFS related information of the first device to a socket connection corresponding to the socket ID. in.
- the first TCP module 801 includes:
- the second obtaining unit 8013 is configured to: acquire the second information, where the second information includes at least the packet in the TCP module sending buffer and the packet in the socket receiving buffer;
- the second sending unit 8014 is configured to: send the second information to the second TCP module by using a cluster channel, so that the second TCP module assigns the second information to the copied socket connection.
- the first NFS module 802 includes:
- the third obtaining unit 8021 is configured to: acquire the third information, where the third information includes at least information about a control plane management structure of the NFS module and an IO request that is not completed by the NFS module, where the NFS module
- the information of the control plane management structure includes NFS lock information
- the third sending unit 8022 is configured to: send the third information to the second NFS module by using a cluster channel.
- the TCP layer of the faulty end notifies the NFS service layer that the NFS service and the TCP layer start the switching of the connection at the same time.
- NFS receives the switch message, NFS stops the IO operation of the back-end disk and the packet of the front-end disk; NFS collects the information of the switch connection, including the NFS control plane management structure and the unfinished IO request sent to the pair through the cluster channel.
- the NFS control plane management structure includes NFS lock information.
- the TCP layer also collects the data packets in the TCP transmission buffer and the packets in the socket receiving ring, and sends them to the TCP layer of the takeover node through the cluster channel.
- the TCP layer of the takeover node needs to assign the information to the new socket connection. Complete the migration of the TCP connection. Both NFS and TCP are performed simultaneously here.
- the TCP module of the takeover node restores the received switch data to the new socket connection.
- the TCP module of the takeover node processes all the data and sends the new socket ID to the NFS module of the takeover node.
- the NFS module that takes over the node needs to temporarily store the received switching data and then restore the NFS lock information.
- the recovery process of the NFS lock is to match the temporarily stored switching data with the new socket ID, and the matching condition is the quintuple information (protocol, source-destination IP, source-destination port) of the TCP layer and the new socket ID.
- the corresponding socket connections match.
- the first NFS module 802 includes:
- the fourth obtaining unit 8023 is configured to: obtain NFS lock information
- the encapsulating unit 8024 is configured to: encapsulate the NFS lock information according to a packet format, where the packet format includes at least the socket connection information, a packet fragment number, a number of the NFS lock information, and the End of the message;
- the fourth sending unit 8025 is configured to: send the encapsulated NFS lock information to the second NFS module.
- the port of the faulty node is triggered to be switched. After the target node of the switchover is determined, the NFS lock information migration is started.
- the faulty node After the faulty node receives the NFS lock information migration message, it first finds the lock of the corresponding sockett information according to the socket information, and encapsulates the lock information according to the packet format shown in Table 1. When the lock information is too much, the message is fragmented. After receiving the message, the takeover node performs group packet processing according to the sequence number of the message fragment. After the NFS lock information is successfully sent, the NFS lock information of the corresponding connection of the node is cleared until all the connections that need to be switched on the faulty node are sent.
- the receiving node After the receiving node receives the switching packet of the faulty node, it will perform corresponding processing according to the packet type.
- the packet end information is marked according to the end of the packet and the file lock information needs to be grouped, and it is determined that if the currently connected port switching is successful, the migrated file is parsed on the node. Lock the information, and initiate the recovery operation of the lock, and restore the file lock on the faulty node to the takeover node; if the currently connected port switchover is not completed, it will wait, when the maximum delay time of the client lock operation is exceeded (NFSv4 protocol) When the default 90s is the lease time, the migrated lock information is released. Because the recovery is lost at the client and the recovery value is lost, the port switching speed is often much faster than the lock migration speed. Only in rare cases occurs when lock migration is ahead of port switching and the delay exceeds the client's default maximum latency.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- the invention further provides a first device.
- FIG. 9 is a schematic diagram of a functional module of a second embodiment of a first device of the present invention.
- the first TCP module 801 further includes:
- the first receiving unit 8015 is configured to: receive a zero window message sent by the second device.
- the takeover node After the takeover node establishes a new socket connection, the takeover node sends a zero window message to the faulty node, and temporarily disables the corresponding NET to receive the message at the network layer IP layer.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- FIG. 10 is a schematic diagram of a functional module of a third embodiment of a first device of the present invention.
- the first TCP module 801 further includes:
- the fifth sending unit 8016 is configured to: after sending the second information to the second device, send a message that the switching is completed to the first NFS module;
- the first NFS module 802 further includes:
- the closing unit 8026 is configured to: if the first NFS module receives the message that the first NFS module sends a switching completion, and the first NFS module has sent the third information to the second device, Turning off the TCP switching request;
- the closing unit 8026 is further configured to: if the first NFS module receives the message that the first NFS module sends a switching completion, and the first NFS module does not send the third information to the second device And closing the TCP switching request after sending the third information to the second device.
- the NFS service is sent to the NFS service.
- the NFS receives the message. If the NFS is also switched, the NFS actively closes the request. If the NFS has not completed the data switchover, the NFS switchover is performed. Close the connection when done.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the second The information is used by the second device to generate a socket ID of the socket connection, where the third information is used to assign NFS related information of the first device to a socket connection corresponding to the socket ID, thereby implementing a fault.
- the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thus ensuring fast recovery of the file lock.
- FIG. 11 is a schematic diagram of functional modules of a first embodiment of a second device of the present invention.
- the second device includes a second TCP module 1101 and a second NFS module 1102;
- the second TCP module 1101 is configured to: when the first device receives the TCP switching request, receive the first information sent by the first device, and copy the first device according to the first information. Socket connection
- the first device may be a faulty node, and the second device may be a takeover node.
- the user can manually migrate the IP to the takeover node, and then power off the faulty node.
- the TCP module of the faulty end obtains the connection on the current switched IP.
- the faulty TCP module acquires the control block structure of the TCP module and the key field in the management structure of the socket module, and sends the obtained control block structure of the TCP module and the key field in the management structure of the socket module to the cluster channel through the cluster channel. Take over the node's TCP module so that the TCP module that takes over the node can clone a new socket connection in the first time.
- the second TCP module 1101 is configured to: receive second information sent by the first device, and generate a socket ID of the socket connection according to the second information;
- the second TCP module 1101 includes:
- the second receiving unit 11011 is configured to: receive the second information
- the generating unit 11012 is configured to: assign the second information to the copied socket connection, and generate a socket ID of the socket connection;
- the second NFS module 1102 is configured to: receive the third information sent by the first device, and assign the NFS related information of the first device to the socket connection corresponding to the socket ID according to the third information.
- the second NFS module 1102 includes:
- the third receiving unit 11021 is configured to: receive the third information
- the matching unit 11022 is configured to: match the socket connection corresponding to the socket ID according to the quintuple information in the third information;
- the assignment unit 11023 is configured to: if matched, assign the third information to the socket connection corresponding to the socket ID.
- the TCP layer of the faulty end notifies the NFS service layer that the NFS service and the TCP layer start the switching of the connection at the same time.
- NFS receives the switch message, NFS stops the IO operation of the back-end disk and the packet of the front-end disk; NFS collects the information of the switch connection, including the NFS control plane management structure and the unfinished IO request sent to the pair through the cluster channel.
- the NFS control plane management structure includes NFS lock information.
- the TCP layer also collects the data packets in the TCP transmission buffer and the packets in the socket receiving ring, and sends them to the TCP layer of the takeover node through the cluster channel.
- the TCP layer of the takeover node needs to assign the information to the new socket connection. Complete the migration of the TCP connection. Both NFS and TCP are performed simultaneously here.
- the TCP module of the takeover node restores the received switch data to the new socket connection.
- the TCP module of the takeover node processes all the data and sends the new socket ID to the NFS module of the takeover node.
- the NFS module that takes over the node needs to temporarily store the received switching data and then restore the NFS lock information.
- the recovery process of the NFS lock is to match the temporarily stored switching data with the new socket ID, and the matching condition is the quintuple information (protocol, source-destination IP, source-destination port) of the TCP layer and the new socket ID.
- the corresponding socket connections match.
- the system In order to implement the fast recovery of the NFS lock information, the system detects the faulty node access abnormality and triggers the port switching of the faulty node. After the target node of the switchover is determined, the NFS lock information migration begins.
- the faulty node After the faulty node receives the NFS lock information migration message, it first finds the lock of the corresponding sockett information according to the socket information, and encapsulates the lock information according to the packet format shown in Table 1. When the lock information is too much, the message is fragmented. After receiving the message, the takeover node performs group packet processing according to the sequence number of the message fragment. After the NFS lock information is successfully sent, the NFS lock information of the corresponding connection of the node is cleared until all the connections that need to be switched on the faulty node are sent.
- the receiving node After the receiving node receives the switching packet of the faulty node, it will perform corresponding processing according to the packet type.
- the packet end information is marked according to the end of the packet and the file lock information needs to be grouped, and it is determined that if the currently connected port switching is successful, the migrated file is parsed on the node. Lock the information and actively initiate a lock recovery operation to lock the file on the failed node Resume on the takeover node; if the currently connected port switchover is not completed, it will wait.
- the NFSv4 protocol defaults to 90s as the lease time
- the migrated lock information is released because this Even if the recovery, the lock has expired on the client, losing the value of recovery, but the port switching speed is often much faster than the lock migration speed, only in rare cases, the lock migration is ahead of the port switching and the delay exceeds the customer.
- the default maximum delay case the NFSv4 protocol defaults to 90s as the lease time
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- FIG. 12 is a schematic diagram of functional modules of a second embodiment of a second device of the present invention.
- the second NFS module 1102 further includes:
- the sixth sending unit 11024 is configured to: send a message that the switching is completed to the second TCP module;
- the second TCP module 1101 further includes:
- the seventh sending unit 11013 is configured to: send a window recovery message to the first device.
- the NFS service is sent to the NFS service.
- the NFS receives the message. If the NFS is also switched, the NFS actively closes the request. If the NFS has not completed the data switchover, the NFS switchover is performed. Close the connection when done.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- FIG. 13 is a schematic diagram of functional modules of a third embodiment of a second device of the present invention.
- the second TCP module 1101 further includes:
- the eighth sending unit 11014 is configured to: send a zero window message to the first device.
- the takeover node After the takeover node establishes a new socket connection, the takeover node sends a zero window message to the faulty node, and temporarily disables the corresponding NET to receive the message at the network layer IP layer.
- the module functions of the first device 800 as shown in FIGS. 8 to 10 and the module functions of the second device 110 as shown in FIGS. 11 to 13 can be simultaneously collected.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- the invention further provides a system.
- FIG. 14 is a schematic structural diagram of a system according to a first embodiment of the system of the present invention.
- the system includes a first device 800 as shown in Figures 8-10 and a second device 110 as shown in Figures 11-13.
- the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device.
- the NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
- all or part of the steps of the above embodiments may also be implemented by using an integrated circuit. These steps may be separately fabricated into individual integrated circuit modules, or multiple modules or steps may be fabricated into a single integrated circuit module. achieve. Thus, the invention is not limited to any specific combination of hardware and software.
- the devices/function modules/functional units in the above embodiments may be implemented by a general-purpose computing device, which may be centralized on a single computing device or distributed over a network of multiple computing devices.
- each device/function module/functional unit in the above embodiment When each device/function module/functional unit in the above embodiment is implemented in the form of a software function module and sold or used as a stand-alone product, it can be stored in a computer readable storage medium.
- the above mentioned computer readable storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
- the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring fast recovery of the file lock.
Landscapes
- Computer And Data Communications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
本发明涉及网络数据存储领域,尤其涉及一种数据倒换的方法、设备及系统。The present invention relates to the field of network data storage, and in particular, to a data switching method, device and system.
文件锁是保持文件同步的一种手段,当多个用户同时操作同一个文件时,文件锁可以保证数据不发生冲突,很多数据库软件在进行数据的读写过程中都需要文件锁的支持。网络文件系统(Network File System,NFS)作为一种强大的网络文件系统,文件锁对其保持文件同步有着至关重要的作用,所以对于文件锁的维护对于NFS来说也是至关重要。在集群模式下,若数据访问中某节点出现异常,则可能会面临着文件锁信息丢失的危险,在多数情况下客户端通过状态协议进行检查锁的状态来进行锁的恢复,但通过锁状态的监测来恢复锁一则时间较长,二则在特殊情况下某些锁不能恢复,急需一种能快速恢复或保存锁状态的技术来解决此问题。File lock is a means of keeping files synchronized. When multiple users operate the same file at the same time, the file lock can ensure that data does not conflict. Many database softwares need file lock support in the process of reading and writing data. Network File System (NFS) is a powerful network file system. File locks play a vital role in maintaining file synchronization. Therefore, the maintenance of file locks is also crucial for NFS. In cluster mode, if a node in the data access is abnormal, it may face the risk of file lock information loss. In most cases, the client checks the lock status through the state protocol to recover the lock, but through the lock state. The monitoring is to recover the lock for a long time, and in the special case, some locks cannot be recovered, and a technique for quickly recovering or saving the lock state is urgently needed to solve the problem.
发明内容Summary of the invention
本发明提供一种数据倒换的方法、设备及系统,主要目的在于解决节点发生故障的情况下如何快速恢复文件锁信息的技术问题。The present invention provides a data switching method, device and system, the main purpose of which is to solve the technical problem of how to quickly recover file lock information in the event of a node failure.
一种数据倒换的方法,包括:A method of data switching, including:
在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;When the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket connection of the first device;
所述第一设备向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的网络文件系统NFS相关信息赋值到与所述socket ID对应的socket连接中。Transmitting, by the first device, the second information and the third information to the second device, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used for The network file system NFS related information of the first device is assigned to the socket connection corresponding to the socket ID.
可选地,所述在第一设备接收到TCP倒换请求的情况下,所述向第二设备发送第一信息,包括: Optionally, in the case that the first device receives the TCP switching request, the sending the first information to the second device includes:
在所述第一设备接收到TCP倒换请求的情况下,第一设备的TCP模块获取所述第一信息,所述第一信息至少包括所述第一设备TCP模块中的控制块结构的关键字和第一设备socket模块中的管理结构的关键字;When the first device receives the TCP switching request, the TCP module of the first device acquires the first information, where the first information includes at least a keyword of a control block structure in the first device TCP module. And a keyword of the management structure in the first device socket module;
所述第一设备的TCP模块将所述第一信息发送给所述第二设备的TCP模块。The TCP module of the first device sends the first information to a TCP module of the second device.
可选地,所述向所述第二设备发送第二信息,包括:Optionally, the sending the second information to the second device includes:
所述第一设备的TCP模块获取所述第二信息,所述第二信息至少包括所述TCP模块发送缓存中的报文以及socket接收缓存中的报文;The TCP module of the first device acquires the second information, where the second information includes at least the packet in the TCP module sending buffer and the packet in the socket receiving buffer.
所述第一设备的TCP模块通过集群通道将所述第二信息发送给所述第二设备的TCP模块,以使得所述第二设备的TCP模块将所述第二信息赋值到复制后的socket连接中。The TCP module of the first device sends the second information to the TCP module of the second device through a cluster channel, so that the TCP module of the second device assigns the second information to the copied socket connecting.
可选地,所述向所述第二设备发送第三信息,包括:Optionally, the sending the third information to the second device includes:
所述第一设备的NFS模块获取所述第三信息,所述第三信息至少包括所述NFS模块的控制面管理结构的信息和所述NFS模块未完成的输入输出IO请求,所述NFS模块的控制面管理结构的信息包括NFS锁信息;The NFS module of the first device acquires the third information, where the third information includes at least information about a control plane management structure of the NFS module and an input and output IO request that is not completed by the NFS module, where the NFS module The information of the control plane management structure includes NFS lock information;
所述第一设备的NFS模块通过集群通道将所述第三信息发送给所述第二设备的NFS模块。The NFS module of the first device sends the third information to the NFS module of the second device by using a cluster channel.
可选地,所述第一设备的NFS模块获取所述第三信息,包括:Optionally, the NFS module of the first device acquires the third information, including:
所述第一设备的NFS模块获取NFS锁信息;The NFS module of the first device acquires NFS lock information;
所述第一设备的NFS模块通过集群通道将所述第三信息发送给所述第二设备的NFS模块,包括:The NFS module of the first device sends the third information to the NFS module of the second device by using a cluster channel, including:
所述第一设备的NFS模块将所述NFS锁信息根据报文格式进行封装,所述报文格式至少包括所述socket连接的信息、报文分片号、所述NFS锁信息的编号以及所述报文的结束标识;The NFS module of the first device encapsulates the NFS lock information according to a packet format, where the packet format includes at least the information of the socket connection, the packet fragment number, the number of the NFS lock information, and the The end identifier of the message;
所述第一设备的NFS模块将封装后的NFS锁信息发送给所述第二NFS模块。The NFS module of the first device sends the encapsulated NFS lock information to the second NFS module.
可选地,所述向第二设备发送第一信息之后,还包括:Optionally, after the sending the first information to the second device, the method further includes:
所述第一设备接收所述第二设备发送的零窗口的报文。The first device receives a zero window message sent by the second device.
可选地,所述向所述第二设备发送第二信息及第三信息之后,还包括:Optionally, after the sending the second information and the third information to the second device, the method further includes:
所述第一设备的TCP模块向所述第二设备发送所述第二信息后,向所述 第一设备的NFS模块发送倒换完成的消息;After the TCP module of the first device sends the second information to the second device, the The NFS module of the first device sends a message that the switching is completed.
若所述第一设备的NFS模块收到所述第一设备的NFS模块发送倒换完成的消息,并且所述第一设备的NFS模块已向所述第二设备发送完所述第三信息,则所述第一设备的NFS模块关闭所述TCP倒换请求;If the NFS module of the first device receives the message that the NFS module of the first device sends a switching completion, and the NFS module of the first device has sent the third information to the second device, The NFS module of the first device closes the TCP switching request;
若所述第一设备的NFS模块收到所述第一NFS模块发送倒换完成的消息,并且所述第一设备的NFS模块未向所述第二设备发送完所述第三信息,则所述第一设备的NFS模块在向所述第二设备发送完所述第三信息后关闭所述TCP倒换请求。If the NFS module of the first device receives the message that the first NFS module sends a switching completion, and the NFS module of the first device does not send the third information to the second device, The NFS module of the first device closes the TCP switching request after sending the third information to the second device.
一种数据倒换的方法,包括:A method of data switching, including:
第二设备接收所述第一设备接收到传输控制协议TCP倒换请求的情况下向所述第二设备发送的第一信息,根据所述第一信息复制出所述第一设备的socket连接;Receiving, by the second device, the first information that is sent to the second device when the first device receives the transmission control protocol TCP switching request, and copies the socket connection of the first device according to the first information;
所述第二设备接收第一设备发送的第二信息及第三信息,根据所述第二信息生成所述socket连接的socket ID,根据所述第三信息将所述第一设备的网络文件系统NFS相关信息赋值到与所述socket ID对应的socket连接中。Receiving, by the second device, the second information and the third information that are sent by the first device, generating a socket ID of the socket connection according to the second information, and using the third information, the network file system of the first device The NFS related information is assigned to the socket connection corresponding to the socket ID.
可选地,所述第二设备接收第一设备发送的第二信息及第三信息,根据所述第二信息生成所述socket连接的socket ID,包括:Optionally, the second device receives the second information and the third information that are sent by the first device, and generates the socket ID of the socket connection according to the second information, including:
所述第二设备的TCP模块接收所述第二信息;The TCP module of the second device receives the second information;
所述第二设备的TCP模块将所述第二信息赋值到复制后的socket连接中,生成所述socket连接的socket ID;The TCP module of the second device assigns the second information to the copied socket connection, and generates a socket ID of the socket connection;
所述第二设备的NFS模块接收所述第三信息;The NFS module of the second device receives the third information;
所述第二设备的NFS模块根据所述第三信息中的五元组信息匹配与所述socket ID对应的socket连接,若匹配,则将所述第三信息赋值给所述与所述socket ID对应的socket连接。The NFS module of the second device matches the socket connection corresponding to the socket ID according to the quintuple information in the third information, and if yes, assigns the third information to the socket ID Corresponding socket connection.
可选地,所述第二设备的NFS模块根据所述第三信息中的五元组信息匹配与所述socket ID对应的socket连接,若匹配,则将所述第三信息赋值给所述与所述socket ID对应的socket连接之后,还包括:Optionally, the NFS module of the second device matches the socket connection corresponding to the socket ID according to the quintuple information in the third information, and if yes, assigns the third information to the After the socket connection corresponding to the socket ID, the method further includes:
所述第二设备的NFS模块向所述第二TCP模块发送倒换完成的消息;The NFS module of the second device sends a message that the switching is completed to the second TCP module.
所述第二设备的协议栈IP层开放NET的收发包,所述第二设备的TCP模块发送窗口恢复报文给所述第一设备,所述第二设备的NFS模块收发报文。 The protocol stack IP layer of the second device opens and receives the packet of the NET, and the TCP module of the second device sends a window recovery packet to the first device, and the NFS module of the second device sends and receives the packet.
可选地,所述第二设备接收所述第一设备发送的第一信息之后,还包括:Optionally, after the receiving, by the second device, the first information sent by the first device, the method further includes:
向所述第一设备发送零窗口的报文。Sending a zero window message to the first device.
一种第一设备,包括第一传输控制协议TCP模块和第一网络文件系统NFS模块;A first device includes a first transmission control protocol TCP module and a first network file system NFS module;
所述第一TCP模块,设置为:在所述第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;The first TCP module is configured to: when the first device receives the transmission control protocol TCP switching request, send the first information to the second device, where the first information is used by the second device to be copied. a socket connection of the first device;
所述第一TCP模块,还设置为:向所述第二设备发送第二信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID;The first TCP module is further configured to: send second information to the second device, where the second information is used by the second device to generate a socket ID of the socket connection;
所述第一NFS模块,设置为:向所述第二设备发送第三信息,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中。The first NFS module is configured to: send third information to the second device, where the third information is used to assign NFS related information of the first device to a socket connection corresponding to the socket ID. .
可选地,所述第一TCP模块,包括:Optionally, the first TCP module includes:
第一获取单元,设置为:在所述第一设备接收到TCP倒换请求的情况下,获取第一信息,所述第一信息至少包括所述第一设备TCP模块中的控制块结构的关键字和第一设备socket模块中的管理结构的关键字;以及a first acquiring unit, configured to: when the first device receives the TCP switching request, acquire first information, where the first information includes at least a keyword of a control block structure in the first device TCP module And a keyword of the management structure in the first device socket module;
第一发送单元,设置为:将所述第一信息发送给所述第二TCP模块。The first sending unit is configured to: send the first information to the second TCP module.
可选地,所述第一TCP模块,包括:Optionally, the first TCP module includes:
第二获取单元,设置为:获取所述第二信息,所述第二信息至少包括所述TCP模块发送缓存中的报文以及socket接收缓存中的报文;以及a second acquiring unit, configured to: acquire the second information, where the second information includes at least a packet in the TCP module sending buffer and a packet in the socket receiving buffer;
第二发送单元,设置为:通过集群通道将所述第二信息发送给所述第二TCP模块,以使得所述第二TCP模块将所述第二信息赋值到复制后的socket连接中。The second sending unit is configured to: send the second information to the second TCP module by using a cluster channel, so that the second TCP module assigns the second information to the copied socket connection.
可选地,所述第一NFS模块,包括:Optionally, the first NFS module includes:
第三获取单元,设置为:获取所述第三信息,所述第三信息至少包括所述NFS模块的控制面管理结构的信息和所述NFS模块未完成的输入输出IO请求,所述NFS模块的控制面管理结构的信息包括NFS锁信息;以及The third obtaining unit is configured to: acquire the third information, where the third information includes at least information about a control plane management structure of the NFS module and an input/output IO request that is not completed by the NFS module, where the NFS module The information of the control plane management structure includes NFS lock information;
第三发送单元,设置为:通过集群通道将所述第三信息发送给所述第二NFS模块。The third sending unit is configured to: send the third information to the second NFS module by using a cluster channel.
可选地,所述第一NFS模块,包括: Optionally, the first NFS module includes:
第四获取单元,设置为:获取NFS锁信息;The fourth obtaining unit is configured to: obtain NFS lock information;
封装单元,设置为:将所述NFS锁信息根据报文格式进行封装,所述报文格式至少包括所述socket连接的信息、报文分片号、所述NFS锁信息的编号以及所述报文的结束标识;以及The encapsulating unit is configured to: encapsulate the NFS lock information according to a packet format, where the packet format includes at least the socket connection information, a packet fragment number, a number of the NFS lock information, and the packet The end of the text; and
第四发送单元,设置为:将封装后的NFS锁信息发送给所述第二NFS模块。The fourth sending unit is configured to: send the encapsulated NFS lock information to the second NFS module.
可选地,第一TCP模块,还包括:Optionally, the first TCP module further includes:
第一接收单元,设置为:接收所述第二设备发送的零窗口的报文。The first receiving unit is configured to: receive a zero window message sent by the second device.
可选地,所述第一TCP模块还包括:Optionally, the first TCP module further includes:
第五发送单元,设置为:向所述第二设备发送所述第二信息后,向所述第一NFS模块发送倒换完成的消息;The fifth sending unit is configured to: after sending the second information to the second device, send a message that the switching is completed to the first NFS module;
所述第一NFS模块,还包括:The first NFS module further includes:
关闭单元,设置为:若所述第一NFS模块收到所述第一NFS模块发送倒换完成的消息,并且所述第一NFS模块已向所述第二设备发送完所述第三信息,则关闭所述TCP倒换请求;And the closing unit is configured to: if the first NFS module receives the message that the first NFS module sends a switching completion, and the first NFS module has sent the third information to the second device, Close the TCP switching request;
所述关闭单元,还设置为:若所述第一NFS模块收到所述第一NFS模块发送倒换完成的消息,并且所述第一NFS模块未向所述第二设备发送完所述第三信息,则在向所述第二设备发送完所述第三信息后关闭所述TCP倒换请求。The shutting down unit is further configured to: if the first NFS module receives the message that the first NFS module sends a switch completion, and the first NFS module does not send the third message to the second device The information is closed after the third information is sent to the second device.
一种第二设备,包括第二传输控制协议TCP模块以及第二网络文件系统NFS模块;A second device includes a second transmission control protocol TCP module and a second network file system NFS module;
所述第二TCP模块,设置为:在第一设备接收到TCP倒换请求的情况下,接收所述第一设备发送的第一信息,根据所述第一信息复制出所述第一设备的socket连接;The second TCP module is configured to: when the first device receives the TCP switching request, receive the first information sent by the first device, and copy the socket of the first device according to the first information connection;
所述第二TCP模块,设置为:接收第一设备发送的第二信息,根据所述第二信息生成所述socket连接的socket ID;The second TCP module is configured to: receive second information sent by the first device, and generate a socket ID of the socket connection according to the second information;
所述第二NFS模块,设置为:接收第一设备发送的第三信息,根据所述第三信息将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中。The second NFS module is configured to receive the third information sent by the first device, and assign the NFS related information of the first device to the socket connection corresponding to the socket ID according to the third information.
可选地,所述第二TCP模块,包括: Optionally, the second TCP module includes:
第二接收单元,设置为:接收所述第二信息;以及a second receiving unit, configured to: receive the second information;
生成单元,设置为:将所述第二信息赋值到复制后的socket连接中,生成所述socket连接的socket ID;a generating unit, configured to: assign the second information to the copied socket connection, and generate a socket ID of the socket connection;
所述第二NFS模块,包括:The second NFS module includes:
第三接收单元,设置为:接收所述第三信息;a third receiving unit, configured to: receive the third information;
匹配单元,设置为:根据所述第三信息中的五元组信息匹配与所述socket ID对应的socket连接;a matching unit, configured to: match a socket connection corresponding to the socket ID according to the quintuple information in the third information;
赋值单元,设置为:若匹配,则将所述第三信息赋值给所述与所述socket ID对应的socket连接。The assignment unit is configured to: if matched, assign the third information to the socket connection corresponding to the socket ID.
可选地,所述第二NFS模块还包括:Optionally, the second NFS module further includes:
第六发送单元,设置为:向所述第二TCP模块发送倒换完成的消息;a sixth sending unit, configured to: send a message that the switching is completed to the second TCP module;
所述第二TCP模块还包括:The second TCP module further includes:
第七发送单元,设置为:发送窗口恢复报文给所述第一设备。The seventh sending unit is configured to: send a window recovery message to the first device.
可选地,所述第二TCP模块还包括:Optionally, the second TCP module further includes:
第八发送单元,设置为:向所述第一设备发送零窗口的报文。The eighth sending unit is configured to: send a zero window message to the first device.
一种系统,包括上述第一设备以及上述第二设备。A system comprising the first device described above and the second device.
一种计算机可读存储介质,存储有程序指令,当该程序指令被执行时可实现上面所述的方法。A computer readable storage medium storing program instructions that, when executed, implement the methods described above.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
附图概述BRIEF abstract
图1为数据倒换的方法第一实施例的流程示意图;1 is a schematic flow chart of a first embodiment of a method for data switching;
图2为数据倒换的方法第二实施例的流程示意图; 2 is a schematic flow chart of a second embodiment of a method for data switching;
图3为数据倒换的方法第三实施例的流程示意图;3 is a schematic flow chart of a third embodiment of a method for data switching;
图4为数据倒换的方法第四实施例的流程示意图;4 is a schematic flow chart of a fourth embodiment of a method for data switching;
图5为数据倒换的方法第五实施例的流程示意图;FIG. 5 is a schematic flowchart of a fifth embodiment of a method for data switching;
图6为数据倒换的方法第六实施例的流程示意图;6 is a schematic flow chart of a sixth embodiment of a method for data switching;
图7为数据倒换的方法第七实施例的交互流程示意图;7 is a schematic diagram of an interaction process of a seventh embodiment of a method for data switching;
图8为本发明第一设备第一实施例的功能模块示意图;8 is a schematic diagram of functional modules of a first embodiment of a first device of the present invention;
图9为本发明第一设备第二实施例的功能模块示意图;9 is a schematic diagram of functional modules of a second embodiment of a first device of the present invention;
图10为本发明第一设备第三实施例的功能模块示意图;10 is a schematic diagram of functional modules of a third embodiment of a first device of the present invention;
图11为本发明第二设备第一实施例的功能模块示意图;11 is a schematic diagram of functional modules of a first embodiment of a second device of the present invention;
图12为本发明第二设备第二实施例的功能模块示意图;12 is a schematic diagram of functional modules of a second embodiment of a second device of the present invention;
图13为本发明第二设备第三实施例的功能模块示意图;13 is a schematic diagram of functional modules of a third embodiment of a second device of the present invention;
图14为本发明系统第一实施例的功能模块示意图。Figure 14 is a schematic diagram of functional modules of a first embodiment of the system of the present invention.
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional features, and advantages of the present invention will be further described in conjunction with the embodiments.
本发明的较佳实施方式Preferred embodiment of the invention
应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
本发明提供一种数据倒换的方法。The invention provides a method of data switching.
参照图1,图1为数据倒换的方法第一实施例的流程示意图。Referring to FIG. 1, FIG. 1 is a schematic flowchart of a first embodiment of a method for data switching.
在第一实施例中,该数据倒换的方法包括:In the first embodiment, the data switching method includes:
步骤101,在第一设备接收到TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;Step 101: If the first device receives the TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket connection of the first device.
可选地,所述在第一设备接收到传输控制协议(Transmission Control Protocol,TCP)倒换请求的情况下,所述向第二设备发送第一信息,包括:Optionally, in the case that the first device receives the Transmission Control Protocol (TCP) switching request, the sending the first information to the second device includes:
在所述第一设备接收到TCP倒换请求的情况下,所述第一设备的TCP模块获取第一信息,所述第一信息至少包括所述第一设备的TCP模块中的控制块结构的关键字和第一设备socket模块中的管理结构的关键字;When the first device receives the TCP switching request, the TCP module of the first device acquires first information, where the first information includes at least a key of a control block structure in the TCP module of the first device. The word and the keyword of the management structure in the first device socket module;
所述第一设备的TCP模块将所述第一信息发送给所述第二设备的TCP模块。 The TCP module of the first device sends the first information to a TCP module of the second device.
其中,所述第一设备可以为故障节点,所述第二设备可以为接管节点。The first device may be a faulty node, and the second device may be a takeover node.
其中,故障节点因故障或者故障节点在升级的情况下,用户可以手动将IP迁移到接管节点,再下电故障节点,故障节点的IP迁移后,故障端的TCP模块会获取当前切换IP上的连接,故障端TCP模块获取TCP模块的控制块结构和socket模块的管理结构中的关键字段,将获取到的TCP模块的控制块结构和socket模块的管理结构中的关键字段通过集群通道发送到接管节点的TCP模块,以便接管节点的TCP模块能在第一时间克隆出新的socket连接。If the faulty node is faulty or the faulty node is upgraded, the user can manually migrate the IP to the takeover node, and then power off the faulty node. After the IP address of the faulty node is migrated, the TCP module of the faulty end obtains the connection on the current switched IP. The faulty TCP module acquires the control block structure of the TCP module and the key field in the management structure of the socket module, and sends the obtained control block structure of the TCP module and the key field in the management structure of the socket module to the cluster channel through the cluster channel. Take over the node's TCP module so that the TCP module that takes over the node can clone a new socket connection in the first time.
步骤102,第一设备向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中。Step 102: The first device sends the second information and the third information to the second device, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to The NFS related information of the first device is assigned to a socket connection corresponding to the socket ID.
可选地,所述向所述第二设备发送第二信息,包括:Optionally, the sending the second information to the second device includes:
所述第一设备的TCP模块获取所述第二信息,所述第二信息至少包括所述TCP模块发送缓存中的报文以及socket接收缓存中的报文;The TCP module of the first device acquires the second information, where the second information includes at least the packet in the TCP module sending buffer and the packet in the socket receiving buffer.
所述第一设备的TCP模块通过集群通道将所述第二信息发送给所述第二TCP模块,以使得所述第二设备的TCP模块将所述第二信息赋值到复制后的socket连接中。The TCP module of the first device sends the second information to the second TCP module through a cluster channel, so that the TCP module of the second device assigns the second information to the copied socket connection. .
可选地,所述向所述第二设备发送第三信息,包括:Optionally, the sending the third information to the second device includes:
所述第一设备的网络文件系统(Network File System,NFS)模块获取所述第三信息,所述第三信息至少包括所述NFS模块的控制面管理结构的信息和所述NFS模块未完成的输入输出(Input/Output,IO)请求,所述NFS模块的控制面管理结构的信息包括NFS锁信息;The network file system (NFS) module of the first device acquires the third information, where the third information includes at least information about a control plane management structure of the NFS module and an unfinished NFS module. An input/output (IO) request, the information of the control plane management structure of the NFS module includes NFS lock information;
所述第一NFS模块通过集群通道将所述第三信息发送给所述第二NFS模块。The first NFS module sends the third information to the second NFS module through a cluster channel.
其中,故障端的TCP层通知NFS业务层,NFS业务和TCP层同时开始连接的倒换。NFS收到切换的消息,NFS停止对后端磁盘的IO操作和前端的报文发送;NFS收集本次倒换连接的信息,包括NFS控制面管理结构和未完成的IO请求通过集群通道发送到对端节点上,所述NFS控制面管理结构中包括NFS锁信息。同时,TCP层也会收集TCP发送缓存中的数据报文和socket接收环中的报文,通过集群通道发送到接管节点TCP层,接管节点TCP层需 要将这些信息赋值到新的socket连接中,完成TCP连接的迁移。NFS和TCP两者在此处是同时进行。The TCP layer of the faulty end notifies the NFS service layer that the NFS service and the TCP layer start the switching of the connection at the same time. NFS receives the switch message, NFS stops the IO operation of the back-end disk and the packet of the front-end disk; NFS collects the information of the switch connection, including the NFS control plane management structure and the unfinished IO request sent to the pair through the cluster channel. On the end node, the NFS control plane management structure includes NFS lock information. At the same time, the TCP layer also collects the data packets in the TCP send buffer and the packets in the socket receiving ring, and sends them to the TCP layer of the takeover node through the cluster channel, and takes over the TCP layer of the node. To assign this information to a new socket connection, complete the migration of the TCP connection. Both NFS and TCP are performed simultaneously here.
接管节点的TCP模块将接收到的倒换数据恢复到新的socket连接中,待接管节点的TCP模块将所有数据处理完毕,再将新的socket ID发送给接管节点的NFS模块。接管节点的NFS模块需要将接收到的倒换数据暂存再把NFS锁信息恢复。NFS锁的恢复过程是将暂存的倒换数据与新的socket ID匹配,匹配的条件是TCP层的五元组信息(协议、源-目的IP、源-目的端口)与所述新的socket ID对应的socket连接相匹配。The TCP module of the takeover node restores the received switch data to the new socket connection. The TCP module of the takeover node processes all the data and sends the new socket ID to the NFS module of the takeover node. The NFS module that takes over the node needs to temporarily store the received switching data and then restore the NFS lock information. The recovery process of the NFS lock is to match the temporarily stored switching data with the new socket ID, and the matching condition is the quintuple information (protocol, source-destination IP, source-destination port) of the TCP layer and the new socket ID. The corresponding socket connections match.
可选地,所述向所述第二设备发送第三信息,包括:Optionally, the sending the third information to the second device includes:
所述第一设备的NFS模块获取NFS锁信息;The NFS module of the first device acquires NFS lock information;
所述第一设备的NFS模块将所述NFS锁信息根据报文格式进行封装,所述报文格式至少包括所述socket连接的信息、报文分片号、所述NFS锁信息的编号以及所述报文的结束标识;The NFS module of the first device encapsulates the NFS lock information according to a packet format, where the packet format includes at least the information of the socket connection, the packet fragment number, the number of the NFS lock information, and the The end identifier of the message;
所述第一设备的NFS模块将封装后的NFS锁信息发送给所述第二设备的NFS模块。The NFS module of the first device sends the encapsulated NFS lock information to the NFS module of the second device.
其中,为了实现NFS锁信息的快速恢复,在系统检测到故障节点访问异常时触发故障节点的端口倒换,待确定倒换的目标节点之后开始进行NFS锁信息迁移。In order to implement the fast recovery of the NFS lock information, when the system detects that the faulty node access is abnormal, the port of the faulty node is triggered to be switched. After the target node of the switchover is determined, the NFS lock information migration is started.
在故障节点收到NFS锁信息迁移消息后先根据socket信息找到相对应的socket信息的锁,并将锁信息按照表1所示的报文格式进行封装,当锁信息过多时进行消息分片,在接管节点收到消息后根据消息分片的序号进行组包处理,当NFS锁信息发送成功后清除掉本节点相应连接的NFS锁信息,直至故障节点上所有需要倒换的连接发送完成为止。After receiving the NFS lock information migration message, the faulty node first finds the lock of the corresponding socket information according to the socket information, and encapsulates the lock information according to the packet format shown in Table 1. When the lock information is too much, the message is fragmented. After receiving the message, the takeover node performs group packet processing according to the sequence number of the message fragment. After the NFS lock information is successfully sent, the NFS lock information of the corresponding connection of the node is cleared until all the connections that need to be switched on the faulty node are sent.
表1Table 1
在接管节点收到故障节点的倒换报文后,会根据报文类型进行相应的处理。当收到锁迁移的报文信息时根据报文结束标记和需要将收到的文件锁信息进行组包,并判断若当前相应连接的端口倒换已成功,则在本节点上解析迁移过来的文件锁信息,并主动发起锁的恢复操作,将故障节点上的文件锁 在接管节点上恢复;若当前连接的端口倒换未完成则会进行等待,当超过客户端锁操作的最大延迟时间(NFSv4协议默认90s为租赁时间)时,将迁移的锁信息进行释放,因为此时就算恢复,该锁在客户端已经失效,失去了恢复的价值,不过端口倒换的速度往往要远大于锁迁移的速度,只有在极少数情况下才出现锁迁移超前于端口倒换且延迟超过客户端默认最大延迟的情况。After the receiving node receives the switching packet of the faulty node, it will perform corresponding processing according to the packet type. When the packet information of the lock transition is received, the packet end information is marked according to the end of the packet and the file lock information needs to be grouped, and it is determined that if the currently connected port switching is successful, the migrated file is parsed on the node. Lock the information and actively initiate a lock recovery operation to lock the file on the failed node Resume on the takeover node; if the currently connected port switchover is not completed, it will wait. When the maximum delay time of the client lock operation is exceeded (the NFSv4 protocol defaults to 90s as the lease time), the migrated lock information is released because this Even if the recovery, the lock has expired on the client, losing the value of recovery, but the port switching speed is often much faster than the lock migration speed, only in rare cases, the lock migration is ahead of the port switching and the delay exceeds the customer. The default maximum delay case.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
本发明进一步提供一种数据倒换的方法。The invention further provides a method of data switching.
参照图2,图2为数据倒换的方法第二实施例的流程示意图。Referring to FIG. 2, FIG. 2 is a schematic flowchart of a second embodiment of a method for data switching.
在第二实施例中,在步骤101之后还包括:In the second embodiment, after
步骤103,接收所述第二设备发送的零窗口的报文。Step 103: Receive a zero window message sent by the second device.
具体的,接管节点建立新的socket连接之后,接管节点会发送一个零窗口的报文给故障节点,并在网络层IP层暂时禁止对应的NET收取报文。Specifically, after the takeover node establishes a new socket connection, the takeover node sends a zero window message to the faulty node, and temporarily disables the corresponding NET to receive the packet at the network layer IP layer.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
参照图3,图3为数据倒换的方法第三实施例的流程示意图。Referring to FIG. 3, FIG. 3 is a schematic flowchart of a third embodiment of a method for data switching.
在第一实施例中,在步骤102之后还包括:In the first embodiment, after
步骤104,所述第一设备的TCP模块向所述第二设备发送所述第二信息后,向所述第一设备的NFS模块发送倒换完成的消息,第一设备的NFS模块关闭所述TCP倒换请求; Step 104: After the TCP module of the first device sends the second information to the second device, the message is sent to the NFS module of the first device, and the NFS module of the first device closes the TCP. Switching request;
其中,若所述第一设备的NFS模块收到所述第一NFS模块发送倒换完成的消息,并且所述第一设备的NFS模块已向所述第二设备发送完所述第三信息,则所述第一设备的NFS模块关闭所述TCP倒换请求;If the NFS module of the first device receives the message that the first NFS module sends the switching completion, and the NFS module of the first device has sent the third information to the second device, The NFS module of the first device closes the TCP switching request;
若所述第一设备的NFS模块收到所述第一NFS模块发送倒换完成的消息,并且所述第一设备的NFS模块未向所述第二设备发送完所述第三信息,则所述第一设备的NFS模块在向所述第二设备发送完所述第三信息后关闭所述TCP倒换请求。If the NFS module of the first device receives the message that the first NFS module sends a switching completion, and the NFS module of the first device does not send the third information to the second device, The NFS module of the first device closes the TCP switching request after sending the third information to the second device.
其中,故障端的TCP倒换完毕,会发送倒换完成的消息给NFS业务,NFS收到此消息,如果NFS也倒换完毕了,那么NFS就主动关闭请求;如果NFS还未完成数据的倒换,那么NFS倒换完成后在关闭连接。After the TCP switchover of the faulty end is completed, the NFS service is sent to the NFS service. The NFS receives the message. If the NFS is also switched, the NFS actively closes the request. If the NFS has not completed the data switchover, the NFS switchover is performed. Close the connection when done.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
参照图4,图4为数据倒换的方法第四实施例的流程示意图。Referring to FIG. 4, FIG. 4 is a schematic flowchart of a fourth embodiment of a method for data switching.
在第四实施例中,所述数据倒换的方法包括:In the fourth embodiment, the data switching method includes:
步骤401,第二设备接收所述第一设备在接收到TCP倒换请求的情况下发送的第一信息,根据所述第一信息复制出所述第一设备的socket连接;Step 401: The second device receives the first information that is sent by the first device when receiving the TCP switching request, and copies the socket connection of the first device according to the first information.
其中,所述第一设备可以为故障节点,所述第二设备可以为接管节点。The first device may be a faulty node, and the second device may be a takeover node.
其中,故障节点因故障或者故障节点在升级的情况下,用户可以手动将IP迁移到接管节点,再下电故障节点,故障节点的IP迁移后,故障端的TCP模块会获取当前切换IP上的连接,故障端TCP模块获取TCP模块的控制块结构和socket模块的管理结构中的关键字段,将获取到的TCP模块的控制块结构和socket模块的管理结构中的关键字段通过集群通道发送到接管节点的TCP模块,以便接管节点的TCP模块能在第一时间克隆出新的socket连接。If the faulty node is faulty or the faulty node is upgraded, the user can manually migrate the IP to the takeover node, and then power off the faulty node. After the IP address of the faulty node is migrated, the TCP module of the faulty end obtains the connection on the current switched IP. The faulty TCP module acquires the control block structure of the TCP module and the key field in the management structure of the socket module, and sends the obtained control block structure of the TCP module and the key field in the management structure of the socket module to the cluster channel through the cluster channel. Take over the node's TCP module so that the TCP module that takes over the node can clone a new socket connection in the first time.
步骤402,接收第一设备发送的第二信息及第三信息,根据所述第二信息生成所述socket连接的socket ID,根据所述第三信息将所述第一设备的 NFS相关信息赋值到与所述socket ID对应的socket连接中。Step 402: Receive second information and third information that are sent by the first device, generate a socket ID of the socket connection according to the second information, and use the third information according to the third information. The NFS related information is assigned to the socket connection corresponding to the socket ID.
可选地,所述接收第一设备发送的第二信息及第三信息,根据所述第二信息生成所述socket连接的socket ID,包括:Optionally, the receiving the second information and the third information that are sent by the first device, and generating the socket ID of the socket connection according to the second information, including:
所述第二设备的TCP模块接收所述第二信息;The TCP module of the second device receives the second information;
所述第二设备的TCP模块将所述第二信息赋值到复制后的socket连接中,生成所述socket连接的socket ID;The TCP module of the second device assigns the second information to the copied socket connection, and generates a socket ID of the socket connection;
所述第二设备的NFS模块接收所述第三信息;The NFS module of the second device receives the third information;
所述第二设备的NFS模块根据所述第三信息中的五元组信息匹配与所述socket ID对应的socket连接,若匹配,则将所述第三信息赋值给所述与所述socket ID对应的socket连接。The NFS module of the second device matches the socket connection corresponding to the socket ID according to the quintuple information in the third information, and if yes, assigns the third information to the socket ID Corresponding socket connection.
其中,故障端的TCP层通知NFS业务层,NFS业务和TCP层同时开始连接的倒换。NFS收到切换的消息,NFS停止对后端磁盘的IO操作和前端的报文发送;NFS收集本次倒换连接的信息,包括NFS控制面管理结构和未完成的IO请求通过集群通道发送到对端节点上,所述NFS控制面管理结构中包括NFS锁信息。同时,TCP层也会收集TCP发送缓存中的数据报文和socket接收环中的报文,通过集群通道发送到接管节点TCP层,接管节点TCP层需要将这些信息赋值到新的socket连接中,完成TCP连接的迁移。NFS和TCP两者在此处是同时进行。The TCP layer of the faulty end notifies the NFS service layer that the NFS service and the TCP layer start the switching of the connection at the same time. NFS receives the switch message, NFS stops the IO operation of the back-end disk and the packet of the front-end disk; NFS collects the information of the switch connection, including the NFS control plane management structure and the unfinished IO request sent to the pair through the cluster channel. On the end node, the NFS control plane management structure includes NFS lock information. At the same time, the TCP layer also collects the data packets in the TCP transmission buffer and the packets in the socket receiving ring, and sends them to the TCP layer of the takeover node through the cluster channel. The TCP layer of the takeover node needs to assign the information to the new socket connection. Complete the migration of the TCP connection. Both NFS and TCP are performed simultaneously here.
接管节点的TCP模块将接收到的倒换数据恢复到新的socket连接中,待接管节点的TCP模块将所有数据处理完毕,再将新的socket ID发送给接管节点的NFS模块。接管节点的NFS模块需要将接收到的倒换数据暂存再把NFS锁信息恢复。NFS锁的恢复过程是将暂存的倒换数据与新的socket ID匹配,匹配的条件是TCP层的五元组信息(协议、源-目的IP、源-目的端口)与所述新的socket ID对应的socket连接相匹配。The TCP module of the takeover node restores the received switch data to the new socket connection. The TCP module of the takeover node processes all the data and sends the new socket ID to the NFS module of the takeover node. The NFS module that takes over the node needs to temporarily store the received switching data and then restore the NFS lock information. The recovery process of the NFS lock is to match the temporarily stored switching data with the new socket ID, and the matching condition is the quintuple information (protocol, source-destination IP, source-destination port) of the TCP layer and the new socket ID. The corresponding socket connections match.
为了实现NFS锁信息的快速恢复,在系统检测到故障节点访问异常时触发故障节点的端口倒换,待确定倒换的目标节点之后开始进行NFS锁信息迁移。In order to implement the fast recovery of the NFS lock information, the system detects the faulty node access abnormality and triggers the port switching of the faulty node. After the target node of the switchover is determined, the NFS lock information migration begins.
在故障节点收到NFS锁信息迁移消息后先根据socket信息找到相对应的sokcet信息的锁,并将锁信息按照表1所示的报文格式进行封装,当锁信息过多时进行消息分片,在接管节点收到消息后根据消息分片的序号进行组包 处理,当NFS锁信息发送成功后清除掉本节点相应连接的NFS锁信息,直至故障节点上所有需要倒换的连接发送完成为止。After the faulty node receives the NFS lock information migration message, it first finds the lock of the corresponding sockett information according to the socket information, and encapsulates the lock information according to the packet format shown in Table 1. When the lock information is too much, the message is fragmented. After receiving the message, the takeover node performs grouping according to the sequence number of the message fragment. After the NFS lock information is successfully sent, the NFS lock information of the corresponding connection of the node is cleared until all the connections that need to be switched on the faulty node are sent.
表1Table 1
在接管节点收到故障节点的倒换报文后,会根据报文类型进行相应的处理。当收到锁迁移的报文信息时根据报文结束标记和需要将收到的文件锁信息进行组包,并判断若当前相应连接的端口倒换已成功,则在本节点上解析迁移过来的文件锁信息,并主动发起锁的恢复操作,将故障节点上的文件锁在接管节点上恢复;若当前连接的端口倒换未完成则会进行等待,当超过客户端锁操作的最大延迟时间(NFSv4协议默认90s为租赁时间)时,将迁移的锁信息进行释放,因为此时就算恢复,该锁在客户端已经失效,失去了恢复的价值,不过端口倒换的速度往往要远大于锁迁移的速度,只有在极少数情况下才出现锁迁移超前于端口倒换且延迟超过客户端默认最大延迟的情况。After the receiving node receives the switching packet of the faulty node, it will perform corresponding processing according to the packet type. When the packet information of the lock transition is received, the packet end information is marked according to the end of the packet and the file lock information needs to be grouped, and it is determined that if the currently connected port switching is successful, the migrated file is parsed on the node. Lock the information, and initiate the recovery operation of the lock, and restore the file lock on the faulty node to the takeover node; if the currently connected port switchover is not completed, it will wait, when the maximum delay time of the client lock operation is exceeded (NFSv4 protocol) When the default 90s is the lease time, the migrated lock information is released. Because the recovery is lost at the client and the recovery value is lost, the port switching speed is often much faster than the lock migration speed. Only in rare cases occurs when lock migration is ahead of port switching and the delay exceeds the client's default maximum latency.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
参照图5,图5为数据倒换的方法第五实施例的流程示意图。Referring to FIG. 5, FIG. 5 is a schematic flowchart of a fifth embodiment of a method for data switching.
在第四实施例中,步骤402之后还包括:In the fourth embodiment, after
步骤403,所述第二设备的NFS模块向所述第二设备的TCP模块发送倒换完成的消息;Step 403: The NFS module of the second device sends a message that the switching is completed to the TCP module of the second device.
所述第二设备的协议栈IP层开放NET的收发包,所述第二设备的TCP模块发送窗口恢复报文给所述第一设备,所述第二NFS模块收发报文。The protocol stack IP layer of the second device opens and receives the packet of the NET, and the TCP module of the second device sends a window recovery packet to the first device, and the second NFS module sends and receives the packet.
其中,故障端的TCP倒换完毕,会发送倒换完成的消息给NFS业务,NFS收到此消息,如果NFS也倒换完毕了,那么NFS就主动关闭请求;如果NFS还未完成数据的倒换,那么NFS倒换完成后在关闭连接。After the TCP switchover of the faulty end is completed, the NFS service is sent to the NFS service. The NFS receives the message. If the NFS is also switched, the NFS actively closes the request. If the NFS has not completed the data switchover, the NFS switchover is performed. Close the connection when done.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况 下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。The embodiment of the present invention receives the transmission control protocol TCP switching request by the first device. Transmitting, to the second device, the first information, where the first information is used by the second device to copy the socket connection of the first device, and the second device is configured to send the second information and the third information, where The second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to assign NFS related information of the first device to a socket connection corresponding to the socket ID, thereby When the problem occurs in the faulty node, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring fast recovery of the file lock.
参照图6,图6为数据倒换的方法第六实施例的流程示意图。Referring to FIG. 6, FIG. 6 is a schematic flowchart of a sixth embodiment of a method for data switching.
在第四或者第五实施例中,步骤401之后还包括:In the fourth or fifth embodiment, after
步骤404,所述第二设备向所述第一设备发送零窗口的报文。Step 404: The second device sends a zero window message to the first device.
其中,接管节点建立新的socket连接之后,接管节点会发送一个零窗口的报文给故障节点,并在网络层IP层暂时禁止对应的NET收取报文。After the takeover node establishes a new socket connection, the takeover node sends a zero window message to the faulty node, and temporarily disables the corresponding NET to receive the message at the network layer IP layer.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
参考图7,图7为数据倒换的方法第七实施例的交互流程示意图。Referring to FIG. 7, FIG. 7 is a schematic diagram of an interaction process of a seventh embodiment of a method for data switching.
在第七实施例中,所述数据倒换的方法包括:In the seventh embodiment, the data switching method includes:
步骤701,故障节点的协议栈TCP层获取到接管节点以后,通知NFS开始倒换;Step 701: After the TCP layer of the faulty node's protocol stack obtains the takeover node, notify the NFS to start switching.
步骤702,TCP模块通过跨节点集群通道,先把TCP层协议控制块等关键信息(key)发送到接管节点,在接管节点建立socket连接;Step 702: The TCP module sends a key information (key) such as a TCP layer protocol control block to the takeover node by using a cross-node cluster channel, and establishes a socket connection at the takeover node;
步骤703,故障节点的TCP模块和NFS模块同时通过集群通道发送各自的数据到接管节点;Step 703, the TCP module and the NFS module of the faulty node simultaneously send respective data to the takeover node through the cluster channel;
步骤704,故障节点的TCP模块的数据传输完毕,故障节点通知NFS关闭socket,接管节点通知NFS新连接(socket)到来;Step 704, the data transmission of the TCP module of the faulty node is completed, the faulty node notifies the NFS to close the socket, and the takeover node notifies the NFS new connection (socket) to arrive;
步骤705,接管端NFS模块获取到了新的socket连接,通知TCP开始正常工作。 In step 705, the NFS module of the takeover end acquires a new socket connection, and notifies that the TCP starts to work normally.
本发明实施例提供一种在集群模式下的NFS倒换和倒换过程中快速完成锁迁移的技术,原理是通过拷贝NFS连接的方式,把服务端NFS连接拷贝到接管端,而客户端不感知,拷贝的信息按类别区分包括两部分:控制信息和数据信息,控制面的信息主要拷贝关键字段即可,数据面信息需要把所有的未完成已发起的数据请求报文拷贝到接管端;其中控制面关键信息中包括NFS锁信息,对NFS锁的处理,能避免对文件的冲突访问。在故障端收集NFS锁信息,在接管端恢复所信息。The embodiment of the invention provides a technology for quickly completing the lock migration in the NFS switching and switching process in the cluster mode. The principle is that the NFS connection of the server is copied to the takeover end by copying the NFS connection, and the client does not perceive. The copied information is divided into two parts according to the category: control information and data information. The information of the control plane mainly copies the key field, and the data plane information needs to copy all the uncompleted data request messages to the takeover end; The key information of the control plane includes the NFS lock information, and the processing of the NFS lock can avoid conflicting access to the file. Collect NFS lock information on the fault side and recover the information on the takeover side.
本发明实施例采用以下的技术方案:在前端IP迁移完成,故障端通知本端的TCP层倒换,TCP再通知应用层倒换,相当于把服务端的TCP连接迁移到接管端;在整个过程中必须要有一条集群通道来作为倒换的数据通道,该集群通道可以使各种形式的物理通道,而且各个节点控制器都能访问到后端的共享存储池。当应用层收到倒换消息后,将NFS文件锁和NFS请求数据等相关信息通过集群内部通信转移到切换的节点之上,并在接管节点上将锁信息快速恢复。整个过程分为TCP层和NFS的数据迁移,为了提高数据迁移的速度,TCP和NFS数据迁移是并行进行,而且集群通道是一条高速可靠通道,故数据迁移速度快,整个迁移过程在毫秒级别。The following technical solutions are adopted in the embodiment of the present invention: when the front-end IP migration is completed, the faulty end notifies the local end of the TCP layer switching, and the TCP notifies the application layer to switch, which is equivalent to migrating the TCP connection of the server to the take-over end; There is a cluster channel as a switched data channel, which can make various forms of physical channels, and each node controller can access the shared storage pool of the back end. After receiving the switching message, the application layer transfers the NFS file lock and NFS request data and other related information to the switched node through the internal communication of the cluster, and quickly recovers the lock information on the takeover node. The whole process is divided into TCP layer and NFS data migration. In order to improve the speed of data migration, TCP and NFS data migration are performed in parallel, and the cluster channel is a high-speed and reliable channel, so the data migration speed is fast, and the whole migration process is at the millisecond level.
本发明实施例适用于集群模式下的分布式网络文件系统NFS的高可用性,应用到的技术包括应用服务的可靠接入及集群模式下的分布式网络文件系统的可靠访问。主要解决大数据存储集群环境下,在单点链路故障的情况下,保证NFS服务的可靠性和稳定性。具体方案为:当客户端与访问节点之间出现链路故障时,该节点通过基于网口的NFS倒换将文件锁及相关信息发送到另一个接管节点,在接管节点上克隆拷贝出一条新的连接,把旧的NFS连接数据复制到新的连接上,新的连接拥有旧连接的所有属性和状态,包括NFS锁的信息,从而在客户端看来,服务端的连接并为发生改变,客户端不感知服务端NFS服务切换的状态下完成了文件锁的迁移。通过该技术可以实现一主多备,即通过添加集群中的各节点的端口为访问节点端口的备用端口,可将访问节点上的锁信息迁移到集群内任意节点之上。The embodiments of the present invention are applicable to the high availability of the distributed network file system NFS in the cluster mode, and the applied technologies include reliable access of application services and reliable access of the distributed network file system in the cluster mode. It mainly solves the problem of ensuring the reliability and stability of the NFS service in the case of a single-point link failure in a big data storage cluster environment. The specific solution is: when a link fault occurs between the client and the access node, the node sends the file lock and related information to another takeover node through the network port-based NFS switch, and clones and copies a new one on the takeover node. Connection, copy the old NFS connection data to the new connection, the new connection has all the attributes and status of the old connection, including the information of the NFS lock, so that from the client's point of view, the connection of the server is changed, the client The file lock migration is completed without being aware of the server-side NFS service switch. Through this technology, a master multi-standby can be implemented. That is, by adding a port of each node in the cluster as an alternate port of the access node port, the lock information on the access node can be migrated to any node in the cluster.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二 信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting to send the second information and the third information to the second device, the second The information is used by the second device to generate a socket ID of the socket connection, where the third information is used to assign NFS related information of the first device to a socket connection corresponding to the socket ID, thereby implementing a fault. When there is a problem with the node, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thus ensuring fast recovery of the file lock.
本发明提供一种设备。The present invention provides an apparatus.
参考图8,图8为本发明第一设备第一实施例的功能模块示意图。Referring to FIG. 8, FIG. 8 is a schematic diagram of functional modules of a first embodiment of a first device of the present invention.
在第一实施例中,该第一设备包括:In the first embodiment, the first device includes:
第一TCP模块801和第一NFS模块802;a first TCP module 801 and a first NFS module 802;
所述第一TCP模块801,设置为:在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;The first TCP module 801 is configured to: when the first device receives the transmission control protocol TCP switching request, send the first information to the second device, where the first information is used by the second device The socket connection of the first device;
可选地,所述第一TCP模块801,包括:Optionally, the first TCP module 801 includes:
第一获取单元8011,设置为:在所述第一设备接收到TCP倒换请求的情况下,获取第一信息,所述第一信息至少包括所述第一设备TCP模块中的控制块结构的关键字和第一设备socket模块中的管理结构的关键字;The first obtaining unit 8011 is configured to: when the first device receives the TCP switching request, acquire first information, where the first information includes at least a key of a control block structure in the first device TCP module. The word and the keyword of the management structure in the first device socket module;
第一发送单元8012,设置为:将所述第一信息发送给所述第二TCP模块。The first sending unit 8012 is configured to: send the first information to the second TCP module.
其中,所述第一设备可以为故障节点,所述第二设备可以为接管节点。The first device may be a faulty node, and the second device may be a takeover node.
具体的,故障节点因故障或者故障节点在升级的情况下,用户可以手动将IP迁移到接管节点,再下电故障节点,故障节点的IP迁移后,故障端的TCP模块会获取当前切换IP上的连接,故障端TCP模块获取TCP模块的控制块结构和socket模块的管理结构中的关键字段,将获取到的TCP模块的控制块结构和socket模块的管理结构中的关键字段通过集群通道发送到接管节点的TCP模块,以便接管节点的TCP模块能在第一时间克隆出新的socket连接。Specifically, if the faulty node is faulty or the faulty node is upgraded, the user can manually migrate the IP to the takeover node, and then power off the faulty node. After the IP address of the faulty node is migrated, the TCP module of the faulty end acquires the current handover IP address. The TCP module of the fault end acquires the control block structure of the TCP module and the key field in the management structure of the socket module, and sends the control block structure of the obtained TCP module and the key field in the management structure of the socket module through the cluster channel. Go to the TCP module of the takeover node so that the TCP module of the takeover node can clone a new socket connection in the first time.
所述第一TCP模块801,设置为:向所述第二设备发送第二信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID;The first TCP module 801 is configured to: send second information to the second device, where the second information is used by the second device to generate a socket ID of the socket connection;
所述第一NFS模块802,设置为:向所述第二设备发送第三信息,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中。The first NFS module 802 is configured to: send third information to the second device, where the third information is used to assign NFS related information of the first device to a socket connection corresponding to the socket ID. in.
可选地,所述第一TCP模块801,包括: Optionally, the first TCP module 801 includes:
第二获取单元8013,设置为:获取所述第二信息,所述第二信息至少包括所述TCP模块发送缓存中的报文以及socket接收缓存中的报文;The second obtaining
第二发送单元8014,设置为:通过集群通道将所述第二信息发送给所述第二TCP模块,以使得所述第二TCP模块将所述第二信息赋值到复制后的socket连接中。The
可选地,所述第一NFS模块802,包括:Optionally, the first NFS module 802 includes:
第三获取单元8021,设置为:获取所述第三信息,所述第三信息至少包括所述NFS模块的控制面管理结构的信息和所述NFS模块未完成的IO请求,所述NFS模块的控制面管理结构的信息包括NFS锁信息;The third obtaining unit 8021 is configured to: acquire the third information, where the third information includes at least information about a control plane management structure of the NFS module and an IO request that is not completed by the NFS module, where the NFS module The information of the control plane management structure includes NFS lock information;
第三发送单元8022,设置为:通过集群通道将所述第三信息发送给所述第二NFS模块。The third sending unit 8022 is configured to: send the third information to the second NFS module by using a cluster channel.
其中,故障端的TCP层通知NFS业务层,NFS业务和TCP层同时开始连接的倒换。NFS收到切换的消息,NFS停止对后端磁盘的IO操作和前端的报文发送;NFS收集本次倒换连接的信息,包括NFS控制面管理结构和未完成的IO请求通过集群通道发送到对端节点上,所述NFS控制面管理结构中包括NFS锁信息。同时,TCP层也会收集TCP发送缓存中的数据报文和socket接收环中的报文,通过集群通道发送到接管节点TCP层,接管节点TCP层需要将这些信息赋值到新的socket连接中,完成TCP连接的迁移。NFS和TCP两者在此处是同时进行。The TCP layer of the faulty end notifies the NFS service layer that the NFS service and the TCP layer start the switching of the connection at the same time. NFS receives the switch message, NFS stops the IO operation of the back-end disk and the packet of the front-end disk; NFS collects the information of the switch connection, including the NFS control plane management structure and the unfinished IO request sent to the pair through the cluster channel. On the end node, the NFS control plane management structure includes NFS lock information. At the same time, the TCP layer also collects the data packets in the TCP transmission buffer and the packets in the socket receiving ring, and sends them to the TCP layer of the takeover node through the cluster channel. The TCP layer of the takeover node needs to assign the information to the new socket connection. Complete the migration of the TCP connection. Both NFS and TCP are performed simultaneously here.
接管节点的TCP模块将接收到的倒换数据恢复到新的socket连接中,待接管节点的TCP模块将所有数据处理完毕,再将新的socket ID发送给接管节点的NFS模块。接管节点的NFS模块需要将接收到的倒换数据暂存再把NFS锁信息恢复。NFS锁的恢复过程是将暂存的倒换数据与新的socket ID匹配,匹配的条件是TCP层的五元组信息(协议、源-目的IP、源-目的端口)与所述新的socket ID对应的socket连接相匹配。The TCP module of the takeover node restores the received switch data to the new socket connection. The TCP module of the takeover node processes all the data and sends the new socket ID to the NFS module of the takeover node. The NFS module that takes over the node needs to temporarily store the received switching data and then restore the NFS lock information. The recovery process of the NFS lock is to match the temporarily stored switching data with the new socket ID, and the matching condition is the quintuple information (protocol, source-destination IP, source-destination port) of the TCP layer and the new socket ID. The corresponding socket connections match.
可选地,所述第一NFS模块802,包括:Optionally, the first NFS module 802 includes:
第四获取单元8023,设置为:获取NFS锁信息;The fourth obtaining unit 8023 is configured to: obtain NFS lock information;
封装单元8024,设置为:将所述NFS锁信息根据报文格式进行封装,所述报文格式至少包括所述socket连接的信息、报文分片号、所述NFS锁信息的编号以及所述报文的结束标识;
The
第四发送单元8025,设置为:将封装后的NFS锁信息发送给所述第二NFS模块。The fourth sending unit 8025 is configured to: send the encapsulated NFS lock information to the second NFS module.
其中,为了实现NFS锁信息的快速恢复,在系统检测到故障节点访问异常时触发故障节点的端口倒换,待确定倒换的目标节点之后开始进行NFS锁信息迁移。In order to implement the fast recovery of the NFS lock information, when the system detects that the faulty node access is abnormal, the port of the faulty node is triggered to be switched. After the target node of the switchover is determined, the NFS lock information migration is started.
在故障节点收到NFS锁信息迁移消息后先根据socket信息找到相对应的sokcet信息的锁,并将锁信息按照表1所示的报文格式进行封装,当锁信息过多时进行消息分片,在接管节点收到消息后根据消息分片的序号进行组包处理,当NFS锁信息发送成功后清除掉本节点相应连接的NFS锁信息,直至故障节点上所有需要倒换的连接发送完成为止。After the faulty node receives the NFS lock information migration message, it first finds the lock of the corresponding sockett information according to the socket information, and encapsulates the lock information according to the packet format shown in Table 1. When the lock information is too much, the message is fragmented. After receiving the message, the takeover node performs group packet processing according to the sequence number of the message fragment. After the NFS lock information is successfully sent, the NFS lock information of the corresponding connection of the node is cleared until all the connections that need to be switched on the faulty node are sent.
表1Table 1
在接管节点收到故障节点的倒换报文后,会根据报文类型进行相应的处理。当收到锁迁移的报文信息时根据报文结束标记和需要将收到的文件锁信息进行组包,并判断若当前相应连接的端口倒换已成功,则在本节点上解析迁移过来的文件锁信息,并主动发起锁的恢复操作,将故障节点上的文件锁在接管节点上恢复;若当前连接的端口倒换未完成则会进行等待,当超过客户端锁操作的最大延迟时间(NFSv4协议默认90s为租赁时间)时,将迁移的锁信息进行释放,因为此时就算恢复,该锁在客户端已经失效,失去了恢复的价值,不过端口倒换的速度往往要远大于锁迁移的速度,只有在极少数情况下才出现锁迁移超前于端口倒换且延迟超过客户端默认最大延迟的情况。After the receiving node receives the switching packet of the faulty node, it will perform corresponding processing according to the packet type. When the packet information of the lock transition is received, the packet end information is marked according to the end of the packet and the file lock information needs to be grouped, and it is determined that if the currently connected port switching is successful, the migrated file is parsed on the node. Lock the information, and initiate the recovery operation of the lock, and restore the file lock on the faulty node to the takeover node; if the currently connected port switchover is not completed, it will wait, when the maximum delay time of the client lock operation is exceeded (NFSv4 protocol) When the default 90s is the lease time, the migrated lock information is released. Because the recovery is lost at the client and the recovery value is lost, the port switching speed is often much faster than the lock migration speed. Only in rare cases occurs when lock migration is ahead of port switching and the delay exceeds the client's default maximum latency.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
本发明进一步提供一种第一设备。 The invention further provides a first device.
参考图9,图9为本发明第一设备第二实施例的功能模块示意图。Referring to FIG. 9, FIG. 9 is a schematic diagram of a functional module of a second embodiment of a first device of the present invention.
在第一实施例的基础上,所述第一TCP模块801还包括:On the basis of the first embodiment, the first TCP module 801 further includes:
第一接收单元8015,设置为:接收所述第二设备发送的零窗口的报文。The first receiving unit 8015 is configured to: receive a zero window message sent by the second device.
其中,接管节点建立新的socket连接之后,接管节点会发送一个零窗口的报文给故障节点,并在网络层IP层暂时禁止对应的NET收取报文。After the takeover node establishes a new socket connection, the takeover node sends a zero window message to the faulty node, and temporarily disables the corresponding NET to receive the message at the network layer IP layer.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
参考图10,图10为本发明第一设备第三实施例的功能模块示意图。Referring to FIG. 10, FIG. 10 is a schematic diagram of a functional module of a third embodiment of a first device of the present invention.
在第一实施例的基础上,所述第一TCP模块801还包括:On the basis of the first embodiment, the first TCP module 801 further includes:
第五发送单元8016,设置为:向所述第二设备发送所述第二信息后,向所述第一NFS模块发送倒换完成的消息;The fifth sending unit 8016 is configured to: after sending the second information to the second device, send a message that the switching is completed to the first NFS module;
所述第一NFS模块802,还包括:The first NFS module 802 further includes:
关闭单元8026,设置为:若所述第一NFS模块收到所述第一NFS模块发送倒换完成的消息,并且所述第一NFS模块已向所述第二设备发送完所述第三信息,则关闭所述TCP倒换请求;The closing unit 8026 is configured to: if the first NFS module receives the message that the first NFS module sends a switching completion, and the first NFS module has sent the third information to the second device, Turning off the TCP switching request;
关闭单元8026,还设置为:若所述第一NFS模块收到所述第一NFS模块发送倒换完成的消息,并且所述第一NFS模块未向所述第二设备发送完所述第三信息,则在向所述第二设备发送完所述第三信息后关闭所述TCP倒换请求。The closing unit 8026 is further configured to: if the first NFS module receives the message that the first NFS module sends a switching completion, and the first NFS module does not send the third information to the second device And closing the TCP switching request after sending the third information to the second device.
其中,故障端的TCP倒换完毕,会发送倒换完成的消息给NFS业务,NFS收到此消息,如果NFS也倒换完毕了,那么NFS就主动关闭请求;如果NFS还未完成数据的倒换,那么NFS倒换完成后在关闭连接。After the TCP switchover of the faulty end is completed, the NFS service is sent to the NFS service. The NFS receives the message. If the NFS is also switched, the NFS actively closes the request. If the NFS has not completed the data switchover, the NFS switchover is performed. Close the connection when done.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二 信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting to send the second information and the third information to the second device, the second The information is used by the second device to generate a socket ID of the socket connection, where the third information is used to assign NFS related information of the first device to a socket connection corresponding to the socket ID, thereby implementing a fault. When there is a problem with the node, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thus ensuring fast recovery of the file lock.
参考图11,图11为本发明第二设备第一实施例的功能模块示意图。Referring to FIG. 11, FIG. 11 is a schematic diagram of functional modules of a first embodiment of a second device of the present invention.
在第一实施例中,所述第二设备包括第二TCP模块1101以及第二NFS模块1102;In the first embodiment, the second device includes a second TCP module 1101 and a second NFS module 1102;
所述第二TCP模块1101,设置为:在第一设备接收到TCP倒换请求的情况下,接收所述第一设备发送的第一信息,根据所述第一信息复制出所述第一设备的socket连接;The second TCP module 1101 is configured to: when the first device receives the TCP switching request, receive the first information sent by the first device, and copy the first device according to the first information. Socket connection
其中,所述第一设备可以为故障节点,所述第二设备可以为接管节点。The first device may be a faulty node, and the second device may be a takeover node.
其中,故障节点因故障或者故障节点在升级的情况下,用户可以手动将IP迁移到接管节点,再下电故障节点,故障节点的IP迁移后,故障端的TCP模块会获取当前切换IP上的连接,故障端TCP模块获取TCP模块的控制块结构和socket模块的管理结构中的关键字段,将获取到的TCP模块的控制块结构和socket模块的管理结构中的关键字段通过集群通道发送到接管节点的TCP模块,以便接管节点的TCP模块能在第一时间克隆出新的socket连接。If the faulty node is faulty or the faulty node is upgraded, the user can manually migrate the IP to the takeover node, and then power off the faulty node. After the IP address of the faulty node is migrated, the TCP module of the faulty end obtains the connection on the current switched IP. The faulty TCP module acquires the control block structure of the TCP module and the key field in the management structure of the socket module, and sends the obtained control block structure of the TCP module and the key field in the management structure of the socket module to the cluster channel through the cluster channel. Take over the node's TCP module so that the TCP module that takes over the node can clone a new socket connection in the first time.
所述第二TCP模块1101,设置为:接收第一设备发送的第二信息,根据所述第二信息生成所述socket连接的socket ID;The second TCP module 1101 is configured to: receive second information sent by the first device, and generate a socket ID of the socket connection according to the second information;
可选地,所述第二TCP模块1101,包括:Optionally, the second TCP module 1101 includes:
第二接收单元11011,设置为:接收所述第二信息;The
生成单元11012,设置为:将所述第二信息赋值到复制后的socket连接中,生成所述socket连接的socket ID;The generating
所述第二NFS模块1102,设置为:接收第一设备发送的第三信息,根据所述第三信息将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中。The second NFS module 1102 is configured to: receive the third information sent by the first device, and assign the NFS related information of the first device to the socket connection corresponding to the socket ID according to the third information.
可选地,所述第二NFS模块1102,包括:Optionally, the second NFS module 1102 includes:
第三接收单元11021,设置为:接收所述第三信息;The
匹配单元11022,设置为:根据所述第三信息中的五元组信息匹配与所述socket ID对应的socket连接;
The
赋值单元11023,设置为:若匹配,则将所述第三信息赋值给所述与所述socket ID对应的socket连接。The
其中,故障端的TCP层通知NFS业务层,NFS业务和TCP层同时开始连接的倒换。NFS收到切换的消息,NFS停止对后端磁盘的IO操作和前端的报文发送;NFS收集本次倒换连接的信息,包括NFS控制面管理结构和未完成的IO请求通过集群通道发送到对端节点上,所述NFS控制面管理结构中包括NFS锁信息。同时,TCP层也会收集TCP发送缓存中的数据报文和socket接收环中的报文,通过集群通道发送到接管节点TCP层,接管节点TCP层需要将这些信息赋值到新的socket连接中,完成TCP连接的迁移。NFS和TCP两者在此处是同时进行。The TCP layer of the faulty end notifies the NFS service layer that the NFS service and the TCP layer start the switching of the connection at the same time. NFS receives the switch message, NFS stops the IO operation of the back-end disk and the packet of the front-end disk; NFS collects the information of the switch connection, including the NFS control plane management structure and the unfinished IO request sent to the pair through the cluster channel. On the end node, the NFS control plane management structure includes NFS lock information. At the same time, the TCP layer also collects the data packets in the TCP transmission buffer and the packets in the socket receiving ring, and sends them to the TCP layer of the takeover node through the cluster channel. The TCP layer of the takeover node needs to assign the information to the new socket connection. Complete the migration of the TCP connection. Both NFS and TCP are performed simultaneously here.
接管节点的TCP模块将接收到的倒换数据恢复到新的socket连接中,待接管节点的TCP模块将所有数据处理完毕,再将新的socket ID发送给接管节点的NFS模块。接管节点的NFS模块需要将接收到的倒换数据暂存再把NFS锁信息恢复。NFS锁的恢复过程是将暂存的倒换数据与新的socket ID匹配,匹配的条件是TCP层的五元组信息(协议、源-目的IP、源-目的端口)与所述新的socket ID对应的socket连接相匹配。The TCP module of the takeover node restores the received switch data to the new socket connection. The TCP module of the takeover node processes all the data and sends the new socket ID to the NFS module of the takeover node. The NFS module that takes over the node needs to temporarily store the received switching data and then restore the NFS lock information. The recovery process of the NFS lock is to match the temporarily stored switching data with the new socket ID, and the matching condition is the quintuple information (protocol, source-destination IP, source-destination port) of the TCP layer and the new socket ID. The corresponding socket connections match.
为了实现NFS锁信息的快速恢复,在系统检测到故障节点访问异常时触发故障节点的端口倒换,待确定倒换的目标节点之后开始进行NFS锁信息迁移。In order to implement the fast recovery of the NFS lock information, the system detects the faulty node access abnormality and triggers the port switching of the faulty node. After the target node of the switchover is determined, the NFS lock information migration begins.
在故障节点收到NFS锁信息迁移消息后先根据socket信息找到相对应的sokcet信息的锁,并将锁信息按照表1所示的报文格式进行封装,当锁信息过多时进行消息分片,在接管节点收到消息后根据消息分片的序号进行组包处理,当NFS锁信息发送成功后清除掉本节点相应连接的NFS锁信息,直至故障节点上所有需要倒换的连接发送完成为止。After the faulty node receives the NFS lock information migration message, it first finds the lock of the corresponding sockett information according to the socket information, and encapsulates the lock information according to the packet format shown in Table 1. When the lock information is too much, the message is fragmented. After receiving the message, the takeover node performs group packet processing according to the sequence number of the message fragment. After the NFS lock information is successfully sent, the NFS lock information of the corresponding connection of the node is cleared until all the connections that need to be switched on the faulty node are sent.
表1Table 1
在接管节点收到故障节点的倒换报文后,会根据报文类型进行相应的处理。当收到锁迁移的报文信息时根据报文结束标记和需要将收到的文件锁信息进行组包,并判断若当前相应连接的端口倒换已成功,则在本节点上解析迁移过来的文件锁信息,并主动发起锁的恢复操作,将故障节点上的文件锁 在接管节点上恢复;若当前连接的端口倒换未完成则会进行等待,当超过客户端锁操作的最大延迟时间(NFSv4协议默认90s为租赁时间)时,将迁移的锁信息进行释放,因为此时就算恢复,该锁在客户端已经失效,失去了恢复的价值,不过端口倒换的速度往往要远大于锁迁移的速度,只有在极少数情况下才出现锁迁移超前于端口倒换且延迟超过客户端默认最大延迟的情况。After the receiving node receives the switching packet of the faulty node, it will perform corresponding processing according to the packet type. When the packet information of the lock transition is received, the packet end information is marked according to the end of the packet and the file lock information needs to be grouped, and it is determined that if the currently connected port switching is successful, the migrated file is parsed on the node. Lock the information and actively initiate a lock recovery operation to lock the file on the failed node Resume on the takeover node; if the currently connected port switchover is not completed, it will wait. When the maximum delay time of the client lock operation is exceeded (the NFSv4 protocol defaults to 90s as the lease time), the migrated lock information is released because this Even if the recovery, the lock has expired on the client, losing the value of recovery, but the port switching speed is often much faster than the lock migration speed, only in rare cases, the lock migration is ahead of the port switching and the delay exceeds the customer. The default maximum delay case.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
参考图12,图12为本发明第二设备第二实施例的功能模块示意图。Referring to FIG. 12, FIG. 12 is a schematic diagram of functional modules of a second embodiment of a second device of the present invention.
在第一实施例中,所述第二NFS模块1102还包括:In the first embodiment, the second NFS module 1102 further includes:
第六发送单元11024,设置为:向所述第二TCP模块发送倒换完成的消息;The sixth sending unit 11024 is configured to: send a message that the switching is completed to the second TCP module;
所述第二TCP模块1101还包括:The second TCP module 1101 further includes:
第七发送单元11013,设置为:发送窗口恢复报文给所述第一设备。The seventh sending unit 11013 is configured to: send a window recovery message to the first device.
其中,故障端的TCP倒换完毕,会发送倒换完成的消息给NFS业务,NFS收到此消息,如果NFS也倒换完毕了,那么NFS就主动关闭请求;如果NFS还未完成数据的倒换,那么NFS倒换完成后在关闭连接。After the TCP switchover of the faulty end is completed, the NFS service is sent to the NFS service. The NFS receives the message. If the NFS is also switched, the NFS actively closes the request. If the NFS has not completed the data switchover, the NFS switchover is performed. Close the connection when done.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
参考图13,图13为本发明第二设备第三实施例的功能模块示意图。Referring to FIG. 13, FIG. 13 is a schematic diagram of functional modules of a third embodiment of a second device of the present invention.
在第一或者第二实施例中,所述第二TCP模块1101还包括: In the first or second embodiment, the second TCP module 1101 further includes:
第八发送单元11014,设置为:向所述第一设备发送零窗口的报文。The eighth sending unit 11014 is configured to: send a zero window message to the first device.
其中,接管节点建立新的socket连接之后,接管节点会发送一个零窗口的报文给故障节点,并在网络层IP层暂时禁止对应的NET收取报文。After the takeover node establishes a new socket connection, the takeover node sends a zero window message to the faulty node, and temporarily disables the corresponding NET to receive the message at the network layer IP layer.
另外,在实际设备中,可以同时集合如图8至图10所示的第一设备800的模块功能以及如图11至13所示的第二设备110的模块功能。In addition, in the actual device, the module functions of the first device 800 as shown in FIGS. 8 to 10 and the module functions of the second device 110 as shown in FIGS. 11 to 13 can be simultaneously collected.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
本发明进一步提供一种系统。The invention further provides a system.
参考图14,图14为本发明系统第一实施例的系统架构示意图。所述系统包括如图8至图10所示的第一设备800以及如图11至13所示的第二设备110。Referring to FIG. 14, FIG. 14 is a schematic structural diagram of a system according to a first embodiment of the system of the present invention. The system includes a first device 800 as shown in Figures 8-10 and a second device 110 as shown in Figures 11-13.
本发明实施例通过在第一设备接收到传输控制协议TCP倒换请求的情况下,向第二设备发送第一信息,所述第一信息用于所述第二设备复制所述第一设备的socket连接;向所述第二设备发送第二信息及第三信息,所述第二信息用于所述第二设备生成所述socket连接的socket ID,所述第三信息用于将所述第一设备的NFS相关信息赋值到与所述socket ID对应的socket连接中,从而实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。In the embodiment of the present invention, when the first device receives the transmission control protocol TCP switching request, the first information is sent to the second device, where the first information is used by the second device to copy the socket of the first device. Connecting, sending, to the second device, second information and third information, where the second information is used by the second device to generate a socket ID of the socket connection, where the third information is used to be the first The NFS-related information of the device is assigned to the socket connection corresponding to the socket ID, so that when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring the file lock. Fast recovery.
以上仅为本发明的可选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。The above is only an alternative embodiment of the present invention, and thus does not limit the scope of the invention, and the equivalent structure or equivalent process transformation made by using the specification and the drawings of the present invention, or directly or indirectly applied to other related technologies. The fields are all included in the scope of patent protection of the present invention.
本领域普通技术人员可以理解上述实施例的全部或部分步骤可以使用计算机程序流程来实现,所述计算机程序可以存储于一计算机可读存储介质中,所述计算机程序在相应的硬件平台上(如系统、设备、装置、器件等)执行,在执行时,包括方法实施例的步骤之一或其组合。 One of ordinary skill in the art will appreciate that all or a portion of the steps of the above-described embodiments can be implemented using a computer program flow, which can be stored in a computer readable storage medium, such as on a corresponding hardware platform (eg, The system, device, device, device, etc. are executed, and when executed, include one or a combination of the steps of the method embodiments.
可选地,上述实施例的全部或部分步骤也可以使用集成电路来实现,这些步骤可以被分别制作成一个个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本发明不限制于任何特定的硬件和软件结合。Alternatively, all or part of the steps of the above embodiments may also be implemented by using an integrated circuit. These steps may be separately fabricated into individual integrated circuit modules, or multiple modules or steps may be fabricated into a single integrated circuit module. achieve. Thus, the invention is not limited to any specific combination of hardware and software.
上述实施例中的各装置/功能模块/功能单元可以采用通用的计算装置来实现,它们可以集中在单个的计算装置上,也可以分布在多个计算装置所组成的网络上。The devices/function modules/functional units in the above embodiments may be implemented by a general-purpose computing device, which may be centralized on a single computing device or distributed over a network of multiple computing devices.
上述实施例中的各装置/功能模块/功能单元以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。上述提到的计算机可读取存储介质可以是只读存储器,磁盘或光盘等。When each device/function module/functional unit in the above embodiment is implemented in the form of a software function module and sold or used as a stand-alone product, it can be stored in a computer readable storage medium. The above mentioned computer readable storage medium may be a read only memory, a magnetic disk or an optical disk or the like.
本发明实施例实现故障节点出现问题时可将故障节点的锁信息及其相应的数据连接迁移到另外一个正常的节点,从而保证文件锁的快速恢复。 In the embodiment of the present invention, when the faulty node has a problem, the lock information of the faulty node and its corresponding data connection can be migrated to another normal node, thereby ensuring fast recovery of the file lock.
Claims (24)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410812351.2 | 2014-12-23 | ||
CN201410812351.2A CN105790985B (en) | 2014-12-23 | 2014-12-23 | Data switching method, first device, second device and system |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016101409A1 true WO2016101409A1 (en) | 2016-06-30 |
Family
ID=56149037
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2015/073416 WO2016101409A1 (en) | 2014-12-23 | 2015-02-28 | Data switching method, device and system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN105790985B (en) |
WO (1) | WO2016101409A1 (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109412891B (en) * | 2018-10-19 | 2022-04-22 | 郑州云海信息技术有限公司 | Method and device for detecting client state |
CN109862088B (en) * | 2019-01-23 | 2021-06-08 | 新华三云计算技术有限公司 | TCP connection migration method and device |
CN111225020B (en) * | 2019-11-07 | 2021-06-29 | 苏州浪潮智能科技有限公司 | A user-mode network file system dual-stack access method, device and device |
CN111240833B (en) * | 2019-12-31 | 2023-03-17 | 厦门网宿有限公司 | Resource migration method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101132347A (en) * | 2006-08-24 | 2008-02-27 | 华为技术有限公司 | A system and method for realizing TCP connection backup |
CN101764738A (en) * | 2008-12-25 | 2010-06-30 | 华为技术有限公司 | Backup method supporting TCP protocol connection and system thereof |
CN102035687A (en) * | 2011-01-06 | 2011-04-27 | 华为技术有限公司 | Backup method and equipment for TCP connection |
CN102510408A (en) * | 2011-11-30 | 2012-06-20 | 武汉烽火网络有限责任公司 | Method for realizing TCP (transmission control protocol) application main and standby changeover |
CN102521300A (en) * | 2011-11-30 | 2012-06-27 | 华中科技大学 | Inter-domain file data sharing method based embedded virtualization platform |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004110509A (en) * | 2002-09-19 | 2004-04-08 | Nec Corp | System switchover control processing method in redundancy configuration system |
CN1921369B (en) * | 2006-08-08 | 2011-02-09 | 华为技术有限公司 | Adapting method for network connection |
JP4550867B2 (en) * | 2007-07-06 | 2010-09-22 | 日本電信電話株式会社 | TCP session switching apparatus and method, program, and recording medium |
CN101399692B (en) * | 2007-09-27 | 2011-12-21 | 华为技术有限公司 | Method and system for service migration |
WO2009134772A2 (en) * | 2008-04-29 | 2009-11-05 | Maxiscale, Inc | Peer-to-peer redundant file server system and methods |
CN102375955A (en) * | 2010-08-17 | 2012-03-14 | 伊姆西公司 | System and method for locking files in combined naming space in network file system |
-
2014
- 2014-12-23 CN CN201410812351.2A patent/CN105790985B/en active Active
-
2015
- 2015-02-28 WO PCT/CN2015/073416 patent/WO2016101409A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101132347A (en) * | 2006-08-24 | 2008-02-27 | 华为技术有限公司 | A system and method for realizing TCP connection backup |
CN101764738A (en) * | 2008-12-25 | 2010-06-30 | 华为技术有限公司 | Backup method supporting TCP protocol connection and system thereof |
CN102035687A (en) * | 2011-01-06 | 2011-04-27 | 华为技术有限公司 | Backup method and equipment for TCP connection |
CN102510408A (en) * | 2011-11-30 | 2012-06-20 | 武汉烽火网络有限责任公司 | Method for realizing TCP (transmission control protocol) application main and standby changeover |
CN102521300A (en) * | 2011-11-30 | 2012-06-27 | 华中科技大学 | Inter-domain file data sharing method based embedded virtualization platform |
Also Published As
Publication number | Publication date |
---|---|
CN105790985A (en) | 2016-07-20 |
CN105790985B (en) | 2020-06-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109361532B (en) | High availability system and method for network data analysis and computer readable storage medium | |
CN105393220B (en) | System and method for disposing dotted virtual server in group system | |
US12045491B2 (en) | Resynchronization of individual volumes of a consistency group (CG) within a cross-site storage solution while maintaining synchronization of other volumes of the CG | |
WO2016070375A1 (en) | Distributed storage replication system and method | |
US10459922B2 (en) | Unique identification generation for records in a data streaming processing system | |
CN104935634B (en) | Mobile device data sharing method based on Distributed shared memory | |
CN110502364B (en) | Cross-cloud backup recovery method for big data sandbox cluster under OpenStack platform | |
CN102012944B (en) | Distributed NOSQL (not only structured query language) database capable of providing replication property | |
WO2012145963A1 (en) | Data management system and method | |
US11403319B2 (en) | High-availability network device database synchronization | |
CN1874267A (en) | Method for ensuring accordant configuration information in cluster system | |
US9367298B1 (en) | Batch configuration mode for configuring network devices | |
WO2021128927A1 (en) | Message processing method and apparatus, storage medium, and electronic apparatus | |
CN106130763A (en) | Server cluster and be applicable to the database resource group method for handover control of this cluster | |
CN112035062B (en) | Migration method of local storage of cloud computing, computer equipment and storage medium | |
WO2016101409A1 (en) | Data switching method, device and system | |
CN106937351B (en) | Session realization method and core network element | |
CN103986789A (en) | A method for realizing dual-machine redundancy of NFS nodes in NFS-based HADOOP HA cluster | |
CN106789291A (en) | A kind of cluster member control method and device | |
CN111835555A (en) | Data recovery method, device and readable storage medium | |
CN110971872A (en) | Video image information acquisition method based on distributed cluster | |
WO2013083013A1 (en) | Synchronization method among network devices, network device and system | |
WO2015101026A1 (en) | Distributed flow processing system fault tolerance method, nodes and system | |
CN105323271A (en) | Cloud computing system, and processing method and apparatus thereof | |
CN111684428B (en) | Super-scale clouding N-route protection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15871487 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15871487 Country of ref document: EP Kind code of ref document: A1 |