CN114722125B

CN114722125B - Method, device, apparatus and computer-readable medium for database transaction processing

Info

Publication number: CN114722125B
Application number: CN202210373077.8A
Authority: CN
Inventors: 王学伟
Original assignee: Jingdong Technology Information Technology Co Ltd
Current assignee: Xi'an Tongxing Hengyao Information Technology Co.,Ltd.
Priority date: 2022-04-11
Filing date: 2022-04-11
Publication date: 2025-01-07
Anticipated expiration: 2042-04-11
Also published as: CN114722125A

Abstract

The invention discloses a method, a device, equipment and a computer readable medium for processing database transactions, and relates to the technical field of computers. The method comprises the steps of recording a data file name of a database transaction into a recovery file of a temporary directory of a distributed node, moving the data file of the database transaction from the temporary directory into a data file directory of a cloud platform when the recovery file does not exist in the cloud platform, moving the recovery file of the temporary directory into the data file directory of the cloud platform to submit the database transaction, updating a symbolic link name of the data file in a visibility file based on a transaction serial number of the database transaction, and moving the recovery file of the cloud platform into a directory of a cloud platform shared storage to determine successful submission of the database transaction. The implementation method can improve the efficiency of database transaction processing in the cloud primary scene.

Description

Method, apparatus, device and computer readable medium for database transaction processing

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a computer readable medium for database transaction processing.

Background

Transactions are an important feature of relational databases, and as databases evolve from single-machine centralized to distributed databases, single-machine database transactions become distributed transactions. Distributed transactions are typically implemented using a two-phase commit protocol (2 PC). The execution of a two-phase commit is divided into two phases, the voting phase and the commit phase, as is its name.

In implementing the present invention, the inventors have found that there are at least problems in the prior art in that distributed transactions use a two-phase commit protocol, mainly because two-phase commit is directed to a common distributed scenario. For a cloud native scenario that relies on the underlying services provided by the cloud platform, the efficiency of processing transactions is low.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method, an apparatus, a device, and a computer readable medium for database transaction processing, which can improve the efficiency of database transaction processing in a cloud native scenario.

To achieve the above object, according to one aspect of an embodiment of the present invention, there is provided a method of database transaction processing, including:

recording the data file name of the database transaction into a recovery file of a temporary directory of the distributed node;

When the recovery file does not exist in the cloud platform, moving the data file of the database transaction from the temporary directory to the data file directory of the cloud platform, and moving the recovery file of the temporary directory to the data file directory of the cloud platform to submit the database transaction;

updating a symbolic link name of the data file in a visibility file based on the transaction sequence number of the database transaction, wherein the visibility file is used for concurrency control of the distributed nodes;

And moving the recovery file of the cloud platform to a directory of the cloud platform shared storage to determine successful submission of the database transaction.

The database transaction includes an update transaction;

Before the data file names of the database transactions are recorded in the recovery file of the temporary directory of the distributed node, the method further comprises the following steps:

creating a temporary data file at the distributed node and executing the update transaction in the temporary data file to update the database transaction.

The database transaction includes a non-update transaction;

The method further comprises the steps of before recording the data file names of the database transactions in the recovery files of the temporary directories of the distributed nodes

And storing the data file of the database transaction into the data file in the temporary directory.

When a recovery file exists in the cloud platform;

the method further comprises the steps of:

And deleting the newly added file of the data file in the cloud platform after deleting the visibility file of the data file in the cloud platform, and updating the visibility file of the cloud platform.

The method further comprises the steps of:

Traversing the directory shared and stored by the cloud platform, and determining a cleaning sequence number according to the reference count of the data file, wherein the reference count is used for recording the times of calling the data file;

and deleting the data files with the transaction sequence numbers smaller than or equal to the cleaning sequence numbers.

The updating the symbolic link name of the data file in the visibility file based on the transaction serial number of the database transaction comprises the following steps:

A new symbol link file in the data file is associated with a data file corresponding to the transaction serial number of the database transaction;

and in the visibility file, updating the newly-built symbol link name of the data file by using the transaction serial number of the database transaction.

Before the database transaction is submitted, locking operation is executed;

after the successful commit of the database transaction is determined, an unlocking operation is performed.

According to a second aspect of an embodiment of the present invention, there is provided an apparatus for database transaction processing, including:

the recording module is used for recording the data file names of the database transactions into the recovery files of the temporary catalogues of the distributed nodes;

The submitting module is used for moving the data file of the database transaction from the temporary directory to the data file directory of the cloud platform and moving the recovery file of the temporary directory to the data file directory of the cloud platform so as to submit the database transaction when the recovery file does not exist in the cloud platform;

The updating module is used for updating the symbolic link name of the data file in the visibility file based on the transaction serial number of the database transaction, wherein the visibility file is used for concurrency control of the distributed nodes;

And the moving module is used for moving the recovery file of the cloud platform to a directory of the shared storage of the cloud platform so as to determine that the database transaction is successfully submitted.

According to a third aspect of an embodiment of the present invention, there is provided an electronic device for database transaction processing, including:

one or more processors;

storage means for storing one or more programs,

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the methods as described above.

According to a fourth aspect of embodiments of the present invention, there is provided a computer readable medium having stored thereon a computer program which when executed by a processor implements a method as described above.

One embodiment of the invention has the advantages or beneficial effects of recording the data file name of a database transaction into a recovery file of a temporary directory of a distributed node, moving the data file of the database transaction from the temporary directory into a data file directory of a cloud platform and moving the recovery file of the temporary directory into the data file directory of the cloud platform to submit the database transaction when the recovery file does not exist in the cloud platform, updating the symbolic link name of the data file in a visibility file for concurrency control of the distributed node based on the transaction sequence number of the database transaction, and moving the recovery file of the cloud platform into a directory of a cloud platform shared storage to determine successful submission of the database transaction. The distributed nodes upload the data files of the database transaction to the cloud platform to process the database transaction, and other distributed nodes can acquire the data files, so that the efficiency of the database transaction processing in the cloud scene can be improved.

Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a schematic flow diagram of a method of database transaction processing according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a distributed database architecture according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a framework for committing database transactions according to an embodiment of the invention;

FIG. 4 is a flow diagram of background cleaning according to an embodiment of the invention;

FIG. 5 is a flow chart of updating symbolic link names in a visibility file in accordance with an embodiment of the present invention;

FIG. 6 is a schematic diagram of the main structure of an apparatus for database transaction processing according to an embodiment of the present invention;

FIG. 7 is an exemplary system architecture diagram in which embodiments of the present invention may be applied;

fig. 8 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The two-phase commit protocol for a distributed transaction includes a voting phase and a commit phase. In the voting phase, the coordinator sends a request for executing the operation to the participants of the transaction, waits for the responses of other participants, executes the corresponding transaction operation, and returns the result of executing the operation to the coordinator. When all participants return the result of the determination, a two-phase commit enters the commit phase. The result of the determination includes agreement or termination.

In the COMMIT phase, the coordinator will send a COMMIT or ABORT instruction to all participants based on the return of the voting phase. When all participants of the transaction decide to COMMIT the transaction, the coordinator will send a COMMIT request to the participants, after completing the operation and releasing the resources, the participants will return a completion message to the coordinator, the coordinator will end the whole transaction when receiving the completion message of all participants, in contrast to the above, when all participants decide to ABORT the current transaction, the coordinator will send an ABORT request to the participants of the transaction, the participants will roll back and send a completion message to the coordinator.

The two-phase commit described above is directed to a distributed scenario of the common MPP architecture. The MPP architecture is to distribute tasks to a plurality of servers and nodes in parallel, and after calculation is completed on each node, the results of the respective parts are summarized together to obtain a final result. Databases employing the MPP architecture are referred to as MPP databases.

For cloud native scenarios, transaction processing is inefficient. The cloud native scene refers to a basic service provided by a dependent cloud platform. Such as virtual machine service, cloud storage service, and cloud file service running software. Cloud native scenarios are distinguished from scenarios where traditional software is deployed on a physical machine, relying on the hardware operation of the physical machine. The cloud primary scene may also become a cloud scene. The cloud scene is characterized in that the computing nodes can be conveniently expanded, and the cluster size can be large.

In the execution process of the two-stage submission, the following defects exist in the cloud scene:

All nodes participating in the transaction operation are in a blocking state, and each participant cannot perform other task operations in the process of waiting for responses of other participants, so that the performance is low.

In the existing two stages, a role of a coordinator is introduced, the role of the coordinator plays a very important role in the whole two-stage section submission protocol, and once the coordinator has a problem, the second stage cannot operate. More seriously, if the coordinator goes wrong in the second stage, other participants will be in the state of locking the transaction resource and cannot continue to complete the transaction operation.

If a network failure, or a partial participant failure occurs in the first stage, a coordinator ABORT transaction is caused, and the probability of success of the transaction is reduced. This is particularly the case in large cluster sizes.

If in the second phase, after the coordinator sends a commit request to the participants, a network anomaly occurs or the coordinator fails during the sending of the commit request, this may result in only a portion of the participants receiving the commit request. And a commit operation is performed after the portion of the participants receive the commit request. But machines that do not receive commit requests in other parts cannot perform transaction commit. The data inconsistency phenomenon occurs in the whole distributed system.

In summary, for cloud scenarios that rely on the underlying services provided by the cloud platform, the efficiency of processing transactions is low.

In order to solve the problem that in a cloud scene, the transaction processing efficiency is low, the following technical scheme in the embodiment of the invention can be adopted.

Referring to fig. 1, fig. 1 is a main flow diagram of a method for processing database transactions, and the database transactions are submitted at a cloud platform through distributed nodes according to an embodiment of the present invention. As shown in fig. 1, the method specifically comprises the following steps:

S101, recording the data file name of the database transaction into a recovery file of the temporary directory of the distributed node.

The technical scheme in the embodiment of the invention is mainly applied to the cloud platform. And the cloud platform provides cloud file services.

Referring to fig. 2, fig. 2 is a schematic diagram of a distributed database architecture according to an embodiment of the present invention. Fig. 2 includes two clients and multiple computing nodes, where the clients enjoy cloud file service through server nodes.

The cloud platform provides cloud file services, and the cloud file services belong to shared storage. On the shared storage, one computing node writes a file, and other settlement nodes can see the file in real time. Files are shared to all computing nodes using storage, so that storage is also called shared.

A client may communicate with any one of the computing nodes. The computing node is a computing unit of a cloud service, such as a virtual machine or container. The computing nodes perform computation and access the storage data of the bottom layer, and then provide services such as inquiry for the upper layer clients. The computing nodes may distribute computing tasks to other computing nodes to accelerate computation. And storing the storage unit, particularly the cloud service. The storage unit typically provides storage services for cloud file services, or objects.

The scheme in the embodiment of the invention realizes the distributed transaction based on the function provided by the cloud file service. The following describes 4 basic features of the cloud file service:

And 3 copies of the cloud file service are stored in a redundant mode according to the characteristic 1, so that super-strong stability and reliability are provided.

Feature 2, support standard nfsv4.0 and nfsv4.1 protocols, provide the service of full escrow, need not to modify the application, can realize through the file system mounting step of the standard.

And 3, multiple cloud hosts can access the created file storage in the cloud file service through the NFS protocol, and perform read-write operation on the file, so that data sharing of multiple computing nodes is realized.

Property 4, file operations are all atomic, such as moving a file, creating a file, modifying a file name, etc.

Since shared storage is used, all distributed nodes have the same data, and the transaction does not require two phases to commit. Recording commit information to shared storage at commit of each node ensures that other nodes have consistent data. By utilizing the characteristic that the shared storage file operation has atomicity, all nodes can have the same transaction state, and only one specific file is established to represent the transaction commit when the transaction is submitted, so that all nodes can confirm the transaction commit.

Referring to FIG. 3, FIG. 3 is a schematic diagram of a framework for committing database transactions according to an embodiment of the invention. In shared storage, the framework of database transaction commit involves multiple directories and files.

As can be seen from fig. 3, the table belongs to a plurality of files. The following describes the files under the table, respectively.

With an N suffix indicates that there may be multiple files or directories. The Lock file is used for locking operation. Such as spin locks. The locking operation on the cloud file service is a distributed lock, and once one node is locked, other nodes cannot be locked.

Recovery (recovery) files are used to Recover, similar to the function of a redox log and an undo log. A data file is a file representing the actual data stored by an object. The visibility file is mainly used for realizing multi-version concurrency control technology (MVCC), and the creation transaction serial number and the deletion transaction serial number of the data file are recorded in a visibility file name in the format of create_drop.

The transaction sequence number file is a file generated at the time of transaction commit, and the file name is a monotonically increasing sequence (1, 2, 3.) when the file is generated successfully, meaning that the transaction commit is successful. The reference count file refers to a hard link to the transaction sequence number file currently being accessed to confirm whether there is currently a query to access the data file resulting from this transaction. Temporary data files are data files generated during the running of a transaction that are not available until commit.

In an embodiment of the invention, in order to process a database transaction on a cloud platform, a data file name of the database transaction is recorded in a recovery file of a temporary directory of a distributed node.

The node_tmp directory is the temporary directory of the distributed node from fig. 3. The recovery file is the recovery file under the NodeN_tmp directory in FIG. 3.

The recovery file of the temporary directory functions similarly to redolog and undolog for recovery. The above-described writing of temporary files has two purposes.

The first purpose is that the write operation is not an atomic operation, and the atomic operation is performed when renaming or moving operation is performed after the write operation is finished, so that the data file can be ensured not to be damaged due to downtime.

Secondly, the time for writing the file content is long, which is unfavorable for the process in the submitting stage. Whereas the renaming operation is very fast. The mobile operation is equivalent to the renaming operation in the cloud file service.

In one embodiment of the invention, the database transaction comprises an update transaction. In order not to affect the read operation, a temporary data file is created at the distributed node and an update transaction is performed in the temporary data file to update the database transaction.

That is, instead of performing a local update, the data file to be modified is copied as new temporary file data, and then the temporary file data is modified. Thus, reading and writing are not the same file, and no conflict between reading and writing can occur.

In one embodiment of the invention, the database transaction comprises a non-update transaction. And storing the data file of the database transaction into the data file in the temporary directory of the distributed node.

That is, if the database transaction is not an update transaction, the data file of the database transaction is written into the data file in the temporary directory of each distributed node.

In the steps, the data file and the data file name of the database transaction are recorded in the temporary directory of the distributed node, and a foundation is laid for storing data to the cloud platform.

S102, when the recovery file does not exist in the cloud platform, moving the data file from the temporary data directory to the data file directory of the cloud platform, and moving the recovery file of the temporary directory to the data file directory of the cloud platform to submit the database transaction.

To guarantee atomicity of transactions, all rollback or aborted transactions must restore the data that has been modified to a state prior to the start of the transaction. If errors are found to occur in the execution process of the transaction, or constraint is not satisfied, rollback is needed, and all operations are directly rolled back according to the information recorded in the memory. The purpose of a recovery file is to recover data.

In one embodiment of the invention, when the recovery file does not exist in the cloud platform, the last database transaction commit is successful, and recovery is not needed. And directly moving the recovery file submitted at the time from the temporary directory of the distributed node to the data file directory of the cloud platform. The data files of the database transaction are then moved from the temporary directory of the distributed node into the data file directory of the cloud platform.

In one embodiment of the present invention, when a recovery file exists in the cloud platform, this means that the last database transaction commit fails and a rollback is required. The rollback operation comprises deleting the newly added files of the data files in the cloud platform after deleting the visibility files of the data files in the cloud platform, and updating the visibility files of the cloud platform.

Specifically, the visibility file of the data file of the database transaction is deleted, and then the newly added file of the data file in the cloud platform is deleted. And if the new added file of the data file does not exist in the cloud platform, the processing is not needed. This part of the deletion operation may be repeated, i.e. if an exception occurs in this process, it may be reworked without additional processing.

In one embodiment of the present invention, in order to improve efficiency, the newly added files of the visibility file and the data file are deleted, and in a specific implementation, the files may be moved to a temporary directory and cleaned by the background. The specific deletion scheme is as follows.

Referring to fig. 4, fig. 4 is a schematic flow chart of background cleaning according to an embodiment of the invention. The method specifically comprises the following steps:

S401, traversing the directory shared and stored by the cloud platform, determining a cleaning sequence number according to the reference count of the data file, wherein the reference count is used for recording the times of calling the data file.

In the embodiment of the invention, in order to accelerate the process of processing database transactions, the deletion operation is performed in the background. It is understood that S401 and S402 are performed in the background. Specifically, the deleted file is cleaned by the background if it is not queried to be accessed.

The visibility files with the transaction serial numbers not being 0 in the visibility catalogue of the shared storage in the cloud platform are usually not queried to be accessed, so that the visibility files are cleaned by the background.

In the embodiment of the invention, under the condition of inquiring the transaction sequence number, a hard link of the transaction sequence number file is established and is called reference count. The reference count is set in the temporary directory of the corresponding distributed node. Each time a data file is created, the reference count for the transaction sequence number is incremented by one. The reference count is used to record the number of times the data file was called. The smaller the reference count, the lower the frequency of use of the document is indicated, and the larger the reference count is, the higher the frequency of use of the document is indicated.

In order to clean up a file, a cleaning sequence number needs to be determined. Specifically, a plurality of transaction sequence number files are stored under the TIDS directory. And traversing the catalogs in the cloud platform, such as TIDS catalogs, and taking the maximum transaction serial number with the reference count being greater than 1 as a cleaning identifier.

Specifically, traversing the TIDS directory in the cloud platform to find a file with a reference count greater than 1, if there is only one transaction sequence number file, it means that no file needs to be cleaned, traversing the TIDS directory in the cloud platform to find a file with a reference count greater than 1, cleaning the largest value of all transaction sequence number files if the reference count of all transaction sequence number files is 1, and cleaning the transaction sequence number of which the reference count is not 1 if the reference count of all transaction sequence number files is not 1.

S402, deleting the data file with the transaction sequence number smaller than or equal to the cleaning identification.

And traversing all the data files in the cloud platform, and deleting files with the transaction sequence numbers smaller than or equal to the cleaning identification. And the files with the transaction sequence numbers smaller than or equal to the cleaning identification in all the visibility files are all deletable files.

The cleanup identifier and the previous transaction file are both deleted because they are not accessed by someone and are not accessed later. And finally, deleting all files corresponding to the transaction serial numbers smaller than the cleaning identification.

In an embodiment of the invention, the cleaning operation includes querying the file and deleting the file. The inquiry and the deletion are reentrant, namely, the target file is not found when the target file is deleted, and the target file is ignored.

In the embodiment of fig. 4, to increase the speed of processing database transactions, the operation of deleting files in the cloud platform may be implemented in the background.

S103, based on the transaction serial number of the database transaction, the symbolic link name of the data file in the visibility file is updated, and the visibility file is used for concurrency control of the distributed nodes.

The transaction sequence number of the database transaction is used to identify the database transaction. As one example, the transaction sequence number of a database transaction is the current maximum transaction sequence number of the database transaction plus one. The transaction sequence number of a database transaction may be determined in the following manner.

The transaction serial number (ID) of the current database transaction is obtained, namely the directory is traversed TIDS once, the file name of the largest transaction serial number is obtained, and then the transaction ID X of the current database transaction is added.

Referring to fig. 5, fig. 5 is a schematic flow chart of updating symbolic link names in a visibility file according to an embodiment of the present invention. The method specifically comprises the following steps:

s501, linking the original symbol in the data file with the data file corresponding to the transaction serial number of the associated database transaction.

The symbolic link name of the data file in the visibility file in the cloud platform is the data file pointing to the database transaction. In order to facilitate the retrieval of the data files described above, the symbolic link names of the data files in the visibility file need to be updated based on the transaction sequence numbers of the database transactions.

Specifically, a symbolic link file of a transaction serial number of a database transaction is established in the visibility directory, the data file of the database transaction is associated, and the symbolic link name of the original data file is modified.

As one example, the transaction sequence number of the database transaction is X, a sign link file of X_0 is established in the visibility directory, the sign link name of the original data file is modified to be changed from create_0 to create_X, and the sign link file of the original data file is pointed to the data file of the database transaction.

S502, in the visibility file, the symbol link name of the data file is updated by the transaction serial number of the database transaction.

In the visibility file, the symbolic link name of the data file is updated with the transaction sequence number of the database transaction. And further, the data file can be acquired according to the symbol link name.

In the embodiment of fig. 5, the data file can be obtained based on the updated symbolic link name.

S104, moving the recovery file of the cloud platform to a directory of the shared storage of the cloud platform to determine that the database transaction is successfully submitted.

To determine that the commit database transaction is successful, the recovery file of the cloud platform needs to be moved to a cloud platform shared storage directory, such as the directory, specifically the TIDS directory. Once successful, this movement operation means that the database transaction commit is successful. When querying the data file, the latest version of the accessed data file is determined by traversing TIDS the directory.

In the embodiment of the invention, as the data files are once named successfully, all the data are actually stored on the cloud file server, and the data have durability and do not need to be recovered through a redox log.

It should be noted that, in the process of reading the data file, a cache may be utilized to avoid that the data file needs to be read from the cloud file service each time.

In the embodiment of the invention, the disk data is stored on the shared storage, and the data seen by all the distributed nodes is consistent. That is, the data read by each distributed node is identical to ensure data consistency.

When the distributed node reads data, firstly, a read number is obtained in a TIDS directory of the cloud platform, namely, the directory is traversed TIDS, and the maximum transaction sequence number is obtained. And then the data file which can be read by reading the maximum transaction sequence number is fetched.

The judgment rule is that the creation version number < = reading number < the deletion number of the file.

If database transaction X fails to commit, then there must be no X file in the TIDS directory and the transaction sequence number of the query may only be X-1. Then the file visibility file committed in database transaction X is X _0, or a _ X. According to the above judgment rule, even if the files are not rolled back, the reading is not affected, because the query itself will not read the file corresponding to X_0 and will read the A_X file. The effect is the same as a rollback X transaction.

That is, even in the case of distributed query, only the X number acquired at the master node needs to be transferred to other distributed nodes for query, and the queried data is consistent.

In one embodiment of the invention, to avoid concurrent operation of multiple distributed nodes, a locking operation may be performed prior to committing a database transaction, and an unlocking operation may be performed after successful commit of the database transaction.

The method comprises the steps of firstly executing locking operation, then submitting database transaction, and executing unlocking operation after determining that the database transaction is successfully submitted. As one example, a distributed lock may be implemented using an external program such as zookeeper.

The SQL standard defines class 4 isolation levels, including specific rules to define which changes inside and outside the transaction are visible and which are invisible. The low level isolation level generally supports higher concurrency processing and has lower overhead. Including read uncommitted, read committed, repeatable read, serializable.

In an embodiment of the invention, read committed isolation levels are supported. The read transaction is completely non-conflicting with other transactions, being a complete lock-free read. The insertion transaction and the update transaction can conflict only at the commit node, but the locking process at the stage of the transaction of the database is very short, so that the concurrency is very high.

The update transaction and the update transaction cannot be performed simultaneously, isolation is guaranteed by a distributed lock, external guarantees are required, or the distributed lock is implemented on shared storage.

In the embodiment of the invention, in order to ensure the atomicity of database transactions, all rollback or aborted transactions must restore the data that has been modified to the state before the transaction began. If errors are found to occur in the execution process of the transaction, or constraint is not satisfied, rollback is needed, and all operations are directly rolled back according to the information recorded in the memory. The method mainly aims at transaction rollback caused by downtime in the process of processing database transactions.

The main abnormal conditions include the following three:

Abnormal case 1:

from the beginning of processing the database transaction, the recovery file in the temporary directory is moved into the recovery file of the cloud platform. At this time, the recovery files of the cloud platform are free of files, and all files needing rollback operation are in the temporary directory.

The rollback operation is simple and the temporary directory of the corresponding distributed node is emptied. The implementation of the part can be divided into two parts, wherein the first part is a node restarting stage, and when the distributed node is restarted, the corresponding temporary directory is emptied. And the second part is processed by other distributed nodes in the distributed system, and if the other distributed nodes find that the distributed nodes have faults, the temporary catalogs of the corresponding distributed nodes are cleaned.

Abnormal case 2:

And after the recovery file is moved to the recovery file of the cloud platform, the recovery file is moved to be a preset identification file. As an example, if the preset identifier is X, the preset identifier is moved to an X file. This stage is where the file is restored and the rollback operation is handed over to the process of the next database transaction execution commit. If the next database transaction is not executed, no rollback is needed, as the read is not affected, and consistency is not affected.

Abnormal case 3:

and moving the recovery file of the cloud platform to a data file, and ending the transaction, wherein the recovery file is not present at the moment, and the transaction is considered to be submitted without rollback.

In the embodiment, the data file name of the database transaction is recorded in a recovery file of a temporary directory of a distributed node, when the recovery file does not exist in a cloud platform, the data file of the database transaction is moved from the temporary directory to a data file directory of the cloud platform, the recovery file of the temporary directory is moved to the data file directory of the cloud platform to submit the database transaction, the sign link name of the data file in a visibility file is updated based on the transaction serial number of the database transaction, the visibility file is used for concurrency control of the distributed node, and the recovery file of the cloud platform is moved to a directory of a cloud platform shared storage to determine successful submission of the database transaction. The distributed nodes upload the data files of the database transaction to the cloud platform to process the database transaction, and other distributed nodes can acquire the data files, so that the efficiency of the database transaction processing in the cloud scene can be improved.

Referring to fig. 6, fig. 6 is a schematic diagram of a main structure of a database transaction processing apparatus, which may implement a method of database transaction processing, according to an embodiment of the present invention, and as shown in fig. 6, the database transaction processing apparatus specifically includes:

A recording module 601, configured to record a data file name of a database transaction into a recovery file of a temporary directory of a distributed node;

A commit module 602, configured to, when a recovery file does not exist in the cloud platform, move a data file of the database transaction from the temporary data directory to a data file directory of the cloud platform, and move a recovery file of the temporary directory to the data file directory of the cloud platform to commit the database transaction;

an updating module 603, configured to update a symbolic link name of the data file in a visibility file, where the visibility file is used for concurrency control of a distributed node, based on a transaction sequence number of the database transaction;

And the moving module 604 is configured to move the recovery file of the cloud platform to a directory of the shared storage of the cloud platform, so as to determine that the database transaction is successfully submitted.

In one embodiment of the invention, the database transaction comprises an update transaction;

The recording module 601 is further configured to create a temporary data file at a distributed node, and execute the update transaction in the temporary data file to update the database transaction.

In one embodiment of the invention, the database transaction comprises a non-update transaction;

The recording module 601 is further configured to store the data file of the database transaction in the data file in the temporary directory.

In one embodiment of the invention, in the case that a recovery file exists in the cloud platform;

and the submitting module 602 is further configured to delete the visibility file of the data file in the cloud platform, upload a new added file of the data file in the cloud platform, and update the visibility file of the cloud platform.

In one embodiment of the present invention, the update module 603 is further configured to traverse the directory of the cloud platform shared storage, and determine, according to the reference count of the data file, the number of times the reference count is used to record the number of times the data file is called;

and deleting the data file with the transaction identifier smaller than or equal to the cleaning sequence number.

In one embodiment of the present invention, the update module 603 is further configured to associate a new symbol link file in the data file with a data file corresponding to the transaction sequence number of the database transaction;

In one embodiment of the invention, the commit module 602 is further configured to perform a locking operation prior to said committing the database transaction.

The mobile module 604 is further configured to perform an unlocking operation.

Fig. 7 illustrates an exemplary system architecture 700 of a database transaction method or database transaction device to which embodiments of the present invention may be applied.

As shown in fig. 7, a system architecture 700 may include terminal devices 701, 702, 703, a network 704, and a server 705. The network 704 is the medium used to provide communication links between the terminal devices 701, 702, 703 and the server 705. The network 704 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

A user may interact with the server 705 via the network 704 using the terminal devices 701, 702, 703 to receive or send messages or the like. Various communication client applications such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 701, 702, 703.

The terminal devices 701, 702, 703 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.

The server 705 may be a server providing various services, such as a background management server (by way of example only) providing support for shopping-type websites browsed by users using the terminal devices 701, 702, 703. The background management server may analyze and process the received data such as the product information query request, and feedback the processing result (e.g., the target push information, the product information—only an example) to the terminal device.

It should be noted that, the method for processing database transaction provided in the embodiment of the present invention is generally executed by the server 705, and accordingly, the device for processing database transaction is generally disposed in the server 705.

It should be understood that the number of terminal devices, networks and servers in fig. 7 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 8, there is illustrated a schematic diagram of a computer system 800 suitable for use in implementing an embodiment of the present invention. The terminal device shown in fig. 8 is only an example, and should not impose any limitation on the functions and the scope of use of the embodiment of the present invention.

As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU) 801 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data required for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804.

Connected to the I/O interface 805 are an input section 806 including a keyboard, a mouse, and the like, an output section 807 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker, and the like, a storage section 808 including a hard disk, and the like, and a communication section 809 including a network interface card such as a LAN card, a modem, and the like. The communication section 809 performs communication processing via a network such as the internet. The drive 810 is also connected to the I/O interface 805 as needed. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as needed so that a computer program read out therefrom is mounted into the storage section 808 as needed.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section 809, and/or installed from the removable media 811. The above-described functions defined in the system of the present invention are performed when the computer program is executed by a Central Processing Unit (CPU) 801.

The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of a computer-readable storage medium may include, but are not limited to, an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules involved in the embodiments of the present invention may be implemented in software or in hardware. The described modules may also be provided in a processor, for example, a processor may be described as including a recording module, a submitting module, an updating module, and a moving module. The names of these modules do not constitute a limitation on the module itself in some cases, and for example, the recording module may also be described as "recording into a recovery file of a temporary directory for a database transaction" the data file name of the temporary directory of the distributed node.

As a further aspect, the invention also provides a computer readable medium which may be comprised in the device described in the above embodiments or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to include:

According to the technical scheme of the embodiment of the invention, the data file name of the database transaction is recorded in the recovery file of the temporary directory of the distributed node, when the recovery file does not exist in the cloud platform, the data file of the database transaction is moved from the temporary directory to the data file directory of the cloud platform, the recovery file of the temporary directory is moved to the data file directory of the cloud platform to submit the database transaction, the symbol link name of the data file in the visibility file is updated based on the transaction serial number of the database transaction, the visibility file is used for concurrency control of the distributed node, and the recovery file of the cloud platform is moved to the directory of the cloud platform shared storage to determine successful submission of the database transaction. The distributed nodes upload the data files of the database transaction to the cloud platform to process the database transaction, and other distributed nodes can acquire the data files, so that the efficiency of the database transaction processing in the cloud scene can be improved.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. A method for database transaction processing, comprising:

Record the data file name of the database transaction in the recovery file of the temporary directory of the distributed node;

When the recovery file does not exist in the cloud platform, the data file of the database transaction is moved from the temporary directory to the data file directory of the cloud platform, and the recovery file of the temporary directory is moved to the data file directory of the cloud platform to submit the database transaction;

Based on the transaction sequence number of the database transaction, updating the symbolic link name of the data file in the visibility file, the visibility file being used for concurrency control of distributed nodes;

Moving the recovery file of the cloud platform to a directory of the shared storage of the cloud platform to confirm successful submission of the database transaction;

The updating of the symbolic link name of the data file in the visibility file based on the transaction sequence number of the database transaction includes:

Associating the newly created symbolic link file in the data file with the data file corresponding to the transaction sequence number of the database transaction;

In the visibility file, the newly created symbolic link name of the data file is updated with the transaction sequence number of the database transaction.

2. The method for database transaction processing according to claim 1, wherein the database transaction comprises an update transaction;

Before recording the data file name of the database transaction into the recovery file of the temporary directory of the distributed node, the method further includes:

A temporary data file is newly created in the distributed node, and the update transaction is executed in the temporary data file to update the database transaction.

3. The method for database transaction processing according to claim 1, wherein the database transaction comprises a non-update transaction;

Before recording the data file name of the database transaction into the recovery file of the temporary directory of the distributed node, it also includes

The data file of the database transaction is stored in the data file in the temporary directory.

4. The method for database transaction processing according to claim 1, characterized in that when there is a recovery file in the cloud platform;

The method further comprises:

After deleting the visibility file of the data file in the cloud platform, the newly added file of the data file in the cloud platform is deleted, and the visibility file of the cloud platform is updated.

5. The method for database transaction processing according to claim 1, characterized in that the method further comprises:

Traverse the directory of the shared storage of the cloud platform and determine the cleaning sequence number according to the reference count of the data file, where the reference count is used to record the number of times the data file is called;

Delete data files whose transaction sequence numbers are less than or equal to the cleanup sequence number.

6. The method for database transaction processing according to claim 1, characterized in that before submitting the database transaction, it also includes: performing a locking operation;

After determining that the database transaction is successfully submitted, the method further includes: performing an unlocking operation.

7. A device for database transaction processing, comprising:

A recording module is used to record the data file name of the database transaction into the recovery file of the temporary directory of the distributed node;

A submission module, configured to move the data file of the database transaction from the temporary directory to the data file directory of the cloud platform, and to move the recovery file of the temporary directory to the data file directory of the cloud platform, so as to submit the database transaction when the recovery file does not exist in the cloud platform;

An update module, configured to update the symbolic link name of the data file in the visibility file based on the transaction sequence number of the database transaction, the visibility file being used for concurrent control of distributed nodes; and to associate the newly created symbolic link file in the data file with the data file corresponding to the transaction sequence number of the database transaction; in the visibility file, update the newly created symbolic link name of the data file with the transaction sequence number of the database transaction;

The moving module is used to move the recovery file of the cloud platform to a directory of the shared storage of the cloud platform to determine that the database transaction is successfully submitted.

8. An electronic device for database transaction processing, comprising:

one or more processors;

a storage device for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors implement the method according to any one of claims 1 to 6.

9. A computer-readable medium having a computer program stored thereon, wherein when the program is executed by a processor, the method according to any one of claims 1 to 6 is implemented.