[go: up one dir, main page]

CN115017180B - Database access method, device, electronic device and readable storage medium - Google Patents

Database access method, device, electronic device and readable storage medium Download PDF

Info

Publication number
CN115017180B
CN115017180B CN202210674339.4A CN202210674339A CN115017180B CN 115017180 B CN115017180 B CN 115017180B CN 202210674339 A CN202210674339 A CN 202210674339A CN 115017180 B CN115017180 B CN 115017180B
Authority
CN
China
Prior art keywords
transaction
transactions
data
specified
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210674339.4A
Other languages
Chinese (zh)
Other versions
CN115017180A (en
Inventor
马占峰
陈默
杨新军
黄贵
李飞飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Cloud Computing Ltd
Original Assignee
Alibaba Cloud Computing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Cloud Computing Ltd filed Critical Alibaba Cloud Computing Ltd
Priority to CN202210674339.4A priority Critical patent/CN115017180B/en
Publication of CN115017180A publication Critical patent/CN115017180A/en
Application granted granted Critical
Publication of CN115017180B publication Critical patent/CN115017180B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to the technical field of databases, and in particular relates to a database access method, a database access device, electronic equipment and a readable storage medium. The database access method comprises the steps of opening a transaction group according to a single database access statement of a specified data table, determining a data set visible to the transaction in the specified data table, wherein the data set is identical to all transactions in the transaction group, and executing access operation on the data set through the plurality of transactions. The method and the device can solve the problem of low efficiency when executing a single complex database access statement.

Description

Database access method, device, electronic equipment and readable storage medium
Technical Field
The disclosure relates to the technical field of databases, and in particular relates to a database access method, a database access device, electronic equipment and a readable storage medium.
Background
The distributed database includes CN (Compute Node, responsible for parsing, optimizing of database access statements, and generation of execution plans) clusters and DN (Data Node ) clusters. In order to improve the performance of the distributed database, when a single database access statement which is complex and time-consuming is processed, if the target data tables are distributed on the same target DN, the distributed database can split the database access statement into subtasks with finer granularity, and transactions corresponding to the subtasks are executed in parallel through a plurality of database connections to the target DN. By default, transactions on different database connections are independent of each other, and a logically global consistent read cannot be guaranteed, which can lead to serious execution result errors.
Disclosure of Invention
To solve the problems in the related art, embodiments of the present disclosure provide a database access method, apparatus, electronic device, and readable storage medium.
In a first aspect, an embodiment of the present disclosure provides a database access method, which is characterized in that the method includes:
opening a transaction group according to a single database access statement to a specified data table, the transaction group comprising a plurality of transactions;
Determining a set of data in the specified data table that is visible to the transaction, wherein the set of data is the same for all transactions in the transaction group;
and executing access operation on the data set through the plurality of transactions.
According to an embodiment of the present disclosure, wherein:
the specified data table comprises one or more data tables on a specified data node;
The opening of the transaction group according to the single database access statement of the appointed data table comprises dividing a plurality of transactions corresponding to a plurality of subtasks obtained by dividing the single database access statement into one transaction group.
According to an embodiment of the disclosure, different transactions in the transaction group correspond to different database sessions of a designated computing node to the designated data node.
According to an embodiment of the present disclosure, wherein:
The transaction group comprises a master transaction and a plurality of slave transactions, wherein the master transaction is started before the slave transactions;
The data set visible to the transaction in the appointed data table comprises data created by the main transaction and data created by a transaction which is earlier in opening time than the main transaction and does not belong to a corresponding active transaction set of the transaction, wherein the corresponding active transaction set of the transaction is the active transaction set on the appointed node when the main transaction is opened.
According to an embodiment of the present disclosure, wherein:
The master transaction has read and write rights to the specified data table, and the slave transaction has only read rights to the specified data table.
According to an embodiment of the present disclosure, the method further comprises:
assigning a transaction group identifier and a transaction identifier to the transaction when the transaction group is opened;
Searching transactions with matched transaction group identifiers with specified transactions in the transaction groups in active transactions started on the specified data node;
If the active transaction started on the appointed data node has no transaction with the matched transaction group identifier with the appointed transaction, determining the appointed transaction as a main transaction, determining the identifier of the appointed transaction as the corresponding main transaction identifier of the appointed transaction, and determining the active transaction set on the appointed data node as the corresponding active transaction set of the appointed transaction when the appointed transaction is started.
According to an embodiment of the present disclosure, wherein:
The transaction group identifier includes a specified field in a transaction name of the transaction;
The assigning a transaction group identifier to the transaction includes assigning a transaction group identifier to the transaction by assigning a value to the designated field when the transaction is started.
According to an embodiment of the present disclosure, the method further comprises:
If there is a transaction with a matched transaction group identifier with the appointed transaction in the active transactions started on the appointed data node, determining the appointed transaction as a transaction, determining a corresponding main transaction identifier of the transaction with the matched transaction group identifier with the appointed transaction as a corresponding main transaction identifier of the appointed transaction, and taking a corresponding active transaction set of the transaction with the matched transaction group identifier with the appointed transaction as a corresponding active transaction set of the appointed transaction.
According to an embodiment of the disclosure, the performing, by the plurality of transactions, an access operation on the data set includes, for any of the plurality of transactions:
Determining a transaction that creates specified data to be accessed by the any transaction;
The specified data is visible to the any transaction if the transaction that created the specified data is open before the corresponding master transaction of the any transaction and does not belong to the corresponding active transaction set of the any transaction, or the transaction that created the specified data is the corresponding master transaction of the any transaction, otherwise the specified data is not visible to the any transaction.
In a second aspect, an embodiment of the present disclosure provides a database access apparatus, including:
An opening module configured to open a transaction group according to a single database access statement to a specified data table, the transaction group comprising a plurality of transactions;
a first determination module configured to determine a set of data in the specified data table that is visible to the transaction, wherein the set of data is the same for all transactions in the transaction group;
and the execution module is configured to execute access operation on the data set through the plurality of transactions.
According to an embodiment of the present disclosure, wherein:
the specified data table comprises one or more data tables on a specified data node;
The opening of the transaction group according to the single database access statement of the appointed data table comprises dividing a plurality of transactions corresponding to a plurality of subtasks obtained by dividing the single database access statement into one transaction group.
According to an embodiment of the disclosure, different transactions in the transaction group correspond to different database sessions of a designated computing node to the designated data node.
According to an embodiment of the present disclosure, wherein:
The transaction group comprises a master transaction and a plurality of slave transactions, wherein the master transaction is started before the slave transactions;
The data set visible to the transaction in the appointed data table comprises data created by the main transaction and data created by a transaction which is earlier in opening time than the main transaction and does not belong to a corresponding active transaction set of the transaction, wherein the corresponding active transaction set of the transaction is the active transaction set on the appointed node when the main transaction is opened.
According to an embodiment of the present disclosure, wherein:
The master transaction has read and write rights to the specified data table, and the slave transaction has only read rights to the specified data table.
According to an embodiment of the present disclosure, the apparatus further comprises:
an allocation module configured to allocate a transaction group identifier and a transaction identifier to the transaction when the transaction group is opened;
A lookup module configured to find a transaction of the active transactions opened on the specified data node that has a matching transaction group identifier with a specified transaction of the transaction group;
And the second determining module is configured to determine that the appointed transaction is a main transaction if no transaction with a matched transaction group identifier is included in the active transactions started on the appointed data node, determine that the identifier of the appointed transaction is a corresponding main transaction identifier of the appointed transaction, and determine that the active transaction set on the appointed data node is a corresponding active transaction set of the appointed transaction when the appointed transaction is started.
According to an embodiment of the present disclosure, wherein:
The transaction group identifier includes a specified field in a transaction name of the transaction;
The assigning a transaction group identifier to the transaction includes assigning a transaction group identifier to the transaction by assigning a value to the designated field when the transaction is started.
According to an embodiment of the present disclosure, the apparatus further comprises:
A third determining module configured to determine a given transaction as a transaction if there is a transaction having a matching transaction group identifier with the given transaction among active transactions opened on the given data node, determine a corresponding master transaction identifier of a transaction having a matching transaction group identifier with the given transaction as a corresponding master transaction identifier of the given transaction, and regard a corresponding active transaction set of a transaction having a matching transaction group identifier with the given transaction as a corresponding active transaction set of the given transaction.
According to an embodiment of the disclosure, the performing, by the plurality of transactions, an access operation on the data set includes, for any of the plurality of transactions:
Determining a transaction that creates specified data to be accessed by the any transaction;
The specified data is visible to the any transaction if the transaction that created the specified data is open before the corresponding master transaction of the any transaction and does not belong to the corresponding active transaction set of the any transaction, or the transaction that created the specified data is the corresponding master transaction of the any transaction, otherwise the specified data is not visible to the any transaction.
In a third aspect, embodiments of the present disclosure provide a distributed database comprising a cluster of computing nodes and a cluster of data nodes, wherein:
The computing node obtains a single database access statement for a specified data table;
And opening a transaction group according to the single database access statement by a designated data node storing the designated data table in the data node cluster, wherein the transaction group comprises a plurality of transactions, and determining a data set visible to the transactions in the designated data table, wherein the data set is the same for all the transactions in the transaction group, and executing access operation on the data set through the plurality of transactions.
In a fourth aspect, an embodiment of the present disclosure provides an electronic device, including a memory and a processor, wherein the memory is configured to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement the method of any of the first aspects.
In a fifth aspect, embodiments of the present disclosure provide a computer-readable storage medium having stored thereon computer instructions which, when executed by a processor, implement a method according to the first aspect.
In a sixth aspect, embodiments of the present disclosure provide a computer program product comprising computer instructions which, when executed by a processor, implement a method as in any of the first aspects.
According to the technical scheme provided by the embodiment of the disclosure, a group of transactions corresponding to the subtasks obtained by splitting a single database access statement are logically divided into one transaction group, and all the data which can be accessed by the transactions in one transaction group are identical, so that the transactions in one transaction group are ensured to read the logically globally consistent data, and the same result as that of the single transaction execution task is achieved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
Other features, objects and advantages of the present disclosure will become more apparent from the following detailed description of non-limiting embodiments, taken in conjunction with the accompanying drawings. In the drawings:
Fig. 1 illustrates a schematic structure of a distributed database according to an embodiment of the present disclosure.
FIG. 2 illustrates a schematic diagram of splitting execution of a single database access statement into multiple subtasks, according to an embodiment of the present disclosure.
FIG. 3 illustrates a transaction ID schematic diagram that is globally incremented in accordance with an embodiment of the present disclosure.
Fig. 4 shows a flowchart of a database access method according to an embodiment of the present disclosure.
Fig. 5 shows a block diagram of a database access apparatus according to an embodiment of the present disclosure.
Fig. 6 shows a block diagram of an electronic device according to an embodiment of the disclosure.
Fig. 7 shows a schematic diagram of a computer system suitable for use in implementing methods according to embodiments of the present disclosure.
Detailed Description
Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily implement them. In addition, for the sake of clarity, portions irrelevant to description of the exemplary embodiments are omitted in the drawings.
In this disclosure, it should be understood that terms such as "comprises" or "comprising," etc., are intended to indicate the presence of features, numbers, steps, acts, components, portions, or combinations thereof disclosed in this specification, and are not intended to exclude the possibility that one or more other features, numbers, steps, acts, components, portions, or combinations thereof are present or added.
In addition, it should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates a schematic structure of a distributed database according to an embodiment of the present disclosure.
As shown in fig. 1, the distributed database includes DN clusters, CN clusters, and GMS (Global METADATA SERVICE ) clusters.
The DNs are storage engines used by distributed databases, and each DN may be a cluster including one master node (leader) and two slave nodes (slaves). DN uses MVCC (Multi-Version Concurrency Control ) mechanism to control concurrency, i.e. insert, modify, delete, etc. operations on data will create new versions of data, later when other transactions read data, it will determine which versions are visible based on the time the current transaction was on, and the status of all committed and active transactions currently, this process is called "visibility determination".
The CN is a distributed database access engine used by a distributed database, the CN has two processing modes for the database access statement sent by the client, the first is to directly analyze the database access statement, optimize and generate an execution plan, the generated execution plan is sent to the DN for execution, and the second is to forward the database access statement to the DN, analyze and optimize the database access statement, generate the execution plan and execute locally. The database access statement may be, for example, an SQL (Structured Query Language ) statement. When executing the database access statement, the CN creates and maintains a database session to the DN where the target data is located, the target DN executes an execution plan corresponding to the database access statement, the execution result is returned to the CN for further processing through the database session, and finally the processing result is sent to the client.
The DN opens transactions when executing an execution plan corresponding to a database access statement, which can be divided into general transactions that are used when the accessed data is on a single DN and distributed transactions that are used when the accessed data is on a different DN of the distributed system.
In the DN, a transaction binds a database session, resulting in a transaction being executed in only one session. If a transaction contains a single database access statement that is complex and long running (e.g., a query to a large table), then the database access statement can only be executed serially, with poor performance, typical scenarios including union (merge), insert.
In order to improve the performance of the distributed database system, when the CN processes such a single database access statement, if it finds that a plurality of data tables used by the CN are distributed on the same DN, the CN will split the execution of the database access statement into a plurality of subtasks, and execute the transactions corresponding to the subtasks respectively using a plurality of database sessions to the target DN, thereby achieving the effect of executing the database access statement in parallel, as shown in fig. 2.
FIG. 2 illustrates a schematic diagram of splitting execution of a single database access statement into multiple subtasks, according to an embodiment of the present disclosure.
As shown in fig. 2, when the CN discovers that a plurality of data tables used by a single database access statement are distributed on the same DN, splitting the execution of the database access statement into a plurality of subtasks, and executing transactions corresponding to the subtasks respectively using a plurality of database sessions to a target DN, thereby achieving the effect of executing the database access statement in parallel. Fig. 2 shows the case where a database access uses two data tables T1, T2 and is split into two subtasks (corresponding to transactions TRX1, TRX2, respectively), but it will be appreciated that a database access may also use more data tables and be split into more subtasks accordingly.
The DN needs to open a transaction as it is executed through each database session and determines which data is visible to the current transaction through a visibility determination. If a single database session is simply changed into multiple database sessions, since transactions opened on different database sessions are independent of each other, some other concurrent transactions may be modified to be visible only to transactions opened on some database sessions, but not to transactions opened on other database sessions, so that logically globally consistent data cannot be read, and finally, an erroneous execution result is obtained.
The DN maintains a globally incremented transaction ID that is globally incremented within a single DN. The newly opened transaction will use the currently available global transaction ID as its transaction ID, which is incremented by 1 after the current global transaction ID is used. The smaller the transaction ID, the earlier the time to open the transaction. At the same time, the DN maintains a global active transaction linked list for quickly obtaining information of all active transactions, and when a transaction commits or rolls back, the transaction is removed from the linked list.
When a new transaction is started, the DN traverses the global active transaction chain table to acquire the ID set of all current active transactions, the minimum transaction ID in the set is marked as trx_id_min, the currently available global transaction ID is marked as trx_id_max, and the structure formed by the information is called read view.
FIG. 3 illustrates a transaction ID schematic diagram that is globally incremented in accordance with an embodiment of the present disclosure.
As shown in fig. 3, assuming that when a current transaction is newly started, the currently available global transaction ID is 1005, the DN assigns the global transaction ID1005 to the newly started current transaction and traverses the global active transaction linked list, determining that the currently active transaction is trx1 (ID 1001), trx3 (ID 1003), trx4 (ID 1004), the transaction preceding trx1 (ID 1001) and trx2 (ID 1002) are committed transactions, and thus the read view of the newly started current transaction contains information of:
1. active transaction ID set (1001,1003,1004)
Trx_id_min is the minimum value in the active transaction ID set, in this example 1001
Trx_id_max is the value of the current transaction, in this example 1005
When the newly opened current transaction reads the data line, an ID of the transaction creating the data (including inserting the data or modifying the data) is obtained from the data line, and is marked as trx_id, and in combination with the read view of the current transaction, visibility judgment is performed according to the following rule:
1. If trx_id is less than the minimum trx_id_min in the active transaction ID set (e.g., all transactions with IDs less than 1001), indicating that the transaction has committed when the current transaction is opened, the data it created is visible to the current transaction.
2. If trx_id is greater than trx_id_max (e.g., all transactions with IDs greater than 1005), indicating that the transaction is open later than the current transaction, the data it created is not visible to the current transaction.
3. If trx_id is included in the active transaction ID set (e.g., ID 1003), it indicates that the transaction is already open and active when the current transaction is open, and the data it creates is not visible to the current transaction.
4. If trx_id is not included in the active transaction ID set (e.g., ID 1002), it indicates that the transaction has committed when the current transaction is opened, and the data it created is visible to the current transaction.
5. If the active transaction ID set is empty, it indicates that no other active transaction is present when the current transaction is open, that is, the transaction that created the data has been committed at the time, and the data it created is visible to the current transaction.
The state of globally active transactions is dynamically changing over time (e.g., new transactions open, transactions commit, rollback, etc.), if different transactions are opened at different points in time using different database sessions, these transactions will have different transaction IDs, and different sets of active transaction IDs, with the result that some concurrent transactions create data or make data modifications that may be visible only to transactions that are open on some database sessions, but not to transactions that are open on other database sessions.
Thus, if the transaction is initiated by simply using different database sessions to execute the subtasks obtained by splitting a single database access statement by the CN, the subtasks will be executed as independent transactions without association, and the effect of global consistency reading cannot be achieved, thereby causing logic errors. To avoid logical errors, it is necessary to guarantee a logical read consistency when executing these subtasks, i.e. for a certain data line, either visible to the transactions in which all these subtasks are located or invisible to the transactions in which all the subtasks are located.
To solve this problem, embodiments of the present disclosure logically divide a set of transactions corresponding to a subtask obtained by splitting a single database access statement into a transaction group, and provide that all transactions within a transaction group can access the same data, ensuring that the transactions within a transaction group read logically globally consistent data, thereby achieving the same result as a single transaction executing task.
Fig. 4 shows a flowchart of a database access method according to an embodiment of the present disclosure. As shown in fig. 4, the database access method includes the following steps S401 to S403:
In step S401, a transaction group is opened according to a single database access statement to a specified data table, the transaction group comprising a plurality of transactions;
In step S402, determining a set of data visible to the transaction in the specified data table, wherein the set of data is the same for all transactions in the transaction group;
in step S403, an access operation is performed on the data set by the plurality of transactions.
According to an embodiment of the disclosure, the specified data tables include one or more data tables on the specified data node. According to embodiments of the present disclosure, different transactions in a transaction group may access the same specified data table, as well as different specified data tables. Since the data visible to the transactions in the transaction group is the same, logical read consistency can be guaranteed even if multiple transactions access the same data table.
According to an embodiment of the disclosure, the opening the transaction group according to the single database access statement of the specified data table may include dividing a plurality of transactions corresponding to a plurality of sub-tasks obtained by splitting the single database access statement of the specified data table into one transaction group, for example.
According to an embodiment of the disclosure, different transactions in the transaction group correspond to different database sessions of a designated computing node to the designated data node. Wherein the designated compute node may be the compute node that receives the single database access statement from the client or any other compute node for feeding back the data access result to the client.
According to an embodiment of the present disclosure, performing an access operation on the data set by the plurality of transactions includes performing an access operation on the data set by at least two transactions of the plurality of transactions in parallel. The performing, by at least two transactions of the plurality of transactions, access operations on the data set in parallel includes the at least two transactions at least partially overlapping times of performing access operations on the data set.
For example, for a single complex SQL statement "SELECT FROM t1 UNION ALL SELECT FROM t2", in serial processing mode, a transaction is started using the same database session to DN, and the data tables t1 and t2 are scanned serially, and the scan result is sent to CN for aggregation. In parallel mode, CN splits the statement into two subtasks, "SELECT FROM t1" and "SELECT FROM t2", initiates transaction execution through two database sessions to DN, scans t1 and t2 in parallel respectively, and sends scan results to CN for aggregation respectively.
According to an embodiment of the present disclosure, the single SQL statement includes any type of SQL statement that is merge sort join, union all, insert, select.
According to an embodiment of the disclosure, the transaction group comprises a master transaction and a plurality of slave transactions, the master transaction is started before the slave transaction, the data set visible to the transactions in the designated data table comprises data created by the master transaction and data created by transactions which are earlier in starting time than the corresponding active transaction set of the master transaction and do not belong to the transactions, and the corresponding active transaction set of the transactions is the active transaction set on the designated node when the master transaction is started.
According to an embodiment of the present disclosure, the master transaction has read and write rights to the specified data table, and the slave transaction has only read rights to the specified data table.
For example, the first open transaction in a transaction group is referred to as a master transaction (master transaction), only the master transaction is able to perform write operations (including insert, delete, update, etc.), and only one master transaction is in a transaction group, the other transactions are transactions, only read operations can be performed, and data modifications of the master transaction are visible to not only itself, but also to all other transactions in the same transaction group.
In this way, when the sub-tasks of the CN split are executed through different database sessions, the transactions in which these sub-tasks are located can be configured to the same transaction group, and they will logically guarantee a global consistency read using the same corresponding active transaction set and master transaction ID.
The following is a simple example. Assuming transactions trx 1, trx 2 are open, which constitute a transaction group, then in accordance with an embodiment of the present disclosure:
1. If trx 1 is started first, it will become the main transaction, and read and write operations can be performed, where the transaction ID of trx 1 is the main transaction ID of the transaction group;
Trx 2 is a transaction, and only a read operation can be performed. trx 2 will copy the corresponding active transaction set and master transaction ID of trx 1;
3. if trx 1 modifies the data, these modifications are also visible to trx 2.
Taking the SQL statement "insert into t1 values (.+ -.); insert into t2 select from t1" as an example, CN first inserts data into t1, then obtains all data for t1 and inserts it into t2. To achieve parallel processing, the CN splits the SQL statement into 2 transactions, which are executed through different database sessions. Wherein the master transaction is used to insert data into t1 and data obtained from the CN into t2, and the slave transaction is used to read data from t1 and send it to the CN. Since the master transaction is not committed, by default, the slave transaction is otherwise unable to read the data in the master transaction insert t 1. In transaction group mode, each transaction in the transaction group may read the data in the master transaction insert t1, since each transaction in the transaction group holds the master transaction ID, and the modifications of the master transaction are visible to all transactions in the transaction group.
According to an embodiment of the present disclosure, system variables are added at the kernel layer of the DN for explicitly turning on or off the transaction group mode. The transaction group mode defaults to off and needs to be set in the database connection in advance when the transaction group mode needs to be turned on or off.
If the transaction group pattern is opened in the current database session, only the transaction group pattern is validated only for the current database session. If the transaction group mode is globally turned on, the transaction group mode is validated for the entire database session. After the variable is modified, it is only valid for the new transaction that is subsequently opened, and the currently executing transaction is not affected.
In accordance with embodiments of the present disclosure, in transaction group mode, transactions that are open for a period of time need to be logically partitioned into one or more different transaction groups. However, opening transactions is a relatively independent process and therefore it is necessary to determine which transactions currently to open belong to the same transaction group.
There are various methods for determining which transactions to be started currently belong to the same transaction group, for example, the most straightforward method is to independently specify the transaction group ID when the transactions are started, but the scheme needs to modify the SQL syntax, and when performing visibility judgment, the globally active transaction list needs to be traversed multiple times to find the transactions with the same transaction group ID, which has a direct influence on performance and is costly.
Or according to an embodiment of the present disclosure, the transaction group identifier includes a specified field in a transaction name of the transaction, and the assigning the transaction group identifier to the transaction includes assigning the transaction group identifier to the transaction by assigning a value to the specified field when the transaction is started. For example, the CN may use an SQL statement such as "XA START xid" to open a transaction, xid is a transaction name, and the format is gtrid [, bqual [, formatID ] ], where gtrid and bqual are arbitrarily specifiable strings, so that assignment can be contracted to achieve matching of transaction groups. According to an embodiment of the present disclosure, xid, which meets the following preset rules, belongs to the same transaction group:
1.gtrid is the same, and
The bqual format is "value@xxxx" where the "value" is identical and the "xxxx" is a 4-digit number and cannot be repeated.
Specifically, when gtrid of xid of two transactions are the same, value@xxxx is the same and xxxx is different, the two transactions belong to the same transaction group.
According to embodiments of the present disclosure, assignment of assigned fields gtrid and bqual in xid may be performed according to the preset rules described above when a transaction is opened to partition the transaction into transaction groups. According to embodiments of the present disclosure, the gtrid and value@xxxx fields in xid may be considered transaction group identifiers. When two transaction group identifiers meet a preset matching rule, the two transaction group identifiers are matching transaction group identifiers. For example, when gtrid of the two transaction group identifiers are the same, value@xxxx is the same and xxxx is different, the two transaction group identifiers are matching transaction group identifiers.
According to the embodiment of the disclosure, when the transaction group is started, a transaction group identifier and a transaction identifier are allocated to the transaction, the transaction with the matched transaction group identifier is searched for in the active transactions started on the appointed data node, if the transaction with the matched transaction group identifier is not found in the active transactions started on the appointed data node, the appointed transaction is determined to be a main transaction, the identifier of the appointed transaction is determined to be the corresponding main transaction identifier of the appointed transaction, and when the appointed transaction is started, the active transaction set on the appointed node is determined to be the corresponding active transaction set of the appointed transaction. If there is a transaction with a matched transaction group identifier with the appointed transaction in the active transactions started on the appointed data node, determining the appointed transaction as a transaction, determining a corresponding main transaction identifier of the transaction with the matched transaction group identifier with the appointed transaction as a corresponding main transaction identifier of the appointed transaction, and taking a corresponding active transaction set of the transaction with the matched transaction group identifier with the appointed transaction as a corresponding active transaction set of the appointed transaction.
After the transaction group mode is started, the information related to the transaction group needs to be recorded, and a member variable trx_group_master_trx_id is newly added in an object of each transaction in the memory and is used for storing a main transaction ID of the transaction group to which the current transaction belongs.
If the transaction group mode is closed, the process of creating the read view is the same as the normal transaction, and all that is additionally required is to set the trx_group_master_trx_id of the current transaction to the ID of the current transaction.
If the transaction group mode is started, when a current transaction is started, the currently available global transaction ID is allocated to the current transaction as the transaction ID thereof, and the current transaction is matched with the active transaction started on the corresponding DN according to the assignment rule of xid. If no active transaction belongs to the same transaction group as the current transaction, which indicates that the current transaction is the first opened transaction of the transaction group or is possibly the only transaction of the transaction group, the trx_group_master_trx_id of the current transaction is set as the transaction ID of the current transaction, and a read view of the current transaction is generated, wherein the read view comprises a corresponding active transaction set (indicated by the active transaction ID set on the DN when the current transaction is opened) of the current transaction, the minimum transaction ID in the active transaction ID set, and the transaction ID of the current transaction.
If the current transaction group is found to have transactions open and active according to the assignment rule of xid, i.e. a matching transaction of the current transaction group is found, the trx_group_master_trx_id of the current transaction is set to the trx_group_master_trx_id of the matched transaction, and the read view of the matching transaction is copied.
Thus, when all transactions belonging to the same transaction group are open, they will have a common read view and master transaction ID (i.e., trx_group_master_trx_id).
According to an embodiment of the disclosure, the executing access operations on the data set by the plurality of transactions includes, for any of the plurality of transactions, determining a transaction that creates specified data to be accessed by the any transaction, if the transaction that creates the specified data is open before the corresponding master transaction of the any transaction and does not belong to the corresponding active transaction set of the any transaction, or the transaction that creates the specified data is the corresponding master transaction of the any transaction, the specified data is visible to the any transaction, otherwise the specified data is invisible to the any transaction.
According to embodiments of the present disclosure, the read view of all transactions in a transaction group are the same and their corresponding master transaction IDs are the same, so that for any transaction in the transaction group, the data visibility determined based on the read view and the master transaction ID is the same, thereby ensuring a globally consistent read of the transactions in the transaction group.
According to the embodiment of the disclosure, the database access method can be realized on the DN based on the storage engine layer of the DN, and is matched with the splitting and parallel processing of complex time-consuming SQL sentences by the CN, and test results of the SQL sentences :SQL:SELECT o_orderkey,o_custkey,c_custkey,c_name FROM orders JOIN customer on o_custkey=c_custkey ORDER BY o_custkey limit 5, are tested for the table orders (150 ten thousand lines of data) and the table customers (15 ten thousand lines of data), so that the execution time for opening the transaction group function is less than 1 second, the time for saving about 2 seconds compared with the execution time without opening the transaction group function is greatly shortened. Therefore, the database access method provided by the disclosure can effectively improve the processing speed of complex SQL.
Fig. 5 shows a block diagram of a database access apparatus according to an embodiment of the present disclosure. The device may be implemented as part or all of the DN by software, hardware, or a combination of both.
As shown in fig. 5, the database access apparatus 500 includes an opening module 510, a first determining module 520, and an executing module 530.
The opening module 510 is configured to open a transaction group according to a single database access statement to a specified data table, the transaction group comprising a plurality of transactions;
The first determination module 520 is configured to determine a set of data in the specified data table that is visible to the transaction, wherein the set of data is the same for all transactions in the transaction group;
The execution module 530 is configured to perform access operations on the data set through the plurality of transactions.
According to an embodiment of the present disclosure, wherein:
the specified data table comprises one or more data tables on a specified data node;
The opening of the transaction group according to the single database access statement of the appointed data table comprises dividing a plurality of transactions corresponding to a plurality of subtasks obtained by dividing the single database access statement into one transaction group.
According to an embodiment of the disclosure, different transactions in the transaction group correspond to different database sessions of a designated computing node to the designated data node.
According to an embodiment of the present disclosure, wherein:
The transaction group comprises a master transaction and a plurality of slave transactions, wherein the master transaction is started before the slave transactions;
The data set visible to the transaction in the appointed data table comprises data created by the main transaction and data created by a transaction which is earlier in opening time than the main transaction and does not belong to a corresponding active transaction set of the transaction, wherein the corresponding active transaction set of the transaction is the active transaction set on the appointed node when the main transaction is opened.
According to an embodiment of the present disclosure, wherein:
The master transaction has read and write rights to the specified data table, and the slave transaction has only read rights to the specified data table.
According to an embodiment of the present disclosure, the apparatus 500 further includes:
an allocation module 540 configured to allocate a transaction group identifier and a transaction identifier to the transaction when the transaction group is opened;
A lookup module 550 configured to lookup a transaction of the active transactions opened on the specified data node that has a matching transaction group identifier with a specified transaction of the transaction group;
A second determining module 560 is configured to determine the designated transaction as a master transaction if there is no transaction having a matching transaction group identifier with the designated transaction in the active transactions opened on the designated data node, determine the identifier of the designated transaction as a corresponding master transaction identifier of the designated transaction, and determine the active transaction set on the designated data node as a corresponding active transaction set of the designated transaction when the designated transaction is opened.
According to an embodiment of the present disclosure, wherein:
The transaction group identifier includes a specified field in a transaction name of the transaction;
The assigning a transaction group identifier to the transaction includes assigning a transaction group identifier to the transaction by assigning a value to the designated field when the transaction is started.
According to an embodiment of the present disclosure, the apparatus 500 further includes:
A third determining module 570 is configured to determine a given transaction as a transaction if there is a transaction having a matching transaction group identifier with the given transaction, of active transactions opened on the given data node, determine a corresponding master transaction identifier of a transaction having a matching transaction group identifier with the given transaction as a corresponding master transaction identifier of the given transaction, and take a corresponding active transaction set of a transaction having a matching transaction group identifier with the given transaction as a corresponding active transaction set of the given transaction.
According to an embodiment of the disclosure, the performing, by the plurality of transactions, an access operation on the data set includes, for any of the plurality of transactions:
Determining a transaction that creates specified data to be accessed by the any transaction;
The specified data is visible to the any transaction if the transaction that created the specified data is open before the corresponding master transaction of the any transaction and does not belong to the corresponding active transaction set of the any transaction, or the transaction that created the specified data is the corresponding master transaction of the any transaction, otherwise the specified data is not visible to the any transaction.
The present disclosure also discloses an electronic device, and fig. 6 shows a block diagram of the electronic device according to an embodiment of the present disclosure.
As shown in fig. 6, the electronic device includes a memory and a processor, wherein the memory is configured to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement a method in accordance with an embodiment of the present disclosure.
The embodiment of the disclosure provides a database access method, which is characterized by comprising the following steps:
opening a transaction group according to a single database access statement to a specified data table, the transaction group comprising a plurality of transactions;
Determining a set of data in the specified data table that is visible to the transaction, wherein the set of data is the same for all transactions in the transaction group;
and executing access operation on the data set through the plurality of transactions.
According to an embodiment of the present disclosure, wherein:
the specified data table comprises one or more data tables on a specified data node;
The opening of the transaction group according to the single database access statement of the appointed data table comprises dividing a plurality of transactions corresponding to a plurality of subtasks obtained by dividing the single database access statement into one transaction group.
According to an embodiment of the disclosure, different transactions in the transaction group correspond to different database sessions of a designated computing node to the designated data node.
According to an embodiment of the present disclosure, wherein:
The transaction group comprises a master transaction and a plurality of slave transactions, wherein the master transaction is started before the slave transactions;
The data set visible to the transaction in the appointed data table comprises data created by the main transaction and data created by a transaction which is earlier in opening time than the main transaction and does not belong to a corresponding active transaction set of the transaction, wherein the corresponding active transaction set of the transaction is the active transaction set on the appointed node when the main transaction is opened.
According to an embodiment of the present disclosure, wherein:
The master transaction has read and write rights to the specified data table, and the slave transaction has only read rights to the specified data table.
According to an embodiment of the present disclosure, the method further comprises:
assigning a transaction group identifier and a transaction identifier to the transaction when the transaction group is opened;
Searching transactions with matched transaction group identifiers with specified transactions in the transaction groups in active transactions started on the specified data node;
If the active transaction started on the appointed data node has no transaction with the matched transaction group identifier with the appointed transaction, determining the appointed transaction as a main transaction, determining the identifier of the appointed transaction as the corresponding main transaction identifier of the appointed transaction, and determining the active transaction set on the appointed data node as the corresponding active transaction set of the appointed transaction when the appointed transaction is started.
According to an embodiment of the present disclosure, wherein:
The transaction group identifier includes a specified field in a transaction name of the transaction;
The assigning a transaction group identifier to the transaction includes assigning a transaction group identifier to the transaction by assigning a value to the designated field when the transaction is started.
According to an embodiment of the present disclosure, the method further comprises:
If there is a transaction with a matched transaction group identifier with the appointed transaction in the active transactions started on the appointed data node, determining the appointed transaction as a transaction, determining a corresponding main transaction identifier of the transaction with the matched transaction group identifier with the appointed transaction as a corresponding main transaction identifier of the appointed transaction, and taking a corresponding active transaction set of the transaction with the matched transaction group identifier with the appointed transaction as a corresponding active transaction set of the appointed transaction.
According to an embodiment of the disclosure, the performing, by the plurality of transactions, an access operation on the data set includes, for any of the plurality of transactions:
Determining a transaction that creates specified data to be accessed by the any transaction;
The specified data is visible to the any transaction if the transaction that created the specified data is open before the corresponding master transaction of the any transaction and does not belong to the corresponding active transaction set of the any transaction, or the transaction that created the specified data is the corresponding master transaction of the any transaction, otherwise the specified data is not visible to the any transaction.
The present disclosure also provides a distributed database comprising a cluster of computing nodes and a cluster of data nodes, wherein:
The computing node obtains a single database access statement for a specified data table;
And opening a transaction group according to the single database access statement by a designated data node storing the designated data table in the data node cluster, wherein the transaction group comprises a plurality of transactions, and determining a data set visible to the transactions in the designated data table, wherein the data set is the same for all the transactions in the transaction group, and executing access operation on the data set through the plurality of transactions.
A distributed database according to an embodiment of the present disclosure may have a structure as shown in fig. 1.
According to an embodiment of the present disclosure, wherein:
the specified data table comprises one or more data tables on a specified data node;
The opening of the transaction group according to the single database access statement of the appointed data table comprises dividing a plurality of transactions corresponding to a plurality of subtasks obtained by dividing the single database access statement into one transaction group.
According to an embodiment of the present disclosure, wherein:
Different transactions in the transaction group correspond to different database sessions of a designated compute node to the designated data node.
According to an embodiment of the present disclosure, wherein:
The transaction group comprises a master transaction and a plurality of slave transactions, wherein the master transaction is started before the slave transactions;
The data set visible to the transaction in the appointed data table comprises data created by the main transaction and data created by a transaction which is earlier in opening time than the main transaction and does not belong to a corresponding active transaction set of the transaction, wherein the corresponding active transaction set of the transaction is the active transaction set on the appointed node when the main transaction is opened.
According to an embodiment of the present disclosure, wherein:
The master transaction has read and write rights to the specified data table, and the slave transaction has only read rights to the specified data table.
According to an embodiment of the disclosure, the designated data node is further configured to:
assigning a transaction group identifier and a transaction identifier to the transaction when the transaction group is opened;
Searching transactions with matched transaction group identifiers with specified transactions in the transaction groups in active transactions started on the specified data node;
If the active transaction started on the appointed data node has no transaction with the matched transaction group identifier with the appointed transaction, determining the appointed transaction as a main transaction, determining the identifier of the appointed transaction as the corresponding main transaction identifier of the appointed transaction, and determining the active transaction set on the appointed data node as the corresponding active transaction set of the appointed transaction when the appointed transaction is started.
According to an embodiment of the present disclosure, wherein:
The transaction group identifier includes a specified field in a transaction name of the transaction;
The assigning a transaction group identifier to the transaction includes assigning a transaction group identifier to the transaction by assigning a value to the designated field when the transaction is started.
According to an embodiment of the disclosure, the designated data node is further configured to:
If there is a transaction with a matched transaction group identifier with the appointed transaction in the active transactions started on the appointed data node, determining the appointed transaction as a transaction, determining a corresponding main transaction identifier of the transaction with the matched transaction group identifier with the appointed transaction as a corresponding main transaction identifier of the appointed transaction, and taking a corresponding active transaction set of the transaction with the matched transaction group identifier with the appointed transaction as a corresponding active transaction set of the appointed transaction.
According to an embodiment of the disclosure, the performing, by the plurality of transactions, an access operation on the data set includes, for any of the plurality of transactions:
Determining a transaction that creates specified data to be accessed by the any transaction;
The specified data is visible to the any transaction if the transaction that created the specified data is open before the corresponding master transaction of the any transaction and does not belong to the corresponding active transaction set of the any transaction, or the transaction that created the specified data is the corresponding master transaction of the any transaction, otherwise the specified data is not visible to the any transaction.
Fig. 7 shows a schematic diagram of a computer system suitable for use in implementing methods according to embodiments of the present disclosure.
As shown in fig. 7, the computer system includes a processing unit that can execute the various methods in the above embodiments according to a program stored in a Read Only Memory (ROM) or a program loaded from a storage section into a Random Access Memory (RAM). In the RAM, various programs and data required for the operation of the computer system are also stored. The processing unit, ROM and RAM are connected to each other by a bus. An input/output (I/O) interface is also connected to the bus.
Connected to the I/O interface are an input section including a keyboard, a mouse, etc., an output section including an output section such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), etc., and a speaker, etc., a storage section including a hard disk, etc., and a communication section including a network interface card such as a LAN card, a modem, etc. The communication section performs a communication process via a network such as the internet. The drives are also connected to the I/O interfaces as needed. Removable media such as magnetic disks, optical disks, magneto-optical disks, semiconductor memories, and the like are mounted on the drive as needed so that a computer program read therefrom is mounted into the storage section as needed. Wherein, the processing unit may be implemented as a processing unit such as CPU, GPU, TPU, FPGA, NPU.
In particular, according to embodiments of the present disclosure, the methods described above may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the method described above. In such embodiments, the computer program may be downloaded and installed from a network via a communication portion, and/or installed from a removable medium.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units or modules referred to in the embodiments of the present disclosure may be implemented in software or in programmable hardware. The units or modules described may also be provided in a processor, the names of which in some cases do not constitute a limitation of the unit or module itself.
As another aspect, the present disclosure also provides a computer-readable storage medium, which may be a computer-readable storage medium included in the electronic device or the computer system in the above-described embodiment, or may be a computer-readable storage medium that exists alone and is not assembled into the device. The computer-readable storage medium stores one or more programs for use by one or more processors in performing the methods described in the present disclosure.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the invention referred to in this disclosure is not limited to the specific combination of features described above, but encompasses other embodiments in which any combination of features described above or their equivalents is contemplated without departing from the inventive concepts described. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).

Claims (13)

1.一种数据库访问方法,其特征在于,包括:1. A database access method, characterized in that it includes: 根据对指定数据表的单条数据库访问语句开启事务组,所述事务组包括多个事务;Opening a transaction group according to a single database access statement for a specified data table, wherein the transaction group includes multiple transactions; 确定所述指定数据表中对所述事务可见的数据集合,其中,所述数据集合对所述事务组中的所有事务是相同的;所述指定数据表包括指定数据节点上的一个或多个数据表;Determine a data set in the specified data table that is visible to the transaction, wherein the data set is the same for all transactions in the transaction group; the specified data table includes one or more data tables on a specified data node; 通过所述多个事务对所述数据集合执行访问操作;performing access operations on the data set through the multiple transactions; 所述事务组包括一个主事务和多个从事务,所述主事务先于所述从事务开启;The transaction group includes a master transaction and multiple slave transactions, and the master transaction is started before the slave transactions; 所述指定数据表中对所述事务可见的数据集合,包括所述主事务创建的数据,和开启时间早于所述主事务且不属于所述事务的对应活跃事务集合的事务创建的数据,所述事务的对应活跃事务集合是所述主事务开启时,所述指定数据节点上的活跃事务集合。The data set visible to the transaction in the designated data table includes data created by the main transaction and data created by a transaction that is started earlier than the main transaction and does not belong to the corresponding active transaction set of the transaction, where the corresponding active transaction set of the transaction is the active transaction set on the designated data node when the main transaction is started. 2.根据权利要求1所述的方法,其中:2. The method according to claim 1, wherein: 所述根据对指定数据表的单条数据库访问语句开启事务组,包括将拆分所述单条数据库访问语句而得到的多个子任务对应的多个事务划分为一个事务组。The opening of the transaction group according to the single database access statement for the designated data table includes dividing the multiple transactions corresponding to the multiple subtasks obtained by splitting the single database access statement into one transaction group. 3.根据权利要求1所述的方法,其中:3. The method according to claim 1, wherein: 所述事务组中的不同事务对应于指定计算节点到所述指定数据节点的不同数据库会话。Different transactions in the transaction group correspond to different database sessions from a specified computing node to the specified data node. 4.根据权利要求1所述的方法,其中:4. The method according to claim 1, wherein: 所述主事务具有对所述指定数据表的读写权限,所述从事务仅具有对所述指定数据表的读权限。The master transaction has read and write permissions for the specified data table, and the slave transaction only has read permissions for the specified data table. 5.根据权利要求1所述的方法,还包括:5. The method according to claim 1, further comprising: 在开启所述事务组时,给所述事务分配事务组标识符和事务标识符;When opening the transaction group, assigning a transaction group identifier and a transaction identifier to the transaction; 查找所述指定数据节点上开启的活跃事务中与所述事务组中的指定事务具有匹配的事务组标识符的事务;Searching for a transaction in active transactions opened on the designated data node that has a transaction group identifier that matches a designated transaction in the transaction group; 如果所述指定数据节点上开启的活跃事务中,没有与所述指定事务具有匹配的事务组标识符的事务,则将所述指定事务确定为主事务,将所述指定事务的标识符确定为所述指定事务的对应主事务标识符,将所述指定事务开启时,所述指定数据节点上的活跃事务集合确定为所述指定事务的对应活跃事务集合。If there is no transaction having a transaction group identifier matching the specified transaction among the active transactions opened on the specified data node, the specified transaction is determined as the main transaction, the identifier of the specified transaction is determined as the corresponding main transaction identifier of the specified transaction, and when the specified transaction is opened, the active transaction set on the specified data node is determined as the corresponding active transaction set of the specified transaction. 6.根据权利要求5所述的方法,其中:6. The method according to claim 5, wherein: 所述事务组标识符包括所述事务的事务名称中的指定字段;The transaction group identifier includes a specified field in the transaction name of the transaction; 所述给所述事务分配事务组标识符,包括通过在开启所述事务时对所述指定字段赋值来给所述事务分配事务组标识符。The allocating a transaction group identifier to the transaction includes allocating a transaction group identifier to the transaction by assigning a value to the designated field when starting the transaction. 7.根据权利要求5所述的方法,还包括:7. The method according to claim 5, further comprising: 如果所述指定数据节点上开启的活跃事务中,存在与所述指定事务具有匹配的事务组标识符的事务,则将所述指定事务确定为从事务,将与所述指定事务具有匹配的事务组标识符的事务的对应主事务标识符确定为所述指定事务的对应主事务标识符,将与所述指定事务具有匹配的事务组标识符的事务的对应活跃事务集合作为所述指定事务的对应活跃事务集合。If there is a transaction having a transaction group identifier matching the specified transaction among the active transactions opened on the specified data node, the specified transaction is determined as a slave transaction, the corresponding master transaction identifier of the transaction having a transaction group identifier matching the specified transaction is determined as the corresponding master transaction identifier of the specified transaction, and the corresponding active transaction set of the transaction having a transaction group identifier matching the specified transaction is used as the corresponding active transaction set of the specified transaction. 8.根据权利要求7所述的方法,其中,所述通过所述多个事务对所述数据集合执行访问操作,包括,针对所述多个事务中的任一事务:8. The method according to claim 7, wherein the performing access operations on the data set through the multiple transactions comprises, for any transaction in the multiple transactions: 确定创建所述任一事务要访问的指定数据的事务;Determine a transaction that creates the specified data to be accessed by any of the transactions; 如果创建所述指定数据的事务在所述任一事务的对应主事务之前开启并且不属于所述任一事务的对应活跃事务集合,或者创建所述指定数据的事务是所述任一事务的对应主事务,则所述指定数据对所述任一事务可见,否则所述指定数据对所述任一事务不可见。If the transaction that creates the specified data is opened before the corresponding main transaction of any of the transactions and does not belong to the corresponding active transaction set of any of the transactions, or the transaction that creates the specified data is the corresponding main transaction of any of the transactions, then the specified data is visible to any of the transactions; otherwise, the specified data is not visible to any of the transactions. 9.一种数据库访问装置,其特征在于,包括:9. A database access device, comprising: 开启模块,配置为根据对指定数据表的单条数据库访问语句开启事务组,所述事务组包括多个事务;An opening module, configured to open a transaction group according to a single database access statement for a specified data table, wherein the transaction group includes a plurality of transactions; 第一确定模块,配置为确定所述指定数据表中对所述事务可见的数据集合,其中,所述数据集合对所述事务组中的所有事务是相同的;所述指定数据表包括指定数据节点上的一个或多个数据表;A first determination module is configured to determine a data set in the specified data table that is visible to the transaction, wherein the data set is the same for all transactions in the transaction group; the specified data table includes one or more data tables on a specified data node; 执行模块,配置为通过所述多个事务对所述数据集合执行访问操作;an execution module, configured to perform access operations on the data set through the multiple transactions; 所述事务组包括一个主事务和多个从事务,所述主事务先于所述从事务开启;The transaction group includes a master transaction and multiple slave transactions, and the master transaction is started before the slave transactions; 所述指定数据表中对所述事务可见的数据集合,包括所述主事务创建的数据,和开启时间早于所述主事务且不属于所述事务的对应活跃事务集合的事务创建的数据,所述事务的对应活跃事务集合是所述主事务开启时,所述指定数据节点上的活跃事务集合。The data set visible to the transaction in the designated data table includes data created by the main transaction and data created by a transaction that is started earlier than the main transaction and does not belong to the corresponding active transaction set of the transaction, where the corresponding active transaction set of the transaction is the active transaction set on the designated data node when the main transaction is started. 10.一种分布式数据库,包括计算节点集群和数据节点集群,其中:10. A distributed database, comprising a computing node cluster and a data node cluster, wherein: 所述计算节点获取对指定数据表的单条数据库访问语句;The computing node obtains a single database access statement for a specified data table; 所述数据节点集群中存储所述指定数据表的指定数据节点根据所述单条数据库访问语句开启事务组,所述事务组包括多个事务,确定所述指定数据表中对所述事务可见的数据集合,其中,所述数据集合对所述事务组中的所有事务是相同的,所述指定数据表包括指定数据节点上的一个或多个数据表,通过所述多个事务对所述数据集合执行访问操作;The designated data node storing the designated data table in the data node cluster starts a transaction group according to the single database access statement, the transaction group includes multiple transactions, determines a data set in the designated data table that is visible to the transaction, wherein the data set is the same for all transactions in the transaction group, the designated data table includes one or more data tables on the designated data node, and access operations are performed on the data set through the multiple transactions; 所述事务组包括一个主事务和多个从事务,所述主事务先于所述从事务开启;The transaction group includes a master transaction and multiple slave transactions, and the master transaction is started before the slave transactions; 所述指定数据表中对所述事务可见的数据集合,包括所述主事务创建的数据,和开启时间早于所述主事务且不属于所述事务的对应活跃事务集合的事务创建的数据,所述事务的对应活跃事务集合是所述主事务开启时,所述指定数据节点上的活跃事务集合。The data set visible to the transaction in the designated data table includes data created by the main transaction and data created by a transaction that is started earlier than the main transaction and does not belong to the corresponding active transaction set of the transaction, where the corresponding active transaction set of the transaction is the active transaction set on the designated data node when the main transaction is started. 11.一种电子设备,其特征在于,包括存储器和处理器;其中,所述存储器用于存储一条或多条计算机指令,其中,所述一条或多条计算机指令被所述处理器执行以实现权利要求1~8任一项所述的方法步骤。11. An electronic device, characterized in that it comprises a memory and a processor; wherein the memory is used to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement the method steps described in any one of claims 1 to 8. 12.一种计算机可读存储介质,其上存储有计算机指令,其特征在于,该计算机指令被处理器执行时实现权利要求1~8任一项所述的方法步骤。12. A computer-readable storage medium having computer instructions stored thereon, wherein the computer instructions, when executed by a processor, implement the method steps described in any one of claims 1 to 8. 13.一种计算机程序产品,包括计算机指令,该计算机指令被处理器执行时实现权利要求1~8任一项所述的方法步骤。13. A computer program product, comprising computer instructions, which, when executed by a processor, implement the method steps of any one of claims 1 to 8.
CN202210674339.4A 2022-06-14 2022-06-14 Database access method, device, electronic device and readable storage medium Active CN115017180B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210674339.4A CN115017180B (en) 2022-06-14 2022-06-14 Database access method, device, electronic device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210674339.4A CN115017180B (en) 2022-06-14 2022-06-14 Database access method, device, electronic device and readable storage medium

Publications (2)

Publication Number Publication Date
CN115017180A CN115017180A (en) 2022-09-06
CN115017180B true CN115017180B (en) 2025-03-11

Family

ID=83074431

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210674339.4A Active CN115017180B (en) 2022-06-14 2022-06-14 Database access method, device, electronic device and readable storage medium

Country Status (1)

Country Link
CN (1) CN115017180B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113961315A (en) * 2020-07-21 2022-01-21 阿里巴巴集团控股有限公司 Transaction processing method, device, equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090037452A1 (en) * 2007-07-31 2009-02-05 Ahmad Baitalmal System and Method for Synchronizing Applications
US9990224B2 (en) * 2015-02-23 2018-06-05 International Business Machines Corporation Relaxing transaction serializability with statement-based data replication
CN114003644B (en) * 2021-10-21 2022-11-15 河南星环众志信息科技有限公司 Distributed transaction processing method, device, medium and database system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113961315A (en) * 2020-07-21 2022-01-21 阿里巴巴集团控股有限公司 Transaction processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN115017180A (en) 2022-09-06

Similar Documents

Publication Publication Date Title
US11093459B2 (en) Parallel and efficient technique for building and maintaining a main memory, CSR-based graph index in an RDBMS
US6112198A (en) Optimization of data repartitioning during parallel query optimization
US7689612B2 (en) Handling of queries of transient and persistent data
US8601474B2 (en) Resuming execution of an execution plan in a virtual machine
US20150081637A1 (en) Difference determination in a database environment
US20180046643A1 (en) Consistent execution of partial queries in hybrid dbms
US10885031B2 (en) Parallelizing SQL user defined transformation functions
US20070027860A1 (en) Method and apparatus for eliminating partitions of a database table from a join query using implicit limitations on a partition key value
US20190278773A1 (en) Grouping in analytical databases
JP2007501458A (en) Dynamic reassignment of data ownership
US20120330890A1 (en) Propagating tables while preserving cyclic foreign key relationships
US20080109813A1 (en) Resource assignment method, resource assignment program and management computer
US11714794B2 (en) Method and apparatus for reading data maintained in a tree data structure
US11720561B2 (en) Dynamic rebuilding of query execution trees and reselection of query execution operators
US7293011B1 (en) TQ distribution that increases parallism by distributing one slave to a particular data block
CN117931765A (en) Re-validating propagated fine-grained decisions
US11940972B2 (en) Execution of operations on partitioned tables
Guo et al. Distributed join algorithms on multi-CPU clusters with GPUDirect RDMA
CN115017180B (en) Database access method, device, electronic device and readable storage medium
US11275571B2 (en) Unified installer
US11914598B2 (en) Extended synopsis pruning in database management systems
Shi et al. PECC: parallel expansion based on clustering coefficient for efficient graph partitioning
CN109388638B (en) Method and system for distributed massively parallel processing of databases
US12430339B2 (en) Pipelined execution of database queries processing streaming data
US12326866B1 (en) Out-of-core BFS for shortest path graph queries

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant