CN118467592B - Metadata caching method, device, equipment and medium of distributed database - Google Patents
Metadata caching method, device, equipment and medium of distributed database Download PDFInfo
- Publication number
- CN118467592B CN118467592B CN202410925120.6A CN202410925120A CN118467592B CN 118467592 B CN118467592 B CN 118467592B CN 202410925120 A CN202410925120 A CN 202410925120A CN 118467592 B CN118467592 B CN 118467592B
- Authority
- CN
- China
- Prior art keywords
- metadata
- invalid message
- local
- coordinator
- transaction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2255—Hash tables
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a metadata caching method, device, equipment and medium of a distributed database, relating to the technical field of databases, comprising the following steps: the coordinator obtains metadata through a metadata server and processes related operations of the metadata; the related operations comprise creation, deletion, modification of metadata tables and generation of a transaction execution plan according to the metadata; and the coordinator distributes the metadata information and the generated transaction execution plan to all participants, and all the participants access the data in the shared storage according to the metadata information and the transaction execution plan, execute the corresponding transaction and output a transaction execution result. The invention reduces the number and times of the participants accessing the metadata server to acquire the latest metadata and invalid information to the minimum, reduces the access to the metadata server as much as possible, improves the system performance, reduces the pressure of the metadata server, and ensures the consistency of the metadata.
Description
Technical Field
The present invention relates to the field of database technologies, and in particular, to a method, an apparatus, a device, and a medium for caching metadata in a distributed database.
Background
In modern database systems, the storage and computation of data is typically tightly coupled, meaning that the data storage and computation resources are commonly managed and operated on. However, with the explosive growth of data volume and increasing demand for computing resources, the split-store architecture is becoming a trend. In such an architecture, data storage and computing resources are separated onto different servers, which allows flexibility in expanding storage and computing power and better optimizing resource utilization.
Management of metadata becomes particularly important in a memory separation architecture. Metadata including data table structures, indexes, statistics, etc., are key to efficient operation of database systems. However, since metadata is typically stored on a separate metadata server, the computing nodes need to frequently access these centrally stored metadata, which can present several problems: (1) high latency: each time a computing node needs metadata, it needs to obtain from a metadata server, and delays in network communications can significantly impact system performance. (2) high load: the metadata server needs to handle requests from multiple computing nodes and is easily a bottleneck for the system. (3) consistency problem: concurrent access and modification of metadata by multiple compute nodes may lead to consistency issues that can further increase the complexity and overhead of the system.
For example: postgreSQL. PostgreSQL is a widely used open-source relational database management system, and its architecture design is mainly a traditional database model oriented to memory and calculation. It has some drawbacks in dealing with computational separation and efficient metadata management. Under the existence of a split architecture, a computing node must communicate with a metadata server over a network each time it needs to access or modify metadata. Such frequent network communications can lead to high delays that affect system performance. PostgreSQL lacks an efficient mechanism to cache these metadata and thus does not significantly reduce this latency. Greenplum is a powerful data product widely used in the industry, which still faces similar problems as PostgreSQL in the face of a split architecture.
Couchbase. The data access method has a built-in cache layer, improves the data access speed, and supports a high-availability and flexible data model. But requires additional mechanisms and costs in terms of ensuring data consistency, and is of high system complexity.
Memcached and Redis. The two products are high-efficiency memory cache systems, so that direct requests to the metadata server can be remarkably reduced, and distributed deployment is supported. But the consistency of the cache and the metadata server cannot be guaranteed, and especially under the condition of high concurrency modification, the complexity of the system and the operation and maintenance cost are increased.
Disclosure of Invention
In view of this, the embodiments of the present application provide a metadata caching method, apparatus, device, and medium for a distributed database, which can ensure data consistency and achieve high efficiency at the same time, and improve system performance.
The embodiment of the application provides the following technical scheme: a metadata caching method for a distributed database, comprising:
The coordinator obtains metadata through a metadata server and processes related operations of the metadata; the related operations comprise creation, deletion, modification of metadata tables and generation of a transaction execution plan according to the metadata;
and the coordinator distributes the metadata information and the generated transaction execution plan to all participants, and all the participants access the data in the shared storage according to the metadata information and the transaction execution plan, execute the corresponding transaction and output a transaction execution result.
According to one embodiment of the application, the method further comprises:
the coordinator identifies the metadata invalid message in the system cache, the relation cache and the page cache, and clears the metadata corresponding to the invalid message;
storing the invalid message into an invalid message set, and sending the invalid message set to the metadata server, wherein the metadata server stores the invalid message set into a local session space isolated from other transactions;
And when the current transaction or sub-transaction is submitted, the metadata server issues the full set message of the invalid message set stored in the local session space to a global invalid message queue.
According to one embodiment of the application, the method further comprises:
And if the current transaction or sub-transaction is cancelled, the metadata server destroys the invalid message set stored in the local session space.
According to one embodiment of the application, the method further comprises:
When executing corresponding transactions, the participator compares the global variable, the local variable and the local invalid message record value of the invalid message, if the local invalid message record value is smaller than the global variable or the local variable of the invalid message, immediately initiating an update request of an invalid message queue to the metadata server, executing the clearing operation of the invalid message and simultaneously updating the local invalid message record value; wherein the global variable of the invalid message is used for representing the global queue position or identification of the next invalid message to be processed; the local variable is used to characterize the local queue location or identity of the next invalid message to be processed.
According to one embodiment of the present application, the coordinator obtains metadata through the metadata server, and processes related operations of the metadata, and further includes:
The coordinator acquires metadata through a metadata server, creates a metadata table, generates a relation cache by locally caching a plurality of metadata tables, sequences the generated or updated relation cache and stores the relation cache as a local file;
traversing the grammar tree, determining a metadata database table to be accessed by the transaction, reading corresponding metadata information from the local file of the relation cache according to the determined metadata database table, and distributing the metadata information to the participants.
According to one embodiment of the application, the method further comprises:
When the coordinator modifies the metadata table, generating an invalidation message which indicates the modified metadata table and modified information in the table;
On the command boundary, the coordinator applies a local invalid message and collects the invalid message into a Hash table; wherein the Hash table is defined as a modification for tracking metadata;
Scanning a transaction execution plan table when generating a transaction execution plan, and determining a metadata table to be accessed;
And determining the earliest metadata modification record in each metadata table to be accessed according to the Hash table, and distributing the metadata modification record to the participant.
According to one embodiment of the present application, the coordinator distributes the metadata information and the generated transaction execution plan to all participants, further comprising:
the coordinator distributes the local invalid message and the maximum local invalid message record value which is received currently to the participant;
and the participant local application clears the metadata corresponding to the invalid message and records the received maximum local invalid message record value.
The application also provides a metadata caching device of the distributed database, which comprises: metadata server, coordinator and participants;
The coordinator is used for acquiring metadata through a metadata server and processing related operations of the metadata; the related operations comprise creation, deletion, modification of metadata tables and generation of a transaction execution plan according to the metadata; the coordinator is also used for distributing the metadata information and the generated transaction execution plan to all participants;
And the participant is used for accessing the data in the shared storage according to the metadata information and the transaction execution plan, executing the corresponding transaction and outputting a transaction execution result.
The application also provides a computer device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the metadata caching method of the distributed database when executing the computer program.
The present application also provides a computer-readable storage medium storing a computer program for executing the metadata caching method of the distributed database.
Compared with the prior art, the beneficial effects that above-mentioned at least one technical scheme that this description embodiment adopted can reach include at least: the embodiment of the invention adopts a mode that the coordinator generates and distributes metadata, reduces the number and times of the participants accessing the metadata server to acquire the latest metadata and invalid messages to the minimum, reduces the access to the metadata server as much as possible, improves the speed and reduces the pressure to the metadata server. Meanwhile, the method ensures the consistency of the metadata.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a metadata caching method of a distributed database according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a metadata cache arrangement of a distributed database according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a computer device of the present invention.
Detailed Description
Embodiments of the present application will be described in detail below with reference to the accompanying drawings.
Other advantages and effects of the present application will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present application with reference to specific examples. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. The application may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present application. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
As shown in fig. 1, an embodiment of the present invention provides a metadata caching method for a distributed database, including:
S101, a coordinator acquires metadata through a metadata server and processes related operations of the metadata; the related operations comprise creation, deletion, modification of metadata tables and generation of a transaction execution plan according to the metadata;
s102, the coordinator distributes the metadata information and the generated transaction execution plan to all participants, and all the participants access the data in the shared storage according to the metadata information and the transaction execution plan, execute the corresponding transaction and output a transaction execution result.
In the two-phase commit protocol (2 PC) in a distributed database system, participants (PARTICIPANT) and coordinators (coordinators) are two important roles. The coordinator is responsible for coordinating the operations such as commit or rollback of the distributed transaction. It is typically a central node or service responsible for managing the flow of the entire transaction. Participants are individual nodes or services that participate in a distributed transaction, perform specific transaction operations, and communicate with a coordinator to determine the final outcome of the transaction.
The embodiment of the invention provides a metadata caching method of a distributed database, which ensures the consistency of data and realizes high efficiency. Specifically, in a mode of sharing metadata and data storage, the coordinator is responsible for metadata-related operations such as creating/deleting tables and the like. It is also responsible for generating a transactional execution plan from the metadata, and dispatching the transactional execution plan to all participants to complete together. The participants access the data on the shared storage according to the metadata information to complete the execution result, so as to reduce the access to the metadata server as much as possible, improve the speed and reduce the pressure to the metadata server.
In one embodiment, the method further comprises:
the coordinator identifies the metadata invalid message in the system cache, the relation cache and the page cache, and clears the metadata corresponding to the invalid message;
storing the invalid message into an invalid message set, and sending the invalid message set to the metadata server, wherein the metadata server stores the invalid message set into a local session space isolated from other transactions;
And when the current transaction or sub-transaction is submitted, the metadata server issues the full set message of the invalid message set stored in the local session space to a global invalid message queue.
Since the computing clusters may attempt to invalidate the corresponding metadata when modifying the metadata. Therefore, it is necessary to collect syscache (system cache), relcache (Relation Cache relation cache) and other original invalid messages and page cached data gradually.
In practice, at CommandCounterIncrement (command counter increment) the process will be as follows:
(1) And (3) finding out invalid information of the page cache, and eliminating the local page cache, including the index cache. Further, the page cached invalidation message is stored in the invalidation message set.
(2) The complete set of invalid messages is sent to the metadata server. The metadata server currently only places messages in the session's own space, which are not available to other transactions.
(3) At commit of a transaction or sub-transaction. On the metadata server, invalidation messages for the session are published to a global invalidation message queue.
In one embodiment, further comprising: and if the current transaction or sub-transaction is cancelled, the metadata server destroys the invalid message set stored in the local session space.
In implementations, the uncommitted data may be written locally only, and written to the metadata server at commit time. In addition, when the memory space is insufficient, part of the cache data can be written to the disk.
In one embodiment, the method further comprises:
When executing corresponding transactions, the participator compares the global variable, the local variable and the local invalid message record value of the invalid message, if the local invalid message record value is smaller than the global variable or the local variable of the invalid message, immediately initiating an update request of an invalid message queue to the metadata server, executing the clearing operation of the invalid message and simultaneously updating the local invalid message record value; wherein the global variable of the invalid message is used for representing the global queue position or identification of the next invalid message to be processed; the local variable is used to characterize the local queue location or identity of the next invalid message to be processed.
As transactions are in progress, up-to-date metadata may be required, at which point the participant may attempt to receive all invalidation messages from the metadata server. In practice, first globalInvalmsgNextQD (global variable), localInvalmsgNextQD (local variable) and local invalidation message record values are compared, if the local record value is bigger, then it is not necessary to immediately initiate a request, i.e. it is explained that the local invalidation message queue is up to date, so it is not necessary to immediately initiate a new request, in which case the transaction can continue to process without immediately requesting a new invalidation message from the metadata server, which can reduce network traffic and increase efficiency. Otherwise, the message is continuously acquired from the metadata server, and then an invalidation operation is performed. And recording the latest value in the corresponding buffer memory to obtain the latest local invalid message record value.
In one embodiment, the coordinator obtains metadata through the metadata server and processes related operations of the metadata, and further includes:
The coordinator acquires metadata through a metadata server, creates a metadata table, generates a relation cache by locally caching a plurality of metadata tables, sequences the generated or updated relation cache and stores the relation cache as a local file;
traversing the grammar tree, determining a metadata database table to be accessed by the transaction, reading corresponding metadata information from the local file of the relation cache according to the determined metadata database table, and distributing the metadata information to the participants.
In a specific implementation, relcache (relational cache) is composed of information from multiple metadata tables, which is time-consuming to build, so that the coordinator directly distributes the information of the relational cache to the participants, thereby reducing the access frequency of the participants to the metadata server. Since relcache is itself relatively large, relcache is a cache of information from multiple metadata tables, which may include table structure, index information, view definitions, etc., and thus for efficiency and recoverability, each time relcache is generated, it is serialized and stored as a local file, so that the system can reload Relcache from the local file at startup or failure recovery. In the process of generating the transaction execution plan, the grammar tree is traversed, the database table to be accessed is found, relcache data is read out from the local and then sent to the participants, so that the participants can use the metadata to access the data on the shared storage.
On the other hand, when relcache changes occur, such as table structure updates or index additions/deletions, relcache also needs to be updated to reflect these changes. To ensure that all participants are using the most up-to-date metadata, the system will send an invalidation message to all participants when relcache changes occur. Upon receipt of the invalidation message by the participant, the invalidation message is performed, and the relcache local file cached and stored locally is deleted relcache to ensure that the most up-to-date metadata is reloaded at the next query or operation. The mechanism ensures that the participants of the system can efficiently access the latest metadata, reduces the dependence and access frequency on a central metadata server, and improves the overall performance and expandability of the system.
In one embodiment, the method further comprises:
When the coordinator modifies the metadata table, generating an invalidation message which indicates the modified metadata table and modified information in the table;
On the command boundary, the coordinator applies a local invalid message and collects the invalid message into a Hash table; wherein the Hash table is defined as a modification for tracking metadata;
Scanning a transaction execution plan table when generating a transaction execution plan, and determining a metadata table to be accessed;
And determining the earliest metadata modification record in each metadata table to be accessed according to the Hash table, and distributing the metadata modification record to the participant.
In this embodiment, when a metadata database table changes, session level caching is required, and expired data is avoided. This embodiment is implemented by tracking changes in metadata. The method comprises the following steps:
during the collection of the invalidation messages, it is identified which metadata entries have changed.
For the metadata being modified, its intermediate results cannot be seen by other concurrent processes of the machine, otherwise erroneous results may occur. When this portion of data is identified, caching needs to be enabled at the session level to isolate metadata views for different sessions in order to ensure that the metadata in the modification is not visible to other concurrent processes of the machine. The specific operation process is as follows:
1. The coordinator records an invalidation message during the modification process, the invalidation message indicating which record of which metadata table the modification operation was performed for.
2. Local invalidation messages are applied on command boundaries to ensure that the current session uses the most up-to-date metadata. These changes are also collected into a Hash table, denoted METACHANGE, which is used to track the modifications of the metadata.
3. In generating the transaction execution plan, the schedule is scanned for metadata tables to be accessed.
4. For each metadata table, the earliest piece of modified metadata is found and distributed to the participants, so that the participants can obtain the latest metadata.
5. If the participant finds that a metadata table is modified, a local cache is needed. If the metadata changes and data is newly added, the participants also need to invalidate the corresponding page caches so as to avoid using the outdated data.
In this way, the system ensures that each session has access to a consistent metadata view even if the metadata is being modified. Data inconsistency or erroneous results caused by concurrent modification are prevented, thereby ensuring data consistency.
In one embodiment, the coordinator distributes the metadata information and the generated transaction execution plan to all participants, further comprising:
the coordinator distributes the local invalid message and the maximum local invalid message record value which is received currently to the participant;
and the participant local application clears the metadata corresponding to the invalid message and records the received maximum local invalid message record value.
In this embodiment, the coordinator also sends a local invalidation message to the participants during the dispatch of the transaction execution plan to keep the metadata information synchronized. The specific implementation method comprises the following steps:
1. The coordinator traverses the invalid message accumulated by the local transaction, finds out the undelivered part, and packages and sends the part to the participant.
2. The maximum invalid message number (the maximum invalid message number is the maximum local invalid message record value) that the coordinator has received is also dispatched to the participants.
3. After the participants receive the invalid messages, the local application eliminates the corresponding metadata.
4. The participant records the largest invalid message number.
5. When the invalid message needs to be pulled from the metadata server, the local data is temporarily not pulled as long as the local invalid message number is larger than the maximum invalid message number sent by the metadata server, namely, the local data is indicated to meet the freshness requirement.
The embodiment of the invention mainly comprises the following steps: (1) collection and application of invalidation messages. Collecting invalid messages in a metadata change path and distinguishing local and global messages; (2) relational caching and generation and dispatch. The relation cache used by the high frequency is received and cached, so that the metadata server is prevented from being accessed; (3) collection and application of metadata changes. Recording local changes of metadata, selecting proper cache and ensuring correctness; (4) distribution and minimized usage of invalidation messages. The metadata in the change is used correctly and the number of accesses to the metadata server is reduced. Embodiments of the present invention are designed for architecture of a separate, shared metadata server. The method can carry out localized high-efficiency memory caching on the centralized storage metadata at the computing cluster side, thereby obtaining similar performance with a database integrated with memory and computation.
As shown in fig. 2, the present application further provides a metadata caching apparatus of a distributed database, including: metadata server, coordinator and participants;
The coordinator is used for acquiring metadata through a metadata server and processing related operations of the metadata; the related operations comprise creation, deletion, modification of metadata tables and generation of a transaction execution plan according to the metadata; the coordinator is also used for distributing the metadata information and the generated transaction execution plan to all participants;
And the participant is used for accessing the data in the shared storage according to the metadata information and the transaction execution plan, executing the corresponding transaction and outputting a transaction execution result.
In specific implementation, the coordinator is further used for identifying metadata invalid messages in a system cache, a relation cache and a page cache, and clearing metadata corresponding to the invalid messages; storing the invalid message into an invalid message set, and sending the invalid message set to the metadata server, wherein the metadata server is used for storing the invalid message set into a local session space isolated from other transactions; and when the current transaction or sub-transaction is submitted, the metadata server issues the full set message of the invalid message set stored in the local session space to a global invalid message queue.
In the implementation, if the current transaction or sub-transaction is cancelled, the metadata server destroys the invalid message set stored in the local session space.
When implementing, the participator compares the global variable, the local variable and the local invalid message record value of the invalid message when executing the corresponding transaction, if the local invalid message record value is smaller than the global variable or the local variable of the invalid message, immediately initiates an invalid message queue update request to the metadata server, and executes the clearing operation of the invalid message, and simultaneously updates the local invalid message record value; wherein the global variable of the invalid message is used for representing the global queue position or identification of the next invalid message to be processed; the local variable is used to characterize the local queue location or identity of the next invalid message to be processed.
In the implementation, a coordinator acquires metadata through a metadata server, creates metadata tables, generates a relation cache by locally caching a plurality of metadata tables, sequences the generated or updated relation cache and stores the relation cache as a local file; traversing the grammar tree, determining a metadata database table to be accessed by the transaction, reading corresponding metadata information from the local file of the relation cache according to the determined metadata database table, and distributing the metadata information to the participants.
In particular implementations, when modifying the metadata table, the coordinator generates an invalidation message indicating the modified metadata table and the modified information in the table; on the command boundary, the coordinator applies a local invalid message and collects the invalid message into a Hash table; wherein the Hash table is defined as a modification for tracking metadata; scanning a transaction execution plan table when generating a transaction execution plan, and determining a metadata table to be accessed; and determining the earliest metadata modification record in each metadata table to be accessed according to the Hash table, and distributing the metadata modification record to the participant.
In the implementation, the coordinator distributes the local invalid message and the maximum local invalid message record value which is received currently to the participant; and the participant local application clears the metadata corresponding to the invalid message and records the received maximum local invalid message record value.
In the embodiment, the coordinator is used for generating and distributing the metadata, so that the number and the times of the participant accessing the metadata server to acquire the latest metadata and invalid messages are reduced to the minimum, and the consistency of the metadata is ensured.
In one embodiment, a computer device is provided, as shown in fig. 3, including a memory 201, a processor 202, and a computer program stored on the memory and executable on the processor, where the processor implements the metadata caching method of the distributed database described above when executing the computer program.
In particular, the computer device may be a computer terminal, a server or similar computing means.
In the present embodiment, a computer-readable storage medium storing a computer program for executing the metadata caching method of the distributed database described above is provided.
In particular, computer-readable storage media, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer-readable storage media include, but are not limited to, phase-change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable storage media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It will be apparent to those skilled in the art that the modules or steps of the embodiments of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be separately fabricated into individual integrated circuit modules, or a plurality of modules or steps in them may be fabricated into a single integrated circuit module. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present application should be included in the present application. Therefore, the protection scope of the application is subject to the protection scope of the claims.
Claims (7)
1. A metadata caching method for a distributed database, comprising:
The coordinator obtains metadata through a metadata server and processes related operations of the metadata; the related operations comprise creation, deletion, modification of metadata tables and generation of a transaction execution plan according to the metadata;
the coordinator distributes the metadata information and the generated transaction execution plan to all participants, and all the participants access the data in the shared storage according to the metadata information and the transaction execution plan, execute the corresponding transaction and output a transaction execution result;
The method further comprises the steps of:
the coordinator identifies the metadata invalid message in the system cache, the relation cache and the page cache, and clears the metadata corresponding to the invalid message;
storing the invalid message into an invalid message set, and sending the invalid message set to the metadata server, wherein the metadata server stores the invalid message set into a local session space isolated from other transactions;
when the current transaction or sub-transaction is submitted, the metadata server issues the full set message of the invalid message set stored in the local session space to a global invalid message queue;
The method further comprises the steps of: when executing corresponding transactions, the participator compares the global variable, the local variable and the local invalid message record value of the invalid message, if the local invalid message record value is smaller than the global variable or the local variable of the invalid message, immediately initiating an update request of an invalid message queue to the metadata server, executing the clearing operation of the invalid message and simultaneously updating the local invalid message record value; wherein the global variable of the invalid message is used for representing the global queue position or identification of the next invalid message to be processed; the local variable is used for representing the local queue position or identification of the next invalid message to be processed;
The coordinator obtains metadata through the metadata server, processes related operations of the metadata, and further comprises:
The coordinator acquires metadata through a metadata server, creates a metadata table, generates a relation cache by locally caching a plurality of metadata tables, sequences the generated or updated relation cache and stores the relation cache as a local file;
traversing the grammar tree, determining a metadata database table to be accessed by the transaction, reading corresponding metadata information from the local file of the relation cache according to the determined metadata database table, and distributing the metadata information to the participants.
2. The method of metadata caching for a distributed database of claim 1, further comprising:
And if the current transaction or sub-transaction is cancelled, the metadata server destroys the invalid message set stored in the local session space.
3. The method of metadata caching for a distributed database of claim 1, further comprising:
When the coordinator modifies the metadata table, generating an invalidation message which indicates the modified metadata table and modified information in the table;
On the command boundary, the coordinator applies a local invalid message and collects the invalid message into a Hash table; wherein the Hash table is defined as a modification for tracking metadata;
Scanning a transaction execution plan table when generating a transaction execution plan, and determining a metadata table to be accessed;
And determining the earliest metadata modification record in each metadata table to be accessed according to the Hash table, and distributing the metadata modification record to the participant.
4. The method of claim 1, wherein the coordinator distributes the metadata information and the generated transaction execution plan to all participants, further comprising:
the coordinator distributes the local invalid message and the maximum local invalid message record value which is received currently to the participant;
and the participant local application clears the metadata corresponding to the invalid message and records the received maximum local invalid message record value.
5. A metadata caching apparatus of a distributed database, implementing the metadata caching method of a distributed database according to any one of claims 1 to 4, comprising: metadata server, coordinator and participants;
The coordinator is used for acquiring metadata through a metadata server and processing related operations of the metadata; the related operations comprise creation, deletion, modification of metadata tables and generation of a transaction execution plan according to the metadata; the coordinator is also used for distributing the metadata information and the generated transaction execution plan to all participants;
And the participant is used for accessing the data in the shared storage according to the metadata information and the transaction execution plan, executing the corresponding transaction and outputting a transaction execution result.
6. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the metadata caching method of the distributed database of any one of claims 1 to 4 when executing the computer program.
7. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program that performs the metadata caching method of the distributed database according to any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410925120.6A CN118467592B (en) | 2024-07-11 | 2024-07-11 | Metadata caching method, device, equipment and medium of distributed database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410925120.6A CN118467592B (en) | 2024-07-11 | 2024-07-11 | Metadata caching method, device, equipment and medium of distributed database |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118467592A CN118467592A (en) | 2024-08-09 |
CN118467592B true CN118467592B (en) | 2024-10-01 |
Family
ID=92164021
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410925120.6A Active CN118467592B (en) | 2024-07-11 | 2024-07-11 | Metadata caching method, device, equipment and medium of distributed database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118467592B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108829713A (en) * | 2018-05-04 | 2018-11-16 | 华为技术有限公司 | Distributed cache system, cache synchronization method and device |
CN111427966A (en) * | 2020-06-10 | 2020-07-17 | 腾讯科技(深圳)有限公司 | Database transaction processing method, device and server |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114238518B (en) * | 2021-12-22 | 2024-10-25 | 中国建设银行股份有限公司 | Data processing method, device, equipment and storage medium |
-
2024
- 2024-07-11 CN CN202410925120.6A patent/CN118467592B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108829713A (en) * | 2018-05-04 | 2018-11-16 | 华为技术有限公司 | Distributed cache system, cache synchronization method and device |
CN111427966A (en) * | 2020-06-10 | 2020-07-17 | 腾讯科技(深圳)有限公司 | Database transaction processing method, device and server |
Also Published As
Publication number | Publication date |
---|---|
CN118467592A (en) | 2024-08-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10691722B2 (en) | Consistent query execution for big data analytics in a hybrid database | |
US7581025B2 (en) | System and method for synchronizing copies of data in a computer system | |
CN102831156B (en) | Distributed transaction processing method on cloud computing platform | |
CN112162846B (en) | Transaction processing method, device and computer readable storage medium | |
US7162467B2 (en) | Systems and methods for managing distributed database resources | |
CN112749198B (en) | Multistage data caching method and device based on version number | |
US12314251B2 (en) | Transaction processing method and apparatus, computing device, and storage medium | |
US20160188690A1 (en) | Differentiated secondary index maintenance in log structured nosql data stores | |
US20050044080A1 (en) | Management of the file-modification time attribute in a multi-processor file server system | |
CN114265814B (en) | Data lake file system based on object storage | |
CN116108057B (en) | Distributed database access method, device, equipment and storage medium | |
US11347687B2 (en) | Incremental inline journaling in a journaled file system | |
CN116467275A (en) | Shared remote storage method, device, system, electronic equipment and storage medium | |
US7895247B2 (en) | Tracking space usage in a database | |
US11288237B2 (en) | Distributed file system with thin arbiter node | |
CN117539915A (en) | Data processing method and related device | |
CN100429622C (en) | Dynamic reassignment of data ownership | |
CN117111856A (en) | Data lake data processing method, device, system, equipment and medium | |
CN106649530B (en) | Cloud detail query management system and method | |
CN118467592B (en) | Metadata caching method, device, equipment and medium of distributed database | |
EP1196849B1 (en) | System and method for synchronizing copies of data in a computer system | |
CN115604287A (en) | A kind of edge data management method, electronic equipment and storage medium | |
CN117009434A (en) | Data synchronization method and device, storage medium and electronic equipment | |
CN109376141A (en) | A kind of data migration method and device | |
Radi | Improved aggressive update propagation technique in cloud data storage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |