[go: up one dir, main page]

HK1170577B - Increasing database availability during fault recovery - Google Patents

Increasing database availability during fault recovery Download PDF

Info

Publication number
HK1170577B
HK1170577B HK12111170.5A HK12111170A HK1170577B HK 1170577 B HK1170577 B HK 1170577B HK 12111170 A HK12111170 A HK 12111170A HK 1170577 B HK1170577 B HK 1170577B
Authority
HK
Hong Kong
Prior art keywords
quorum
copy
reconfiguration
copies
sets
Prior art date
Application number
HK12111170.5A
Other languages
Chinese (zh)
Other versions
HK1170577A1 (en
Inventor
V‧沙阿
S‧O‧沃蒂莱宁
T‧塔留斯
Original Assignee
微软技术许可有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/948,541 external-priority patent/US8326801B2/en
Application filed by 微软技术许可有限责任公司 filed Critical 微软技术许可有限责任公司
Publication of HK1170577A1 publication Critical patent/HK1170577A1/en
Publication of HK1170577B publication Critical patent/HK1170577B/en

Links

Description

Increasing database availability during failover
Technical Field
The application relates to increasing availability of a database during failover. In particular, access to data of the data partition is provided during reconfiguration of the data partition.
Background
Computers have become highly integrated in work, homes, mobile devices, and many other places. Computers can process large amounts of information quickly and efficiently. Software applications designed to run on computer systems allow users to perform a wide variety of functions including business applications, school assignments, entertainment, and the like. Software applications are typically designed to perform specific tasks, such as word processor applications for drafting documents or email programs for sending, receiving and organizing email.
In many cases, software applications are designed to interact with other software applications or other computer systems. For example, internet browsers send user requests to web servers, and these web servers respond by responding to the user's request. web servers and other computer systems may be configured to access data stores as part of responses to user requests. These data stores may store a large amount of information and may include copies that replicate the data for additional redundancy. In some cases, the copies are grouped together as a copy set or cluster. When one of the copies in the copy set becomes unavailable and subsequently comes back online, the copy set must be updated and reconfigured. During such reconfiguration, the copy set is not available to respond to data read or write requests.
Disclosure of Invention
Embodiments described herein are directed to providing database access during database reconfiguration and maintaining replication connections during database reconfiguration. In one embodiment, a computer system establishes a plurality of quorum copy sets to replicate data of a data partition. The quorum copy set ensures that at least a minimum number of copies are available to commit pending transactions during a partition reconfiguration. The computer system determines that a data partition reconfiguration has been initiated and provides access to the data of the data partition during the reconfiguration of the data partition using at least a quorum of the copies in each of the quorum sets of copies.
In another embodiment, a computer system establishes a plurality of quorum copy sets to replicate data of a data partition. The quorum copy set ensures that at least a minimum number of copies are available to commit pending transactions during a partition reconfiguration. The computer system determines that the departure of the copy has initiated a reconfiguration of the data partition. The computer system prevents existing database replication connections from being disconnected upon copy-away and provides access to data of the data partition during reconfiguration using at least a quorum of copies in each of a quorum set of copies maintained during reconfiguration of the data partition.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Additional features and advantages will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the teachings herein. The features and advantages of the invention may be realized and obtained by means of the instruments and combinations particularly pointed out in the appended claims. Features of the present invention will become more fully apparent from the following description and appended claims, or may be learned by the practice of the invention as set forth hereinafter.
Drawings
To further clarify the above and other advantages and features of embodiments of the present invention, a more particular description of embodiments of the present invention will be rendered by reference to the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:
FIG. 1 illustrates a computer architecture in which embodiments of the present invention may operate, including providing database access during database reconfiguration and maintaining replication connections during database reconfiguration.
FIG. 2 illustrates a flow diagram of an example method for providing database access during database reconfiguration.
FIG. 3 illustrates a flow diagram of an example method for maintaining replication connections during database reconfiguration.
Fig. 4 shows a flow chart of a reconfiguration process.
Detailed Description
Embodiments described herein are directed to providing database access during database reconfiguration and maintaining replication connections during database reconfiguration. In one embodiment, a computer system establishes a plurality of quorum copy sets to replicate data of a data partition. The quorum copy set ensures that at least a minimum number of copies are available to commit pending transactions during a partition reconfiguration. The computer system determines that a data partition reconfiguration has begun and provides access to the data of the data partition during the reconfiguration of the data partition using at least a quorum of the copies in each of the quorum sets of copies.
In another embodiment, a computer system establishes a plurality of quorum copy sets to replicate data of a data partition. The quorum copy set ensures that at least a minimum number of copies are available to commit pending transactions during a partition reconfiguration. The computer system determines that the departure of the copy has initiated a reconfiguration of the data partition. The computer system prevents existing database replication connections from being disconnected upon copy-away and provides access to data of the data partition during reconfiguration of the data partition using at least a quorum of copies in each of the quorum set of copies.
The following discussion now refers to various methods and method acts that may be performed. It should be noted that while the method acts may be discussed in a certain order or depicted in a flowchart as occurring in a particular order, that particular order is not necessarily required unless specifically stated or required because one act is dependent on another act being completed before the act is performed.
Embodiments of the present invention may comprise or utilize a special purpose or general-purpose computer including computer hardware, such as, for example, one or more processors and system memory, as discussed in greater detail below. Embodiments within the scope of the present invention also include physical and other computer-readable media for carrying or storing computer-executable instructions and/or data structures. Such computer-readable media can be any available media that can be accessed by a general purpose or special purpose computer system. Computer-readable media storing computer-executable instructions are computer storage media. Computer-readable media carrying computer-executable instructions are transmission media. Thus, by way of example, and not limitation, embodiments of the invention can include at least two disparate types of computer-readable media: computer storage media and transmission media.
Computer storage media includes RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computer.
A "network" is defined as one or more data links that allow electronic data to be transferred between computer systems and/or modules and/or other electronic devices. When information is transferred or provided over a network or another communications connection (either hardwired, wireless, or a combination of hardwired or wireless) to a computer, the computer properly views the connection as a transmission medium. Transmission media can include a network and/or data links which can be used to carry desired program code means in the form of computer-executable instructions or data structures and which can be accessed by a general purpose or special purpose computer. Combinations of the above should also be included within the scope of computer-readable media.
Further, upon reaching various computer system components, program code means in the form of computer-executable instructions or data structures can be transferred automatically from transmission media to computer storage media (or vice versa). For example, computer-executable instructions or data structures received over a network or a data link may be buffered in RAM within a network interface module (e.g., a "NIC") and then eventually transferred to computer system RAM and/or to less volatile computer storage media at a computer system. Thus, it should be understood that computer storage media can be included in computer system components that also utilize (or even primarily utilize) transmission media.
Computer-executable instructions comprise, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. The computer-executable instructions may be, for example, binaries, intermediate format instructions such as assembly language, or even source code. Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the features and acts described above are disclosed as example forms of implementing the claims.
Those skilled in the art will appreciate that the invention may be practiced in network computing environments with many types of computer system configurations, including, personal computers, desktop computers, laptop computers, message processors, hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, mobile telephones, PDAs, pagers, routers, switches, and the like. The invention may also be practiced in distributed system environments where local and remote computer systems, which are linked (either by hardwired data links, wireless data links, or by a combination of hardwired and wireless data links) through a network, both perform tasks. In a distributed system environment, program modules may be located in both local and remote memory storage devices.
FIG. 1 illustrates a computer architecture 100 in which the principles of the present invention may be employed. Computer architecture 100 includes a database 110. The database may be any type of database or data storage system and may include storage devices on one or more computing systems. For example, a database may be local to an organization or institution, or may be distributed across many different computer systems across a wide geographic area. Database 110 may include a Storage Area Network (SAN) or other storage scheme. The database is accessible over the internet and may be configured to receive requests from users. For example, user 105 may send a data request requesting data and/or services provided by a database. These requests may be stored by the database as pending transactions 115.
Database transactions ensure that no data is lost when a user request is performed. For example, if a user is requesting that their bank data be updated, the database transaction will ensure that the data is updated as requested by the user. The data of the database may be backed up in the form of copies. For example, each data storage partition may have one or more copies of data. As shown in FIG. 1, these copies may be part of a set of quorum copies (e.g., 126A/126B). Each quorum set may include a plurality of different copies. While a nominal number of sets A and B are shown with one primary copy set (127A/127B) and two secondary copy sets (128A1/128A2/128B1/128B2), it should be understood that a different number of copies may be used. Quorum set establishing module 125 may establish various different numbers of quorum sets based on various criteria. In some cases, each data partition has a set of quorum, and is then allocated a second, temporary quorum for use during reconfiguration.
Database reconfiguration may occur when a copy is down (i.e., stops working due to a computer failure, a network failure, or some other problem) or is restored (i.e., begins working again). Thus, for example, if the secondary copy 128A1 were to be down, it would be necessary to reconfigure the regular quantity set 126A. Similarly, if the secondary copy 128a1 were to be restored again at some later point in time, the regular number set 126A would need to be reconfigured again. The reconfiguration module 120 can be used to reconfigure the quorum copy set in such a manner that: the database service is provided to the user while the reconfiguration is taking place. These and other concepts will be explained in more detail below with reference to fig. 2 and 3.
As indicated above, read and/or write operations may be performed on the database partition even during the database partition undergoing the reconfiguration process. In some embodiments, this may be accomplished by maintaining multiple dynamically-defined quantity sets to allow read/write access to the partition while keeping it transactionally consistent during the reconfiguration process. The disconnection of the duplicate connection between existing copies can be prevented during the reconfiguration process. This may allow a user (e.g., 105) to perform read/write operations during the reconfiguration process. Copies may be added to or removed from each quorum set as follows: the partitions remain transactionally consistent during reconfiguration and in the presence of user transactions. Also, operations that rely on reading from the primary copy of the database may be prevented from being reset when the database undergoes reconfiguration. Such operations may include creating a new copy for the partition, or creating a copy of the partition.
In a distributed data storage system (e.g., database 110), when the configuration of a partition is to be changed, a reconfiguration process is performed. The reconfiguration process involves changing the active configuration of the partition. As part of this process, existing replication connections between copies of the partition are maintained that are typically to be broken. Thus, a user can perform read/write operations on the partition during the process.
In some embodiments, a special case of reconfiguration or minimal reconfiguration may be implemented. The minimum reconfiguration may ensure that a user may perform read/write operations on the partition for the duration of the reconfiguration process. In some cases, to perform minimal reconfiguration, a writen-numbered copy is to be established. This number of writes may be defined as an upper limit of (n +1)/2, where n is the total number of copies in the configuration. The current primary copy will start and run and will also be the primary copy after a minimum reconfiguration. After these items are established, a minimum reconfiguration is initiated.
In some embodiments, the minimum reconfiguration differs from the conventional reconfiguration in the following respects: 1) maintaining a plurality of quorum sets that are dynamically updated during the reconfiguration process, 2) the primary copies are part of the plurality of quorum sets, 3) initially, all the secondary copies are either in the first quorum set or outside of the quorum, 4) at the end of the reconfiguration, all the secondary copies are either in the first quorum set or outside of the quorum, 5) at most two quorum copy sets are maintained for the duration of the reconfiguration process, 6) the quorum sets are modified by the configuration members for the duration of the reconfiguration, and 7) user transactions are committed on each quorum set before they are considered committed.
Due to article 7 above, at any point during the reconfiguration, there will be a legal number of copies available in each quorum set. Thus, the user may successfully complete the write transaction on the partition. Read transactions are also possible because at least a write quorum of copies is available (where read quorum is a lower limit of (n +1)/2 and write quorum is an upper limit of (n + 1)/2). During this operation, if the number of available copies in the previous configuration drops below its write-legal number, the reconfiguration agent will detect this, terminate the current minimum reconfiguration, and restart the reconfiguration as a regular reconfiguration.
In view of the above-described systems and architectures, methodologies that may be implemented in accordance with the disclosed subject matter will be better appreciated with reference to the flow charts of fig. 2 and 3. For purposes of simplicity of explanation, the methodologies are shown and described as a series of blocks. It is to be understood and appreciated, however, that the claimed subject matter is not limited by the order of the blocks, as some blocks may occur in different orders and/or concurrently with other blocks from what is depicted and described herein. Moreover, not all illustrated blocks may be required to implement the methodologies described hereinafter.
FIG. 2 illustrates a flow diagram of a method 200 for providing database access during database reconfiguration. The method 200 will now be described with frequent reference to the components and data of the environment 100.
Method 200 includes an act of establishing a plurality of quorum copysets to replicate data for a given data partition, wherein the quorum copysets ensure that at least a minimum number of copies are available to commit pending transactions during a partition reconfiguration (act 210). For example, quorum set establishing module 125 will establish quorum sets A and B (126A/126B) to replicate the data for a given data partition. The quorum copy set ensures that at least a minimum number of copies are available to commit pending transactions 115 during a partition reconfiguration.
In some cases, transactions (or data from transactions) are replicated across a minimum copy set. For example, the minimal set of copies may include a primary copy (e.g., 127A) and at least one (or at least two, etc.) secondary copy (e.g., 128a1 and 128a 2). Each quorum copy set includes at least one primary copy and includes any number of secondary copies. In some cases, a single primary copy may be a member of multiple quorum copy sets. Thus, in FIG. 1, primary copy 127A in quorum set A can be the same primary copy in quorum set B.
In some embodiments, a quorum copy set may be a temporary quorum copy set that is instantiated to answer requests during reconfiguration. Thus, for example, legal quantity set B (126B) may be a temporary legal quantity set that is established to respond to data requests (e.g., 106) or other pending transactions while the database is being reconfigured. In some cases, such a temporary quorum copy set is removed after the reconfiguration is concluded.
Access to a given data partition may be provided at a number of different stages of reconfiguration, as shown in fig. 4. In the example shown in FIG. 4, a minimal reconfiguration is initiated when the secondary copy C joins an existing set of a legal number. As shown in 410, secondary copy C will be added to the existing legal number set AB comprising primary copy A and secondary copy B. As shown in 415, quorum set 1 includes A and B, while quorum set 2 includes primary copy A.
At stage 1(420), when reconfiguration begins, the quorum group membership of all the copies belonging to the previous configuration and to the new configuration is changed so that they become both quorum set 1 and quorum set 2, and only all the copies belonging to the new configuration are added to become part of quorum set 2. Thus, at 425, quorum set 1 has copies A and B, while quorum set 2 has copies A, B and C. A phase 1 increment (catch-up)430 may be initiated in which the joining node C is updated to be the same as the secondary copy B. Quorum set 2 is updated at 435 and quorum set 2 has primary a and secondary B and C copies, and has a sufficient number of copies and copy distributions to commit the transaction, as shown at 440.
During stage 2(445), the set of normal quantities is not changed, as shown in 450. During phase 3(455), the quorum membership of all the copies belonging to the new configuration is changed so that they are now part of quorum set 1. Moreover, the quorum membership of all copies that do not belong to the new configuration is changed so that they are no longer part of any quorum set (outside of quorum). Thus, the new configuration of quorum set 1 has copies A, B and C, while quorum set 2 has only primary copy A, as shown in 460. During stage 4(465), a commit message is sent, and quorum set 1 is fully operational with the updated secondary copy C.
Returning to FIG. 2, method 200 includes an act of determining that a data partition reconfiguration has been initiated (act 220). For example, reconfiguration module 120 may determine that a data partition reconfiguration has been initiated for a legal number of sets A (126A). The reconfiguration may be initiated by a leaving or joining of a copy set by a quorum of copies of the copy set (e.g., 128a 2). During the reconfiguration, the nominal number of copy sets added by the joining copy are modified to include the joining copy. Similarly, when a copy leaves a quorum set, the quorum set is reconfigured. When a copy leaves a legal number set, the reconfiguration module 120 may prevent the existing database replication connection from being disconnected due to the leaving of the copy. Thus, if the secondary copy 128A2 were to leave the legal number set A (126A), the existing database replication connection between the database and the primary copy 127A and secondary copy 128A1 would not be broken.
Method 200 includes providing access to data of the data partition during reconfiguration of the data partition using at least a quorum of the copies in each of the quorum sets of copies. For example, database 110 may provide access to data for a given partition using a regular number of sets A of primary copies 127A and secondary copies 128A1 during reconfiguration of the data partition. In some cases, a database transaction may be acknowledged by most of the quorum sets of replicas (two of the three replicas of the quorum set of FIG. 1). When a quorum of members is moved to a different quorum set during different phases of the reconfiguration, the data on the partitions can be maintained in a transactionally consistent manner. In this way, access to the underlying data will be provided in a transactionally consistent manner regardless of how many copies are changed or how copies are changed. This ensures that no data is lost in any transaction. Also, operations that rely on reading from the primary copy of the database may be prevented from being reset during the reconfiguration process.
Turning now to FIG. 3, FIG. 3 illustrates a flow diagram of a method 300 for maintaining replication connections during database reconfiguration. The method 300 will now be described with frequent reference to the components and data of the environment 100.
Method 300 includes an act of establishing a plurality of quorum sets of replicas to replicate data of a given data partition, wherein the quorum sets of replicas ensure that at least a minimum number of replicas are available to commit pending transactions during a partition reconfiguration (act 310). For example, quorum set establishing module 125 may establish quorum sets A and B (126A/126B) to replicate data for a given data partition. The quorum copy set ensures that at least a minimum number of copies are available to commit pending transactions 115 during a partition reconfiguration.
Method 300 includes an act of determining that the leaving of the copy initiated a data partition reconfiguration (act 320). For example, reconfiguration module 120 may determine that a data partition reconfiguration has been initiated for a legal number of sets A (126A). Reconfiguration may be initiated by, for example, the secondary copy joining or leaving the legal number set a. Method 300 also includes an act of preventing an existing database replication connection from being broken when the copy leaves (act 330).
For example, reconfiguration module 120 may prevent any existing database replication connections to other copies of quorum set A (e.g., connections to primary copy 127A or secondary copy 128A 1) from being disconnected or removed. In this way, the database replication connection to the unchanged copy remains intact. Subsequently, the copy that remains intact can continue to process transactions during the reconfiguration. For copies that are removed (i.e., are left from a legal number of sets), the leaving copy may be removed in such a way that the copied partitions remain transactionally consistent during the reconfiguration. In this way, any transactions processed will be consistent and will provide transactional guarantees desired by the database user.
Method 300 also includes providing access to the data of the data partition using at least the quorum of copies in each of the quorum sets of copies during the reconfiguration of the data partition (act 340). For example, a legal number set A (126A) may provide access to a data partition of a database during the partition reconfiguration. The legal number set may provide such access using at least one of the primary copy 127A and the secondary copy (128A1/128A 2). In some embodiments, various applications may be prevented from being reset during reconfiguration. In particular, applications that rely on reading from the primary copy of the database may be prevented from being reset. In this way, at least in some cases, the partition copy operation may be prevented from being reset during reconfiguration. Additionally, or alternatively, new copy creation operations may be prevented from being reset during reconfiguration.
Thus, systems, methods, and computer program products are provided that provide database access during database reconfiguration. Transactions may continue to be processed in a transactionally consistent manner during reconfiguration. Thus, systems, methods, and computer program products are provided for maintaining replication connections during database reconfiguration. In this way, copies that are not changed as part of the reconfiguration can maintain their duplicate connections and can continue to provide database access during the reconfiguration.
The present invention may be embodied in other specific forms without departing from its spirit or essential characteristics. The described embodiments are to be considered in all respects only as illustrative and not restrictive. The scope of the invention is, therefore, indicated by the appended claims rather than by the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.

Claims (11)

1. In a computer networking environment comprising a plurality of computing systems, in a computer system comprising a processor and a memory, a computer-implemented method for providing database access during database reconfiguration, the method comprising:
an act of establishing, for a given data partition of a database, two legal number copy sets comprising a first legal number copy set and a second legal number copy set (126A/126B) to replicate data of the given data partition, such that there are two legal number copy sets for the same data partition, wherein the legal number copy sets ensure at least a minimum number of copies for committing pending transactions during partition reconfiguration (120), and wherein the legal number sets comprise at least one primary copy having membership in both of the legal number sets;
an act of determining that a reconfiguration of a data partition has been initiated, the reconfiguration of the data partition being performed to change a nominal number of sets of configurations to a new configuration; and
an act of providing access to data of the data partition during the reconfiguration (120) of the data partition using at least the quorum of copies (127A-128A 2) in each of the two quorum sets of copies by:
when reconfiguration begins, changing the quorum membership of all copies in the first quorum set such that all copies in the first quorum set are members in both the first and second quorum sets;
changing the second quorum set to the new configuration but including the primary copy having membership in both of the quorum sets and having a sufficient number of copies and copy distributions to commit pending transactions during the reconfiguration so that the second quorum set can be used to commit pending transactions during the reconfiguration;
changing membership of all copies in the second quorum set configured as the new configuration to be members of the first quorum set so that the first quorum set is configured as the new configuration; and
changing the membership of any copies that do not belong to the new configuration so that they are no longer part of any legal number set.
2. The method of claim 1, wherein each copy set comprises a plurality of secondary copies.
3. The method of claim 2, wherein at least one of the quorum copy sets is a temporary quorum copy set instantiated as a response to the request during the reconfiguration.
4. The method of claim 3, wherein the temporary quorum copy set is removed after reconfiguration has ended.
5. The method of claim 1, wherein access to the database partition is provided during multiple stages of reconfiguration.
6. The method of claim 1, wherein the reconfiguration is initiated by a copy-leave or a copy-join copy set of the plurality of quorum copy sets.
7. The method of claim 6, further comprising preventing existing database replication connections from being broken due to a departure of a copy.
8. A method for maintaining replication connections during database reconfiguration, comprising:
an act of establishing, for a given data partition of a database, two legal number copy sets comprising a first legal number copy set and a second legal number copy set (126A/126B) to replicate data of the given data partition, such that there are two legal number copy sets for the same data partition, wherein the legal number copy sets ensure at least a minimum number of copies (127A-128A 2) for pending transactions during partition reconfiguration (120), and wherein the legal number sets include at least one primary copy that has membership in both of the legal number sets;
for the given data partition of the database, an act of determining that a copy (128A 1) has left the reconfiguration of the initiated data partition, the reconfiguration of the data partition being performed to change a nominal number of sets of configurations to a new configuration; and
an act of using at least a quorum of the copies (127A-128A 2) in each of the two quorum copy sets (126A/126B) to provide access to data of the data partition during the reconfiguration (120) of the data partition, such that a database replication connection can be maintained even if the copies leave by:
when reconfiguration begins, changing the quorum membership of all copies in the first quorum set such that all copies in the first quorum set are members in both the first and second quorum sets;
changing the second quorum set to the new configuration but including the primary copy having membership in both of the quorum sets and having a sufficient number of copies and copy distributions to commit pending transactions during the reconfiguration so that the second quorum set can be used to commit pending transactions during the reconfiguration;
changing membership of all copies in the second quorum set configured as the new configuration to be members of the first quorum set so that the first quorum set is configured as the new configuration; and
changing the membership of any copies that do not belong to the new configuration so that they are no longer part of any legal number set.
9. The method of claim 8, wherein the copy is removed in a manner that the partition of the outgoing copy remains transactionally consistent during the reconfiguration.
10. A method for providing database access during database reconfiguration, the method comprising:
an act of establishing, for a given data partition of a database, two legal-number copy sets (126A/126B) comprising a first legal-number copy set and a second legal-number copy set (126A/126B) to replicate data of the given data partition, such that there are two legal-number copy sets for the same data partition, wherein the legal-number copy sets ensure at least a minimum number of copies (127A-128A 2) for pending transactions during partition reconfiguration, each copy set comprising one primary copy (127A) and at least one secondary copy (128A 1), and wherein the legal-number set comprises at least one primary copy having membership in both of the legal-number sets;
an act of determining that a reconfiguration (120) of a data partition has been initiated, the reconfiguration of the data partition being performed to change a nominal number of sets of configurations to a new configuration;
an act of providing access to data of the data partition during the reconfiguration (120) of the data partition using at least two copies (127A/128A 1) of the two quorum copy sets (126A/126B) by:
when reconfiguration begins, changing the quorum membership of all copies in the first quorum set such that all copies in the first quorum set are members in both the first and second quorum sets;
changing the second quorum set to the new configuration but including the primary copy having membership in both of the quorum sets and having a sufficient number of copies and copy distributions to commit pending transactions during the reconfiguration so that the second quorum set can be used to commit pending transactions during the reconfiguration;
changing membership of all copies in the second quorum set configured as the new configuration to be members of the first quorum set so that the first quorum set is configured as the new configuration; and
changing the membership of any copies that do not belong to the new configuration so that they are no longer part of any legal number set.
11. A system for maintaining replication connections during database reconfiguration, comprising:
for a given data partition of a database, means for establishing two legal-number copy sets (126A/126B) comprising a first legal-number copy set and a second legal-number copy set (126A/126B) to replicate data of the given data partition, such that there are two legal-number copy sets for the same data partition, wherein the legal-number copy sets ensure at least a minimum number of copies (127A-128A 2) for pending transactions during partition reconfiguration (120), and wherein the legal-number set comprises at least one primary copy having membership in both of the legal-number sets;
means for determining, for the given data partition of the database, that a departure of a copy (128A 1) has initiated a reconfiguration of the data partition and performing the reconfiguration of the data partition to change a nominal number of sets of configurations to a new configuration;
and
means for providing access to data of the data partition during reconfiguration (120) of the data partition using at least a quorum of the copies (127A-128A 2) in each of the two quorum copy sets (126A/126B), such that a database replication connection can be maintained even if a copy leaves by:
when reconfiguration begins, changing the quorum membership of all copies in the first quorum set such that all copies in the first quorum set are members in both the first and second quorum sets;
changing the second quorum set to the new configuration but including the primary copy having membership in both of the quorum sets and having a sufficient number of copies and copy distributions to commit pending transactions during the reconfiguration so that the second quorum set can be used to commit pending transactions during the reconfiguration;
changing membership of all copies in the second quorum set configured as the new configuration to be members of the first quorum set so that the first quorum set is configured as the new configuration; and
changing the membership of any copies that do not belong to the new configuration so that they are no longer part of any legal number set.
HK12111170.5A 2010-11-17 2012-11-06 Increasing database availability during fault recovery HK1170577B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/948,541 US8326801B2 (en) 2010-11-17 2010-11-17 Increasing database availability during fault recovery
US12/948,541 2010-11-17

Publications (2)

Publication Number Publication Date
HK1170577A1 HK1170577A1 (en) 2013-03-01
HK1170577B true HK1170577B (en) 2015-03-13

Family

ID=

Similar Documents

Publication Publication Date Title
US8326801B2 (en) Increasing database availability during fault recovery
JP6353924B2 (en) Reduced data volume durability status for block-based storage
US8886609B2 (en) Backup and restore of data from any cluster node
CN101523355B (en) Failover system and method
US8655851B2 (en) Method and system for performing a clean file lock recovery during a network filesystem server migration or failover
US8190838B1 (en) System and method for M-synchronous replication
EP3435604B1 (en) Service processing method, device, and system
US8364634B2 (en) System and method for processing fault tolerant transaction
US10169139B2 (en) Using predictive analytics of natural disaster to cost and proactively invoke high-availability preparedness functions in a computing environment
CN116233146B (en) Techniques to achieve cache coherency across distributed storage clusters
US11500738B2 (en) Tagging application resources for snapshot capability-aware discovery
CN111506649B (en) Transaction data disaster recovery switching method, device, computing equipment, and storage medium
JP6979079B2 (en) Methods, computer programs, and equipment for monotonous transactions in a multi-master database with loosely coupled nodes.
US20220197761A1 (en) Cloud architecture for replicated data services
CN106168915A (en) Distributed system architecture data process consistency ensuring method
Birman et al. Software for reliable networks
US10409629B1 (en) Automated host data protection configuration
US10664361B1 (en) Transactionally consistent backup of partitioned storage
US9436407B1 (en) Cursor remirroring
US12026056B2 (en) Snapshot capability-aware discovery of tagged application resources
CN116974983A (en) Data processing methods, devices, computer-readable media and electronic equipment
Yadav et al. Mathematical framework for a novel database replication algorithm
CN113986923B (en) Distributed transaction processing method, device and equipment based on append-only file storage
Islam et al. Tree-based consistency approach for cloud databases
HK1170577B (en) Increasing database availability during fault recovery