WO2009065977A1 - Multi-version cache with relaxed isolation for replicated and non-replicated systems - Google Patents
Multi-version cache with relaxed isolation for replicated and non-replicated systems Download PDFInfo
- Publication number
- WO2009065977A1 WO2009065977A1 PCT/ES2008/000636 ES2008000636W WO2009065977A1 WO 2009065977 A1 WO2009065977 A1 WO 2009065977A1 ES 2008000636 W ES2008000636 W ES 2008000636W WO 2009065977 A1 WO2009065977 A1 WO 2009065977A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- transaction
- cache
- transactions
- database
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/568—Storing data temporarily at an intermediate stage, e.g. caching
Definitions
- the invention falls within the field of data replication to provide availability (fault tolerance) and scalability (increased performance through the addition of new nodes) applicable to software systems such as those based on multi-layer architectures and architectures oriented to services.
- multi-layer architectures it is applicable to state layers such as the application server layer and the database server layer.
- Transactions are a proposed mechanism to guarantee the consistency of data in databases and other information systems such as application servers, multi-layer architectures, service-oriented architectures and web services.
- Transactions provide two types of consistency, isolation and atomicity against failures. Isolation determines what type of consistency is provided when a set of transactions are executed concurrently and can access common data.
- the atomicity against failures guarantees that the transaction is executed as a unit, that is, or is executed in its entirety successfully (in which case it is said that the transaction compromises), or if there is any failure (in which case it is said that the transaction aborts), the final result is as if it had not been applied.
- Transactions provide another additional property known as durability that dictates that once a transaction has compromised the updates it has made, they cannot be lost even in the case of system failures (eg, fall of the node in which the system is running. transactional).
- levels of isolation such as seriality ⁇ serializability), snapshot (snapshot isolation) [Berenson95], read committed, etc.
- Seriality provides the highest level of consistency ensuring that the concurrent execution of transactions is equivalent to a sequential execution of these. This is that the concurrent system has a behavior equivalent to a sequential system without any concurrence, which greatly simplifies the development of transactional applications.
- Seriality has an inherent cost since it implies that the writings on a piece of data are conflicting with the readings and writings on it.
- snapshot isolation is one of the most popular isolation levels since it provides a consistency very close to seriality, but eliminates conflicts between readings and writings, only retaining conflicts between writes, less frequent than those of reading and writing, which allows systems to perform with better performance thanks to greater potential concurrence in the system. Snapshot isolation provides the illusion to transactions that the database is frozen (and therefore its data and their values) at the point at which a transaction is initiated (as if a photo had been taken of this one, or snapshot).
- Snapshot isolation does not allow two concurrent transactions to modify data in common, as is the case with seriality. However, it is possible that a transaction reads data A, modifies data B and another transaction modifies data A and reads data ⁇ , a situation not allowed by seriality. Snapshot isolation when implemented in databases generally employs multiple versions of the data.
- Read committed isolation is one of the most relaxed insulations. It only guarantees that a transaction always reads the last compromised values of the data. This isolation has numerous anomalies [Berenson95] that make it difficult to program concurrent applications based on it.
- Modern information systems are usually structured in multiple layers (tiers). Each layer is characterized by a type 'specialized server in a type of processing. The two most relevant layers are the data storage and the application server.
- the data storage layer is usually represented by a database server, a file server, a persistent repository of objects, etc.
- the application server layer provides functionality to deploy the business logic of user applications, typically providing the infrastructure necessary for its construction and deployment, such as data storage server cache to increase efficiency, transactional semantics. (typically being in charge of relating the transactions of the application server with those of the data storage server, typically a database), connection with data storage servers, session management with the clients of the application, etc. Session management allows to maintain volatile status between requests from the same customer (eg the shopping cart in an Internet sales service).
- the transaction management allows to offer the transactional semantics to the applications.
- An important point is the transactional consistency that is offered by combining the application server with a database (or other transactional data repository). At present, this combination offers a consistency formally characterized only for seriality. However, current application servers do not provide a formally characterized consistency for other lower isolation levels, such as snapshot isolation or read committed.
- the present invention proposes a method for combining an application server and a database (or other transactional data repository) that together provide snapshot isolation, this being one of the new contributions of the invention.
- each application server provides specific functionality for a given field (eg dynamic content for web pages).
- application server In the description of the present invention, we will use the term application server in its most general sense, characterizing it by its ability to maintain a data cache of the data storage layer and support for transactions.
- Replication is the main technique to provide fault tolerance and scalability (increase in the maximum productivity of the system by increasing the number of machines, nodes, in the system).
- Replication consists of executing multiple instances of a software, replicas, (usually in a distributed system, executing each instance on a different node) in a coordinated manner so that requests made to the system by its clients can be served by the system to Although some of the replicas fail (the node goes down, the software is blocked, the node disconnects from the network, etc.). In this way, the replication is able to tolerate faults by means of the introduction of processing redundancy.
- Replication can also be used to increase the productivity of a system. Taking advantage of the redundancy of the software in a distributed system, the work to be done by the software can be distributed among the different replicas (instances of the software) so that each node performs a fraction of the overall work of the system.
- Replication for increased productivity has been applied mainly for services without status, that is, services whose result is independent of the requests that have been received in the past and that depend only on the parameters of the request and, at most, on a persistent read-only state. This approach has been used for example by grid systems.
- scalable replication solutions have been developed. The different solutions can be classified according to how they execute the transactions, when they propagate the updates of the transactions and the level of isolation they provide.
- Transactions can be executed using primary-backup techniques [PlattnerO4] or update-in-any-replica (update-everywhere) [Kemme00, Pati ⁇ o05, Lin05].
- the ⁇ primary backup technique only allows to execute update transactions (those that modify the database) in a single replica, called primary.
- the rest of the replicas, called backups can only execute read-only transactions.
- This technique has the disadvantage that the primary becomes a bottleneck since it has to process all update transactions.
- all replicas can process all types of transactions, including update ones, which avoids the existing bottleneck in the primary backup.
- Database replication techniques can also be classified according to when and how the propagation of updates is made. This classification is relevant for replication techniques in which the update transaction is processed in a single replica, local replica (to which the client is connected), and then the updated data is propagated to the rest of the replicas (remote replicas) .
- the propagation of updates can be impatient (eage ⁇ or lazy (lazy).
- the impatient replication propagates the updates atomic with the commitment of the transaction in the local replica so that all active replicas apply the updates of the transaction or none
- the consistency between replicas is also guaranteed when all updates are applied in a coordinated manner.
- lazy replication the updates are propagated independently of the commitment of the transaction in the local replica, so that the status of the replicas can diverge resulting in inconsistencies, that is, the client can observe the last state of a data and later observe a previous state, or in general, the clients can observe sequences of states that would not be possible in a non-replicated system.
- the consistency of a replicated database is based on the correction-1- copy that consists in the fact that the behavior visible by the clients of a replicated system is equivalent to that of an unreplicated system.
- correction-1 -copy is seriality-1 -copy (1-copy-serializability) [Bemstein87] that dictates. that the replicated system must behave like a non-replicated serial system. That is, the replicated execution must be equivalent to a non-replicated execution in a system providing seriality as isolation. More recently, the notion of isolation-snapshot-1-copy (1-copy-snapshot-isolation) [LinO5] has been defined in which a replicated database has a behavior equivalent to that of a database with snapshot isolation .
- replication techniques are applicable to database replication. That is, the techniques only consider replication of the data warehouse and the consistency between the different replicas of it.
- a procedure is proposed for the replication of systems with multiple layers in which a copy of the data, generally called a cache, is maintained in a layer other than the data storage layer.
- the updates of the database replicas must be coordinated with the system replicas that keep a copy of the data in a cache (eg the application server) to ensure consistency between the data maintained between the different layers, subject not treated by the replication of databases.
- JEE Java TM Enterprise Edition
- API Java TM Enterprise Edition
- database persistent state
- WuO4 persistent state
- This solution has as main problem that the shared database becomes the system bottleneck and single point of failure.
- This type of solution has also been applied in CORBA recently [ZhaoO ⁇ ].
- the present patent solves the problem of the bottleneck of the shared database by simultaneously replicating the layer of the application server and the database.
- Caching systems generally maintain a single version of the data. Recently it has been proposed in the context of web pages and similar systems, that the cache may have multiple versions available for the same dentist (eg web pages in different languages) and return to each client the version most appropriate to their needs [Jacobs04]. This versioning in the cache has a purpose radically to that proposed in the present patent that is applied to multi-layer transactional systems to provide snapshot isolation.
- Garbage collection methods have also been proposed to eliminate unnecessary copies of the cache.
- One of these methods proposed in [MattisOl] for caches of multiple fragments of an object organizes the cache in different areas and determines how to delete fragments of the object from the cache and how to reorganize them after deletion.
- the proposed patent on the other hand, focuses on determining when the versions of the cache can be eliminated without violating the properties of the snapshot isolation.
- the present invention presents a cache system for multi-layer transactional systems! Multi-layer systems with at least two layers are considered: a data store layer (hereinafter database) and another layer that maintains a data warehouse cache (hereinafter application server).
- database data store layer
- application server data warehouse cache
- isolation snapshot isolation it is not possible to provide isolation snapshot isolation. to the transactions executed by the clients if the cache is canceled. That is, only if copies of the data are not kept in the cache, and the data is always accessed through the database, could the snapshot isolation be satisfied. Unfortunately, the option to clear the cache results in performance losses.
- the present invention proposes a cache system that maintains multiple versions of the data with the values they had when compromising transactions that, together with a procedure for managing said cache for both non-replicated and replicated systems, provides snapshot isolation.
- the present invention proposes a cache system that maintains multiple versions of each data corresponding to the generated values-
- the invention also proposes a procedure for data management. of the cache in which each transaction is guaranteed
- the cache system proposed in the present invention maintains multiple versions of each data, and when a transaction will modify a data of the cache, a private version of this data is created that only said transaction can see. In this way the existing versions will allow to offer the other transactions that are being
- the private version of the data will be the one that the transaction modifies and reads in successive accesses. In this way you can observe your own modifications, a necessary requirement to satisfy the snapshot isolation.
- the transaction is going to be committed, it is verified that no transaction already committed that was concurrent with it has modified data in common. If the latter occurs, the transaction will abort instead of compromising. If the transaction compromises and there are other transactions in progress that have modified the data or attempt to modify it, they will be aborted.
- the transaction executed in the database is synchronized with the application server transaction. This is achieved by simultaneously initiating an associated transaction in the database when a transaction is initiated on the application server. In this way when reading data that is not in the cache, the database thanks to its snapshot isolation and the synchronization between both transactions will provide the appropriate version of the data. As it is not assumed that the database provides information about its possible internal versioning, the application server labels the version read as unknown.
- the application server transaction is committed, the changes are first propagated to the database in the context of the associated transaction in the database and also committed to the database. If the transaction in the application server is aborted, its associated transaction in the database is also aborted.
- the system determines which version it should read. The most recent version corresponding to a compromised transaction is read before the start of the transaction. If none of the known versions meets this condition and there is a version labeled as unknown, it is returned.
- the cache management guarantees that there will always be
- the suitable version for all running transactions based not remove versions of Ia cache until it is known that they will no longer be required for any transaction ⁇ running.
- the creation of data is a special case of updating. When creating a data, there are no versions of it, so a private version of it is created without public versions. Deletion is also another special case of update.
- a transaction deletes a data and commits if all versions of the data were deleted, the snapshot isolation would be violated since transactions that were initiated prior to the commitment must see the status of the database when the data still existed. Therefore, when a deleted data is for a transaction, a grave version of it is created. When the transaction commits the grave version, it is made public with its corresponding label. If a transaction reads a grave version, the result is as if it did not find the data, this
- garbage collection does not eliminate versions that may result in violations of snapshot isolation. Therefore, the proposed procedure only deletes a version when you have
- a complementary alternative to garbage collection is hibernating data from the cache. It is always possible to free memory from the cache to eject some data to another medium (for example, a disk or another server). In this case, all versions of the data must be ejected. If a transaction attempts to access hibernating data, all versions must be brought and the corresponding version must be determined. Hibernating and dehibiting can be done at any time.
- the database does not have to provide snapshot solation, can provide seriality or isolation read committed, or even not provide transactionality.
- snapshot isolation isolation must continue to be provided, on the other hand, coordinate the replicated execution so that the result is equivalent to that of an unreplicated system (that is, no anomalies or inconsistencies are introduced in the replication).
- the replication model that we propose in the present invention consists in taking as a replication unit pairs application server and database.
- each application server is connected to a single database (which we will call local) and the application servers interact with each other to ensure consistency between replicas and the cache, as well as the transparency of replication.
- each client connects to the application server of any one of the replicas. This will be the local replica for the transactions you execute and the rest will be remote replicas. Transactions are first executed locally and then undergo remote processing that involves all replicas. Local processing is very similar to the processing described above for the non-replicated model. The only step that is treated differently is the commitment and garbage collection detailed below.
- a message is sent to all the replicas, including the local replica, with the updated data, as well as the start mark of the transaction.
- This message is sent atomic to all replicas so that it reaches all or none.
- the message must have the same relative order associated in all replicas to be processed in that order in all replicas. This same relative order can be achieved by any of the known methods in distributed systems such as full-order radiation, use of a sequence, through an agreement protocol, etc.
- This validation verifies that the concurrent execution in different replicas does not violate the transparency of repiration and thus guarantees that the replicated execution of the compromised transactions is equivalent to an unreplicated execution, in addition to satisfying the isolation snapshot isolat ⁇ on.
- the validation verifies that the transaction being validated has not modified common data with other concurrent transactions already committed. If it meets the condition it becomes compromised (that is, it exceeds the validation), if it is not aborted.
- application servers maintain conversational status with the client (known as session, volatile state that is maintained between calls from the same client), such as stateful session beans in
- JEE JEE To provide high availability, it is necessary to combine the repiration of the session together with the repiication of data proposed above. Thus, if an application maintains conversational status and the replica to which a conversational client is connected fails, it can be reconnected transparently to another replica and continue its execution without losing the availability of the session.
- the way to combine session replication with the proposed data replication is to propagate the session status to other replicas after each invocation of the client.
- Clients connect to the application server through an API or proxy generally provided by the application server (for example, in JEE through JNDI - Java Naming and Directory Interface).
- the proxies being generated by the application server can incorporate the replication logic in a completely transparent way to the clients.
- the proxy uses some method to discover the available replicas.- For example, by means of radio protocols such as IP-multicast or by accessing a server that keeps the list of active replicas. For example, using IP multicast, the proxy would send a discovery message using this protocol to a multicast IP address associated with the system replicated by Io that all replicas receive the discovery message.
- Replicas have different identifiers (eg 0 to n-1, where n is the number of replicas).
- the requests sent by the clients are also uniquely associated with a client and have a growing and consecutive identifier of the request number (maintained by the proxy). This is each client sends requests identified by a unique client identifier and a request identifier with a known initial value (eg 0) and grows from unit to unit.
- One or more system replicas will reply to the discovery message indicating the list of available replicas, their network addresses (eg their IP address) and an indication of the load of each replica. For this, the replicas periodically exchange messages informing about their current load.
- the proxy then randomly selects a replica with a probability inversely proportional to its current load to balance the load between the replicas. The proxy then connects to the selected replica.
- each replica contains an application server and database pair. If either of them fails or the node they are in fails, the replica is considered failed. Mechanisms can be used to detect partial failures (only the application server or the database fails) to force the replica to disconnect from the system.
- the fault is detected when the timer since the last request sent to the application server expires without receiving a response. Then the proxy will reconnect to the new replica and forward the request again.
- the replica failed before the updates propagated to the other replicas There are two possible scenarios: (1) the replica failed before the updates propagated to the other replicas; (2) The replica successfully propagated the updates to the other replicas before failing.
- the new replica to which the proxy connects will have a copy of the last state of the session associated with the client. This reply will regenerate the session from the last message received from the reply to which the client was connected.
- Figure 1 illustrates the replicated model.
- Each replica (4) consists of an application server pair (2) and database (3).
- Each client (1) connects to one of the replicas.
- the application servers (4) communicate through a communication network to coordinate and ensure the consistency of the replication.
- FIG. 2 shows an example of the evolution of the local processing of the cache (ignoring remote processing).
- the figure illustrates the application server (2), the database (3), and the cache that the application server (1) maintains.
- the X and Y data shows the evolution of its versions.
- the cache shows its evolution starting from an initial state in which the cache is empty.
- For each data (Xe Y) the value of the data (4) and its version (5) are shown.
- the execution of two transactions T1 and T2 is illustrated.
- T2 has the following steps (6) Start of transaction, (7) Read (X), (8) Write (Y, d), (9) Read (Y), (10) Commit.
- T1 has the following steps (11) Start of transaction, (12) Read (X), (13) Read (Y), (14) Write (X, c), (15) Commit.
- T1 and T2 obtain the same start time stamp, 10, and the associated transaction is created on the basis of data for each of them.
- T2 reads X. Since there is no version of X in the cache, the data is read from the database and a version labeled -1 (to represent that the version is unknown) of X is created. The value of X in the cache is a.
- T1 reads X and Y. Since the version -1 of X is in the cache and -1 ⁇ 10, T1 the version -1 of X. In addition, it reads Y from the database and is labeled as version -1. The value of version -1 of Yes b.
- T1 asks to compromise.
- Receive the time commitment mark CT (TI) 11, the private versions of X and Y are labeled with this time stamp and are made public in the cache.
- the associated transaction in the database compromises, having previously applied the writings on X and Y on the database in the context of the associated transaction.
- T2 now reads Y, it does not read version 11 of Y, since 11> ST (T2). Instead, read the version - 1 of Y, that is, the old value of Y, b.
- T2 is read-only, it simply compromises in the database and compromises in the database without any time stamp being assigned.
- Remote processing is illustrated by an example in Figures 3 and 4.
- Each replica contains an application server pair (4) and database (3).
- the application server maintains a data cache (5).
- the sequence of transaction processing (6) is also shown.
- Each value shows its value (7) and version (8).
- Transaction T1 consists of the following steps: Start, Read (X), Write (X, b), Commit.
- Transaction T2 consists of the following steps: Start, Read (X), Write (X, c), Commit.
- the relative order of the updates is 71, 72 and there are no other concurrent conflicting transactions.
- C7 (7 1) 1) and the transaction will compromise.
- 72 will be processed.
- During the validation of 72 it will be determined that 71 is concurrent and has a conflict with 71 that has already been committed (S7 (72) ⁇ C7 (71)). Therefore, 72 will abort.
- the transactions will be validated in the same order. However, during the validation of 71 it will be found that there is a bolt on X possessed by 72. 72 will be aborted and 71 will compromise. Thus, when the changes of 72 are received, the validation will fail. In this way, the two replicas will compromise the same transactions.
- An embodiment of the invention is presented for the case in which the data store provides snapshot isolation [Berenson95] isolation, both for the replicated and the non-replicated system.
- the proposed implementation uses time stamps to determine when transactions begin and end, as well as to label the different versions of the data.
- the implementation uses locks (locking) to detect conflicts between concurrent transactions early.
- the embodiment of the invention is explained in two steps. In the first step it is detailed how the local processing of a request is carried out in the replica that it receives to guarantee the coherence at the local level, and in the second step how the remote processing is performed in the rest of the replicas to guarantee the coherence to global level.
- the local processing of the transactions is detailed below.
- Both the application server and the database have their own transaction management system.
- the application server maintains the relationship between the transactions of both systems, that is, when a transaction is initiated in the application server, a transaction is also initiated in the database.
- the application server also maintains the relationship between client identifier and transaction.
- Each transaction of the application server is associated with a start time stamp, when it starts on the application server and a commitment time mark when the transaction commits.
- the commitment mark MC (T) of the transaction T is an increasing number that reflects the snapshots through which the data progresses (that is, the number of transactions committed).
- the MI (T) start mark of a transaction T is the largest commitment mark at the start time of T.
- T represents the last compromised transaction T and indicates that if T reads an updated (modified, created or deleted data) ) by T, you must read the values updated by T. S / the data was not updated by T, T must read the value updated by the transaction with the highest commitment mark.
- all replicas begin with the same time commitment mark (which is associated as a start time mark), eg 0.
- the proposed method guarantees that all update transactions (modify, create or delete data) will receive the same commitment mark on all replicas.
- Those that have conflict (update a common data) in the system will be compromised in the database in all replicas in the same relative order.
- the application server maintains a cache or copies some or all of the data in the database.
- each data X in the cache is labeled with a time stamp, / ' , being called version / of the data X (Xi).
- i indicates the time commitment mark of the transaction that updated and committed said version. Since it is assumed that the database does not provide information about the internal versioning of the data, when an X data is read from the database its version number is unknown. For this reason, an X data read from the database is labeled with a special version to indicate that it is unknown. Hereinafter, the version -1 identifier will be used to indicate that the version of the data is unknown.
- the bolt is checked if the major version of the data existing in the cache, Xj, is greater than the MI (Ti) start mark of 77, then this version was created by a concurrent transaction that has committed and You must abort (a concurrent transaction has updated that data and has committed).
- Ti can perform the update, that is, create its own version. This version is private and can only be seen by the Ti transaction while 77 does not compromise. If the transaction Ti writes the same data two or more times, the second and subsequent times will access its private version directly, without performing the previous checks. This ensures that a transaction observes its own updates and prevents other transactions from observing uncommitted changes.
- the transaction fails the validation, it is aborted in the application server and in the database, the private versions of the updated data are discarded, and the locks are released.
- the first transaction that is waiting for each lock is unlocked and the lock is granted. After the completion of the transaction (commitment or abortion), all locks are released.
- Updates are propagated to all replicas atomic (updates are received by all active replicas or none if the replica that sent the message has been dropped) and in total order (all replicas associate the same order regarding updates ). All replicas will validate the updates in the same relative order. The processing of these updates (validation) is deterministic (it behaves like a state machine) and will have the same result in all replicas.
- the validation and processing of the updates in principle, is sequential, although for transactions that are not conflicting with each other they can be validated and processed in parallel.
- a transaction Ti does not pass the validation, if there is a transaction Tj in the system (set of replicas) that is concurrent with Ti, it has committed (MI (T /) ⁇ MC [Tj) ⁇ MC (Ti)) and Tj updated some data in common. Transactions whose spread of changes reaches Ia Replica in which they are local without having been aborted, they will always pass the validation. The details of the validation for remote transactions will be described in the next section.
- each updated data (the private version of the transaction) is labeled with the time stamp of the transaction commitment and the version is made public, becoming part of the cache.
- the release of the locks will cause the abortion of the non-validated local transactions that were waiting for these locks of the compromised transaction.
- the transaction in the database is compromised.
- the commitment of the database automatically propagates the updates of the data to the database. It should be noted that the versions of the updated data are kept in memory, in the cache until it is determined that they are unnecessary and deleted.
- the remote processing is described below.
- a transaction that has made updates has been processed locally, its updates are sent to all replicas including the local replica (in which the transaction has been executed).
- This local transaction is a remote transaction in the rest of the replicas. Since all replicas can execute update transactions, each replica must validate (detect possible conflicts between concurrent transactions that update the same data) to ensure that snapshot isolation is provided globally (of all replicas). Thanks to the changes associated with the same relative order in all replicas, they perform the validation in this order and commit the same transactions in the same order. In this way, the status of the cache and the database remains consistent among all replicas. That is, all instances of the databases have the same data with the same values and the caches have the necessary versions to provide cache and replication transparency.
- Transparency of replication means that for any replicated execution, there is an equivalent non-replicated execution in the system, in which the clients see the same results and the databases remain in the same final states.
- Cache transparency means that customers see the same results they would see if the system did not keep any data in the cache and always read from the database.
- the first step in a remote replication (which has not executed the transaction locally) after receiving the message with the propagation of changes in a transaction is the validation.
- the validation of the transaction verifies that it has no conflicts with any other concurrent transaction that has been validated successfully (and therefore committed) previously (that is, transactions that were not known before the propagation of the updates of the transaction that was is validating). If the validation is unsuccessful, the reply discards the changes received. If the validation is unsuccessful, the reply discards the changes received. If the validation is successful, a temporary commitment mark is assigned to the transaction, as is the case with local transactions and a transaction is created in the database.
- each updated data it is checked whether there is a lock possessed by an unvalidated transaction (local and non-committed) if this occurs, the local transaction is aborted (it is concurrent, it has a write conflict and has not yet been validated).
- a version of each data that has been updated is created, labeled with the time commitment mark and added to the cache (the local cache is updated). The remaining steps are the same as for local transactions, where the transaction of the database compromises, the commitment counter is increased and the transaction is stored in the list of committed transactions. If a remote transaction fails the validation, no further action is necessary except to discard the message.
- the creation and deletion of data is also treated in the proposed method.
- a new X data is created, its private version is created as an update, but no previous version exists.
- the lock is also requested on the X data to prevent other concurrent transactions from creating the same X data.
- the bolts are associated to the data keys (which coincide with the database key).
- the transaction commits the data is inserted in the database, the version becomes public and is available for other transactions that begin after the transaction that I believe committed.
- Deletions are treated by creating a grave version of the data. 'The grave version is also a private version of the transaction until it compromises. If the transaction tries to access the data you will not find it since the grave version indicates that the data no longer exists. When a transaction compromises, the data will be deleted, and the grave version will be made public. It should be noted that even after the compromise of the transaction the previous versions of the deleted data cannot be deleted, since there may be active transactions that are associated with said versions (all transactions that began before the one that deleted the compromised data), They can read the data and thus satisfy the snapshot isolation.
- each application server of each replica propagates the oldest starting time mark (of less value) of the active transactions (not compromised) that exist in the replica to the rest of the replicas. To reduce the cost, this propagation can be done within the propagation messages of transaction updates.
- Each replica keeps an updated vector with the oldest start time stamp of each replica.
- Each replica will remove the versions of the data older than the oldest mark (of lesser value) from among those of all the replicas. Versions labeled with version -1 require different processing.
- version -1 of the data is also not required. If the cache is full with versions that cannot be discarded, they can always be hibernated (stored in persistent storage other than the database as in the hibernation used by JEE application servers to eject data from the cache). The data, hibernados can be brought to memory at any time by the application server (all versions of the data).
- the proposed procedure for the centralized model is a simplification of the local processing proposed for the replicated model.
- the propagation of updates or the global validation phase is not necessary to verify conflicts with remote transactions.
- the centralized model follows the same steps as the local processing of the replicated model.
- the invention is applicable in the industrial sector of multilayer information systems.
- a representative example of these are the application servers that are used in combination with databases, providing two layers on which the invention can be applied.
- Application servers generally maintain a cache of data from the database to increase their efficiency.
- application servers guarantee seriality. With the application of the present invention (centralized model) they can offer snapshot isolation.
- Adya99 Adya, A. 1999 Weak Consistency: a Generalized Theory and Optimistic Implementations for Distr ⁇ Published Transactions. PhD Thesis Massachusetts Institute of Technology.
- Berenson95 Berenson, H., Bernstein, P., Gray, J., Melton, J., O'Neil, E., and O'Neil, P. 1995. A critique of ANSI SQL isolation levéis. In Proceedings of the 1995 ACM SIGMOD international Conference on Management of Data (San Jose, California, United States, May 22-25, 1995). M. Carey and D. Schneider, Eds. SIGMOD '95. ACM, New York, NY, 1-10.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a cache system application servers in multi-layer transactional systems which has one or more versions of each datum corresponding to the values taken by the latter in each transaction compromised. The invention relates to a method of managing the cache in replicated and non-replicated systems, which provides each transaction executed with an image of the database with the content it had when the transaction started. The method guarantees cache transparency, i.e. it guarantees that a system with the multi-version cache provides the same consistency as a system without a cache which always accesses the data through the data store. The method for a replicated system provides the same consistency as in a non-replicated system, thereby guaranteeing transparency of replication.
Description
Sector Técnico Technical Sector
Caché multi-versión con aislamiento relajado para sistemas replicados y no replicadosMulti-version cache with relaxed isolation for replicated and non-replicated systems
Sector de Ia técnicaTechnical sector
La invención se encuadra en el sector de la replicación de datos para proporcionar disponibilidad (tolerancia a fallos) y escalabilidad (aumento del rendimiento mediante Ia adición de nuevos nodos) aplicable a sistemas software tales como los basados en arquitecturas multi-capa y arquitecturas orientadas a servicios. Dentro de las arquitecturas multi-capa es aplicable a las capas con estado tales como Ia capa del servidor de aplicación y Ia capa del servidor de base de datos.The invention falls within the field of data replication to provide availability (fault tolerance) and scalability (increased performance through the addition of new nodes) applicable to software systems such as those based on multi-layer architectures and architectures oriented to services. Within multi-layer architectures it is applicable to state layers such as the application server layer and the database server layer.
Estado de Ia TécnicaState of the Technique
Las transacciones son un mecanismo propuesto para garantizar Ia consistencia de los datos en bases de datos y otros sistemas de información tales como servidores de aplicaciones, arquitecturas multi-capa, arquitecturas orientadas a servicios y servicios web. Las transacciones proporcionan dos tipos de consistencia, el aislamiento y atomicidad frente a fallos. El aislamiento (isolation) determina qué tipo de consistencia se proporciona cuando un conjunto de transacciones se ejecutan concurrentemente y pueden acceder a datos comunes. La atomicidad frente a fallos garantiza que Ia transacción se ejecuta como una unidad, esto es, o bien se ejecuta en su totalidad con éxito (en cuyo caso se dice que Ia transacción compromete), o si hay algún fallo (en cuyo caso se dice que Ia transacción aborta), el resultado final es como si no se hubiera aplicado. Las transacciones proporcionan otra propiedad adicional conocida como durabilidad que dicta que una vez una transacción ha comprometido las actualizaciones que haya realizado no se pueden perder incluso en el caso de fallos del sistema (p.ej. caída del nodo en el que se ejecuta el sistema transaccional). Existen distintos niveles de aislamiento tales como serialidad {serializability), snapshot (snapshot isolation) [Berenson95], read committed, etc. La serialidad proporciona el nivel más alto de consistencia garantizando que Ia ejecución concurrente de transacciones es equivalente a una ejecución secuencial de éstas. Esto es que el sistema concurrente tiene un comportamiento equivalente a un sistema secuencial sin ninguna concurrencia Io que simplifica enormemente el desarrollo de aplicaciones transaccionales. La serialidad tiene un coste inherente ya que implica que las escrituras sobre un dato son conflictivas con las lecturas y escrituras sobre el mismo.
En particular, los conflictos entre escrituras y lecturas, que suelen ser muy frecuentes en Ia mayoría de los sistemas, restringen enormemente Ia concurrencia potencial en el sistema y por tanto su máximo rendimiento. Por ello se han propuesto otros niveles de aislamiento más relajados. El aislamiento snapshot isolation [Berenson95] es uno de los niveles de aislamiento con más popularidad ya que proporciona una consistencia muy cercana a Ia serialidad, pero elimina los conflictos entre lecturas y escrituras, tan sólo conservando los conflictos entre escrituras, menos frecuentes que los de lectura y escritura, Io que permite realizar sistemas con mejor rendimiento gracias a una mayor concurrencia potencial en el sistema. El aislamiento snapshot proporciona Ia ilusión a las transacciones de que Ia base de datos está congelada (y por tanto los datos de ésta y los valores de los mismos) en el punto en el que se inicia una transacción (como si se hubiera hecho una foto de ésta, o snapshot). El aislamiento snapshot no permite que dos transacciones concurrentes modifiquen datos en común, al igual que ocurre con Ia serialidad. Sin embargo, sí es posible que una transacción lea un dato A, modifique un dato B y que otra transacción modifique el dato A y lea el dato β, situación no permitida por Ia serialidad. El aislamiento snapshot isolation cuando se implementa en bases de datos generalmente emplea múltiples versiones de los datos.Transactions are a proposed mechanism to guarantee the consistency of data in databases and other information systems such as application servers, multi-layer architectures, service-oriented architectures and web services. Transactions provide two types of consistency, isolation and atomicity against failures. Isolation determines what type of consistency is provided when a set of transactions are executed concurrently and can access common data. The atomicity against failures guarantees that the transaction is executed as a unit, that is, or is executed in its entirety successfully (in which case it is said that the transaction compromises), or if there is any failure (in which case it is said that the transaction aborts), the final result is as if it had not been applied. Transactions provide another additional property known as durability that dictates that once a transaction has compromised the updates it has made, they cannot be lost even in the case of system failures (eg, fall of the node in which the system is running. transactional). There are different levels of isolation such as seriality {serializability), snapshot (snapshot isolation) [Berenson95], read committed, etc. Seriality provides the highest level of consistency ensuring that the concurrent execution of transactions is equivalent to a sequential execution of these. This is that the concurrent system has a behavior equivalent to a sequential system without any concurrence, which greatly simplifies the development of transactional applications. Seriality has an inherent cost since it implies that the writings on a piece of data are conflicting with the readings and writings on it. In particular, the conflicts between writings and readings, which are usually very frequent in most systems, greatly restrict the potential concurrence in the system and therefore its maximum performance. Therefore, other more relaxed levels of isolation have been proposed. The snapshot isolation [Berenson95] is one of the most popular isolation levels since it provides a consistency very close to seriality, but eliminates conflicts between readings and writings, only retaining conflicts between writes, less frequent than those of reading and writing, which allows systems to perform with better performance thanks to greater potential concurrence in the system. Snapshot isolation provides the illusion to transactions that the database is frozen (and therefore its data and their values) at the point at which a transaction is initiated (as if a photo had been taken of this one, or snapshot). Snapshot isolation does not allow two concurrent transactions to modify data in common, as is the case with seriality. However, it is possible that a transaction reads data A, modifies data B and another transaction modifies data A and reads data β, a situation not allowed by seriality. Snapshot isolation when implemented in databases generally employs multiple versions of the data.
El aislamiento read committed es uno de los aislamientos más relajados. Tan sólo garantiza que una transacción siempre lee los últimos valores comprometidos de los datos. Este aislamiento presente numerosas anomalías [Berenson95] que dificultan Ia programación de aplicaciones concurrentes basadas en él.Read committed isolation is one of the most relaxed insulations. It only guarantees that a transaction always reads the last compromised values of the data. This isolation has numerous anomalies [Berenson95] that make it difficult to program concurrent applications based on it.
Los sistemas de información modernos se suelen estructurar en múltiples capas (tiers). Cada capa se caracteriza por un tipo' de servidor especializado en un tipo de procesamiento. Las dos capas más relevantes son Ia de almacenamiento de datos y Ia del servidor de aplicaciones. La capa de almacenamiento de datos suele estar representada por un servidor de bases de datos, un servidor de ficheros, un repositorio persistente de objetos, etc. La capa del servidor de aplicaciones proporciona funcionalidad para desplegar Ia lógica de negocio de las aplicaciones de usuario, proporcionando típicamente Ia infraestructura necesaria para su construcción y despliegue, tales como caché de los datos del servidor de almacenamiento de datos para aumentar Ia eficiencia, semántica transaccional (encargándose típicamente de relacionar las transacciones del servidor de aplicaciones con las del servidor de almacenamiento de datos, típicamente una base de datos), conexión con servidores de almacenamiento de datos, gestión de sesiones con los clientes de Ia aplicación,
etc. La gestión de sesiones permite mantener estado volátil entre peticiones de un mismo cliente (p.ej. el carro de Ia compra en un servicio de venta por Internet). La gestión de transacciones permite ofrecer Ia semántica transaccional a las aplicaciones. Un punto importante es Ia consistencia transaccional que se ofrece al combinar el servidor de aplicaciones con una base de datos (u otro repositorio de datos transaccional). En Ia actualidad, esta combinación ofrece una consistencia caracterizada formalmente tan sólo para Ia serialidad. Sin embargo, los servidores de aplicación actuales no proporcionan una consistencia caracterizada formalmente para otros niveles de aislamiento inferiores, como por ejemplo snapshot isolation o read committed. La presente invención propone un procedimiento para combinar un servidor de aplicaciones y una base de datos (u otro repositorio de datos transaccional) que proporcionen conjuntamente aislamiento snapshot siendo ésta una de las nuevas contribuciones de Ia invención.Modern information systems are usually structured in multiple layers (tiers). Each layer is characterized by a type 'specialized server in a type of processing. The two most relevant layers are the data storage and the application server. The data storage layer is usually represented by a database server, a file server, a persistent repository of objects, etc. The application server layer provides functionality to deploy the business logic of user applications, typically providing the infrastructure necessary for its construction and deployment, such as data storage server cache to increase efficiency, transactional semantics. (typically being in charge of relating the transactions of the application server with those of the data storage server, typically a database), connection with data storage servers, session management with the clients of the application, etc. Session management allows to maintain volatile status between requests from the same customer (eg the shopping cart in an Internet sales service). The transaction management allows to offer the transactional semantics to the applications. An important point is the transactional consistency that is offered by combining the application server with a database (or other transactional data repository). At present, this combination offers a consistency formally characterized only for seriality. However, current application servers do not provide a formally characterized consistency for other lower isolation levels, such as snapshot isolation or read committed. The present invention proposes a method for combining an application server and a database (or other transactional data repository) that together provide snapshot isolation, this being one of the new contributions of the invention.
Pueden existir múltiples capas de servidor de aplicación en las que cada servidor de aplicación proporciona funcionalidad específica para un campo determinado (p.ej. contenido dinámico para páginas web). Entre dos capas de servidores de aplicación diferentes siempre se pueden ver a uno como servidor de aplicaciones y al otro como servidor de almacenamiento de datos. En Ia descripción de Ia presente invención emplearemos el término servidor de aplicaciones en su sentido más general caracterizándolo por su capacidad para mantener una caché de los datos de Ia capa de almacenamiento de datos y soporte para transacciones.There may be multiple layers of application server in which each application server provides specific functionality for a given field (eg dynamic content for web pages). Between two layers of different application servers you can always see one as an application server and the other as a data storage server. In the description of the present invention, we will use the term application server in its most general sense, characterizing it by its ability to maintain a data cache of the data storage layer and support for transactions.
La replicación es Ia principal técnica para proporcionar tolerancia a fallos y escalabilidad (aumento de Ia productividad máxima del sistema mediante el incremento del número de máquinas, nodos, en el sistema). La replicación consiste en ejecutar múltiples instancias de un software, réplicas, (generalmente en un sistema distribuido, ejecutando cada instancia en un nodo diferente) de forma coordinada tal que las peticiones realizadas al sistema por los clientes de éste puedan ser servidas por el sistema a pesar de que alguna de las replicas falle (el nodo se caiga, el software se bloquee, el nodo se desconecte de Ia red, etc.). De esta forma, Ia replicación es capaz de tolerar fallos mediante Ia introducción de redundancia de procesamiento.Replication is the main technique to provide fault tolerance and scalability (increase in the maximum productivity of the system by increasing the number of machines, nodes, in the system). Replication consists of executing multiple instances of a software, replicas, (usually in a distributed system, executing each instance on a different node) in a coordinated manner so that requests made to the system by its clients can be served by the system to Although some of the replicas fail (the node goes down, the software is blocked, the node disconnects from the network, etc.). In this way, the replication is able to tolerate faults by means of the introduction of processing redundancy.
La replicación también puede emplearse para aumentar Ia productividad de un sistema. Aprovechando Ia redundancia del software en un sistema distribuido el trabajo a realizar por el software puede repartirse entre las distintas réplicas (instancias del software) de forma que cada nodo realice una fracción del trabajo global del sistema. La replicación para el aumento de productividad se ha aplicado
principalmente para servicios sin estado, esto es, servicios cuyo resultado es independiente de las peticiones que se hayan recibido en el pasado y que dependen tan sólo de los parámetros de Ia petición y como mucho de un estado persistente de solo lectura. Esta aproximación ha sido empleada por ejemplo por los sistemas grid. En bases de datos, durante Ia última década, se han desarrollado soluciones de replicación escalables. Las distintas soluciones se pueden clasificar en función de cómo ejecutan las transacciones, cuándo propagan las actualizaciones de las transacciones y el nivel de aislamiento que proporcionan. Las transacciones pueden ejecutarse empleando técnicas primario-respaldo (primary-backup) [PlattnerO4] o actualización-en-cualquier-réplica (update-everywhere) [Kemme00,Patiño05,Lin05]. La^ técnica de primario respaldo sólo permite ejecutar transacciones de actualización (aquellas que modifican Ia base de datos) en una única réplica, denominada primario. El resto de las réplicas, denominadas respaldo, tan sólo pueden ejecutar transacciones de sólo lectura. Esta técnica tiene como inconveniente que el primario se convierte en un cuello de botella ya que tiene que procesar todas las transacciones de actualización. En Ia técnica de actualización-en-cualquier-réplica todas las réplicas pueden procesar todo tipo de transacciones, incluidas las de actualización, Io que evita el cuello de botella existente en el primario respaldo.Replication can also be used to increase the productivity of a system. Taking advantage of the redundancy of the software in a distributed system, the work to be done by the software can be distributed among the different replicas (instances of the software) so that each node performs a fraction of the overall work of the system. Replication for increased productivity has been applied mainly for services without status, that is, services whose result is independent of the requests that have been received in the past and that depend only on the parameters of the request and, at most, on a persistent read-only state. This approach has been used for example by grid systems. In databases, during the last decade, scalable replication solutions have been developed. The different solutions can be classified according to how they execute the transactions, when they propagate the updates of the transactions and the level of isolation they provide. Transactions can be executed using primary-backup techniques [PlattnerO4] or update-in-any-replica (update-everywhere) [Kemme00, Patiño05, Lin05]. The ^ primary backup technique only allows to execute update transactions (those that modify the database) in a single replica, called primary. The rest of the replicas, called backups, can only execute read-only transactions. This technique has the disadvantage that the primary becomes a bottleneck since it has to process all update transactions. In the update-in-any-replica technique, all replicas can process all types of transactions, including update ones, which avoids the existing bottleneck in the primary backup.
Las técnicas de replicación de bases de datos también pueden clasificarse atendiendo a cuándo y cómo se hace Ia propagación de actualizaciones. Esta clasificación es relevante para técnicas de replicación en las que se procesa Ia transacción de actualización en una sola réplica, réplica local (a Ia que está conectado el cliente), y luego los datos actualizados se propagan al resto de las réplicas (réplicas remotas). La propagación de actualizaciones puede ser impaciente (eageή o perezosa (lazy). La replicación impaciente propaga las actualizaciones de forma atómica con el compromiso de Ia transacción en Ia réplica local de forma que se garantiza que todas las réplicas activas aplican las actualizaciones de Ia transacción o ninguna. También se garantiza Ia consistencia entre réplicas al aplicar todas pilas las actualizaciones de forma coordinada. En Ia replicación perezosa las actualizaciones se propagan de forma independiente al compromiso de Ia transacción en Ia réplica local, por Io que el estado de las réplicas puede divergir dando lugar a inconsistencias. Esto es, el cliente puede observar el último estado de un dato y más tarde observar un estado anterior, o en general, los clientes pueden observar secuencias de estados que no serían posibles en un sistema no replicado.
La consistencia de una base de datos replicada se basa en Ia corrección-1- copia que consiste en que el comportamiento visible por los clientes de un sistema replicado es equivalente al de un sistema sin replicar. La primera definición de corrección-1 -copia es serialidad-1 -copia (1-copy-serializability) [Bemstein87] que dicta. que el sistema replicado debe comportarse como un sistema serial no replicado. Esto es, Ia ejecución replicada debe ser equivalente a una ejecución no replicada en un sistema proporcionando como aislamiento Ia serialidad. Más recientemente, se ha definido Ia noción de aislamiento-snapshot-1-copy (1-copy-snapshot-isolation) [LinO5] en el que una base de datos replicada tiene un comportamiento equivalente al que tendría una base de datos con aislamiento snapshot.Database replication techniques can also be classified according to when and how the propagation of updates is made. This classification is relevant for replication techniques in which the update transaction is processed in a single replica, local replica (to which the client is connected), and then the updated data is propagated to the rest of the replicas (remote replicas) . The propagation of updates can be impatient (eageή or lazy (lazy). The impatient replication propagates the updates atomic with the commitment of the transaction in the local replica so that all active replicas apply the updates of the transaction or none The consistency between replicas is also guaranteed when all updates are applied in a coordinated manner.In lazy replication the updates are propagated independently of the commitment of the transaction in the local replica, so that the status of the replicas can diverge resulting in inconsistencies, that is, the client can observe the last state of a data and later observe a previous state, or in general, the clients can observe sequences of states that would not be possible in a non-replicated system. The consistency of a replicated database is based on the correction-1- copy that consists in the fact that the behavior visible by the clients of a replicated system is equivalent to that of an unreplicated system. The first definition of correction-1 -copy is seriality-1 -copy (1-copy-serializability) [Bemstein87] that dictates. that the replicated system must behave like a non-replicated serial system. That is, the replicated execution must be equivalent to a non-replicated execution in a system providing seriality as isolation. More recently, the notion of isolation-snapshot-1-copy (1-copy-snapshot-isolation) [LinO5] has been defined in which a replicated database has a behavior equivalent to that of a database with snapshot isolation .
Es importante hacer notar, que todas estas técnicas de replicación son aplicables a Ia replicación de bases de datos. Esto es, las técnicas tan sólo consideran replicación del almacén de datos y Ia consistencia entre las distintas réplicas de éste. En Ia presente invención, se propone un procedimiento para Ia replicación de sistemas con múltiples capas en los que se mantiene una copia de los datos, generalmente denominada caché, en una capa distinta a Ia capa de almacenamiento de datos. En este contexto diferente hay que coordinar las actualizaciones de las réplicas de Ia base de datos con las réplicas de sistema que mantiene copia de los datos en una caché (p.ej. el servidor de aplicación) para garantizar Ia consistencia entre los datos mantenidos entre las distintas capas, asunto no tratado por Ia replicación de bases de datos.It is important to note that all these replication techniques are applicable to database replication. That is, the techniques only consider replication of the data warehouse and the consistency between the different replicas of it. In the present invention, a procedure is proposed for the replication of systems with multiple layers in which a copy of the data, generally called a cache, is maintained in a layer other than the data storage layer. In this different context, the updates of the database replicas must be coordinated with the system replicas that keep a copy of the data in a cache (eg the application server) to ensure consistency between the data maintained between the different layers, subject not treated by the replication of databases.
En replicación de sistemas mülti-capa ha habido investigaciones en los últimos años. Las primeras investigaciones se hicieron en el contexto de CORBA y el foco fue proporcionar replicación activa, semi-activa o pasiva de servidores CORBA. Estas técnicas de replicación son técnicas de replicación de procesos que persiguen que el flujo de Ia ejecución sea el mismo en las distintas réplicas. ¡Esta técnica es conocida como replicación de procesos y es más restrictiva que Ia replicación de datos ya que no sólo pretende que las réplicas tengan el mismo estado y se devuelva un estado coherente a las aplicaciones, si no que además pretenden, que el flujo de Ia ejecución sea idéntico. Uno de los ejemplos más significativos de Ia replicación de CORBA ha sido el sistema Eternal [NarasimhanO2]. La replicación de procesos exige que los procesos que son replicados sean deterministas, Io cual implica restringir el tipo de procesos que se pueden ejecutar. Recientemente una de las principales restricciones que se aplicaban, el que el proceso a replicar fuera puramente secuencial, se ha eliminado gracias a Ia planificación determinista de servidores concurrentes o multi-hilo
[JimenezOO, Moser03, Moser03b]. La replicación de procesos sólo tolera fallos, proporcionando disponibilidad, pero no proporciona escalabilidad a diferencia de Ia técnica de replicación de datos de Ia presente patente que permite aumentar Ia máxima productividad del sistema añadiendo computadores adicionales al sistema. En sistemas líder-seguidores, otras fuentes de no determinismo provenientes del sistema operativo son tratadas por un supervisor que intercepta las invocaciones al sistema operativo en Ia réplica líder y comunica las decisiones no deterministas a los supervisores de las réplicas seguidoras que se encargan de forzar dichas decisiones localmente [Bressoud98]. Otras técnicas de replicación de procesos persiguen proporcionar replicación primario-respaldo de forma transparente a Ia aplicación [Vigna95].In replication of multi-layer systems there have been investigations in recent years. The first investigations were made in the context of CORBA and the focus was to provide active, semi-active or passive replication of CORBA servers. These replication techniques are process replication techniques that pursue that the flow of the execution be the same in the different replicas. This technique is known as process replication and is more restrictive than data replication since it not only intends that the replicas have the same state and that a coherent state be returned to the applications, but also intended, that the flow of The execution is identical. One of the most significant examples of CORBA replication has been the Eternal system [NarasimhanO2]. Process replication requires that processes that are replicated be deterministic, which implies restricting the type of processes that can be executed. Recently one of the main restrictions that were applied, that the process to be replicated was purely sequential, has been eliminated thanks to the deterministic planning of concurrent or multi-threaded servers [JimenezOO, Moser03, Moser03b]. The process replication only tolerates failures, providing availability, but does not provide scalability unlike the data replication technique of the present patent that allows to increase the maximum productivity of the system by adding additional computers to the system. In leading-followers systems, other sources of non-determinism from the operating system are treated by a supervisor who intercepts the invocations to the operating system in the leading replica and communicates the non-deterministic decisions to the supervisors of the follow-up replicas that are responsible for forcing said decisions locally [Bressoud98]. Other process replication techniques seek to provide primary-backup replication transparently to the application [Vigna95].
En otro tipo de arquitecturas multi-capa tales como Java™ Enterprise Edition (JEE1 antes J2EE) [JEE] hay soluciones que replican los componentes sin estado persistente (session beans) y comparten el estado persistente (base de datos) [WuO4]. Esta solución tiene como principal problema que Ia base de datos compartida se convierte en el cuello de botella del sistema y punto único de fallo. Este tipo de solución también se ha aplicado en CORBA recientemente [ZhaoOδ]. La presente patente resuelve el problema del cuello de botella de Ia base de datos compartida replicando simultáneamente Ia capa del servidor de aplicaciones y de Ia base de datos.In other types of multi-layer architectures such as Java ™ Enterprise Edition (JEE 1 before J2EE) [JEE] there are solutions that replicate the components without persistent state (session beans) and share the persistent state (database) [WuO4]. This solution has as main problem that the shared database becomes the system bottleneck and single point of failure. This type of solution has also been applied in CORBA recently [ZhaoOδ]. The present patent solves the problem of the bottleneck of the shared database by simultaneously replicating the layer of the application server and the database.
Los sistemas de caché por Io general mantienen una única versión de los datos. Recientemente se ha propuesto en el contexto de páginas web y sistemas similares, que Ia caché pueda tener múltiples versiones disponibles para un mismo ¡dentificador (P-ej. páginas web en distintos idiomas) y devuelva a cada cliente Ia versión más adecuada a sus necesidades [Jacobs04]. Este versionado en Ia caché tiene un objetivo radicalmente al propuesto en Ia presente patente que se aplica a sistemas multi-capa transaccionales para proporcionar aislamiento snapshot.Caching systems generally maintain a single version of the data. Recently it has been proposed in the context of web pages and similar systems, that the cache may have multiple versions available for the same dentist (eg web pages in different languages) and return to each client the version most appropriate to their needs [Jacobs04]. This versioning in the cache has a purpose radically to that proposed in the present patent that is applied to multi-layer transactional systems to provide snapshot isolation.
Se han propuesto también métodos de recolección de basura para eliminar copias innecesarias de Ia caché. Uno de estos métodos propuesto en [MattisOl] para caches de múltiples fragmentos de un objeto, organiza Ia caché en diferentes zonas y determina cómo eliminar fragmentos del objeto de Ia caché y cómo reorganizarlos después del borrado. La patente propuesta, por el contrario, se centra en determinar
cuándo las versiones de Ia caché pueden ser eliminadas sin que se viole las propiedades del aislamientd snapshot.Garbage collection methods have also been proposed to eliminate unnecessary copies of the cache. One of these methods proposed in [MattisOl] for caches of multiple fragments of an object, organizes the cache in different areas and determines how to delete fragments of the object from the cache and how to reorganize them after deletion. The proposed patent, on the other hand, focuses on determining when the versions of the cache can be eliminated without violating the properties of the snapshot isolation.
Descripción Detallada de Ia InvenciónDetailed Description of the Invention
, 5 La presente invención presenta un sistema de caché para sistemas transaccionales multi-capa! Se consideran sistemas multi-capa con al menos dos capas: una capa de almacén de datos (en adelante base de datos) y otra capa qué mantiene una caché del almacén, de datos (en adelante servidor de aplicaciones). El problema que aborda Ia presente invención es cómo proporcionar a los clientes de sistemas transaccionales, 5 The present invention presents a cache system for multi-layer transactional systems! Multi-layer systems with at least two layers are considered: a data store layer (hereinafter database) and another layer that maintains a data warehouse cache (hereinafter application server). The problem addressed by the present invention is how to provide customers with transactional systems
10 multhcapa ejecución de sus transacciones con aislamiento snapshot isolation. Se abqrdan los casos de un sistema replicado y no replicado.10 multhcapa execution of your transactions with snapshot isolation isolation. Cases of a replicated and non-replicated system are covered.
Empleando sistemas de caché actuales en los servidores de aplicaciones con una única versión, combinados con bases de datos que proporcionan aislamiento snapshotUsing current cache systems on application servers with a single version, combined with databases that provide snapshot isolation
15 isolation, no resulta posible proporcionar aislamiento snapshot isolation . a las transacciones ejecutadas por los clientes a ño ser que se anule Ia caché. Esto es, sólo si no se mantienen copias de los datos en Ia caché, y se accede siempre a los datos a través de Ia base dé datos, se podría satisfacer el aislamiento snapshot isolation. Desafortunadamente, Ia opción de anular Ia caché resulta en pérdidas .de rendimiento15 isolation, it is not possible to provide isolation snapshot isolation. to the transactions executed by the clients if the cache is canceled. That is, only if copies of the data are not kept in the cache, and the data is always accessed through the database, could the snapshot isolation be satisfied. Unfortunately, the option to clear the cache results in performance losses.
20 muy sustanciales. La presenté invención propone un sistema de caché que mantiene múltiples versiones de los datos con los valores que tenían al comprometer las transacciones que, junto a un procedimiento de gestión de dicha caché tanto para sistemas no replicados como replicados, proporciona aislamiento snapshot isolation.20 very substantial. The present invention proposes a cache system that maintains multiple versions of the data with the values they had when compromising transactions that, together with a procedure for managing said cache for both non-replicated and replicated systems, provides snapshot isolation.
25 . Para evitar el problema antes mencionado que experimentan los servidores de aplicación actuales con una caché que sólo mantiene una versión de cada dato- (o, al menos, no a las versiones correspondientes a los valores generados por cada transacción comprometida), Ia presente invención propone un sistema de caché que .mantiene múltiples versiones de cada dato correspondientes a los valores generados-25. To avoid the aforementioned problem experienced by current application servers with a cache that only maintains one version of each data- (or, at least, not the versions corresponding to the values generated by each compromised transaction), the present invention proposes a cache system that maintains multiple versions of each data corresponding to the generated values-
30 por cada transacción comprometida.; La invención también propone un procedimiento para Ia gestión de los datos. de Ia caché en el qué se garantiza que cada transacción30 for each transaction committed .; The invention also proposes a procedure for data management. of the cache in which each transaction is guaranteed
•i observa un estado. dé Ia base de datos correspondiente al que tenía en el instante que se inició Ia, transacción. La idea básica es' que . cada vez que una transacción lee un dato Ia caché Ie devuelve Ia versión adecuada del dato para cumplir el objetivo • I observes a state. Give the database corresponding to the one you had at the time the transaction was initiated. The basic idea is' what. Each time a transaction reads a data, the cache returns the appropriate version of the data to meet the objective.
35 anteriormente mencionado. Mediante este procedimiento sé. consigue Ia transparencia
de la caché, esto es, que el sistema con caché se comporte igual que un sistema con almacenamiento de datos con aislamiento snapshot ¡solation que no tuviera caché, objetivo del sistema de caché propuesto y del procedimiento para gestión de los datos sobre ésta. Por otro lado, el procedimiento de gestión de los datos en Ia caché se encarga de no permitir el compromiso de transacciones que modifiquen datos comunes con otras transacciones concurrentes (que comprometieron después que Ia transacción en cuestión iniciara) ya comprometidas. De este modo se evitan estados que no respetarían Ia transparencia de Ia caché. Esto es, que nunca se producirían si se suprimiera Ia caché. Se presenta también un procedimiento para Ia replicación del sistema que garantiza que el sistema replicado en conjunto se comporta de forma equivalente al sistema sin replicar proporcionando de este modo transparencia de replicación.35 mentioned above. By this procedure I know. get transparency of the cache, that is, that the cached system behaves the same as a system with data storage with snapshot isolation solation that did not have a cache, objective of the proposed cache system and the procedure for managing data on it. On the other hand, the data management procedure in the cache is responsible for not allowing the compromise of transactions that modify common data with other concurrent transactions (which committed after the transaction in question initiated) already committed. In this way, states that would not respect the transparency of the cache are avoided. That is, they would never occur if the cache were deleted. A procedure for the replication of the system is also presented, which guarantees that the whole replicated system behaves in an equivalent manner to the system without replicating, thus providing transparency of replication.
Una de las dificultades a resolver es que otras transacciones concurrentes pueden modificar dichos datos después de que una transacción se haya iniciado, si se emplea una única versión, Ia transacción podría ver Ia modificación de otra transacción violándose así el aislamiento snapshot isolation. Por ello, el sistema de caché propuesto en Ia presente invención mantiene múltiples versiones de cada dato, y cuando una transacción va a modificar un dato de Ia caché, se crea una versión privada de este dato que sólo dicha transacción puede ver. De este modo las versiones existentes permitirán ofrecer a las otras transacciones que se esténOne of the difficulties to be solved is that other concurrent transactions can modify said data after a transaction has been initiated, if a single version is used, the transaction could see the modification of another transaction thus violating the snapshot isolation. Therefore, the cache system proposed in the present invention maintains multiple versions of each data, and when a transaction will modify a data of the cache, a private version of this data is created that only said transaction can see. In this way the existing versions will allow to offer the other transactions that are being
' ejecutando Ia imagen adecuada de Ia base de datos.'executing the appropriate image of the database.
La versión privada del dato será Ia que Ia transacción modifique y lea en accesos sucesivos. De este modo podrá observar sus propias modificaciones, requisito necesario para satisfacer el aislamiento snapshot isolation. Cuando Ia transacción va a comprometer, se comprueba que ninguna transacción ya comprometida que fuera concurrente con ella, haya modificado datos en común. Si ocurre esto último, Ia transacción abortará en lugar de comprometer. Si Ia transacción compromete y existen otras transacciones en ejecución que han modificado el dato o intentan modificarlo, serán abortadas.The private version of the data will be the one that the transaction modifies and reads in successive accesses. In this way you can observe your own modifications, a necessary requirement to satisfy the snapshot isolation. When the transaction is going to be committed, it is verified that no transaction already committed that was concurrent with it has modified data in common. If the latter occurs, the transaction will abort instead of compromising. If the transaction compromises and there are other transactions in progress that have modified the data or attempt to modify it, they will be aborted.
Cuando se compromete una transacción sus actualizaciones se hacen visibles a las transacciones que se inicien después de su compromiso. Para ello, (as versiones privadas de los datos que haya creado Ia transacción se hacen públicas. Para
determinar que versión debe leer cada transacción para que observe una imagen adecuada de Ia base de datos, el procedimiento propuesto registra cuándo empieza una transacción y cuándo compromete, y por otro lado qué transacción ha producido qué versiones. Cuando las versiones privadas de los datos se hacen públicas durante el compromiso de una transacción se etiquetan de forma que queden asociadas a dicha transacción.When a transaction is committed your updates become visible to the transactions that are initiated after your commitment. For this, (as private versions of the data created by the transaction are made public. determine which version each transaction should read so that it observes an adequate image of the database, the proposed procedure records when a transaction begins and when it commits, and on the other hand which transaction has produced which versions. When private versions of the data are made public during the commitment of a transaction, they are labeled so that they are associated with that transaction.
Como una transacción puede leer un dato que no esté en Ia caché, tiene que acceder a Ia base de datos para obtenerlo. Para que Ia base de datos genere Ia versión adecuada del dato, Ia transacción ejecutada en Ia base de se sincroniza con Ia transacción del servidor de aplicaciones. Esto se consigue, se iniciando simultáneamente una transacción asociada en Ia base de datos cuando se inicia una transacción en el servidor de aplicaciones. De este modo cuando se leen datos que no están en Ia caché, Ia base de datos gracias a su aislamiento snapshot isolation y Ia sincronización entre ambas transacciones proporcionará Ia versión adecuada del dato. Como no se asume que Ia base de datos proporcione información sobre su posible versionado interno, el servidor de aplicaciones etiqueta Ia versión leída como desconocida. Cuando Ia transacción del servidor de aplicaciones se compromete, primero se propagan los cambios a Ia base de datos en el contexto de Ia transacción asociada en Ia base de datos y también se compromete en Ia base de datos. Si Ia transacción en el servidor de aplicaciones se aborta, también se aborta su transacción asociada en Ia base de datos.Since a transaction can read data that is not in the cache, you have to access the database to obtain it. In order for the database to generate the appropriate version of the data, the transaction executed in the database is synchronized with the application server transaction. This is achieved by simultaneously initiating an associated transaction in the database when a transaction is initiated on the application server. In this way when reading data that is not in the cache, the database thanks to its snapshot isolation and the synchronization between both transactions will provide the appropriate version of the data. As it is not assumed that the database provides information about its possible internal versioning, the application server labels the version read as unknown. When the application server transaction is committed, the changes are first propagated to the database in the context of the associated transaction in the database and also committed to the database. If the transaction in the application server is aborted, its associated transaction in the database is also aborted.
Cuando una transacción intenta leer un dato que está en Ia caché el sistema determina qué versión debe leer. Se lee Ia versión más reciente que corresponda a una transacción comprometida antes del inicio de Ia transacción. Si ninguna de las versiones conocidas cumple esta condición y existe una versión etiquetada como desconocida, se devuelve ésta. La gestión de Ia caché garantiza que siempre existiráWhen a transaction tries to read a data that is in the cache, the system determines which version it should read. The most recent version corresponding to a compromised transaction is read before the start of the transaction. If none of the known versions meets this condition and there is a version labeled as unknown, it is returned. The cache management guarantees that there will always be
Ia versión adecuada para todas las transacciones en ejecución a base de no eliminar versiones de Ia caché hasta que se sabe que ya~no van a necesitarse por ninguna transacción en ejecución.The suitable version for all running transactions based not remove versions of Ia cache until it is known that they will no longer be required for any transaction ~ running.
La creación de datos es un caso especial de actualización. Cuando se crea un dato no existen versiones de éste, por Io que se crea una versión privada de éste sin que existan versiones públicas. El borrado es también otro caso especial de actualización.
Cuando una transacción borra un dato y compromete si se borraran todas las versiones del dato, se violaría el aislamiento snapshot ya que transacciones que se hayan iniciado con anterioridad al compromiso deben ver el estado de Ia base de datos cuando todavía existía el dato. Por ello, cuando un dato borrado es por una transacción se crea una versión tumba de ésta. Cuando Ia transacción compromete Ia versión tumba, ésta se hace pública con su correspondiente etiqueta. Si una transacción lee una versión tumba, el resultado es como si no encontrara el dato, estoThe creation of data is a special case of updating. When creating a data, there are no versions of it, so a private version of it is created without public versions. Deletion is also another special case of update. When a transaction deletes a data and commits if all versions of the data were deleted, the snapshot isolation would be violated since transactions that were initiated prior to the commitment must see the status of the database when the data still existed. Therefore, when a deleted data is for a transaction, a grave version of it is created. When the transaction commits the grave version, it is made public with its corresponding label. If a transaction reads a grave version, the result is as if it did not find the data, this
: és, como si estuviera borrado. Sin embargo, aquellas transacciones que observen una versión anterior si encontrarán el dato. : This is as if it were deleted. However, those transactions that observe an earlier version will find the data.
Como el número de versiones va creciendo en Ia caché, para que el procedimiento sea práctico necesita de un mecanismo para eliminar copias innecesarias de Ia caché, también conocido como recolección de basura. Es vital que Ia recolección de basura no elimine versiones que puedan resultar en violaciones del aislamiento snapshot isolation. Por ello, el procedimiento propuesto sólo borra una versión cuando se tieneAs the number of versions grows in the cache, for the procedure to be practical it needs a mechanism to eliminate unnecessary copies of the cache, also known as garbage collection. It is vital that garbage collection does not eliminate versions that may result in violations of snapshot isolation. Therefore, the proposed procedure only deletes a version when you have
Ia certeza que no va a hacer falta, en el futuro. Sólo se elimina una versión producidaThe certainty that will not be necessary, in the future. Only one produced version is deleted
, por una transacción cuando todas las transacciones en ejecución comenzaron después que dicha transacción comprometiera. Cuando se descarta una versión de un dato, siempre puede descartarse Ia versión como desconocida de dicho dato, si existiera., for a transaction when all transactions in execution began after that transaction compromised. When a version of a data is discarded, the version can always be discarded as unknown of said data, if it exists.
Una alternativa complementaria a Ia recolección de basura es la hibernación de datos de Ia caché. Siempre resulta posible para liberar memoria de Ia caché expulsar algún dato a otro medio (por ejemplo, un disco u otro servidor). En este caso deben expulsarse todas las versiones del dato. Si una transacción intenta acceder a un dato hibernado, deben traerse todas las versiones y determinar Ia versión que Ie corresponde. El hibernado y deshibernado puede realizarse en cualquier momento.A complementary alternative to garbage collection is hibernating data from the cache. It is always possible to free memory from the cache to eject some data to another medium (for example, a disk or another server). In this case, all versions of the data must be ejected. If a transaction attempts to access hibernating data, all versions must be brought and the corresponding version must be determined. Hibernating and dehibiting can be done at any time.
Debe hacerse notar que en el método propuesto, si el servidor de aplicaciones tiene suficiente memoria para almacenar todos los datos (y sus versiones) empleados por los clientes (Ia base de datos o al menos todos los datos accedidos por los clientes de manera simultánea caben en Ia caché), entonces Ia base de datos no tiene que proporcionar snapshot ¡solation, puediento proporcionar serialidad o aislamiento read committed, o incluso no proporcionar transaccionalidad.
Cuando el sistema está replicado, se presentan nuevos retos. Por un lado, se debe seguir proporcionando el aislamiento snapshot isolation, por otro coordinar Ia ejecución replicada para que el resultado sea equivalente al de un sistema sin replicar (esto es, no se introduzcan anomalías o inconsistencias en Ia replicación). Además, es muy importante que el sistema replicado sea escalable, para que al añadir nuevos nodos (y nuevas réplicas sobre estos nodos) el rendimiento máximo del sistema se incremente. El modelo de replicación que proponemos en Ia presente invención consiste en tomar como unidad de replicación pares servidor de aplicaciones y base de datos. De este modo, cada servidor de aplicaciones se conecta a una única base de datos (que denominaremos local) y los servidores de aplicación interactúan entre sí para garantizar Ia consistencia entre réplicas y de Ia caché, así como Ia transparencia de replicación. A continuación detallamos el procedimiento de gestión de Ia caché en el modelo replicado.It should be noted that in the proposed method, if the application server has enough memory to store all the data (and its versions) used by the clients (the database or at least all the data accessed by the clients simultaneously) in the cache), then the database does not have to provide snapshot solation, can provide seriality or isolation read committed, or even not provide transactionality. When the system is replicated, new challenges arise. On the one hand, snapshot isolation isolation must continue to be provided, on the other hand, coordinate the replicated execution so that the result is equivalent to that of an unreplicated system (that is, no anomalies or inconsistencies are introduced in the replication). In addition, it is very important that the replicated system be scalable, so that by adding new nodes (and new replicas on these nodes) the maximum system performance is increased. The replication model that we propose in the present invention consists in taking as a replication unit pairs application server and database. In this way, each application server is connected to a single database (which we will call local) and the application servers interact with each other to ensure consistency between replicas and the cache, as well as the transparency of replication. Below we detail the procedure for managing the cache in the replicated model.
En el modelo replicado, cada cliente se conecta al servidor de aplicaciones de una cualquiera de las réplicas. Ésta será Ia réplica local para las transacciones que ejecute y el resto serán Jas réplicas remotas. Las transacciones en primer lugar se ejecutan localmente y luego sufren un procesamiento remoto que involucra a todas las réplicas. El procesamiento local es muy similar al procesamiento anteriormente descrito para el modelo no replicado. El único paso que se trata de forma diferente es el compromiso y la recolección de basura que se detalla a continuación.In the replicated model, each client connects to the application server of any one of the replicas. This will be the local replica for the transactions you execute and the rest will be remote replicas. Transactions are first executed locally and then undergo remote processing that involves all replicas. Local processing is very similar to the processing described above for the non-replicated model. The only step that is treated differently is the commitment and garbage collection detailed below.
Cuando una transacción local de solo lectura quiere comprometer (respectivamente, abortar), se compromete (respectivamente, aborta) tanto en el servidor de aplicaciones, como en Ia base de datos.When a local read-only transaction wants to compromise (respectively, abort), it commits (respectively, aborts) both in the application server and in the database.
Cuando una transacción local de actualización quiere comprometer, en el modelo replicado, se envía un mensaje a todas las réplicas, incluida Ia réplica local, con los datos actualizados, así como Ia marca de inicio de Ia transacción. Este mensaje se envía de forma atómica a todas las réplicas de forma de que llegue a todas o ninguna. Además el mensaje debe tener asociado el mismo orden relativo en todas las réplicas para que se procese en dicho orden en todas las réplicas. Este mismo orden relativo se puede conseguir por cualquiera de los métodos conocidos en sistemas distribuidos tales como radiado con orden total, empleo de un secuenciadσr, mediante un protocolo de acuerdo, etc.
Cuando el mensaje enviado con los cambios de una transacción es procesado por una réplica en primer lugar se valida Ia transacción. Esta validación comprueba que Ia ejecución concurrente en distintas réplicas no viola Ia transparencia de repiicación y garantiza de éste modo que Ia ejecución replicada de las transacciones comprometidas es equivalente a una ejecución no replicada, además de satisfacer el aislamiento snapshot isolatíon. La validación comprueba que Ia transacción que está siendo validada no haya modificado datos comunes con otras transacciones concurrentes ya comprometidas. Si cumple Ia condición pasa a ser comprometida (esto es, supera Ia validación), si no es abortada.When a local update transaction wants to commit, in the replicated model, a message is sent to all the replicas, including the local replica, with the updated data, as well as the start mark of the transaction. This message is sent atomic to all replicas so that it reaches all or none. In addition, the message must have the same relative order associated in all replicas to be processed in that order in all replicas. This same relative order can be achieved by any of the known methods in distributed systems such as full-order radiation, use of a sequence, through an agreement protocol, etc. When the message sent with the changes of a transaction is processed by a replica, the transaction is validated first. This validation verifies that the concurrent execution in different replicas does not violate the transparency of repiration and thus guarantees that the replicated execution of the compromised transactions is equivalent to an unreplicated execution, in addition to satisfying the isolation snapshot isolatíon. The validation verifies that the transaction being validated has not modified common data with other concurrent transactions already committed. If it meets the condition it becomes compromised (that is, it exceeds the validation), if it is not aborted.
Al comprometer Ia transacción se tratan dos casos diferentes, que sea Ia réplica local a Ia transacción (Ia que envió el mensaje) o una réplica remota (el resto de las réplicas). En el caso de Ia réplica local, se abortan todas las transacciones en ejecución que hayan modificado datos comunes o que vayan a modificarlos. En el caso de una réplica remota, se crea Ia transacción asociada en Ia base de datos y hace falta aplicar los cambios de Ia transacción, generando las versiones correspondientes de los datos actualizados en Ia caché. Igualmente se abortan todas las transacciones que hayan modificado datos en común o que vayan a modificarlos. En cualquiera de los dos casos, se aplican los cambios en Ia base de datos en el contexto de Ia transacción asociada y se compromete Ia transacción en el servidor de aplicaciones y en Ia base de datos.When committing the transaction, two different cases are treated, that is the local replica to the transaction (the one that sent the message) or a remote replica (the rest of the replicas). In the case of the local replica, all transactions in execution that have modified common data or that will modify them are aborted. In the case of a remote replica, the associated transaction is created in the database and it is necessary to apply the changes of the transaction, generating the corresponding versions of the updated data in the cache. Likewise, all transactions that have modified data in common or that will modify them are aborted. In either case, the changes in the database are applied in the context of the associated transaction and the transaction is committed in the application server and in the database.
A continuación se detalla cómo combinar el modelo replicado con Ia repiicación de sesiones. En muchos casos, los servidores de aplicación mantienen estado conversacional con el cliente (conocido como sesión, estado volátil que se mantiene entre invocaciones de un mismo cliente), tales como los stateful session beans enBelow is a detailed description of how to combine the replicated model with session repiration. In many cases, application servers maintain conversational status with the client (known as session, volatile state that is maintained between calls from the same client), such as stateful session beans in
JEE. Para proporcionar alta disponibilidad es necesario combinar Ia repiicación de Ia sesión junto con Ia repiicación de datos propuesta anteriormente. De este modo, si una aplicación mantiene estado conversacional y falla Ia réplica a Ia que está conectada un cliente conversacional, se podrá reconectar de forma transparente a otra réplica y continuar su ejecución sin perder Ia disponibilidad de Ia sesión.
La forma de combinar Ia replicación de sesiones con Ia replicación de datos propuesta consiste en propagar el estado de Ia sesión a otras réplicas después de cada invocación del cliente.JEE To provide high availability, it is necessary to combine the repiration of the session together with the repiication of data proposed above. Thus, if an application maintains conversational status and the replica to which a conversational client is connected fails, it can be reconnected transparently to another replica and continue its execution without losing the availability of the session. The way to combine session replication with the proposed data replication is to propagate the session status to other replicas after each invocation of the client.
Dado que Ia sesión sólo se replica para obtener alta disponibilidad y no escalabilidad (aumento del rendimiento al aumentar el número de réplicas), no hace falta replicar el estado de Ia sesión en todas las réplicas si no en un subconjunto de éstas.Since the session is only replicated to obtain high availability and non-scalability (increased performance by increasing the number of replicas), it is not necessary to replicate the state of the session in all replicas if not in a subset of them.
A continuación se detalla Ia gestión de fallos'. Los clientes se conectan al servidor de aplicaciones mediante una API o proxy generalmente proporcionado por el servidor de aplicaciones (por ejemplo, en JEE a través de JNDI - Java Naming and Directory Interface). Los proxys al ser generados por el servidor de aplicaciones pueden incorporar Ia lógica de replicación de forma totalmente transparente a los clientes. El proxy emplea algún método para descubrir las réplicas disponibles.- Por ejemplo, mediante protocolos de radiado como IP-multicast o accediendo a un servidor que mantenga Ia lista de réplicas activas. Por ejemplo, empleando IP multicast, el proxy enviaría un mensaje de descubrimiento empleando este protocolo a una dirección IP multicast asociada al sistema replicado por Io que todas las réplicas reciben el mensaje de descubrimiento. Las réplicas tienen identificadores distintos (p. ej. de 0 a n-1 , siendo n el número de réplicas). Las peticiones que envían los clientes también son asociadas de forma unívoca a un cliente y tienen un identificador creciente y consecutivo de número de petición (mantenido por el proxy). Esto es cada cliente envía peticiones identificadas por un identificador único de c|iente y un identificador de petición con un valor inicial conocido (p. ej. 0) y crece de unidad en unidad. Una o más réplicas del sistema contestará al mensaje de descubrimiento indicando Ia lista de réplicas disponibles, sus direcciones de red (p.ej. su dirección IP) y una indicación de Ia carga de cada réplica. Para ello las réplicas intercambian periódicamente mensajes .informando acerca de su carga actual. El proxy entonces selecciona una réplica aleatoriamente con una probabilidad inversamente proporcional a su carga actual para equilibrar Ia carga entre las réplicas. El proxy entonces se conecta a Ia réplica seleccionada. Cuando el proxy se conecta con éxito a una réplica, enviará todas las peticiones subsiguientes a dicha réplica (las peticiones del cliente son locales a esa réplica). Si Ia réplica fallara, el proxy Io detectará mediante un temporizador y se conectará a una nueva réplica de las que conoció en el mensaje de descubrimiento empleando el mismo método.
En el lado del servidor de aplicaciones, cada réplica contiene un par servidor de aplicaciones y base de datos. Si cualquiera de ellos falla o el nodo en el que se encuentran falla, Ia réplica se considera fallida. Se pueden emplear mecanismos para detectar fallos parciales (sólo el servidor de aplicaciones o Ia base de datos falla) para forzar que Ia réplica se desconecte del sistema.Next, the fault management is detailed '. Clients connect to the application server through an API or proxy generally provided by the application server (for example, in JEE through JNDI - Java Naming and Directory Interface). The proxies being generated by the application server can incorporate the replication logic in a completely transparent way to the clients. The proxy uses some method to discover the available replicas.- For example, by means of radio protocols such as IP-multicast or by accessing a server that keeps the list of active replicas. For example, using IP multicast, the proxy would send a discovery message using this protocol to a multicast IP address associated with the system replicated by Io that all replicas receive the discovery message. Replicas have different identifiers (eg 0 to n-1, where n is the number of replicas). The requests sent by the clients are also uniquely associated with a client and have a growing and consecutive identifier of the request number (maintained by the proxy). This is each client sends requests identified by a unique client identifier and a request identifier with a known initial value (eg 0) and grows from unit to unit. One or more system replicas will reply to the discovery message indicating the list of available replicas, their network addresses (eg their IP address) and an indication of the load of each replica. For this, the replicas periodically exchange messages informing about their current load. The proxy then randomly selects a replica with a probability inversely proportional to its current load to balance the load between the replicas. The proxy then connects to the selected replica. When the proxy successfully connects to a replica, it will send all subsequent requests to that replica (client requests are local to that replica). If the replication fails, the proxy will detect it by means of a timer and connect to a new replica of the ones it knew in the discovery message using the same method. On the application server side, each replica contains an application server and database pair. If either of them fails or the node they are in fails, the replica is considered failed. Mechanisms can be used to detect partial failures (only the application server or the database fails) to force the replica to disconnect from the system.
En el lado cliente se detecta el fallo cuando el temporizador desde Ia última petición enviada al servidor de aplicaciones vence sin haberse recibido respuesta. Entonces el proxy se reconectará a Ia nueva réplica y reenviará Ia petición de nuevo. Hay dos escenarios posibles: (1) Ia réplica falló antes de que las actualizaciones se propagaran a las otras réplicas; (2) Ia réplica propagó con éxito las actualizaciones a las otras réplicas antes de fallar.On the client side the fault is detected when the timer since the last request sent to the application server expires without receiving a response. Then the proxy will reconnect to the new replica and forward the request again. There are two possible scenarios: (1) the replica failed before the updates propagated to the other replicas; (2) The replica successfully propagated the updates to the other replicas before failing.
Si hubo interacciones previas del cliente, Ia nueva réplica a Ia que se conecte el proxy tendrá una copia del último estado de Ia sesión asociado al cliente. Esta réplica regenerar^ Ia sesión a partir del último mensaje recibido de Ia réplica a Ia que el cliente estaba conectado.If there were previous customer interactions, the new replica to which the proxy connects will have a copy of the last state of the session associated with the client. This reply will regenerate the session from the last message received from the reply to which the client was connected.
En el caso (1), al estar las peticiones unívocamente identificadas con un identificador único de cliente y de petición, Ia nueva réplica a Ia que se conecta el cliente identificará el reenvío de Ia petición como una petición no enviada previamente (ya que no sabe nada de ésta) y realizará el procesamiento normal de ésta. El caso (2) puede tratarse pero con un coste mayor. Cuando se propaga el estado de Ia sesión, se debe propagar también Ia respuesta que va a enviarse al cliente. De este modo, si Ia nueva réplica a Ia que se conecta el cliente reconoce Ia petición reenviada como un duplicado (el último estado de Ia sesión conocido ya contiene el resultado de su procesamiento), y Ia transacción asociada comprometió, Ia réplica devolverá Ia respuesta almacenada al cliente. En otro caso, devolverá un mensaje de error al cliente notificando que Ia transacción no pudo comprometerse porque no era posible garantizar el aislamiento snapshot.
Descripción de las figurasIn case (1), since the requests are uniquely identified with a unique customer and request identifier, the new replica to which the client connects will identify the forwarding of the request as a request not previously sent (since it does not know none of this) and will perform its normal processing. Case (2) can be treated but at a higher cost. When the state of the session is propagated, the response to be sent to the client must also be propagated. Thus, if the new replica to which the client is connected recognizes the request forwarded as a duplicate (the last state of the known session already contains the result of its processing), and the associated transaction compromised, the replica will return the response stored to the customer. Otherwise, it will return an error message to the client notifying that the transaction could not be committed because it was not possible to guarantee snapshot isolation. Description of the figures
La figura 1 ilustra el modelo replicado. Cada réplica (4) consta de un par servidor de aplicaciones (2) y base de datos (3). Cada cliente (1 ) se conecta a una de las réplicas. En el modelo no replicado existirían los elementos de una única réplica, esto es, un par servidor de aplicaciones y base de datos. Los servidores de aplicación (4) se comunican a través de una red de comunicación para coordinarse y garantizar Ia consistencia de Ia replicación.Figure 1 illustrates the replicated model. Each replica (4) consists of an application server pair (2) and database (3). Each client (1) connects to one of the replicas. In the non-replicated model there would be elements of a single replica, that is, an application server and database pair. The application servers (4) communicate through a communication network to coordinate and ensure the consistency of the replication.
La figura 2 muestra un ejemplo de Ia evolución del procesamiento local de Ia caché (ignorando el procesamiento remoto). La figura ilustra el servidor de aplicaciones (2), Ia base de datos (3), y Ia caché que mantiene el servidor de aplicaciones (1). Dentro de Ia base de datos se muestra los datos X e Y Ia evolución de sus versiones. La caché muestra su evolución partiendo de un estado inicial en el que Ia caché está vacía. De cada dato (Xe Y) se muestra el valor del dato (4) y su versión (5). Se ilustra Ia ejecución de dos transacciones T1 y T2. T2 tiene los siguientes pasos (6) Inicio de transacción, (7) Leer(X), (8) Escribir(Y, d), (9) Leer(Y), (10) Comprometer. T1 tiene los siguientes pasos (11 ) Inicio de transacción, (12) Leer(X), (13) Leer(Y), (14) Escribir(X, c), (15) Comprometer.Figure 2 shows an example of the evolution of the local processing of the cache (ignoring remote processing). The figure illustrates the application server (2), the database (3), and the cache that the application server (1) maintains. Within the database, the X and Y data shows the evolution of its versions. The cache shows its evolution starting from an initial state in which the cache is empty. For each data (Xe Y) the value of the data (4) and its version (5) are shown. The execution of two transactions T1 and T2 is illustrated. T2 has the following steps (6) Start of transaction, (7) Read (X), (8) Write (Y, d), (9) Read (Y), (10) Commit. T1 has the following steps (11) Start of transaction, (12) Read (X), (13) Read (Y), (14) Write (X, c), (15) Commit.
En Ia figura se parte de un estado en el que Ia caché está vacía y Ia marca temporal de Ia réplica vale 10. Las transacciones T1 y T2 obtienen Ia misma marca temporal de inicio, 10, y se crea Ia transacción asociada en Ia base de datos para cada una de ellas. T2 lee X. Como no hay ninguna versión de X en Ia caché, el dato es leído de Ia base de datos y una versión etiquetada como -1 (para representar que Ia versión es desconocida) de X es creada. El valor de X en Ia caché es a. Ahora T1 lee X e Y. Como Ia versión -1 de X está en Ia caché y -1 ≤ 10, T1 Ia versión -1 de X. Además, lee Y de Ia base de datos y es etiquetada como versión -1. El valor de Ia versión -1 de Yes b. Ahora T1 actualiza X con el valor c e Y con el valor d. Para ello crea versiones privadas de Xe Y adquiriendo los correspondientes cerrojos sobre Xe Y.The figure starts from a state in which the cache is empty and the time stamp of the replica is worth 10. Transactions T1 and T2 obtain the same start time stamp, 10, and the associated transaction is created on the basis of data for each of them. T2 reads X. Since there is no version of X in the cache, the data is read from the database and a version labeled -1 (to represent that the version is unknown) of X is created. The value of X in the cache is a. Now T1 reads X and Y. Since the version -1 of X is in the cache and -1 ≤ 10, T1 the version -1 of X. In addition, it reads Y from the database and is labeled as version -1. The value of version -1 of Yes b. Now T1 updates X with the value c and Y with the value d. For this, it creates private versions of Xe Y by acquiring the corresponding bolts on Xe Y.
Finalmente, T1 solicita comprometer. Recibe Ia marca temporal de compromiso CT(TI ) = 11 , las versiones privadas de X e Y son etiquetadas con esta marca temporal y se hacen públicas en Ia caché. La transacción asociada en Ia base de datos compromete, habiéndose aplicado con anterioridad las escrituras sobre X e Y sobre Ia base de datos en el contexto de Ia transacción asociada. Como Ia base de datos
proporciona snapshot isolation, internamente crea versiones de X e Y. Cunado T2 ahora lee Y, no lee Ia versión 11 de Y, ya que 11 > ST(T2). En su lugar, lee Ia versión - 1 de Y, esto s, el antiguo valor de Y, b. Como T2 es de solo lectura, simplemente compromete en Ia base de datos y compromete en Ia base de datos sin que se Ie asigne ninguna marca temporal.Finally, T1 asks to compromise. Receive the time commitment mark CT (TI) = 11, the private versions of X and Y are labeled with this time stamp and are made public in the cache. The associated transaction in the database compromises, having previously applied the writings on X and Y on the database in the context of the associated transaction. As the database provides snapshot isolation, internally creates versions of X and Y. When T2 now reads Y, it does not read version 11 of Y, since 11> ST (T2). Instead, read the version - 1 of Y, that is, the old value of Y, b. As T2 is read-only, it simply compromises in the database and compromises in the database without any time stamp being assigned.
El procesamiento remoto se ilustra con un ejemplo en las Figuras 3 y 4. En el ejemplo, hay dos réplicas R1 (1 ) y R2 (2) que ejecutan dos transacciones 71 y 72, respectivamente. Cada replica contiene un par servidor de aplicaciones (4) y base de datos (3). El servidor de aplicaciones mantiene una caché de datos (5). Se muestra también Ia secuencia de procesamiento de transacciones (6). De cada dato se muestra su valor (7) y versión (8). La transacción T1 consta de los siguientes pasos: Inicio, Leer(X), Escribir(X, b), Comprometer. La transacción T2 consta de los siguientes pasos: Inicio, Leer(X), Escribir(X, c), Comprometer.Remote processing is illustrated by an example in Figures 3 and 4. In the example, there are two replicas R1 (1) and R2 (2) that execute two transactions 71 and 72, respectively. Each replica contains an application server pair (4) and database (3). The application server maintains a data cache (5). The sequence of transaction processing (6) is also shown. Each value shows its value (7) and version (8). Transaction T1 consists of the following steps: Start, Read (X), Write (X, b), Commit. Transaction T2 consists of the following steps: Start, Read (X), Write (X, c), Commit.
En el ejemplo ilustrado, T1 y T2 que leen y actualizan el dato X concurrentemente (tienen Ia misma marca temporal de inicio: S7(71 ) = 0, S7(72) = 0). Cuando terminan Ia ejecución en sus respectivas réplicas locales, sus cambios son propagados. El orden relativo de las actualizaciones es 71, 72 y no hay ninguna otra transacción conflictiva concurrente. Cuando 71 es recibida, Ia validación de 71 tendrá éxito en R1 ya que no hay ninguna otra transacción concurrente comprometida. Su marca temporal de compromiso es obtenida (C7(7 1 ) = 1 ) y Ia transacción comprometerá. Entonces, 72 será procesada. Durante Ia validación de 72 se determinará que 71 es concurrente y tiene conflicto con 71 que ya comprometió (S7(72) < C7(71 )). Por Io tanto, 72 abortará. En Ia réplica R2 las transacciones serán validadas en el mismo orden. Sin embargo, durante Ia validación de 71 se encontrará que hay un cerrojo sobre X poseído por 72. 72 será abortada y 71 comprometerá. De este modo, cuando se reciban los cambios de 72, Ia validación fallará. De este modo, las dos réplicas comprometerán las mismas transacciones.In the illustrated example, T1 and T2 that read and update the data X concurrently (have the same starting time stamp: S7 (71) = 0, S7 (72) = 0). When they finish the execution in their respective local replicas, their changes are propagated. The relative order of the updates is 71, 72 and there are no other concurrent conflicting transactions. When 71 is received, the validation of 71 will be successful in R1 since there is no other concurrent transaction committed. Your temporary commitment mark is obtained (C7 (7 1) = 1) and the transaction will compromise. Then, 72 will be processed. During the validation of 72 it will be determined that 71 is concurrent and has a conflict with 71 that has already been committed (S7 (72) <C7 (71)). Therefore, 72 will abort. In the R2 replica the transactions will be validated in the same order. However, during the validation of 71 it will be found that there is a bolt on X possessed by 72. 72 will be aborted and 71 will compromise. Thus, when the changes of 72 are received, the validation will fail. In this way, the two replicas will compromise the same transactions.
Exposición de un modo de realización de Ia invenciónExhibition of an embodiment of the invention
Se presenta un modo de realización de Ia invención para el caso en el que el almacén de datos proporciona aislamiento snapshot isolation [Berenson95], tanto para el sistema replicado, como el no replicado. La realización propuesta emplea marcas temporales para determinar cuándo inician y terminan las transacciones, así como
para etiquetar las distintas versiones de los datos. La realización emplea cerrojos (locking) para detectar de forma temprana los conflictos entre transacciones concurrentes.An embodiment of the invention is presented for the case in which the data store provides snapshot isolation [Berenson95] isolation, both for the replicated and the non-replicated system. The proposed implementation uses time stamps to determine when transactions begin and end, as well as to label the different versions of the data. The implementation uses locks (locking) to detect conflicts between concurrent transactions early.
En el modelo replicado se mantienen múltiples instancias o réplicas de cada par servidor de aplicaciones-base de datos. La realización de Ia invención se explica en dos pasos. En el primer paso se detalla cómo se realiza el procesamiento local de una petición en Ia réplica que ía recibe para garantizar Ia coherencia a nivel local, y en el segundo cómo se realiza el procesamiento remoto en el resto de las réplicas para garantizar Ia coherencia a nivel global.In the replicated model, multiple instances or replicas of each database application server pair are maintained. The embodiment of the invention is explained in two steps. In the first step it is detailed how the local processing of a request is carried out in the replica that it receives to guarantee the coherence at the local level, and in the second step how the remote processing is performed in the rest of the replicas to guarantee the coherence to global level.
A continuación se detalla el procesamiento local de las transacciones. Tanto el servidor de aplicaciones, como Ia base de datos, tienen su propio sistema de gestión de transacciones. El servidor de aplicaciones mantiene Ia relación entre las transacciones de ambos sistemas, es decir, cuando se inicia una transacción en el servidor de aplicaciones también se inicia una transacción en Ia base de datos. El servidor de aplicaciones también mantiene Ia relación entre identificador de cliente y transacción. A cada transacción del servidor de aplicaciones se Ie asocia una marca temporal de inicio, cuando se inicia en el servidor de aplicaciones y una marca temporal de compromiso cuando Ia transacción compromete. La marca de compromiso MC(T) de Ia transacción T es un número creciente que refleja los snapshots por los que los datos progresan (esto es, número de transacciones comprometidas). La marca de inicio MI(T ) de una transacción T es Ia mayor marca de compromiso en el instante de inicio de T. Esto es, representa Ia última transacción comprometida T e indica que si T lee un dato actualizado (modificado, creado o borrado) por T, deberá leer los valores actualizados por T. S/ el dato no fue actualizado por T, T deberá leer el valor actualizado por Ia transacción con mayor marca de compromiso. Inicialmente todas las réplicas comienzan con Ia misma marca temporal de compromiso (que se asocia como marca temporal de inicio), p.ej. 0. El método propuesto garantiza que todas las transacciones de actualización (modifican, crean o borran datos) recibirán Ia misma marca de compromiso en todas las réplicas. Las que tengan conflicto (actualicen un dato en común) en el sistema serán comprometidas en Ia base de datos en todas las réplicas en el mismo orden relativo.
El servidor de aplicaciones mantiene una caché o copia algunos o todos los datos de Ia base de datos. Inicialmente no hay datos en el servidor de aplicaciones (caché) y los datos a los que accede una transacción se leen de Ia base de datos. Cada dato X en Ia caché se etiqueta con una marca temporal, /', denominándose versión / del dato X (Xi). i indica Ia marca temporal de compromiso de Ia transacción que actualizó y comprometió dicha versión. Ya que se asume que Ia base de datos no proporciona información acerca del versionado interno de los datos, cuando un dato X es leído de Ia base de datos su número versión es desconocido. Por este motivo, un dato X leído de Ia base de datos es etiquetado con una versión especial para indicar que es desconocida. En adelante, se empleará el identificador de versión -1 para indicar que Ia versión del dato es desconocida.The local processing of the transactions is detailed below. Both the application server and the database have their own transaction management system. The application server maintains the relationship between the transactions of both systems, that is, when a transaction is initiated in the application server, a transaction is also initiated in the database. The application server also maintains the relationship between client identifier and transaction. Each transaction of the application server is associated with a start time stamp, when it starts on the application server and a commitment time mark when the transaction commits. The commitment mark MC (T) of the transaction T is an increasing number that reflects the snapshots through which the data progresses (that is, the number of transactions committed). The MI (T) start mark of a transaction T is the largest commitment mark at the start time of T. That is, it represents the last compromised transaction T and indicates that if T reads an updated (modified, created or deleted data) ) by T, you must read the values updated by T. S / the data was not updated by T, T must read the value updated by the transaction with the highest commitment mark. Initially, all replicas begin with the same time commitment mark (which is associated as a start time mark), eg 0. The proposed method guarantees that all update transactions (modify, create or delete data) will receive the same commitment mark on all replicas. Those that have conflict (update a common data) in the system will be compromised in the database in all replicas in the same relative order. The application server maintains a cache or copies some or all of the data in the database. Initially there is no data in the application server (cache) and the data accessed by a transaction is read from the database. Each data X in the cache is labeled with a time stamp, / ' , being called version / of the data X (Xi). i indicates the time commitment mark of the transaction that updated and committed said version. Since it is assumed that the database does not provide information about the internal versioning of the data, when an X data is read from the database its version number is unknown. For this reason, an X data read from the database is labeled with a special version to indicate that it is unknown. Hereinafter, the version -1 identifier will be used to indicate that the version of the data is unknown.
Cuando una transacción lee un dato X, primero comprueba si Ia transacción ha actualizado ese dato, en tal caso, leerá Io cfue Ia transacción actualizó (como se detallará más adelante estará accesible en Ia versión privada del dato de dicha transacción). Si no ha actualizado ese dato, se busca X en Ia caché. La versión del dato X que una transacción T accederá es Ia última versión comprometida en el instante que Ia transacción se inició. Esto es, Ia versión del dato que tenga una versión /, tal que / <= Ml( T ) y no existe una versión j tal que Xj : / < j <= Ml( T ). Si un dato X no está en caché, entonces se lee de Ia base de datos. Dado que cuando se inicia una transacción en el servidor de aplicaciones se inicia una transacción en Ia base de datos, y que Ia base de datos suministra aislamiento snapshot isolation, Ia base de datos devolverá Ia versión adecuada de X. Este proceso garantiza que cada transacción observa una foto (o snapshot) de Ia base de datos en el estado en que ésta se encontraba cuando Ia transacción comenzó, de forma que satisface el aislamiento snapshot isolation.When a transaction reads an X data, it first checks if the transaction has updated that data, in that case, it will read what the transaction updated (as will be detailed later it will be accessible in the private version of the data of said transaction). If you have not updated that data, search X in the cache. The version of data X that a transaction T will access is the last compromised version at the moment the transaction was initiated. That is, the version of the data that has a version /, such that / <= Ml (T) and there is no version j such that Xj: / <j <= Ml (T). If an X data is not cached, then it is read from the database. Given that when a transaction is initiated on the application server, a transaction is initiated in the database, and that the database provides snapshot isolation, the database will return the appropriate version of X. This process guarantees that each transaction look at a photo (or snapshot) of the database in the state it was in when the transaction began, so that it satisfies the snapshot isolation.
Bajo snapshot isolation, cuando dos transacciones concurrentes actualizan el mismo dato, sólo una de ellas puede comprometer y Ia otra tiene que abortar. Este tipo de conflictos se puede detectar o cuanto se producen o cuando compromete Ia segunda de las transacciones o en cualquier instante intermedio. A continuación se describe el método empleando una detección temprana de conflictos empleando cerrojos (locking). Una transacción Ti para actualizar un dato X tiene que obtener un cerrojo sobre éste antes de actualizarlo. Este cerrojo evitará que otras transacciones concurrentes actualicen X. Si otra transacción local (transacciones de clientes
conectados a esa réplica) solicita un cerrojo sobre X, se quedará bloqueada y, por tanto, no podrá actualizar X hasta que sea desbloqueada. Una vez que se ha obtenido el cerrojo se comprueba si Ia mayor versión del dato existente en Ia caché, Xj, es mayor que Ia marca de inicio MI(Ti) de 77, entonces esta versión fue creada por una transacción concurrente que ha comprometido y Ti debe abortar (una transacción concurrente ha actualizado ese dato y ha comprometido). En otro caso, Ti puede realizar Ia actualización, esto es, crear su propia versión. Esta versión es privada y sólo puede ser vista por Ia transacción Ti mientras 77 no comprometa. Si Ia transacción Ti escribe dos o más veces el mismo dato, Ia segunda y sucesivas veces accederá a su versión privada directamente, sin realizar las comprobaciones anteriores. De este modo se garantiza que una transacción observa sus propias actualizaciones y evita que otras transacciones observen cambios no comprometidos. Si Ia transacción falla Ia validación, ésta se aborta en el servidor de aplicaciones y en Ia base de datos, las versiones privadas de los datos actualizados se descartan, y los cerrojos se liberan. La primera transacción que esté esperando por cada cerrojo se desbloquea y se Ie concede el cerrojo. Después de Ia terminación de Ia transacción (compromiso o aborto), todos los cerrojos son liberados.Under snapshot isolation, when two concurrent transactions update the same data, only one of them can compromise and the other has to abort. This type of conflict can be detected or how much they occur or when it compromises the second of the transactions or at any intermediate time. The method is described below using early conflict detection using locks. A Ti transaction to update an X data must obtain a lock on it before updating it. This lock will prevent other concurrent transactions from updating X. If another local transaction (customer transactions connected to that replica) requests a bolt on X, it will remain locked and therefore cannot update X until it is unlocked. Once the bolt has been obtained, it is checked if the major version of the data existing in the cache, Xj, is greater than the MI (Ti) start mark of 77, then this version was created by a concurrent transaction that has committed and You must abort (a concurrent transaction has updated that data and has committed). In another case, Ti can perform the update, that is, create its own version. This version is private and can only be seen by the Ti transaction while 77 does not compromise. If the transaction Ti writes the same data two or more times, the second and subsequent times will access its private version directly, without performing the previous checks. This ensures that a transaction observes its own updates and prevents other transactions from observing uncommitted changes. If the transaction fails the validation, it is aborted in the application server and in the database, the private versions of the updated data are discarded, and the locks are released. The first transaction that is waiting for each lock is unlocked and the lock is granted. After the completion of the transaction (commitment or abortion), all locks are released.
Cuando una transacción que sólo realiza lecturas, compromete, se compromete Ia transacción en el servidor de aplicaciones y en Ia base de datos y se devuelve el resultado al cliente. Estas transacciones no requieren de coordinación con las otras réplicas. Para transacciones que realicen actualizaciones, en el compromiso de Ia transacción en el servidor de aplicaciones, éste propaga los datos actualizados a las otras réplicas. Las actualizaciones son propagadas a todas las réplicas de forma atómica (las actualizaciones son recibidas por todas las réplicas activas o a ninguna si Ia réplica que envió el mensaje se ha caído) y en orden total (todas las réplicas asocian el mismo orden relativo a las actualizaciones). Todas las réplicas validarán las actualizaciones en el mismo orden relativo. El procesamiento de estas actualizaciones (validación) es determinista (se comporta como una máquina de estados) y tendrá el mismo resultado en todas las réplicas. La validación y procesamiento de las actualizaciones, en principio, es secuencial, aunque para transacciones que no sean conflictivas entre sí pueden validarse y procesarse en paralelo. Una transacción Ti no supera Ia validación, si hay una transacción Tj en el sistema (conjunto de réplicas) que es concurrente con Ti, ha comprometido (MI(T/) < MC[Tj) < MC(Ti)) y Tj actualizó algún dato en común. Las transacciones cuya propagación de cambios llegue a Ia
réplica en Ia que son locales sin que estuvieran abortadas, siempre pasarán Ia validación. Los detalles de la validación para transacciones remotas se describirán en el próximo apartado.When a transaction that only carries out readings, commits, the transaction is committed in the application server and in the database and the result is returned to the client. These transactions do not require coordination with the other replicas. For transactions that perform updates, in the compromise of the transaction on the application server, it propagates the updated data to the other replicas. Updates are propagated to all replicas atomic (updates are received by all active replicas or none if the replica that sent the message has been dropped) and in total order (all replicas associate the same order regarding updates ). All replicas will validate the updates in the same relative order. The processing of these updates (validation) is deterministic (it behaves like a state machine) and will have the same result in all replicas. The validation and processing of the updates, in principle, is sequential, although for transactions that are not conflicting with each other they can be validated and processed in parallel. A transaction Ti does not pass the validation, if there is a transaction Tj in the system (set of replicas) that is concurrent with Ti, it has committed (MI (T /) <MC [Tj) <MC (Ti)) and Tj updated some data in common. Transactions whose spread of changes reaches Ia Replica in which they are local without having been aborted, they will always pass the validation. The details of the validation for remote transactions will be described in the next section.
Cuando una transacción de actualización supera Ia validación se Ie asigna una marca temporal de compromiso que es incrementada con cada transacción comprometida. En Ia réplica en Ia que Ia transacción es local, cada dato actualizado (Ia versión privada de la transacción) se etiqueta con Ia marca temporal del compromiso de Ia transacción y se hace pública Ia versión, pasando a formar parte de Ia caché. La liberación de los cerrojos causará el aborto de las transacciones locales no validadas que estuvieran esperando por estos cerrojos de Ia transacción comprometida. Finalmente, se compromete Ia transacción en Ia base de datos. El compromiso de Ia base de datos propaga automáticamente las actualizaciones de los datos a Ia base de datos. Cabe destacar que las versiones de los datos actualizados se mantienen en memoria, en Ia caché hasta que se determine que son innecesarios y se eliminen.When an update transaction exceeds the validation, a temporary commitment mark is assigned that is increased with each compromised transaction. In the replica in which the transaction is local, each updated data (the private version of the transaction) is labeled with the time stamp of the transaction commitment and the version is made public, becoming part of the cache. The release of the locks will cause the abortion of the non-validated local transactions that were waiting for these locks of the compromised transaction. Finally, the transaction in the database is compromised. The commitment of the database automatically propagates the updates of the data to the database. It should be noted that the versions of the updated data are kept in memory, in the cache until it is determined that they are unnecessary and deleted.
Debido a que el uso de cerrojos puede dar lugar a interbloqueos, un mecanismo para Ia detección y resolución de interbloqueos resulta necesario (p.ej. basado en Ia detección de ciclos en un grafo de esperas). Los interbloqueos, en cualquier caso, serán raros ya que sólo implicarán conflictos entre escrituras.Because the use of locks can lead to interlocks, a mechanism for the detection and resolution of interlocks is necessary (eg based on the detection of cycles in a waiting graph). Interlocks, in any case, will be rare since they will only imply conflicts between writes.
A continuación se describe el procesamiento remoto. Como se mencionó en el procesamiento local, cuando una transacción que ha realizado actualizaciones ha sido procesada localmente, sus actualizaciones se envían a todas las réplicas incluyendo Ia réplica local (en Ia que se ha ejecutado Ia transacción). Esta transacción local es una transacción remota en el resto de las réplicas. Ya que todas las réplicas pueden ejecutar transacciones de actualización, cada réplica debe validar (detectar posible conflictos entre transacciones concurrentes que actualicen los mismos datos) para garantizar que se proporciona snapshot isolation a nivel global (de todas las réplicas). Gracias a que los cambios tienen asociado el mismo orden relativo en todas las réplicas, realizan Ia validación en este orden y comprometen las mismas transacciones en el mismo orden. De este modo el estado de Ia caché y Ia base de datos se mantiene consistente entre todas las réplicas. Esto es, todas las instancias de las bases de datos tienen los mismos datos con los mismos valores y las caches tienen las versiones necesarias para proporcionar transparencia de caché y de replicación.
La transparencia de replicación significa que para cualquier ejecución replicada, existe una ejecución en el sistema no replicada equivalente, en Ia que los clientes ven los mismos resultados y las bases de datos quedan en los mismos estados finales. La transparencia de caché significa que los clientes ven los mismos resultados que verían si el sistema no mantuviera ningún dato en Ia caché y siempre leyera de Ia base de datos.The remote processing is described below. As mentioned in the local processing, when a transaction that has made updates has been processed locally, its updates are sent to all replicas including the local replica (in which the transaction has been executed). This local transaction is a remote transaction in the rest of the replicas. Since all replicas can execute update transactions, each replica must validate (detect possible conflicts between concurrent transactions that update the same data) to ensure that snapshot isolation is provided globally (of all replicas). Thanks to the changes associated with the same relative order in all replicas, they perform the validation in this order and commit the same transactions in the same order. In this way, the status of the cache and the database remains consistent among all replicas. That is, all instances of the databases have the same data with the same values and the caches have the necessary versions to provide cache and replication transparency. Transparency of replication means that for any replicated execution, there is an equivalent non-replicated execution in the system, in which the clients see the same results and the databases remain in the same final states. Cache transparency means that customers see the same results they would see if the system did not keep any data in the cache and always read from the database.
El primer paso en una réplica remota (que no haya ejecutado Ia transacción localmente) después de recibir el mensaje con Ia propagación de cambios de una transacción es Ia validación. Al igual que ocurre con las transacciones locales, se comprueba que no se hayan comprometido transacciones concurrentes que hayan actualizado el mismo dato (esto es, que tengan conflictos de escritura). La validación de Ia transacción comprueba que no tiene conflictos con ninguna otra transacción concurrente que se haya validado con éxito (y por tanto comprometido) con anterioridad (esto es, transacciones que no eran conocidas antes de Ia propagación de las actualizaciones de Ia transacción que se está validando). Si Ia validación no tiene éxito, Ia réplica descarta los cambios recibidos. Si Ia validación no tiene éxito, Ia réplica descarta los cambios recibidos. Si Ia validación tiene éxito se Ie asigna a Ia transacción una marca temporal de compromiso, al igual que ocurre con Jas transacciones locales y se crea una transacción en Ia base de datos. Para cada dato actualizado se comprueba si existe un cerrojo poseído por una transacción no validada (local y no comprometida) si esto ocurre, Ia transacción local es abortada (es concurrente, tiene conflicto de escritura y no ha validado aún). A continuación se crea una versión de cada dato que ha sido actualizado, etiquetado con Ia marca temporal de compromiso y se añade a Ia caché (Ia caché local es actualizada). Los pasos restantes son los mismos que para las transacciones locales, donde Ia transacción de Ia base de datos compromete, el contador de compromisos se incrementa y Ia transacción es almacenada en Ia lista de transacciones comprometidas. Si una transacción remota falla Ia validación, no necesita realizarse ninguna acción adicional salvo descartar el mensaje.The first step in a remote replication (which has not executed the transaction locally) after receiving the message with the propagation of changes in a transaction is the validation. As with local transactions, it is verified that no concurrent transactions have been committed that have updated the same data (that is, that they have write conflicts). The validation of the transaction verifies that it has no conflicts with any other concurrent transaction that has been validated successfully (and therefore committed) previously (that is, transactions that were not known before the propagation of the updates of the transaction that was is validating). If the validation is unsuccessful, the reply discards the changes received. If the validation is unsuccessful, the reply discards the changes received. If the validation is successful, a temporary commitment mark is assigned to the transaction, as is the case with local transactions and a transaction is created in the database. For each updated data, it is checked whether there is a lock possessed by an unvalidated transaction (local and non-committed) if this occurs, the local transaction is aborted (it is concurrent, it has a write conflict and has not yet been validated). Next, a version of each data that has been updated is created, labeled with the time commitment mark and added to the cache (the local cache is updated). The remaining steps are the same as for local transactions, where the transaction of the database compromises, the commitment counter is increased and the transaction is stored in the list of committed transactions. If a remote transaction fails the validation, no further action is necessary except to discard the message.
La creación y borrado de datos también se trata en el método propuesto. Cuando un nuevo dato X es creado, se crea su versión privada como una actualización, pero no existe ninguna versión previa. El cerrojo también se pide sobre el dato X para evitar que otras transacciones concurrentes creen el mismo dato X. Los cerrojos son
asociados a las claves de los datos (que coinciden con Ia clave de Ia base de datos). Cuando Ia transacción compromete, se inserta el dato en Ia base de datos, Ia versión pasa a ser pública y estar disponible para otras transacciones que comiencen después que Ia transacción que Ia creo comprometió.The creation and deletion of data is also treated in the proposed method. When a new X data is created, its private version is created as an update, but no previous version exists. The lock is also requested on the X data to prevent other concurrent transactions from creating the same X data. The bolts are associated to the data keys (which coincide with the database key). When the transaction commits, the data is inserted in the database, the version becomes public and is available for other transactions that begin after the transaction that I believe committed.
Los borrados se tratan creando una versión tumba del dato. 'La versión tumba también es una versión privada de Ia transacción hasta que ésta comprometa. Si Ia transacción intenta acceder al dato no Io encontrará ya que Ia versión tumba indica que el dato ya no existe. Cuando una transacción compromete, el dato se eliminará, y Ia versión tumba se hará pública. Debe hacerse notar que incluso después del compromiso de Ia transacción las versiones previas del dato borrado no pueden ser eliminadas, ya que puede haber transacciones activas que estén asociadas a dichas versiones (todas las transacciones que comenzaron antes de Ia que borró el dato comprometiera), puedan leer el dato y así satisfacer el aislamiento snapshot isolation.Deletions are treated by creating a grave version of the data. 'The grave version is also a private version of the transaction until it compromises. If the transaction tries to access the data you will not find it since the grave version indicates that the data no longer exists. When a transaction compromises, the data will be deleted, and the grave version will be made public. It should be noted that even after the compromise of the transaction the previous versions of the deleted data cannot be deleted, since there may be active transactions that are associated with said versions (all transactions that began before the one that deleted the compromised data), They can read the data and thus satisfy the snapshot isolation.
Debido a que Ia memoria (caché) es limitada, todas las versiones que se van produciendo a Io largo del tiempo no pueden mantenerse en Ia caché (en memoria o incluso hibernadas en disco). Por ello, se eliminan de Ia caché cuando ya no se necesitan más para liberar espacio.Because the memory (cache) is limited, all the versions that are produced over time cannot be kept in the cache (in memory or even hibernated on disk). Therefore, they are removed from the cache when they are no longer needed to free up space.
De este cometido se encarga el mecanismo de eliminación de copias innecesarias o recolección de basura. Para realizar Ia recolección de basura cada servidor de aplicaciones de cada réplica propaga Ia marca temporal de inicio más antigua (de menor valor) de las transacciones activas (no comprometidas) que hay en Ia réplica al resto de las réplicas. Para reducir el coste, esta propagación puede realizarse dentro de los mensajes de propagación de actualizaciones de las transacciones. Cada réplica mantiene un vector actualizado con Ia marca temporal de inicio más antigua de cada réplica. Cada réplica eliminará las versiones de los datos más antiguas que Ia marca más antigua (de menor valor) de entre las de todas las réplicas. Las versiones etiquetadas con versión -1 requieren un procesamiento diferente. Si una versión de un dato Xi (i distinta de -1 ) no se necesita más (no hay ninguna transacción activa que tenga una marca temporal de inicio menor que /), entonces Ia versión -1 del dato tampoco se necesita. Si Ia caché está llena con versiones que no pueden descartarse siempre pueden hibernarse (almacenarse en almacenamiento persistente distinto a Ia base de datos como en Ia hibernación que emplean Io servidores de aplicación JEE
para expulsar datos de Ia caché). Los datos, hibernados pueden ser llevados a Ia memoria en cualquier momento por el servidor de aplicación (todas las versiones del dato).This task is responsible for the mechanism of elimination of unnecessary copies or garbage collection. In order to carry out the garbage collection, each application server of each replica propagates the oldest starting time mark (of less value) of the active transactions (not compromised) that exist in the replica to the rest of the replicas. To reduce the cost, this propagation can be done within the propagation messages of transaction updates. Each replica keeps an updated vector with the oldest start time stamp of each replica. Each replica will remove the versions of the data older than the oldest mark (of lesser value) from among those of all the replicas. Versions labeled with version -1 require different processing. If a version of a data Xi (i other than -1) is no longer needed (there is no active transaction that has a start time stamp less than /), then version -1 of the data is also not required. If the cache is full with versions that cannot be discarded, they can always be hibernated (stored in persistent storage other than the database as in the hibernation used by JEE application servers to eject data from the cache). The data, hibernados can be brought to memory at any time by the application server (all versions of the data).
El procedimiento propuesto para el modelo centralizado es una simplificación del procesamiento local propuesto para el modelo replicado. En el modelo centralizado al no existir transacciones remotas no es necesaria Ia propagación de actualizaciones ni Ia fase de validación global para comprobar conflictos con transacciones remotas. Por Io demás el modelo centralizado sigue los mismos pasos que el procesamiento local del modelo replicado.The proposed procedure for the centralized model is a simplification of the local processing proposed for the replicated model. In the centralized model in the absence of remote transactions, the propagation of updates or the global validation phase is not necessary to verify conflicts with remote transactions. Moreover, the centralized model follows the same steps as the local processing of the replicated model.
Aplicación industrialIndustrial application
La invención es aplicable en el sector industrial de los sistemas de información multi- capa. Un ejemplo representativo de éstos son los servidores de aplicación que se emplean en combinación con bases de datos, proporcionando lai dos capas sobre Ia que se puede aplicar Ia invención. Los servidores de aplicación generalmente mantienen una caché de (os datos de Ia base de datos para aumentar su eficiencia. En Ia actualidad los servidores de aplicación garantizan serialidad. Con Ia aplicación de Ia presente invención (modelo centralizado) podrán ofrecer aislamiento snapshot isolation.The invention is applicable in the industrial sector of multilayer information systems. A representative example of these are the application servers that are used in combination with databases, providing two layers on which the invention can be applied. Application servers generally maintain a cache of data from the database to increase their efficiency. Currently, application servers guarantee seriality. With the application of the present invention (centralized model) they can offer snapshot isolation.
Dentro del mismo campo de los servidores de aplicación en Ia actualidad existen soluciones de replicación (también conocida como clustering) para proporcionar disponibilidad y aumentar Ia escalabilidad. La escalabilidad con serialidad está muy limitada debido a los conflictos entre lecturas y escrituras. El modelo replicado de Ia presente invención permite evita los conflictos de lecturas escrituras por Io que es más escalable y proporcionando un aislamiento snapshot isolation muy cercano a Ia serialidad,Within the same field of application servers there are currently replication solutions (also known as clustering) to provide availability and increase scalability. Serial scalability is very limited due to conflicts between readings and writes. The replicated model of the present invention allows to avoid the conflicts of readings written by what is more scalable and providing a snapshot isolation very close to the seriality,
ReferenciasReferences
[Adya99] Adya, A. 1999 Weak Consistency: a Generalized Theory and Optimistic Implementations for Distríbuted Transactions. PhD Thesis. Massachusetts Institute of Technology.
[Berenson95] Berenson, H., Bernstein, P., Gray, J., Melton, J., O'Neil, E., and O'Neil, P. 1995. A critique of ANSI SQL isolation levéis. In Proceedings of the 1995 ACM SIGMOD international Conference on Management of Data (San José, California, United States, May 22 - 25, 1995). M. Carey and D. Schneider, Eds. SIGMOD '95. ACM, New York, NY, 1-10.[Adya99] Adya, A. 1999 Weak Consistency: a Generalized Theory and Optimistic Implementations for Distríbuted Transactions. PhD Thesis Massachusetts Institute of Technology. [Berenson95] Berenson, H., Bernstein, P., Gray, J., Melton, J., O'Neil, E., and O'Neil, P. 1995. A critique of ANSI SQL isolation levéis. In Proceedings of the 1995 ACM SIGMOD international Conference on Management of Data (San José, California, United States, May 22-25, 1995). M. Carey and D. Schneider, Eds. SIGMOD '95. ACM, New York, NY, 1-10.
[Bemstein87] Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman: Concurrency Control and Recovery in Datábase Systems. Addison-Wesley 1987 [Bressoud98] Thomas C. Bressoud, John E. Ahern, Kenneth P. Birman, Robert C. B. Cooper, Bradford B. Glade, Fred B. Schneider, John D. Service. Transparent fault tolerant computer system. United States Patents 5802265 (1998) and 5968185 (1999). [Jacobs04] Lawrence Jacobs, Xiang Liu, Shehzaad Nakhoda, Zheng Zeng, Rajiv Mishra. Multi-version data caching. United States Patent 6785769. [JEE] JSR-000244 Java™ Platform, Enterprise Edition 5 Specif ¡catión. Java Community ProcessSM. 8 May 2006. [JimenezOO] Ricardo Jiménez-Peris, Marta Patino-Martínez, Sergio Arévalo: Deterministic Scheduling for Transactional Multithreaded Replicas. SRDS 2000: 164- 173.[Bemstein87] Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman: Concurrency Control and Recovery in Datábase Systems. Addison-Wesley 1987 [Bressoud98] Thomas C. Bressoud, John E. Ahern, Kenneth P. Birman, Robert CB Cooper, Bradford B. Glade, Fred B. Schneider, John D. Service. Transparent fault tolerant computer system. United States Patents 5802265 (1998) and 5968185 (1999). [Jacobs04] Lawrence Jacobs, Xiang Liu, Shehzaad Nakhoda, Zheng Zeng, Rajiv Mishra. Multi-version data caching. United States Patent 6785769. [JEE] JSR-000244 Java ™ Platform, Enterprise Edition 5 Specif cation. Java Community Process SM . 8 May 2006. [JimenezOO] Ricardo Jiménez-Peris, Marta Patino-Martínez, Sergio Arévalo: Deterministic Scheduling for Transactional Multithreaded Replicas. SRDS 2000: 164-173.
[KemmeOO] Bettina Kemme, Gustavo Alonso: Don't Be Lazy, Be Consistent: Postgres- R, A New Way to Implement Datábase Replication. In Proceedings of VLDB Conf. 2000. pp. 134-143.[KemmeOO] Bettina Kemme, Gustavo Alonso: Don't Be Lazy, Be Consistent: Postgres- R, A New Way to Implement Datábase Replication. In Proceedings of VLDB Conf. 2000. pp. 134-143.
[LinO5] Yi Lin, Bettina Kemme, Marta Patino-Martínez, Ricardo Jiménez-Peris:[LinO5] Yi Lin, Bettina Kemme, Marta Patino-Martínez, Ricardo Jiménez-Peris:
Middleware based Data Replication providing Snapshot Isolation. SIGMOD ConferenceMiddleware based Data Replication providing Snapshot Isolation. SIGMOD Conference
2005: 419-4302005: 419-430
[MattisOl] Peter Mattis, John Plevyak, Adam Beguelin, Brian Totty, David Gourley, Matthew Haines. Garbage collection in an object cache. United States Patent 6209003. 2001.[MattisOl] Peter Mattis, John Plevyak, Adam Beguelin, Brian Totty, David Gourley, Matthew Haines. Garbage collection in an object cache. United States Patent 6209003. 2001.
[Moser03] Moser, Louise, E,; Melliar-Smith, Peter, M. Transparent consistent active replication of multithreaded application programs. European Software Patent EP1495414. 2003. [Moser03b] Moser, Louise, E.; Melliar-Smith, Peter, M.Transparent consistent semi- active and passive replication of multithreaded application programs. US Patent (WO/2003/084116).[Moser03] Moser, Louise, E ;; Melliar-Smith, Peter, M. Transparent consistent active replication of multithreaded application programs. European Software Patent EP1495414. 2003. [Moser03b] Moser, Louise, E .; Melliar-Smith, Peter, M. Transparent consistent semi-active and passive replication of multithreaded application programs. US Patent (WO / 2003/084116).
[NarasimhanO2] Priya Narasimhan, Louise E. Moser, P. M. Melliar-Smith: Eternal - a componentτbased framework for transparent fault-tolerant CORBA. Softw., Pract. Exper. 32(8): 771-788 (2002)
[PatiñoOδ] Marta Patino-Martínez, Ricardo Jíménez-Peris, Bettina Kemme, Gustavo Alonso: MIDDLE-R: Consistent datábase replication at the middleware level. ACM Trans. Comput. Syst. 23(4): 375-423 (2005)[NarasimhanO2] Priya Narasimhan, Louise E. Moser, PM Melliar-Smith: Eternal - a component τ based framework for transparent fault-tolerant CORBA. Softw., Pract. Exper. 32 (8): 771-788 (2002) [PatiñoOδ] Marta Patino-Martínez, Ricardo Jíménez-Peris, Bettina Kemme, Gustavo Alonso: MIDDLE-R: Consistent datábase replication at the middleware level. ACM Trans. Comput Syst 23 (4): 375-423 (2005)
[PlattnerO4] Christian Plattner and Gustavo Alonso. Ganymed: Scalable Replication for Transactional Web Applications. In Proceedings of ACM/IFIP/USENIX 5th International Middleware Conference, Toronto, Ganada, October 18-22, 2004. [Vigna95] Del Vigna, Jr.; Paul. System and method for providing a fault tolerant computer program runtime support environment. US Patent 5621885. 1995. [Wu04] Huaigu Wu, Bettina Kemme, Vanee Maverick: Eager Replication for Stateful J2EE Servers. In proceedings of CoopIS/DOA/ODBASE (2) 2004: 1376-1394.[PlattnerO4] Christian Plattner and Gustavo Alonso. Ganymed: Scalable Replication for Transactional Web Applications. In Proceedings of ACM / IFIP / USENIX 5th International Middleware Conference, Toronto, Won, October 18-22, 2004. [Vigna95] Del Vigna, Jr .; Paul System and method for providing a fault tolerant computer program runtime support environment. US Patent 5621885. 1995. [Wu04] Huaigu Wu, Bettina Kemme, Vanee Maverick: Eager Replication for Stateful J2EE Servers. In proceedings of CoopIS / DOA / ODBASE (2) 2004: 1376-1394.
[ZhaoOS] Wenbing Zhao, Louise E. Moser, P. M. Melliar-Smith: Unification of Transactions and Replication in Three-Tier Architectures Based on CORBA. IEEE Trans. Dependable Sec. Comput. 2(1): 20-33 (2005).
[ZhaoOS] Wenbing Zhao, Louise E. Moser, P. M. Melliar-Smith: Unification of Transactions and Replication in Three-Tier Architectures Based on CORBA. IEEE Trans. Dependable Sec. Comput. 2 (1): 20-33 (2005).
Claims
1.- Sistema de caché para servidores de aplicación en sistemas transaccionales multi-capa caracterizado porque comprende, al menos, una caché que mantiene copia de los datos almacenados en Ia capa de base datos, siendo dicha capa transaccional o no transaccional, y donde, además, se mantiene al menos, una versión de cada dato, o información suficiente para Ia generación de dicho dato, según los valores tomados por los mismos en cada transacción comprometida, y donde, además, dicha capa de base de datos proporciona aislamiento snapshot isolation o versión generalizada de éste como PL2+.1.- Cache system for application servers in multi-layer transactional systems characterized in that it comprises, at least, a cache that keeps a copy of the data stored in the database layer, said transactional or non-transactional layer, and where, In addition, at least one version of each data is maintained, or information sufficient for the generation of said data, according to the values taken by them in each compromised transaction, and where, in addition, said database layer provides snapshot isolation. or generalized version of this as PL2 +.
2.- Procedimiento de gestión datos para el sistema descrito en las reivindicación 1 caracterizado porque, comprende, al menos las etapas de: a. una primera etapa de proporcionar a cada transacción que se ejecuta una imagen de Ia base de datos con el mismo contenido que tenía en el momento de iniciarse la transacción y b. una sola etapa de habilitar el compromiso de las transacciones que satisfacen Ia condición siguiente: no existe una transacción comprometida que: > i. sea concurrente con Ia transacción a comprometer y ii. modifique datos comunes.2. Data management procedure for the system described in claim 1 characterized in that it comprises at least the steps of: a. a first stage of providing each transaction that executes an image of the database with the same content it had at the time the transaction was initiated and b. a single stage of enabling the commitment of transactions that satisfy the following condition: there is no compromised transaction that:> i. is concurrent with the transaction to be committed and ii. Modify common data.
3.- Procedimiento según reivindicación 2 caracterizado porque para eliminar una versión innecesaria de Ia caché realiza los siguientes pasos: a. si hay transacciones en ejecución, para cada dato, sólo se eliminan versiones del dato producidas por una transacción comprometida cuando existe una versión posterior del dato en Ia caché creada por una transacción comprometida y todas las transacciones en ejecución se iniciaron después de que comprometiera dicha transacción y, b. si no hay transacciones en ejecución se eliminan todas las versiones menos Ia más reciente.3. Method according to claim 2 characterized in that to eliminate an unnecessary version of the cache performs the following steps: a. if there are transactions in execution, for each data, only versions of the data produced by a compromised transaction are deleted when there is a later version of the data in the cache created by a compromised transaction and all the transactions in execution were initiated after that transaction was committed and, b. if there are no transactions in execution, all versions are eliminated except the most recent one.
4.- Procedimiento de gestión de datos según reivindicaciones 2 y 3 caracterizado porque Ia caché expulsa (hiberna) datos a un repositorio de datos diferente de Ia capa de base de datos. 4. Data management procedure according to claims 2 and 3 characterized in that the cache ejects (hibernates) data to a data repository different from the database layer.
5.- Procedimiento de gestión de datos replicados para el sistema descrito en Ia reivindicación 1 caracterizado porque: a. replicar el sistema tomando como unidad de replicación como pares servidor de aplicación-base de datos, b. ejecutar cada transacción localmente en una réplica cualquiera del sistema, c. mantener los mismos datos y valores en todas las bases de datos de cada réplica, d. proporcionar a cada transacción que se ejecuta una imagen de Ia base de datos con el mismo contenido que tenía en el momento de iniciarse5.- Procedure for managing replicated data for the system described in claim 1 characterized in that: a. replicate the system taking as a replication unit as application-database server pairs, b. execute each transaction locally on any replica of the system, c. maintain the same data and values in all databases of each replica, d. provide each transaction that executes an image of the database with the same content that it had at the time of initiation
Ia transacción, e. y por sólo permitir el compromiso de las transacciones que satisfacen Ia condición siguiente: no existe una transacción comprometida que 1) sea concurrente con Ia transacción a comprometer y 2) modifique datos comunes.The transaction, e. and by only allowing the commitment of the transactions that satisfy the following condition: there is no compromised transaction that 1) is concurrent with the transaction to be committed and 2) modifies common data.
6.- Procedimiento según reivindicación 5 caracterizado porque para eliminar una versión innecesaria de Ia caché realiza los siguientes pasos; a. si hay transacciones en ejecución, para cada dato, sólo se eliminan versiones del dato producidas por una transacción comprometida cuando existe una versión posterior del dato en Ia caché creada por una transacción comprometida y todas las transacciones en ejecución se iniciaron después de que comprometiera dicha transacción y b. si no hay transacciones en ejecución se eliminan todas las versiones menos Ia más reciente.6. Method according to claim 5 characterized in that to eliminate an unnecessary version of the cache, perform the following steps; to. if there are transactions in execution, for each data, only versions of the data produced by a compromised transaction are deleted when there is a later version of the data in the cache created by a compromised transaction and all the transactions in execution were initiated after that transaction was committed and b. if there are no transactions in execution, all versions are eliminated except the most recent one.
7.- Procedimiento de gestión de datos según reivindicaciones 5 y 6 caracterizado porque Ia caché expulsa (hiberna) datos a un repositorio de datos diferente de Ia capa de base de datos.7. Data management procedure according to claims 5 and 6, characterized in that the cache ejects (hibernates) data to a data repository different from the database layer.
8.- Procedimiento según reivindicaciones 5 a 7 caracterizado porque puede combinarse con un procedimiento de replicación de sesiones. 8. Method according to claims 5 to 7, characterized in that it can be combined with a session replication procedure.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
ESP200703107 | 2007-11-23 | ||
ES200703107A ES2331039A1 (en) | 2007-11-23 | 2007-11-23 | Multi-version cache with relaxed isolation for replicated and non-replicated systems |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009065977A1 true WO2009065977A1 (en) | 2009-05-28 |
Family
ID=40667150
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/ES2008/000636 WO2009065977A1 (en) | 2007-11-23 | 2008-10-10 | Multi-version cache with relaxed isolation for replicated and non-replicated systems |
Country Status (2)
Country | Link |
---|---|
ES (1) | ES2331039A1 (en) |
WO (1) | WO2009065977A1 (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6209003B1 (en) * | 1998-04-15 | 2001-03-27 | Inktomi Corporation | Garbage collection in an object cache |
US6289358B1 (en) * | 1998-04-15 | 2001-09-11 | Inktomi Corporation | Delivering alternate versions of objects from an object cache |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6785769B1 (en) * | 2001-08-04 | 2004-08-31 | Oracle International Corporation | Multi-version data caching |
-
2007
- 2007-11-23 ES ES200703107A patent/ES2331039A1/en not_active Withdrawn
-
2008
- 2008-10-10 WO PCT/ES2008/000636 patent/WO2009065977A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6209003B1 (en) * | 1998-04-15 | 2001-03-27 | Inktomi Corporation | Garbage collection in an object cache |
US6289358B1 (en) * | 1998-04-15 | 2001-09-11 | Inktomi Corporation | Delivering alternate versions of objects from an object cache |
Non-Patent Citations (2)
Also Published As
Publication number | Publication date |
---|---|
ES2331039A1 (en) | 2009-12-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Taft et al. | Cockroachdb: The resilient geo-distributed sql database | |
CN109739935B (en) | Data reading method and device, electronic equipment and storage medium | |
US10430298B2 (en) | Versatile in-memory database recovery using logical log records | |
Rao et al. | Using paxos to build a scalable, consistent, and highly available datastore | |
Yan et al. | Carousel: Low-latency transaction processing for globally-distributed data | |
US9009116B2 (en) | Systems and methods for synchronizing data in a cache and database | |
US8117153B2 (en) | Systems and methods for a distributed cache | |
US11132350B2 (en) | Replicable differential store data structure | |
US5966706A (en) | Local logging in a distributed database management computer system | |
US8892509B2 (en) | Systems and methods for a distributed in-memory database | |
EP1840766B1 (en) | Systems and methods for a distributed in-memory database and distributed cache | |
US7779295B1 (en) | Method and apparatus for creating and using persistent images of distributed shared memory segments and in-memory checkpoints | |
Kemme et al. | Database replication: a tale of research across communities | |
US20080077636A1 (en) | Database backup system using data and user-defined routines replicators for maintaining a copy of database on a secondary server | |
US9652346B2 (en) | Data consistency control method and software for a distributed replicated database system | |
US10936576B2 (en) | Replicating storage tables used to manage cloud-based resources to withstand storage account outage | |
Perez-Sorrosal et al. | Elastic SI-Cache: consistent and scalable caching in multi-tier architectures | |
Padhye et al. | Scalable transaction management with snapshot isolation for NoSQL data storage systems | |
CN111444027A (en) | Transaction processing method and device, computer equipment and storage medium | |
CN109783578A (en) | Method for reading data, device, electronic equipment and storage medium | |
US9201685B2 (en) | Transactional cache versioning and storage in a distributed data grid | |
US9430541B1 (en) | Data updates in distributed system with data coherency | |
EP3794458B1 (en) | System and method for a distributed database | |
WO2009065977A1 (en) | Multi-version cache with relaxed isolation for replicated and non-replicated systems | |
Padhye | Transaction and data consistency models for cloud applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 08852426 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 08852426 Country of ref document: EP Kind code of ref document: A1 |