HK1179718B

HK1179718B - Parallel nested transactions in transactional memory

Info

Publication number: HK1179718B
Application number: HK13106756.6A
Authority: HK
Inventors: M．M．马格鲁德; D．德特勒夫; J．J．达菲; G.格雷费; V．K．格罗弗
Original assignee: 微软技术许可有限责任公司
Priority date: 2007-09-18
Filing date: 2013-06-07
Publication date: 2017-01-27

Description

Parallel nested transactions in transactional memory

The application is as follows: 2008.09.16 filed under the name "nested transactions in parallel in transactional memory" as a divisional application for application No. 200880107300.2 (international application No. PCT// US 2008/076563).

Background

Software Transactional Memory (STM) is a concurrency control mechanism similar to database transactions for controlling access to shared memory in concurrent computation. A transaction in the context of transactional memory is a piece of code that performs a series of reads and writes to shared memory. In other words, a transaction accesses data in one or more objects. An object in the context of transactional memory is a set of connected memory locations that are locked as one entity. An object in this context may also be a static variable, or a set of such variables, or the object may be a set of cache lines.

STM is used as an alternative to traditional locking mechanisms. STM allows concurrent programs to be written more easily. A transaction specifies a sequence of code that should be executed as if it were executing in isolation. This illusion of isolation may be achieved by fine-grained locking of objects, and by executing in a mode that allows the side effects of a transaction to be rolled back if the transaction is found to conflict with some other transaction. If the code generated for a data access is modified to include support for these locking and rollback mechanisms, the access may be said to be "transacted".

Transactions can be nested and can be classified as open or closed nesting. If a thread is currently executing a transaction and reaches the beginning of a new atomic block, the atomic block is executed as a closed nested child of the currently executing parent transaction. The nested transaction executes within the same isolation boundary as the closed transaction, and the effects of the nested transaction will only become visible when the closed transaction commits, just like other memory accessed in the closed transaction. In other words, the parent transaction is effectively suspended and the closed nested transaction is allowed to run to completion before proceeding with processing in the parent transaction. When a nested transaction rolls back, its temporary impact is undone and the state of the parent transaction is restored to the point where the nested transaction started.

The "outermost" transactions that a given thread is executing are non-nested; referred to as top-level transactions. The top-level transaction must execute atomically, so the nested transaction becomes a part of it. Nesting can arise if some abstractions a and B each have internal representation invariants that they want to maintain even when used by concurrent threads, and these abstractions thus use atomic blocks when implementing their methods to ensure that concurrent accesses do not violate these invariants. Now assume that some higher level abstraction C uses instances of a and B in its implementation and has some invariants that relate these a and B instances. The method of C may use transactions to ensure that the invariants are not violated. If the A and B methods are used inside the transaction of C, then the transactions in the A and B methods will be nested (in this case).

Existing transactional memory systems do not allow work performed within the isolation boundaries of a transaction to be distributed among multiple concurrent execution threads. In existing systems, a transaction may only have one nested child transaction. The semantics of these systems simply do not allow for such parallelism within a transaction, and attempts to execute more than one nested transaction at a time will result in out-of-order mixing and other errors of nested transaction log entries in the parent transaction's log, as well as a crash of the underlying fine-grained locking protocol that provides the illusion of isolation.

SUMMARY

Various technologies and techniques are disclosed for supporting parallel nested transactions in a transactional memory system. Multiple closed nested transactions are created for a single parent transaction and executed concurrently as parallel nested transactions. Various techniques are used to ensure that the effects of parallel nested transactions are hidden from other transactions outside the parent transaction until the parent transaction commits.

In one implementation, versioned write locks are used with parallel nested transactions. When a transactional memory word changes from a write lock to a versioned write lock, an entry is made in the global versioned write lock map to store a pointer to the write log entry that the versioned write lock replaced. When the versioned write lock is encountered during a transaction, the global versioned write lock map is consulted to convert the versioned write lock to a pointer to a write log entry.

In another implementation, releasing a duplicate write lock for rollback is supported for concurrent transactions. During rollback processing of a parallel nested transaction, a first write log entry representing a write lock is encountered. If the write lock is determined to be a duplicate, a global lock is acquired and used to synchronize access to the global versioned write lock map.

In yet another implementation, optimistic read validation is supported for parallel nested transactions. During optimistic read validation, if the versioned write lock indicates a conflict from a sibling parallel nested transaction, information is consulted to determine whether the parallel nested transaction should be destroyed. In one implementation, this information is contained in the versioned write lock and the global versioned write lock map.

In yet other implementations, write lock acquisition is supported for parallel nested transactions. In attempting to acquire a write lock for a parallel nested transaction, the transactional memory word is read and analyzed to determine if the write lock can be acquired. If the transactional memory word indicates a versioned write lock, the global versioned write lock map is accessed to retrieve a write log entry pointer to the first write log entry.

In yet another implementation, pessimistic reads are supported for parallel nested transactions. A pessimistic duplicate detection data structure is created for parallel nested transactions. An entry corresponding to each pessimistic read in the parallel nested transaction is formed into the data structure. Upon committing the parallel nested transaction, a new pessimistic read lock is passed to the immediate parent transaction and an entry is formed into a separate pessimistic duplicate detection data structure for the immediate parent transaction with synchronization between sibling transactions. The pessimistic duplicate detection data structure may also be used for an upgrade from pessimistic read to write lock.

In another implementation, a retry operation is supported for parallel nested transactions. When a transaction execution retry that is a parallel nested transaction or a sub-transaction of a parallel nested transaction, a transaction read set is registered for the retry. When it is decided to propagate the retry through the parallel nested parent transactions of the transaction, the read set remains registered and becomes part of the parent transaction read set.

In yet another implementation, a write abort compensation map may be used with parallel nested transactions to detect and process erroneously corrupted parent transactions. When a new write lock for a parallel nested transaction is released during rollback, a write abort compensation map is created. When a parallel nested transaction rolls back, an entry is created in the write abort compensation map corresponding to each new write lock released.

This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.

Brief Description of Drawings

FIG. 1 is a diagrammatic view of a computer system of one implementation.

FIG. 2 is a diagrammatic view of a transactional memory application of one implementation operating on the computer system of FIG. 1.

Figure 3 is a process flow diagram for one implementation of the system of figure 1 that illustrates the stages involved in allowing multiple parallel transactions to be nested under a single parent transaction.

Figure 4 is a process flow diagram for one implementation of the system of figure 1 that illustrates the stages involved in allocating parallel nested transaction entries in a parent log when the number of parallel nested transactions is known in advance.

Figure 5 is a process flow diagram for one implementation of the system of figure 1 that illustrates the stages involved in allocating parallel nested transaction entries in a parent log when the number of parallel nested transactions is not known in advance.

FIG. 6 illustrates an example structure of a transactional memory word for one implementation.

Figure 7 is a process flow diagram for one implementation of the system of figure 1 that illustrates the stages involved in using versioned write locks to ensure that parallel nested transactions are properly nested and synthesized.

Figures 8 and 9 are process flow diagrams for one implementation of the system of figure 1 illustrating the stages involved in properly handling nested transactions to acquire a write lock.

Figure 10 is a process flow diagram for one implementation of the system of figure 1 that illustrates the stages involved in properly handling nested transactions to release any kind of write lock for commit.

Figure 11 is a process flow diagram for one implementation of the system of figure 1 that illustrates the stages involved in properly handling nested transactions to release a new write lock in order to rollback.

Figure 12 is a process flow diagram for one implementation of the system of figure 1 that illustrates the stages involved in properly handling nested transactions to release a duplicate write lock in order to rollback.

FIG. 13 is a process flow diagram for one implementation of the system of FIG. 1 that illustrates performing an optimistic read lock acquisition.

14A-14D are process flow diagrams for one implementation of the system of FIG. 1 illustrating performing an optimistic read validation.

Figure 15 is a process flow diagram for one implementation of the system of figure 1 that illustrates the stages involved in ensuring a correct pessimistic read operation on parallel nested transactions.

Figure 16 is a process flow diagram for one implementation of the system of figure 1 that illustrates the stages involved in upgrading a pessimistic read lock to a write lock.

17A-17D are process flow diagrams for one implementation of the system of FIG. 1 illustrating performing pessimistic read lock acquisition.

Figure 18 is a process flow diagram for one implementation of the system of figure 1 that illustrates the stages involved in allowing retries to work correctly with respect to parallel nested transactions.

Figure 19 is a process flow diagram for one implementation of the system of figure 1 that illustrates the stages involved in using write abort compensation maps for parallel nested transactions.

Detailed Description

The techniques and methods herein may be described in the general context of a transactional memory system, but the present system serves other purposes in addition to theseIn (1). In one implementation, one or more of the techniques described herein may be implemented as, for example, Microsoft WindowsNET framework, or any other type of program or service feature within a framework program or from a platform that provides developers to develop software applications. In another implementation, one or more of the techniques described herein are implemented as features of processing other applications that develop applications that execute in a concurrent environment.

As described in the background section, a nested transaction is considered closed if its effect is part of the same isolation boundary as its containing (i.e., parent) transaction. Using the various techniques and methods described herein, a transaction may have multiple closed nested transactions simultaneously. These closed nested transactions are referred to as "parallel nested transactions" (PNT). All PNTs under a single closed transaction are referred to as "concurrent children" of the transaction, and the closed transaction is referred to as a "concurrent parent". The parallel parent and its children are referred to as "parallel nesting". A "sibling" of a PNT is another PNT that is closed (at some nesting level) within the same parallel parent transaction. In one implementation, each PNT performs much like a normal closed nested transaction: its effects are isolated within the closed transaction and become visible outside of the parallel nesting only when the parallel parent transaction commits. However, each PNT is isolated from its siblings as if it were a top-level transaction. The impact of PNT only becomes visible to its siblings when it commits.

As shown in FIG. 1, an exemplary computer system for implementing one or more portions of the present system includes a computing device, such as computing device 100. In its most basic configuration, computing device 100 typically includes at least one processing unit 102 and memory 104. Depending on the exact configuration and type of computing device, memory 104 may be volatile (such as RAM), non-volatile (such as ROM, flash memory, etc.) or some combination of the two. This most basic configuration is illustrated in fig. 1 by dashed line 106.

Additionally, device 100 may also have additional features/functionality. For example, device 100 may also include additional storage (removable and/or non-removable) including, but not limited to, magnetic or optical disks or tape. Such additional storage is illustrated in FIG. 1 by removable storage 108 and non-removable storage 110. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Memory 104, removable storage 108 and non-removable storage 110 are all examples of computer storage media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by device 100. Any such computer storage media may be part of device 100.

Computing device 100 includes one or more communication connections 114 that allow computing device 100 to communicate with other computers/applications 115. Device 100 may also have input device(s) 112 such as keyboard, mouse, pen, voice input device, touch input device, etc. Output device(s) 111 such as a display, speakers, printer, etc. may also be included. These devices are well known in the art and need not be discussed at length here. In one implementation, computing device 100 includes transactional memory application 200. Transactional memory application 200 will be described in more detail in FIG. 2.

Turning now to FIG. 2 with continued reference to FIG. 1, transactional memory application 200 operating on computing device 100 is illustrated. Transactional memory application 200 is one of the application programs that reside on computing device 100. However, it will be understood that transactional memory application 200 can alternatively or additionally be embodied as computer-executable instructions on one or more computers and/or in different variations than shown on FIG. 1. Alternatively or additionally, one or more portions of transactional memory application 200 can be a portion of system memory 104, on other computers and/or applications 115, or other such variations as would occur to one in the computer software art.

Transactional memory application 200 includes program logic 204, which is responsible for performing some or all of the techniques described herein. Program logic 204 includes logic 206 to allow multiple parallel transactions to be nested under a single parent transaction (as described below with reference to FIG. 3); logic 208 for allowing the parallel nested transactions to be properly nested and synthesized with other closed nested transactions (as described below with reference to FIGS. 7-14); logic 210 for allowing concurrent nested transactions to execute with independent logs and pass lock ownership and logs to parent transactions with low contention (as described below with reference to FIGS. 4-5); logic 212 for enabling parallel nested transactions for optimistic and pessimistic reads within the same transaction (as described below with reference to FIGS. 15-17); logic 214 to enable parallel nested transactions for in-place or buffered writes (as described below with reference to FIGS. 7, 11, and 12); logic 216 to allow retries to work correctly with respect to parallel nested transactions (as described below with reference to FIG. 18); logic 218 to allow write abort compensation mapping work for parallel nested transactions (as described below with reference to FIG. 19); and other logic 220 for operating the application.

Turning now to figures 3-19 with continued reference to figures 1-2, the stages for implementing one or more implementations of transactional memory application 200 are described in more detail. In some implementations, the processes of fig. 3-19 are at least partially implemented in the operating logic of computing device 100. Figure 3 illustrates one implementation of the stages involved in allowing multiple parallel transactions to be nested under a single parent transaction. The process begins at start point 240 with providing functionality or other features that will create a set of parallel nested transactions ready for execution (stage 242). A set of parallel nested transactions is created for a given parent transaction all at once or lazily as needed (stage 244). Special logic is used to transactionally execute, commit, rollback, re-execute, and retry the parallel nested transaction (stage 246). When the parallel nested transaction completes, the transaction must be destroyed after the parallel nested transaction and all its parallel siblings have completed (stage 248). The process ends at end point 250.

Fig. 4 and 5 show how parallel nested transaction entries in a parent log are allocated. The parallel nested transactions can execute with independent logs and pass lock ownership, logs, and other data to the parent transaction with low contention after being allocated according to one of the allocation techniques described in fig. 4 and 5. Turning now to FIG. 4, one implementation for allocating parallel nested transaction entries in a parent log when the number of parallel nested transactions is known in advance is shown. The process begins at start point 270 with creating a Parallel Nested Transaction Entry (PNTE) for each Parallel Nested Transaction (PNT) before any PNT begins execution because the number of parallel nested transactions is known in advance (stage 272). The parallel nested transaction retrieves the parallel nested transaction entry from the parent transaction with minimal synchronization when needed during rollback or commit processing. . In one implementation, the pre-assigned PNTE is maintained in an array in the parent transaction. The PNT may obtain a pointer to the next available PNTE through a simple compare and swap operation on the next available parallel nested transaction index (stage 274). Compare-and-Swap (CAS) is an operation that performs atomically a comparison between a given value and the contents of a given memory cell and, if they match, stores the given new value in that memory cell. If they do not match, no action is taken. There are many ways, some hardware and some software, to perform CAS operations on many different CPUs and operating systems. As used herein, the term CAS is meant to generally cover all such methods.

The parallel nested transaction populates the parallel nested transaction entries in the parent log with information (stage 276). In one implementation, this information includes a pointer to the log of the child transaction and a pointer to the write abort compensation map and/or pessimistic read object table (as appropriate) of the child transaction. This write abort compensation map is described in more detail in FIG. 19. The pessimistic read object table is described in more detail in FIG. 15. The process ends at end point 278.

Figure 5 illustrates one implementation of the stages involved in allocating parallel nested transaction entries in a parent log when the number of parallel nested transactions is not known in advance. The process begins at start point 290 with allocating space in the parent log for a new parallel nested transaction entry when the parallel nested transaction is created (stage 292). When the parallel nested transaction commits, a parallel nested transaction entry is created in the parent log (stage 294). Accesses to the parent log are synchronized to move the current log point of the parent transaction to obtain access to the next available space of the parallel nested transaction entries (stage 296). The parallel nested transaction populates the parent log with information for the parallel nested transaction entry (stage 298). In one implementation, this information includes a pointer to the log of the child transaction and a pointer to the write abort compensation map and/or pessimistic read object table (as appropriate) of the child transaction. This write abort compensation map is described in more detail in FIG. 19. The pessimistic read object table is described in more detail in FIG. 15. The process ends at end point 300.

FIG. 6 illustrates an example structure 320 of a Transactional Memory Word (TMW) for one implementation having various bits 322 for marking various lock states. In one implementation, a TMW is associated with each object. In one implementation, a single hardware word representation is used; numerous other representations of the TMW that allow storage of lock information are possible. In the example structures described herein, a TMW may indicate whether an associated object is write-locked or not write-locked. One bit in the TMW is dedicated to this distinction. If the TMW is not write-locked, it contains a version number called a version/count pair 324 (V/C pair) and a pessimistic reader count. In this state, a certain number of bits in the TMW record the version number of the object, while the remaining bits represent a count of the number of transactions that currently hold a pessimistic read lock on the object. When a transaction holds a write lock in an object, the lock may be of two types (distinguished by another bit in the TMW). Typically, the remaining bits in the TMW of the write lock contain an entry 326 in the write log that points to the locked transaction; the Write Log Entry (WLE) contains other information about the object and the lock. For example, WLE may contain the TMW value of the object before acquiring the write lock; in another implementation, the WLE may contain a "shadow copy" of the object for which the uncommitted modification was made. In another state, the TMW of the write lock contains a Versioned Write Lock (VWL). Here, the remaining bits in the TMW (referred to as VWL [ V/C ] 328) represent the version number of the object and the count of pessimistic readers while the TMW is still write-locked, similar to a V/C pair.

Before proceeding to a more detailed discussion of how versioned write locks are used, let us first explore an example that will help illustrate the need for Versioned Write Locks (VWL). Assume that an in-place STM system exists, and that the top-level transaction Tx1 acquires a write lock on object O1. The TMW of O1 is set to WLE1, which is a write log entry in the log of Tx1 that represents the write lock of Tx1 to O1. Now, assume that two PNTs are introduced, Tx2 and Tx3, with Tx1 as the parent parallel transaction. Tx2 and Tx3 are sibling transactions in parallel nesting and execute concurrently. Tx2 and Tx3 can access data locked by Tx1, but they must be isolated from each other. Thus, while both Tx2 and Tx3 may access O1, they must not be allowed to do so at the same time. Now, assume Tx3 wishes to read from O1. It performs an optimistic read operation, creating an optimistic read log entry in its log that records the value of O1 as WLE 1. Next, suppose Tx2 is written to O1. Tx2 will acquire a write lock to O1 and set the TMW of O1 to WLE2, which is the write log entry in the Tx 2's log. WLE2 records that WLE1 is the previous value of TMW for O1. Tx2 can now write to the fields of O1, and do so in a write-in-place fashion. As Tx2 continues to execute, it reads the fields of O1 that contain uncommitted writes from Tx 2. Tx3 is corrupted by definition and should fall back. However, if for any reason Tx2 rolls back before Tx3 attempts to commit, then Tx2 must release its write lock on O1. To do so, Tx2 typically sets the TMW of O1 back to WLE 1. But now when Tx3 tries to commit, it will see that the TMW of O1 contains the same value that was set when Tx3 first read O1. In this case, Tx3 will appear valid, and it will not be recognized that it read an uncommitted write from Tx 2. Therefore, when Tx2 rolls back it must set the TMW of O1 to some value other than WLE1, and it must do so in a way that ensures that other transactions in the system (PNT siblings or other top-level transactions) recognize that O1 is still write-locked by Tx 1. This is accomplished by setting the TMW of O1 to Versioned Write Lock (VWL) and forming an entry in the global Versioned Write Lock Map (VWLM) indicating that Tx1 holds a write lock to O1. Details and uses of VWL and VWLM are described below. This example shows a case where VWL is necessary. However, those of ordinary skill in the art will appreciate that there are numerous situations in which VWL may be used, as will become apparent as the processes for lock acquisition and release are described in detail in the remainder of this section.

Figure 7 illustrates one implementation of the stages involved in using versioned write locks to ensure that parallel nested transactions are properly nested and synthesized. The process begins at start point 340 with providing a transactional memory word, which may be one of a version/count pair, a Write Log Entry (WLE), or a Versioned Write Lock (VWL) (stage 342). When a Transactional Memory Word (TMW) becomes a versioned write lock, an entry is formed in a global Versioned Write Lock Map (VWLM) that relates to an old write log entry pointer that the versioned write lock replaces, the entry being indexed by the object address (stage 344). When a versioned write lock is seen, the global versioned write lock map is consulted to convert the versioned write lock to a write log entry pointer, and the write log entry pointer is processed normally (stage 346). Whether held in a V/C pair or versioned write lock, the version number in the transactional memory word is always incremented during commit or abort processing (stage 348). An entry is formed in the global versioned write lock map just prior to placing the versioned write lock in the transactional memory word and the entry is preserved as long as the transactional memory word contains the versioned write lock (stage 350). At any time, the transaction must acquire a global lock that accesses the global versioned write lock map (stage 352). The process ends at end point 354.

Figures 8 and 9 illustrate one implementation of the stages involved in properly handling nested transactions to acquire a write lock. The new write lock is the first acquired lock in the transaction nest, while the duplicate write lock is the write lock that the nested transaction can acquire to an object that the ancestor transaction is currently write-locked to. The process begins at start point 370 with nesting a transactional memory word by a transaction (stage 372). If the transactional memory word is not a V/C pair (decision point 374), then the process continues to FIG. 9, stage 404, which is described in the next section. If the transactional memory word is a V/C pair and C (the count of pessimistic readers) is greater than 0 (decision point 375), the process of FIG. 16 is performed to handle the escalation of the pessimistic read to write lock. If the transactional memory word is a V/C pair and C equals 0 (decision point 375), this indicates that the transactional memory word is not pessimistically locked for any transaction to read or write, thus allowing a new write lock to be acquired. To do so, the system forms a new write log entry to record the version number and appends the write log entry to the write log of the transaction (stage 376). Comparisons and swaps are then performed to swap the transactional memory word value from the V/C pair to a write log entry pointer (e.g., WLE) (stage 378). If the compare and exchange is successful (decision point 380), the lock is successfully acquired (stage 382). If the compare and exchange is not successful (decision point 380), then a conflict exists (stage 384) and the lock is not successfully acquired. The process ends at end point 386.

Continuing to FIG. 9, if the TMW is not a V/C pair (decision point 374 on FIG. 8), and if the transactional memory word is not a versioned write lock (decision point 404), it is directed to WLE_AAnd the process continues at stage 408. If the transactional memory word is a versioned write lock (decision point 404), then a global version write lock map is used to retrieve the point to WLE_AThe bottom layer WLE (stage 406). If WLE_ANot owned by any ancestor transaction (decision point 408), then there is a conflict (stage 410) and the process ends at end point 420. If WLE_AOwned by an ancestor (decision point 408), a new is formedWLE_BTo record WLE_AAnd WLE is performed_BIs appended to the write log of the transaction (stage 412). A compare and swap is then performed to transfer the transactional memory word value from the WLE_ASwitching to WLE_B(stage 414). If the compare and swap is successful (decision point 416), then the lock is successfully acquired (stage 418). If the comparison and exchange is not successful (decision point 416), then there is a conflict (stage 410) and the lock is not successfully acquired. The process ends at end point 420.

Figure 10 illustrates one implementation of the stages involved in properly handling nested transactions to release any kind of write lock for commit. The process begins at start point 440 with encountering a WLE in a transaction write log during commit processing_X(stage 442). Simply put ownership of write lock and WLE_XTo the immediate parent transaction (stage 444). By passing ownership from the parallel nested transaction to the parent transaction at commit, other sibling transactions now discover that they can now acquire a write lock for themselves. Also, the act of acquiring a duplicate write lock prevents sibling transactions from being able to acquire a write lock for themselves. The process ends at end point 446.

Figure 11 illustrates one implementation of the stages involved in properly handling nested transactions to release a new write lock to rollback. The process begins at start point 460 with encountering a WLE in a transaction write log during rollback processing_X(stage 462). System checks to see WLE_XIndicating what type of write lock (stage 464). If WLE_XIf a new write lock is not represented (decision point 466), then the logic for releasing duplicate write locks is executed (stage 471), as described with reference to one implementation in FIG. 12. If WLE_XIndicates a new write lock (decision point 466), then the fetch is stored in the WLE_XThe previous version number in (stage 468). From WLE_XThe transactional memory word location is retrieved and the write lock is released using the normal store operation to change the transactional memory word value to a V/C pair appropriate for that system type (in place or buffered) (stage 470). In one implementation of a buffering system, transactional memory word values are changed back to representationsThe original version number. In one implementation of a system in place, the transactional memory word value is changed to represent the original version number plus one. The process ends at end point 472.

Figure 12 illustrates one implementation of the stages involved in properly handling nested transactions to release a duplicate write lock to rollback. In one implementation, this process is only used for systems that increment the version number of an object at rollback, i.e., in-place systems. In some implementations of the buffering system, the version number is not increased during rollback. In these systems, the process for releasing a new write lock (FIG. 11) may be used to release a duplicate write lock. The process begins at start point 490, where a WLE is encountered in a transaction write log during rollback processing_X(stage 492). System checks to see WLE_XIndicating what type of write lock (stage 494). If the lock is not a duplicate write lock (decision point 496), then the logic for releasing the new write lock is executed (stage 511), as described with reference to one implementation in FIG. 11. If the lock is a duplicate write lock, a global lock is acquired for synchronizing access to the global versioned write lock map (stage 498). From WLE_XRetrieve original write log entry pointer WLE_YAnd location of the transactional memory word (stage 500). Forming corresponds to involving WLE_YThe new global versioned write lock map entry for the object of (stage 502). The global lock used to synchronize access to the global versioned write lock map is then released (stage 504). From WLE_XRetrieves the original version number (stage 506) and forms a new versioned write lock value as the original version number +1 (stage 508). Releasing write locks to transfer transactional memory word values from WLE using normal store operations_XChange to a new versioned write lock (stage 510). The process ends at end point 512.

FIG. 13 is a process flow diagram for one implementation of the system of FIG. 1 that illustrates performing an optimistic read lock acquisition. The process begins at start point 530 with reading a current TMW value of an object when a transaction performs an optimistic read of the object (stage 532). An optimistic read log entry is created (stage 534) and populated with the current TMW value and the location of the TMW (stage 536). The read log entry is appended to the read log of the transaction (stage 538). In one implementation, the process is the same for all types of transactions: top level, simple nested, or parallel nested transactions. The process ends at end point 540.

14A-14D are process flow diagrams for one implementation of the system of FIG. 1 illustrating performing an optimistic read validation. The process begins at start point 560 with considering each optimistic read log entry in the transaction's read log in an attempt to commit the transaction or otherwise determine whether the transaction is valid (stage 562). The original TMW value is retrieved from the read log entry (stage 564) and the current TMW value is read (stage 566). Note that in each case, if the system is using a Write Abort Compensation Map (WACM) (described in fig. 19), the current aggregate WACM (formed during the validation or commit process) is consulted as long as there is a difference in the two version numbers. If the original TMW is a V/C pair (decision point 568), the process described in FIG. 14B is performed. If the original TMW is WLE (decision point 570), the process described in FIG. 14C is performed. If the original TMW is neither a V/C pair nor a WLE, then the original TMW is a VWL (stage 572), and the process of FIG. 14D is performed. Let us look at each of these cases in more detail.

FIG. 14B covers more detail about an exemplary process performed during optimistic read validation in one implementation when the original TMW is a V/C pair (decision point 568). If the current TMW is a V/C pair (decision point 590) and the version numbers in the original TMW and the current TMW match (decision point 592), the transaction is valid (stage 594). If the current TMW is a V/C pair (decision point 590) and the version numbers in the original TMW and the current TMW do not match (decision point 592), the transaction is invalidated (stage 596).

If the current TMW is WLE instead (decision point 598), and if the transaction owns WLE, and the saved version number in WLE matches the old TMW (decision point 600), then the transaction is valid (stage 594). If the current TMW is not a V/C pair (decision point 590) and the current TMW is not WLE (decision point 598), then the TMW is VWL (stage 602). The address of the locked object is used for a synchronous lookup in the VWLM. If a VWLM entry does not exist (decision point 604), the transaction is invalid (stage 596). If an entry exists (decision point 604), the VWLM entry is used to retrieve the WLE replaced by VWL. If the transaction owns the WLE and the saved version number in the WLE matches the old TMW (decision point 606), then the transaction is valid (stage 594). Otherwise, the transaction is invalidated (stage 596). The process ends at end point 608.

Turning now to fig. 14C, an exemplary process performed during optimistic read validation in one implementation when the original TMW is WLE is shown. If the current TMW is a V/C pair (decision point 620), then the current transaction is invalid (stage 630). If the current TMW is not a V/C pair (decision point 620) but is instead a WLE (decision point 624), the system checks to see if the original and current TMWs match and if the current transaction or any ancestor transaction owns a WLE (decision point 626). If both criteria are met, the transaction is valid (stage 628). Otherwise, the transaction is invalidated (stage 630).

If the current TMW is not a V/C pair (decision point 620) and is not WLE (decision point 624), then the current TMW is VWL (stage 632). If the transaction or any ancestor transaction owns the WLE from the original TMW, and if the version number stored in the WLE matches the version number in the VWL (decision point 634), then the transaction is valid (628). Otherwise, the transaction is invalidated (stage 630). The process ends at end point 636.

Turning now to FIG. 14D, an exemplary process performed during optimistic read validation in one implementation when the original TMW is VWL is shown. If the current TMW is a V/C pair (decision point 650), then the transaction is invalidated due to a conflict (stage 660). If the current TMW is not a V/C pair (decision point 650), but is instead a WLE (decision point 654), the system checks to see if the current transaction or any ancestor transaction owns a WLE, and if the version number stored in the WLE matches the version in the VWL (decision point 656). If both criteria are met, the transaction is valid (stage 658). Otherwise, the transaction is invalidated (stage 660).

If the current TMW is not a V/C pair (decision point 650) and the current TMW is not WLE (decision point 654), then the current TMW is VWL (stage 662). A lookup is performed in VWLM to convert VWL to WLE (stage 664). If an entry is not found (decision point 666), the transaction is invalidated (stage 660). Otherwise, if an entry is found (decision point 666), the system checks to see if the version numbers of the original and current VWLs match and if the transaction or any ancestor transaction owns the WLE found in the VWLM that corresponds to the TMW (decision point 668). If both criteria are met, the transaction is valid (stage 658). Otherwise, the transaction is invalidated (stage 660). The process ends at end point 670.

A correct pessimistic read operation in a system with simple closed nested transactions requires the use of a duplicate detect data structure called a Pessimistic Read Object Table (PROT). Each top-level transaction creates a PROT at the beginning of the transaction or lazily at the first pessimistic read operation. When the transaction, or any descendant transaction, attempts to acquire a pessimistic read lock on an object, the transaction consults the PROT to determine whether the pessimistic read lock has been acquired. If the object is in the PROT, the object has been locked by a transactional nested read. If the object is not in the PROT, and if the object is not currently write locked by another transaction, a pessimistic read lock is acquired using a CAS operation to increment the C (count of pessimistic readers) in the V/C pair stored in the TMW of the object. If the CAS operation is successful, an entry is made in the PROT to record the fact that the nest has now locked the object for a pessimistic read. When a pessimistic read lock is released, during top-level commit or rollback, C is decremented (again with CAS) and the PROT entry is removed. Let us now look at how PROT is used for parallel nested transactions.

Figure 15 illustrates one implementation of the stages involved in ensuring a correct pessimistic read operation for parallel nested transactions. The process begins at start point 690 with the parallel nested transaction creating a pessimistic duplicate detection data structure (referred to as PROT, described above) during initialization, or lazily during the time that the parallel nested transaction acquires the first pessimistic read lock (stage 692). The system uses this data structure when a transaction attempts to upgrade a pessimistic read lock to a write lock. An entry corresponding to the first pessimistic read made to the object by the parallel nested transaction or any subsequent child transaction is formed into the PROT (stage 694). At commit time, the new pessimistic read lock is passed to the immediate parent transaction and the appropriate parent PROT entry is formed (stage 696). The duplicate read lock is released to allow sibling transactions to acquire write access after the parallel child transaction commits (stage 698). The system then destroys the log entry associated with the duplicate pessimistic read lock (stage 700). After commit, the PROT of the child transaction may be destroyed with the remaining transactions (stage 702). The process ends at end point 704.

FIG. 16 illustrates one implementation of the stages involved in upgrading a pessimistic read lock to a write lock. The process begins at start point 720 with the discovery that the required write lock on the object has been opened for a pessimistic read (stage 722). The PROT of the child transaction is queried (stage 724) to see the current count of readers and to decide whether the current transaction can count (accountfor) all of these reads (stage 726). If the result counts to all pessimistic readers, then the child transaction may attempt to upgrade to the write lock as usual (stage 728). If there are still outstanding readers, then the child transaction must query the PROT of all ancestor transactions to determine if it can upgrade to a write lock (stage 730). If any ancestor transaction is a parallel parent, the PROT of the parallel parent and the PROT of any parallel sibling transactions that have committed must be considered. The PROTs of these sibling transactions are maintained in the log of the concurrent parent transaction via the PNTE. Proper synchronization is required to ensure that there is contention-free access to the PROTs of these sibling transactions. This synchronization is achieved by ensuring that the PNT does not access its PROT after it is placed in the associated PNTE. If the ancestor transaction and any committed parallel sibling transactions compensate for the additional pessimistic readers, then the child transaction may attempt to upgrade to the write lock as usual (stage 732). In one implementation, the upgrade is implemented as follows. A new write log is formed and added to the log of the PNT. The current TMW value is placed in the WLE for use during rollback and commit processes. If the current TMW value is VWL, VWL is first converted to WLE using VWL with the appropriate synchronization. A CAS is used to acquire write locks. If the CAS is successful, the upgrade works, otherwise there is a conflict. The process ends at end point 734.

An exemplary process for performing pessimistic read lock acquisition while in a parallel nested transaction, as illustrated in FIGS. 17A-17D, is now described. The process begins at 17A, where the current TMW is read (stage 752). If the TMW is a V/C pair (decision point 754), the process described in FIG. 17B is performed. If the TMW is WLE (decision point 756), the process described in fig. 17C is performed. If the TMW is not a V/C pair (decision point 754) and not WLE (decision point 756), the TMW is a VWL (stage 758) and the process described in FIG. 17D is performed. Each of these will now be seen in more detail.

Turning now to FIG. 17B, if the TMW is a V/C pair, then the PROT of the transaction is consulted to determine if the transaction already holds a pessimistic read lock on the object (decision point 769). If a PROT entry exists, the process ends at end point 776. If there is no PROT entry (decision point 769), the count C of the pessimistic reader is incremented using a compare and swap (CAS) (stage 770). If the CAS is successful (decision point 772), the lock is acquired and a PROT entry is formed (stage 774). If the CAS fails (decision point 772), then retry occurs, which is illustrated by referring to stage 752 of FIG. 17A.

As shown in fig. 17C, if the TMW is WLE and the current transaction or any ancestor transaction owns the WLE (decision point 790), the system determines whether the transaction owning the WLE is below or above the parallel parent transaction (decision point 792). If the transaction owning the WLE is below the parallel parent (decision point 792), the pessimistic read is successful (stage 794). If the transaction owning the WLE is above the parallel parent (decision point 792), the system switches the TMW to the VWL for coordination with the sibling transaction. To this end, a VWLM entry is formed with the appropriate synchronization to record WLE (stage 796), and VWL is formed from the version number stored in VWL and C (the count of pessimistic readers) is set to 1 (stage 798). The CAS is used to set the new VWL in the TMW (stage 800). If the CAS is successful, a PROT entry is made and a pessimistic read lock is acquired (stage 806). If the CAS is unsuccessful, the VWLM entry is removed (stage 804) and retried by returning to stage 752 of FIG. 17A. The process ends at end point 810.

As shown in fig. 17D, if the TMW is VWL, then VWL is converted to WLE via VWLM with proper synchronization (stage 820). If the transaction or any ancestor transaction owns the WLE (decision point 822), then the CAS is used to increment the C (the count of pessimistic readers) in the VWL in the TMW (stage 824). If the CAS is successful (decision point 826), a PROT entry is formed and the lock is acquired. If the CAS fails, then retry is performed by proceeding to stage 752 of FIG. 17A. If the transaction or any ancestor transaction does not own the WLE (decision point 822), then a conflict exists (stage 830). The process ends at end point 832.

Figure 18 illustrates one implementation of the stages involved in allowing retries to work correctly with respect to parallel nested transactions. Before delving into the details of FIG. 18 and discussing the permission to retry a transaction in parallel with nested transactions, it is first necessary to provide some background information about the retry operations of an implementation. The retry operation allows basic communication between transactions. When a transaction performs a retry operation, the effects of the transaction are rolled back and execution is suspended until something is changed that the transaction reads. When a change is detected, the transaction is re-executed. Retry operations may be used for some very common data structures, such as a congestion queue. For example, a transaction may check to see if the queue is empty and retry if the queue is empty, or remove an element if the queue is not empty. The transaction will block while the queue remains unchanged and re-execute when the queue state changes, which gives the transaction another chance to complete.

In one implementation, as a transaction performs a retry operation, the system registers for each read in the read set of the transaction to be retried. The retry transaction waits for a notification that something in the read set has changed. A wait notification is issued from the particular transaction that released the write lock. The transaction knows whether or not notification is required in one of two ways. In the first way, if the transactional memory word contains a waiter bit during write lock acquisition, the transactional memory word is looked up in the object waiter map during release and each waiting transaction is signaled. In a second way, if a write transaction finds that the global count of waiting transactions is greater than 0 after releasing all write locks, the write transaction will use the transaction waiter mapping to determine which transactions (if any) are waiting for a write location and need to be signaled. In each case, a normal store operation is used to release the write lock.

In another implementation, a progressive retry operation is started with a rollback to retry only nested transactions and wait for their read set. After waiting some particular time or meeting some other condition, a backoff process is performed to back-off the immediate parent of the retry transaction, thereby increasing the size of the original wait set. This back-off process is repeated until the topmost parent transaction is rolled back to add additional wait for each next parent transaction. The aggregate wait set is associated with the topmost parent transaction and any notification will result in re-execution of the topmost parent transaction.

Referring now to fig. 18, an explanation of how retries can be used with parallel nested transactions will now be discussed. The process begins at start point 850 with the system registering a read set of transactions for retries as usual when a retry is performed by a parallel nested transaction or any sequential descendant (stage 852). The system includes heuristics for determining how long to wait for a retry transaction before expanding the retry operation to include some subset of ancestor transactions of the transaction (stage 854). These heuristics for parallel nested transactions are adjusted and allow for extended latency as long as other parallel sibling transactions are active. If the extended latency is exceeded, then the retry operation is extended to include the parent transaction (stage 856). The status of sibling transactions is considered before further propagation (stage 858). When it is decided to propagate retries through the parallel nested parent transaction, all child transactions are destroyed and then must be completed before retrying the parent transaction back (stage 860). When it is decided to propagate retries through the parent transaction, the read set of parallel nested transactions remains registered and becomes part of the parent transaction read set (stage 862). Note that the read set of any parallel sibling transaction that has been committed into the parent transaction is made part of the wait set along with the read set of the parent transaction. Any parallel sibling transaction aborted by the decision to propagate retries to the parent does not contribute to the wait set. The process ends at end point 864.

Figure 19 illustrates one implementation of the stages involved in using a write abort compensation map with parallel nested transactions. Before delving into the details of FIG. 19 and how write abort compensation maps can be used with parallel nested transactions, it is necessary to provide some background information about the write abort compensation maps of an implementation.

The write abort compensation map may be used to detect a parent transaction that is corrupted with an error of a nested child transaction in a transactional memory system that uses write-in-place. The write abort compensation map (or other storage mechanism) tracks the release count of each lock released for each nested transaction rolled back. The number of times a nested transaction releases a write lock is recorded in their respective write abort compensation maps. The release count may be used during validation of the parent transaction to determine whether an apparently invalid optimistic read is actually valid.

In one implementation, any write abort compensation maps seen for nested child transactions are combined into an aggregated write abort compensation map in the parent transaction when processing the parent transaction log. If the optimistic read fails to acknowledge due to a version number mismatch, the aggregated write abort compensation map is consulted to retrieve the write lock release count for the particular variable of the nested child transaction. If the version number difference exactly matches the write lock release count of the nested child transaction, then the optimistic read is valid.

Returning now to FIG. 19, let us see how a write abort compensation map can be used for parallel nested transactions. The process begins at start point 880 with creating a Write Abort Compensation Map (WACM) when a new write lock is released during rollback of a parallel nested transaction (stage 882). When the nested transaction rolls back, it creates a WACM entry for each new write lock released (stage 884). When ownership of a log of parallel nested transactions is transferred to a parent transaction during commit, the WACM is logically placed in front of the log (stage 886). As described above, these PNTWACMs will be used during parent transaction rollback just like the WACM of any other nested transaction, and will ensure correct validation of the parent transaction's optimistic read (stage 888). The process ends at end point 890.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims. All equivalents, changes, and modifications that come within the spirit of the implementations as described herein and/or by the following claims are desired to be protected.

For example, one of ordinary skill in the computer software art will recognize that the examples discussed herein may be organized differently on one or more computers to include fewer or more options or features than are depicted in the examples.

Claims

1. A method for using versioned write locks with parallel nested transactions, the method comprising the steps of:

concurrently executing the closed nested transaction of the parent transaction as a parallel nested transaction in the transactional memory;

when a transactional memory word changes from a write lock to a versioned write lock, forming an entry in a global versioned write lock map to store a pointer to a write log entry that the versioned write lock replaces (344), wherein the versioned write lock includes a version number of an object when the transactional memory word is still write locked; and

when the versioned write lock is encountered during execution of the parallel nested transaction, the global versioned write lock map is consulted to convert the versioned write lock to a pointer to the write log entry (346).

2. The method of claim 1, wherein, using the versioned write lock, a parallel nested transaction is able to acquire and release a lock held by the parent transaction to properly synchronize without breaking an isolation boundary of the parent transaction (352).

3. The method of claim 1, wherein a version number stored in the transactional memory word is always incremented, whether held in a version count pair or the versioned write lock (348).

4. The method of claim 1, further comprising the steps of:

upon acquiring a pessimistic read, the transactional memory word is switched to the versioned write lock to provide space for storing a count of pessimistic readers, and the versioned write lock now indicates that there is a possibility that a sibling transaction has performed a pessimistic read (796) of an object to which the parent transaction has write-locked.