US20020184445A1 - Dynamically allocated cache memory for a multi-processor unit - Google Patents
Dynamically allocated cache memory for a multi-processor unit Download PDFInfo
- Publication number
- US20020184445A1 US20020184445A1 US09/838,921 US83892101A US2002184445A1 US 20020184445 A1 US20020184445 A1 US 20020184445A1 US 83892101 A US83892101 A US 83892101A US 2002184445 A1 US2002184445 A1 US 2002184445A1
- Authority
- US
- United States
- Prior art keywords
- cache memory
- cache
- processor
- partition
- memory partition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/084—Multiuser, multiprocessor or multiprocessing cache systems with a shared cache
Definitions
- This invention relates generally to multiprocessor computer system and specifically to cache memory of multiprocessor computer systems.
- MPU multi-processor unit
- CPUs central processing units
- An MPU typically includes a cache memory to store data in anticipation of future use by the CPUs.
- the cache memory is smaller and faster than the MPU's main memory, and thus can transfer data to the CPUs in much less time than data from the main memory.
- data requested by the CPUs is in the cache memory, there is a cache hit, and CPU performance approaches the speed of the cache memory.
- cache miss the requested data must be retrieved from main memory, and thus CPU performance approaches the speed of main memory.
- increased performance may be achieved by maximizing the percentage of cache hits during operation.
- Some MPU architectures include a single cache memory that is shared by each of its CPUs. Since data stored in the shared cache memory is shared by each CPU on the chip, it is not necessary to store duplicate sets of data, which increases cache efficiency. Further, if one of the CPUs on the chip becomes defective, or is otherwise not required for a particular operation, the other CPU(s) may still access the entire cache memory. However, since more than one CPU may access the same cache memory locations, chip-level snoop operations are required between the CPUs on each MPU. These snoop operations are in addition to any system-level snoop operations between MPUs on a common bus. The additional circuitry required to perform the chip-level snoop operations undesirably increase the size and complexity of the associated cache controllers.
- MPU architectures include a dedicated cache memory for each of its CPUs. Since only one CPU has access to any given cache memory location, snoop operations between the CPUs on the MPUs may be performed at the system-level rather than the chip-level. Accordingly, the cache controllers for dedicated cache memories are smaller and simpler than the cache controllers for a shared cache memory. However, if one of the CPUs becomes defective or is otherwise not required for a particular application, its dedicated cache memory is not accessible by the other CPU(s), thereby wasting cache resources.
- the resources of a partitioned cache memory are dynamically allocated between two or more processors on a multi-processor unit (MPU) according to a desired system configuration or to the processing needs of the processors.
- the MPU includes first and second processors
- the cache memory includes first and second partitions.
- each cache memory partition is a 2-way associative cache memory.
- a cache access circuit provided between the cache memory and the processors selectively transfers addresses and data between the first and/or second CPUs and the first and/or second cache memory partitions to maximize cache resources.
- both processors are set as active, and may simultaneously execute separate instruction threads.
- the cache access circuit allows each processor to use a corresponding cache memory partition as a dedicated cache. For example, during cache read operations, the cache access circuit provides addresses from the first processor to the first cache memory partition and addresses from the second processor to the second cache memory partition, and returns data from the first cache memory partition to the first processor and data from the second cache memory partition to the second processor. Similarly, during cache write operations, the cache access circuit routes addresses and data from the first processor to the first cache memory partition and routes addresses and data from the second processor to the second cache memory partition.
- the first and second processors may use the first and second cache memory partitions, respectively, as dedicated 2-way associative caches.
- one processor is set as the active processor, and the other processor is set as the inactive processor.
- the cache access circuit allows the active processor to use both the first and second cache memory partitions. For example, during cache read operations, the cache access circuit provides addresses from the active processor to both the first and second cache memory partitions, and returns matching data from the first and second cache memory partitions to the active processor. Similarly, during cache write operations, the cache access circuit returns addresses and data from the active processor to the first and second cache memory partitions. In this manner, the active processor may collectively use the first and second cache memory partitions as a 4-way associative cache.
- the ability to dynamically allocate cache resources between multiple processors advantageously allows the entire cache memory to be used, irrespective of whether one or both processors are currently active, thereby maximizing cache resources while allowing for both one-thread and two-thread execution modes.
- the present invention may be used to maximize cache resources when one of the on-board processors is defective. For example, if one processor is found to be defective during testing, it may be set as inactive, and the cache access circuit may allocate the entire cache memory to the other processor.
- FIG. 1 is a block diagram of a computer system within which embodiments of the present invention may be implemented
- FIG. 2 is a block diagram of a multi-processor unit having a dynamically allocated cache memory in accordance with the present invention
- FIG. 3 is a state diagram illustrating state transitions for the multi-processor unit of FIG. 2.
- FIG. 4 is block diagram of one embodiment of the multi-processor unit of FIG. 2.
- the present invention is described below with reference to an MPU having two processors for simplicity only. It is to be understood that embodiments of the present invention are equally applicable to MPUs having any number of processors. Further, although described as having 2-way associative cache memory partitions, the dynamically allocated cache memory of the present invention may be configured for any desired level of associativity. In addition, the particular logic levels assigned to signals discussed herein is arbitrary and, thus, may be reversed where desirable. Accordingly, the present invention is not to be construed as limited to specific examples described herein but rather includes within its scope all embodiments defined by the appended claims.
- FIG. 1 shows a computer system 10 within which embodiments of the present invention may be implemented.
- System 10 is shown to include four MPUs 11 connected to each other and to a main memory 12 , an input/output (I/O) device 13 , and a network 14 via a system bus 15 .
- Main memory 12 is shared by MPUs 11 , and may be any suitable random access memory (RAM) such as, for example, DRAM.
- I/O device 13 allows a user to interact with system 10 , and may include, for example, a computer monitor, keyboard, and/or mouse input.
- Network 14 may be any suitable network such as, for example, a local area network, a wide area network, and/or the Internet. Additional devices may be connected to the system bus 15 as desired.
- FIG. 2 shows an MPU 20 that is one embodiment of MPU 11 of FIG. 1.
- MPU 20 is shown to include first and second processors such as central processing units (CPUs) 21 a - 21 b, a cache access circuit 22 , and a dynamically allocated cache memory 23 .
- CPUs 21 a - 21 b are well-known processing devices.
- Cache memory 23 is partitioned into first and second cache memory partitions 23 a - 23 b , and is preferably a high speed cache memory device such as SRAM, although other cache devices may be used.
- each cache memory partition 23 a - 23 b is configured as a 2-way associative cache memory.
- the cache memory partitions may be configured for other levels of associativity.
- Cache access circuit 22 selectively couples the first and/or second CPUs 21 a - 21 b to the first and/or second cache memory partitions 23 a - 23 b. As explained in detail below, cache access circuit 22 allows the resources of cache memory 23 to be dynamically allocated between the first and second CPUs 21 a - 21 b according to each CPU's processing requirements to more efficiently utilize cache resources.
- system 10 includes well-known system operating software that assigns tasks of one or more computer programs running thereon to the various MPUs 20 for execution.
- the operating software which is often referred to as the system kernel, also assigns tasks between the CPUs 21 a - 21 b of each MPU 20 .
- the kernel assigns all the tasks to one CPU and idles the other CPU.
- the kernel may assign different threads to CPUs 21 a - 21 b for simultaneous execution therein.
- FIG. 3 illustrates state transitions of MPU 20 between a one-thread (1T) state and a two-thread (2T) state.
- the kernel sets one of the CPUs 21 to an active state and sets the other CPU 21 to an inactive state.
- the kernel sets CPU 21 a as the active CPU and sets CPU 21 b as the inactive CPU, although in other embodiments the kernel may set CPU 21 b as the active CPU and set CPU 21 a as the inactive CPU.
- cache access circuit 22 couples the first CPU 21 a to both the first and second cache memory partitions 23 a - 23 b to allow the first CPU 21 a to use all resources of cache memory 23 .
- the active CPU 21 a may use cache memory partitions 23 a - 23 b as a 4-way associative cache memory.
- dirty data in cache memory partition 23 b is written back to main memory 12 using a well-known writeback operation, thereby flushing cache memory partition 23 b.
- the cache access circuit 22 couples the first CPU 21 a to the first cache memory partition 23 a for exclusive access thereto, and couples the second CPU 21 b to the second cache memory partition 23 b for exclusive access thereto.
- CPU 21 a may use cache memory partition 23 a as a dedicated 2-way associative cache memory
- CPU 21 b may use cache memory partition 23 b as a dedicated 2-way associative cache memory.
- embodiments of the present invention maximize cache performance by ensuring that both cache memory partitions 23 a - 23 b are utilized, irrespective of whether one or both CPUs 2 l a - 21 b are active.
- both cache memory partitions 23 a - 23 b are allocated to the active CPU
- each cache memory partition 23 a and 23 b is allocated only to its corresponding CPU 21 a and 21 b , respectively. Since allocation of cache memory partitions 23 a - 23 b is controlled by cache access circuit 22 , cache memory 23 does not require any special hardware, and thus may be of conventional architecture.
- cache memory 23 is not shared between CPUs 21 a - 21 b , all snoop operations may be performed at the system level.
- the cache controllers (not shown in FIG. 2) in CPUs 21 a - 21 b are much simpler and occupy less silicon area than cache controllers for shared cache memory systems.
- the ability to dynamically allocate cache resources is also useful in situations where portions of MPU 20 are defective.
- the kernel may be configured to maintain MPU 20 in the 1T state, where CPU 21 a is the active CPU and has access to both cache memory partitions 23 a - 23 b , and CPU 21 b is inactive.
- the failure of one CPU 21 on MPU 20 does not render any part of cache memory 23 inaccessible.
- FIG. 4 shows an MPU 40 that is one embodiment of MPU 20 , and includes CPUs 2 l a - 2 l b , cache access circuit 22 , and cache memory partitions 23 a - 23 b .
- Each CPU 21 is shown to include a CPU core 41 and a cache controller 42 .
- Each cache controller 42 which may be of conventional architecture, transfers address and data between its associated CPU core 41 and cache access circuit 22 , and includes (or is associated with) a memory element 43 .
- Memory element 43 may be any suitable memory device including, for example, a register or memory cell. Although shown in FIG. 4 as being internal to cache controller 42 , memory element 43 may be external to cache controller 42 .
- CPU core 41 includes other well-known elements of CPU 21 including, for instance, L1 cache memory, instruction units, fetch and decode units, execution units, register files, write cache(s), and so on.
- Cache memory partition 23 a includes two data RAM arrays 51 - 52 having corresponding searchable tag arrays 61 - 62 , respectively, while cache memory partition 23 b includes two data RAM arrays 53 - 54 having corresponding searchable tag arrays 63 - 64 , respectively.
- Cache memory partition 23 a includes a well-known address converter 56 a that converts a main memory address received from cache access circuit 22 into a cache address that is used to concurrently address the tag arrays 61 - 62 and the data arrays 51 - 52 .
- cache partition 23 b includes a well-known address converter 56 b that converts an address received from cache access circuit 22 into a cache address that is used to concurrently address the tag arrays 63 - 64 and the data arrays 53 - 54 .
- Data arrays 51 - 54 each include a plurality of cache lines for storing data retrieved from main memory 12 .
- Cache lines in data arrays 51 - 54 may be any suitable length.
- each cache line of data arrays 51 - 54 stores 32 Bytes of data.
- Each data array 51 - 54 also includes a well-known address decoder (not shown for simplicity) that selects a cache line for read and write operations in response to a received cache index.
- Data arrays 51 - 52 provide data at a selected cache line to a MUX 57 a
- data arrays 53 - 54 provide data at selected cache line to a MUX 57 b.
- Tag arrays 61 - 64 each include a plurality of lines for storing tag information for corresponding cache lines in data arrays 51 - 54 , respectively.
- Tag arrays 61 - 62 provide tags at the selected cache line to a comparator 58 a which, in response to a comparison with a tag address received from address converter 56 a , generates a select signal for MUX 57 a .
- tag arrays 62 and 63 provide tags at the selected cache line to a comparator 58 b which, in response to a comparison with a tag address received from address converter 56 b , generates a select signal for MUX 57 b .
- Comparators 58 a and 58 b are well-known.
- Cache access circuit 22 is shown to include four multiplexers (MUXs) 44 - 47 , two AND gates 48 a and 48 b , and two comparators 49 a and 49 b , although after reading this disclosure it will be evident to those skilled in the art that various other logic configurations may be used to selectively route addresses and data between MPU 20 and cache memory 23 .
- MUXes 44 - 45 selectively provide address information from CPUs 21 a - 21 b to cache memory partitions 23 a - 23 b , respectively, and MUXes 46 - 47 47 selectively provide data from cache memory partitions 23 a - 23 b to CPUs 21 a - 21 b , respectively.
- MUXes 44 - 45 are controlled by control signals C 44 and C 45 , respectively.
- MUX 46 is controlled by AND gate 48 a , which includes a first input terminal coupled to receive a control signal C 46 and a second input terminal coupled to comparator 49 a .
- Comparator 49 a includes input terminals coupled to receive select signals from comparators 58 a and 58 b of cache memory 23 .
- MUX 47 is controlled by AND gate 48 b , which includes a first input terminal coupled to receive a control signal C 47 and a second input terminal coupled to comparator 49 b .
- Comparator 49 b includes input terminals coupled to receive select signals from comparators 58 a and 58 b of cache memory 23 .
- Comparators 49 a and 49 b are well-known. Values for signals C 44 and C 46 may be stored in memory 43 a of cache controller 42 a , and values for signals C 45 and C 47 may be stored in memory 43 b of cache controller 42 b.
- MUX 44 selectively provides address and data information to cache memory partition 23 a from either CPU 21 a or CPU 21 b in response to C 44
- MUX 45 selectively provides address and data information to cache memory partition 23 b from either CPU 21 a or CPU 21 b in response to C 45
- MUX 46 selectively returns data to CPU 21 a from either cache memory partition 23 a or 23 b in response to AND gate 48 a
- MUX 47 selectively returns data to CPU 21 b from either cache memory partition 23 a or 23 b in response to AND gate 48 b.
- MUXes 44 - 45 are shown in FIG. 4 as routing both address and data information to cache memory partitions 23 a - 23 b , respectively.
- cache access circuit 22 may include a duplicate set of MUXes to route data to respective cache memory partitions 23 a - 23 b , in which case MUXes 44 - 45 route only address information to respective cache memory partitions 23 a - 23 b.
- each CPU 21 a - 21 b is processing its own instruction thread, and the kernel sets signals C 44 -C 47 to logic low (i.e., logic 0) to simultaneously provide CPU 21 a with exclusive use of cache memory partition 23 a and to provide CPU 21 b with exclusive use of cache memory partition 23 b .
- C 44 0 forces MUX 44 to provide an address or data from CPU 21 a to cache memory partition 23 a
- C 45 0 forces MUX 45 to provide an address or data from CPU 21 b to cache memory partition 23 b
- C 46 0 forces the output of AND gate 48 a to logic 0 to force MUX 46 to provide data from cache memory partition 23 a to CPU 21 a
- C 47 0 forces the output of AND gate 48 b to logic 0 to force MUX 47 to provide data from cache memory partition 23 b to CPU 21 b.
- CPU 21 a To request data from cache memory partition 23 a , CPU 21 a provides a main memory address to address converter 56 a via MUX 44 .
- Address converter 56 a converts the main memory address to a cache address that includes a tag address and a cache index. The cache index is used to select a cache line in data arrays 51 - 52 and associated tag arrays 61 - 62 . If there is data stored at the selected cache line in data arrays 51 and/or 52 , the data is read out to MUX 57 a . Also, the tag fields from the selected line of tag arrays 61 - 62 are read out to comparator 58 a , which also receives the tag address from address converter 56 a .
- CPU 21 b may simultaneously request data from cache memory partition 23 b in a similar manner.
- a main memory address provided by CPU 21 b to address converter 56 b via MUX 44 is converted into a cache address that includes a tag address and a cache index.
- the cache index selects a cache line in data arrays 53 - 54 and associated tag arrays 63 - 64 . If there is data stored at the selected cache line in data arrays 53 and/or 54 , the data is read out to MUX 57 b . Also, the tag fields from the selected line of tag arrays 63 - 64 are read out to comparator 58 b , which also receives the tag address from address converter 56 b .
- CPU 21 a may use cache memory partition 23 a as a dedicated 2-way associative cache while CPU 21 b simultaneously and independently uses cache memory partition 23 b as a dedicated 2-way associative cache.
- the kernel sets CPU 21 a as the active CPU and sets CPU 21 b as the inactive CPU (as mentioned earlier, in other embodiments the kernel may set CPU 21 b as the active CPU and set CPU 21 a as the inactive CPU).
- the kernel also sets signal C 44 to logic low and sets signals C 45 -C 46 to logic high (i.e., logic 1) to provide CPU 21 a with use of both cache memory partitions 23 a - 23 b .
- C 44 0 forces MUX 44 to provide an address or data from CPU 21 a to cache memory partition 23 a
- C 45 1 forces MUX 45 to provide the same address or data from CPU 21 a to cache memory partition 23 b
- CPU 21 a To request data from both cache memory partitions 23 a - 23 b , CPU 21 a provides a main memory address to address converter 56 a via MUX 44 and to address converter 56 b via MUX 45 .
- the cache address is provided to data arrays 51 - 54 and to tag arrays 61 - 64 .
- Data arrays 51 - 52 read out the selected cache line to MUX 57 a
- tag arrays 61 - 62 read out corresponding tag fields to comparator 58 a .
- Comparator 58 a compares the tag fields with the tag address received from address converter 56 a , and selects which data (if any) MUX 57 a forwards to MUX 46 .
- data arrays 53 - 54 read out the selected cache line to MUX 57 b
- tag arrays 63 - 64 read out corresponding tag fields to comparator 58 b
- Comparator 58 b compares the tag field with the tag address received from address converter 56 b , and selects which data (if any) MUX 57 b forwards to MUX 46 .
- the ability to easily transition between using cache memory 23 as two dedicated 2-way associative cache memories for respective CPUs 21 a - 21 b , and using cache memory 23 as a 4-way associative memory for only one CPU 21 a advantageously allows for use of the entire cache memory 23 , irrespective of whether MPU 20 is executing one or two threads, and thereby maximizes the effectiveness of cache memory 23 .
- cache controllers 42 a and 42 b do not need to perform separate chip-level snoop operations, and thus are much simpler and occupy less silicon area than cache controllers for a shared cache memory system.
- cache memory 23 may have any number of partitions.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
- 1. Field of Invention
- This invention relates generally to multiprocessor computer system and specifically to cache memory of multiprocessor computer systems.
- 2. Description of Related Art
- Some manufactures combine two or more central processing units (CPUs) on a single chip and sell the chip as a multi-processor unit (MPU). The MPU takes advantage of parallel processing to increase performance over a single CPU. An MPU typically includes a cache memory to store data in anticipation of future use by the CPUs. The cache memory is smaller and faster than the MPU's main memory, and thus can transfer data to the CPUs in much less time than data from the main memory. When data requested by the CPUs is in the cache memory, there is a cache hit, and CPU performance approaches the speed of the cache memory. Conversely, when there is a cache miss, the requested data must be retrieved from main memory, and thus CPU performance approaches the speed of main memory. Thus, increased performance may be achieved by maximizing the percentage of cache hits during operation.
- Some MPU architectures include a single cache memory that is shared by each of its CPUs. Since data stored in the shared cache memory is shared by each CPU on the chip, it is not necessary to store duplicate sets of data, which increases cache efficiency. Further, if one of the CPUs on the chip becomes defective, or is otherwise not required for a particular operation, the other CPU(s) may still access the entire cache memory. However, since more than one CPU may access the same cache memory locations, chip-level snoop operations are required between the CPUs on each MPU. These snoop operations are in addition to any system-level snoop operations between MPUs on a common bus. The additional circuitry required to perform the chip-level snoop operations undesirably increase the size and complexity of the associated cache controllers.
- Other MPU architectures include a dedicated cache memory for each of its CPUs. Since only one CPU has access to any given cache memory location, snoop operations between the CPUs on the MPUs may be performed at the system-level rather than the chip-level. Accordingly, the cache controllers for dedicated cache memories are smaller and simpler than the cache controllers for a shared cache memory. However, if one of the CPUs becomes defective or is otherwise not required for a particular application, its dedicated cache memory is not accessible by the other CPU(s), thereby wasting cache resources.
- Thus, there is a need for better management of cache resources on an MPU without requiring large and complicated cache controllers.
- A method and apparatus are disclosed that overcome problems in the art described above. In accordance with the present invention, the resources of a partitioned cache memory are dynamically allocated between two or more processors on a multi-processor unit (MPU) according to a desired system configuration or to the processing needs of the processors. In some embodiments, the MPU includes first and second processors, and the cache memory includes first and second partitions. In one embodiment, each cache memory partition is a 2-way associative cache memory. A cache access circuit provided between the cache memory and the processors selectively transfers addresses and data between the first and/or second CPUs and the first and/or second cache memory partitions to maximize cache resources.
- In one mode, both processors are set as active, and may simultaneously execute separate instruction threads. In this two-thread mode, the cache access circuit allows each processor to use a corresponding cache memory partition as a dedicated cache. For example, during cache read operations, the cache access circuit provides addresses from the first processor to the first cache memory partition and addresses from the second processor to the second cache memory partition, and returns data from the first cache memory partition to the first processor and data from the second cache memory partition to the second processor. Similarly, during cache write operations, the cache access circuit routes addresses and data from the first processor to the first cache memory partition and routes addresses and data from the second processor to the second cache memory partition. Thus, the first and second processors may use the first and second cache memory partitions, respectively, as dedicated 2-way associative caches.
- In another mode, one processor is set as the active processor, and the other processor is set as the inactive processor. In this one-thread mode, the cache access circuit allows the active processor to use both the first and second cache memory partitions. For example, during cache read operations, the cache access circuit provides addresses from the active processor to both the first and second cache memory partitions, and returns matching data from the first and second cache memory partitions to the active processor. Similarly, during cache write operations, the cache access circuit returns addresses and data from the active processor to the first and second cache memory partitions. In this manner, the active processor may collectively use the first and second cache memory partitions as a 4-way associative cache.
- The ability to dynamically allocate cache resources between multiple processors advantageously allows the entire cache memory to be used, irrespective of whether one or both processors are currently active, thereby maximizing cache resources while allowing for both one-thread and two-thread execution modes. In addition, the present invention may be used to maximize cache resources when one of the on-board processors is defective. For example, if one processor is found to be defective during testing, it may be set as inactive, and the cache access circuit may allocate the entire cache memory to the other processor.
- FIG. 1 is a block diagram of a computer system within which embodiments of the present invention may be implemented;
- FIG. 2 is a block diagram of a multi-processor unit having a dynamically allocated cache memory in accordance with the present invention;
- FIG. 3 is a state diagram illustrating state transitions for the multi-processor unit of FIG. 2; and
- FIG. 4 is block diagram of one embodiment of the multi-processor unit of FIG. 2.
- Like reference numerals refer to corresponding parts throughout the drawing figures.
- The present invention is described below with reference to an MPU having two processors for simplicity only. It is to be understood that embodiments of the present invention are equally applicable to MPUs having any number of processors. Further, although described as having 2-way associative cache memory partitions, the dynamically allocated cache memory of the present invention may be configured for any desired level of associativity. In addition, the particular logic levels assigned to signals discussed herein is arbitrary and, thus, may be reversed where desirable. Accordingly, the present invention is not to be construed as limited to specific examples described herein but rather includes within its scope all embodiments defined by the appended claims.
- FIG. 1 shows a
computer system 10 within which embodiments of the present invention may be implemented.System 10 is shown to include fourMPUs 11 connected to each other and to amain memory 12, an input/output (I/O)device 13, and anetwork 14 via asystem bus 15.Main memory 12 is shared byMPUs 11, and may be any suitable random access memory (RAM) such as, for example, DRAM. I/O device 13 allows a user to interact withsystem 10, and may include, for example, a computer monitor, keyboard, and/or mouse input.Network 14 may be any suitable network such as, for example, a local area network, a wide area network, and/or the Internet. Additional devices may be connected to thesystem bus 15 as desired. - FIG. 2 shows an MPU 20 that is one embodiment of MPU 11 of FIG. 1. MPU 20 is shown to include first and second processors such as central processing units (CPUs) 21 a-21 b, a
cache access circuit 22, and a dynamically allocatedcache memory 23. CPUs 21 a-21 b are well-known processing devices.Cache memory 23 is partitioned into first and secondcache memory partitions 23 a-23 b, and is preferably a high speed cache memory device such as SRAM, although other cache devices may be used. For the purpose of discussion herein, eachcache memory partition 23 a-23 b is configured as a 2-way associative cache memory. Of course, in actual embodiments, the cache memory partitions may be configured for other levels of associativity. -
Cache access circuit 22 selectively couples the first and/or second CPUs 21 a-21 b to the first and/or secondcache memory partitions 23 a-23 b. As explained in detail below,cache access circuit 22 allows the resources ofcache memory 23 to be dynamically allocated between the first and second CPUs 21 a-21 b according to each CPU's processing requirements to more efficiently utilize cache resources. - Referring also to FIG. 1,
system 10 includes well-known system operating software that assigns tasks of one or more computer programs running thereon to thevarious MPUs 20 for execution. The operating software, which is often referred to as the system kernel, also assigns tasks between the CPUs 21 a-21 b of eachMPU 20. For applications that include a single instruction execution thread and are thus best executed using only one CPU 21, e.g., for applications having a highly sequential instruction code, the kernel assigns all the tasks to one CPU and idles the other CPU. Conversely, for applications that can be divided into two parallel instruction execution threads, e.g., for applications having parallel execution loops, the kernel may assign different threads to CPUs 21 a-21 b for simultaneous execution therein. - FIG. 3 illustrates state transitions of
MPU 20 between a one-thread (1T) state and a two-thread (2T) state. In one embodiment, upon power-up ofMPU 20, the kernel sets a mode signal M=0 to initializeMPU 20 to the 1T state. The kernel sets one of the CPUs 21 to an active state and sets the other CPU 21 to an inactive state. For purposes of discussion herein, during the 1T state, the kernel setsCPU 21 a as the active CPU and sets CPU 21 b as the inactive CPU, although in other embodiments the kernel may set CPU 21 b as the active CPU and setCPU 21 a as the inactive CPU. While in the 1T state, the kernel assigns tasks of the computer program(s) only to theactive CPU 21 a, while the other CPU 21 b remains idle. In response to M=0,cache access circuit 22 couples thefirst CPU 21 a to both the first and secondcache memory partitions 23 a-23 b to allow thefirst CPU 21 a to use all resources ofcache memory 23. In this state, theactive CPU 21 a may usecache memory partitions 23 a-23 b as a 4-way associative cache memory. - If, during execution of the computer program(s), the kernel determines that certain tasks may be executed in parallel, and thus may be divided into 2 threads, the kernel may transition
MPU 20 to the 2T state by changing the mode signal to M=1. When M=1, the kernel sets both CPUs 21 a-21 b to the active state, and thereafter assigns one execution thread toCPU 21 a and another execution thread to CPU 21 b in a well-known manner. In response to M=1, dirty data in cache memory partition 23 b is written back tomain memory 12 using a well-known writeback operation, thereby flushing cache memory partition 23 b. Thecache access circuit 22 couples thefirst CPU 21 a to the firstcache memory partition 23 a for exclusive access thereto, and couples the second CPU 21 b to the second cache memory partition 23 b for exclusive access thereto. In this state,CPU 21 a may usecache memory partition 23 a as a dedicated 2-way associative cache memory, and CPU 21 b may use cache memory partition 23 b as a dedicated 2-way associative cache memory. - Thereafter, if the kernel determines that only one of CPUs 21 a-21 b is necessary for a particular instruction code sequence, the kernel may transition
MPU 20 to the 1T state by changing the mode signal to M=0, flushing the second cache memory partition 23 b, and then assigning execution of the instruction code sequence to theactive CPU 21 a. - By dynamically allocating resources of
cache memory 23 in response to specific needs of associated CPUs 21 a-21 b, embodiments of the present invention maximize cache performance by ensuring that bothcache memory partitions 23 a-23 b are utilized, irrespective of whether one or both CPUs 2la-21 b are active. Thus, in the 1T state, bothcache memory partitions 23 a-23 b are allocated to the active CPU, and in the 2T state, eachcache memory partition 23 a and 23 b is allocated only to its correspondingCPU 21 a and 21 b, respectively. Since allocation ofcache memory partitions 23 a-23 b is controlled bycache access circuit 22,cache memory 23 does not require any special hardware, and thus may be of conventional architecture. Further, sincecache memory 23 is not shared between CPUs 21 a-21 b, all snoop operations may be performed at the system level. As a result, the cache controllers (not shown in FIG. 2) in CPUs 21 a-21 b are much simpler and occupy less silicon area than cache controllers for shared cache memory systems. - The ability to dynamically allocate cache resources is also useful in situations where portions of
MPU 20 are defective. For example, during testing ofMPU 20, if CPU 21 b is found to be defective or otherwise unusable, the kernel may be configured to maintainMPU 20 in the 1T state, whereCPU 21 a is the active CPU and has access to bothcache memory partitions 23 a-23 b, and CPU 21 b is inactive. Thus, in contrast to MPUs that have dedicated cache memory for each on-board CPU, the failure of one CPU 21 onMPU 20 does not render any part ofcache memory 23 inaccessible. - FIG. 4 shows an MPU 40 that is one embodiment of
MPU 20, and includes CPUs 2la-2lb,cache access circuit 22, andcache memory partitions 23 a-23 b. Each CPU 21 is shown to include a CPU core 41 and a cache controller 42. Each cache controller 42, which may be of conventional architecture, transfers address and data between its associated CPU core 41 andcache access circuit 22, and includes (or is associated with) a memory element 43. Memory element 43 may be any suitable memory device including, for example, a register or memory cell. Although shown in FIG. 4 as being internal to cache controller 42, memory element 43 may be external to cache controller 42. CPU core 41 includes other well-known elements of CPU 21 including, for instance, L1 cache memory, instruction units, fetch and decode units, execution units, register files, write cache(s), and so on. -
Cache memory partition 23 a includes two data RAM arrays 51-52 having corresponding searchable tag arrays 61-62, respectively, while cache memory partition 23 b includes two data RAM arrays 53-54 having corresponding searchable tag arrays 63-64, respectively.Cache memory partition 23 a includes a well-knownaddress converter 56 a that converts a main memory address received fromcache access circuit 22 into a cache address that is used to concurrently address the tag arrays 61-62 and the data arrays 51-52. Similarly, cache partition 23 b includes a well-known address converter 56 b that converts an address received fromcache access circuit 22 into a cache address that is used to concurrently address the tag arrays 63-64 and the data arrays 53-54. - Data arrays 51-54 each include a plurality of cache lines for storing data retrieved from
main memory 12. Cache lines in data arrays 51-54 may be any suitable length. In one embodiment, each cache line of data arrays 51-54 stores 32 Bytes of data. Each data array 51-54 also includes a well-known address decoder (not shown for simplicity) that selects a cache line for read and write operations in response to a received cache index. Data arrays 51-52 provide data at a selected cache line to aMUX 57 a, and data arrays 53-54 provide data at selected cache line to a MUX 57 b. - Tag arrays 61-64 each include a plurality of lines for storing tag information for corresponding cache lines in data arrays 51-54, respectively. Tag arrays 61-62 provide tags at the selected cache line to a comparator 58 a which, in response to a comparison with a tag address received from
address converter 56 a, generates a select signal forMUX 57 a. Similarly, 62 and 63 provide tags at the selected cache line to a comparator 58 b which, in response to a comparison with a tag address received from address converter 56 b, generates a select signal for MUX 57 b. Comparators 58 a and 58 b are well-known.tag arrays -
Cache access circuit 22 is shown to include four multiplexers (MUXs) 44-47, two AND gates 48 a and 48 b, and twocomparators 49 a and 49 b, although after reading this disclosure it will be evident to those skilled in the art that various other logic configurations may be used to selectively route addresses and data betweenMPU 20 andcache memory 23. MUXes 44-45 selectively provide address information from CPUs 21 a-21 b tocache memory partitions 23 a-23 b , respectively, and MUXes 46-47 47 selectively provide data fromcache memory partitions 23 a-23 b to CPUs 21 a-21 b , respectively. MUXes 44-45 are controlled by control signals C44 and C45, respectively.MUX 46 is controlled by AND gate 48 a, which includes a first input terminal coupled to receive a control signal C46 and a second input terminal coupled tocomparator 49 a.Comparator 49 a includes input terminals coupled to receive select signals from comparators 58 a and 58 b ofcache memory 23.MUX 47 is controlled by AND gate 48 b, which includes a first input terminal coupled to receive a control signal C47 and a second input terminal coupled to comparator 49 b. Comparator 49 b includes input terminals coupled to receive select signals from comparators 58 a and 58 b ofcache memory 23.Comparators 49 a and 49 b are well-known. Values for signals C44 and C46 may be stored inmemory 43 a ofcache controller 42 a, and values for signals C45 and C47 may be stored in memory 43 b of cache controller 42 b. - Specifically,
MUX 44 selectively provides address and data information tocache memory partition 23 a from eitherCPU 21 a or CPU 21 b in response to C44, andMUX 45 selectively provides address and data information to cache memory partition 23 b from eitherCPU 21 a or CPU 21 b in response to C45.MUX 46 selectively returns data toCPU 21 a from eithercache memory partition 23 a or 23 b in response to AND gate 48 a, andMUX 47 selectively returns data to CPU 21 b from eithercache memory partition 23 a or 23 b in response to AND gate 48 b. - For simplicity, MUXes 44-45 are shown in FIG. 4 as routing both address and data information to
cache memory partitions 23 a-23 b, respectively. However, in other embodiments,cache access circuit 22 may include a duplicate set of MUXes to route data to respectivecache memory partitions 23 a-23 b, in which case MUXes 44-45 route only address information to respectivecache memory partitions 23 a-23 b. - When
MPU 20 is in the 2T state (e.g., when M=1), each CPU 21 a-21 b is processing its own instruction thread, and the kernel sets signals C44-C47 to logic low (i.e., logic 0) to simultaneously provideCPU 21 a with exclusive use ofcache memory partition 23 a and to provide CPU 21 b with exclusive use of cache memory partition 23 b. Thus, C44=0 forces MUX 44 to provide an address or data fromCPU 21 a tocache memory partition 23 a, C45=0 forces MUX 45 to provide an address or data from CPU 21 b to cache memory partition 23 b, C46=0 forces the output of AND gate 48 a to logic 0 to forceMUX 46 to provide data fromcache memory partition 23 a toCPU 21 a, and C47=0 forces the output of AND gate 48 b to logic 0 to forceMUX 47 to provide data from cache memory partition 23 b to CPU 21 b. - To request data from
cache memory partition 23 a,CPU 21 a provides a main memory address to addressconverter 56 a viaMUX 44.Address converter 56 a converts the main memory address to a cache address that includes a tag address and a cache index. The cache index is used to select a cache line in data arrays 51-52 and associated tag arrays 61-62. If there is data stored at the selected cache line indata arrays 51 and/or 52, the data is read out to MUX 57 a. Also, the tag fields from the selected line of tag arrays 61-62 are read out to comparator 58 a, which also receives the tag address fromaddress converter 56 a. Comparator 58 a compares the tag address with tag fields provided by tag arrays 61-62, and in response thereto provides a select signal to MUX 57 a that selects whether data fromdata array 51 or 52 (or neither, if there is no matching data) is read out to MUX 46 ofcache access circuit 22. Since C46=0,MUX 46 provides matching data fromcache memory partition 23 a tocache controller 42 a ofCPU 21 a. - CPU 21 b may simultaneously request data from cache memory partition 23 b in a similar manner. Thus, a main memory address provided by CPU 21 b to address converter 56 b via
MUX 44 is converted into a cache address that includes a tag address and a cache index. The cache index selects a cache line in data arrays 53-54 and associated tag arrays 63-64. If there is data stored at the selected cache line indata arrays 53 and/or 54, the data is read out to MUX 57 b. Also, the tag fields from the selected line of tag arrays 63-64 are read out to comparator 58 b, which also receives the tag address from address converter 56 b. Comparator 58 b compares the tag address with tag fields provided by tag arrays 63-64, and in response thereto provides a select signal to MUX 57 b that selects whether data fromdata array 53 or 54 (or neither, if there is no matching data) is read out to MUX 47 ofcache access circuit 22. Since C47=0,MUX 47 provides matching data from cache memory partition 23 b to cache controller 42 b of CPU 21 b. - In this manner,
CPU 21 a may usecache memory partition 23 a as a dedicated 2-way associative cache while CPU 21 b simultaneously and independently uses cache memory partition 23 b as a dedicated 2-way associative cache. - When
MPU 20 transitions to the 1T state (e.g., M=0), the kernel setsCPU 21 a as the active CPU and sets CPU 21 b as the inactive CPU (as mentioned earlier, in other embodiments the kernel may set CPU 21 b as the active CPU and setCPU 21 a as the inactive CPU). The kernel also sets signal C44 to logic low and sets signals C45-C46 to logic high (i.e., logic 1) to provideCPU 21 a with use of bothcache memory partitions 23 a-23 b. Thus, C44=0 forces MUX 44 to provide an address or data fromCPU 21 a tocache memory partition 23 a, C45=1 forces MUX 45 to provide the same address or data fromCPU 21 a to cache memory partition 23 b, and C46=1 allows a result signal fromcomparator 49 a to select whether data fromcache memory partition 23 a or 23 b is returned toCPU 21 a. Since CPU 21 b is inactive, C47 is a don't care (d/c) for M=0. - To request data from both
cache memory partitions 23 a-23 b,CPU 21 a provides a main memory address to addressconverter 56 a viaMUX 44 and to address converter 56 b viaMUX 45. Thus, the cache address is provided to data arrays 51-54 and to tag arrays 61-64. Data arrays 51-52 read out the selected cache line to MUX 57 a, and tag arrays 61-62 read out corresponding tag fields to comparator 58 a. Comparator 58 a compares the tag fields with the tag address received fromaddress converter 56 a, and selects which data (if any)MUX 57 a forwards to MUX 46. Similarly, data arrays 53-54 read out the selected cache line to MUX 57 b, and tag arrays 63-64 read out corresponding tag fields to comparator 58 b. Comparator 58 b compares the tag field with the tag address received from address converter 56 b, and selects which data (if any) MUX 57 b forwards to MUX 46. - The select signals provided by comparators 58 a and 58 b are compared in
comparator 49 a to generate a select signal that is provided to MUX 46 via AND gate 48 a to select which data (if any) is returned toCPU 21 a. Thus, if there is matching data in eithercache memory partition 23 a or cache memory partition 23 b, it is returned toCPU 21 a viaMUX 46. In this manner, data arrays 51-54 provide a 4-way associative cache memory forCPU 21 a. Values for control signals C44-C47 for the 1T and 2T states are summarized below in Table 1.TABLE 1 mode C44 C45 C46 C47 1T 0 1 1 d/c 2T 0 0 0 0 - As discussed above, the ability to easily transition between using
cache memory 23 as two dedicated 2-way associative cache memories for respective CPUs 21 a-21 b, and usingcache memory 23 as a 4-way associative memory for only oneCPU 21 a, advantageously allows for use of theentire cache memory 23, irrespective of whetherMPU 20 is executing one or two threads, and thereby maximizes the effectiveness ofcache memory 23. Further, since CPUs 21 a-21 b do not simultaneously share access to the same data incache memory 23,cache controllers 42 a and 42 b do not need to perform separate chip-level snoop operations, and thus are much simpler and occupy less silicon area than cache controllers for a shared cache memory system. - While particular embodiments of the present invention have been shown and described, it will be obvious to those skilled in the art that changes and modifications may be made without departing from this invention in its broader aspects and, therefore, the appended claims are to encompass within their scope all such changes and modifications as fall within the true spirit and scope of this invention. For example, although described above as having two partitions, in actual
embodiments cache memory 23 may have any number of partitions.
Claims (20)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US09/838,921 US6725336B2 (en) | 2001-04-20 | 2001-04-20 | Dynamically allocated cache memory for a multi-processor unit |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US09/838,921 US6725336B2 (en) | 2001-04-20 | 2001-04-20 | Dynamically allocated cache memory for a multi-processor unit |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20020184445A1 true US20020184445A1 (en) | 2002-12-05 |
| US6725336B2 US6725336B2 (en) | 2004-04-20 |
Family
ID=25278397
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US09/838,921 Expired - Lifetime US6725336B2 (en) | 2001-04-20 | 2001-04-20 | Dynamically allocated cache memory for a multi-processor unit |
Country Status (1)
| Country | Link |
|---|---|
| US (1) | US6725336B2 (en) |
Cited By (32)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20030046495A1 (en) * | 2001-08-28 | 2003-03-06 | Venkitakrishnan Padmanabha I. | Streamlined cache coherency protocol system and method for a multiple processor single chip device |
| US20030074502A1 (en) * | 2001-10-15 | 2003-04-17 | Eliel Louzoun | Communication between two embedded processors |
| US20060010450A1 (en) * | 2004-07-08 | 2006-01-12 | Culter Bradley G | System and method for soft partitioning a computer system |
| US20060259733A1 (en) * | 2005-05-13 | 2006-11-16 | Sony Computer Entertainment Inc. | Methods and apparatus for resource management in a logically partitioned processing environment |
| WO2006136495A1 (en) * | 2005-06-23 | 2006-12-28 | International Business Machines Corporation | System and method of remote media cache optimization for use with multiple processing units |
| US7380085B2 (en) * | 2001-11-14 | 2008-05-27 | Intel Corporation | Memory adapted to provide dedicated and or shared memory to multiple processors and method therefor |
| US20080294839A1 (en) * | 2004-03-29 | 2008-11-27 | Bell Michael I | System and method for dumping memory in computer systems |
| US20090024793A1 (en) * | 2007-07-17 | 2009-01-22 | Fontenot Nathan D | Method and apparatus for managing data in a hybrid drive system |
| US20090049257A1 (en) * | 2007-08-14 | 2009-02-19 | Dell Products L.P. | System and Method for Implementing a Memory Defect Map |
| US20090049270A1 (en) * | 2007-08-14 | 2009-02-19 | Dell Products L.P. | System and method for using a memory mapping function to map memory defects |
| US20090049335A1 (en) * | 2007-08-14 | 2009-02-19 | Dell Products L.P. | System and Method for Managing Memory Errors in an Information Handling System |
| US20090049351A1 (en) * | 2007-08-14 | 2009-02-19 | Dell Products L.P. | Method for Creating a Memory Defect Map and Optimizing Performance Using the Memory Defect Map |
| US20090083508A1 (en) * | 2004-11-11 | 2009-03-26 | Koninklijke Philips Electronics, N.V. | System as well as method for managing memory space |
| CN100485640C (en) * | 2004-11-30 | 2009-05-06 | 国际商业机器公司 | Cache for an enterprise software system |
| US20090132059A1 (en) * | 2007-11-13 | 2009-05-21 | Schultz Ronald E | Industrial controller using shared memory multicore architecture |
| GB2457341A (en) * | 2008-02-14 | 2009-08-19 | Transitive Ltd | Multiprocessor computing system with multi-mode memory consistency protection |
| US20100199044A1 (en) * | 2009-01-30 | 2010-08-05 | Sony Corporation | Interface apparatus, calculation processing apparatus, interface generation apparatus, and circuit generation apparatus |
| US20110040940A1 (en) * | 2009-08-13 | 2011-02-17 | Wells Ryan D | Dynamic cache sharing based on power state |
| US20120314513A1 (en) * | 2011-06-09 | 2012-12-13 | Semiconductor Energy Laboratory Co., Ltd. | Semiconductor memory device and method of driving semiconductor memory device |
| US8724408B2 (en) | 2011-11-29 | 2014-05-13 | Kingtiger Technology (Canada) Inc. | Systems and methods for testing and assembling memory modules |
| US9117552B2 (en) | 2012-08-28 | 2015-08-25 | Kingtiger Technology(Canada), Inc. | Systems and methods for testing memory |
| WO2015152893A1 (en) * | 2014-03-31 | 2015-10-08 | Cfph, Llc | Resource allocation |
| US9223709B1 (en) * | 2012-03-06 | 2015-12-29 | Marvell International Ltd. | Thread-aware cache memory management |
| EP3317769A4 (en) * | 2015-07-28 | 2018-07-04 | Huawei Technologies Co., Ltd. | Advance cache allocator |
| US10223164B2 (en) | 2016-10-24 | 2019-03-05 | International Business Machines Corporation | Execution of critical tasks based on the number of available processing entities |
| US10248464B2 (en) * | 2016-10-24 | 2019-04-02 | International Business Machines Corporation | Providing additional memory and cache for the execution of critical tasks by folding processing units of a processor complex |
| US10248457B2 (en) | 2016-08-10 | 2019-04-02 | International Business Machines Corporation | Providing exclusive use of cache associated with a processing entity of a processor complex to a selected task |
| US10275280B2 (en) | 2016-08-10 | 2019-04-30 | International Business Machines Corporation | Reserving a core of a processor complex for a critical task |
| US20210255972A1 (en) * | 2019-02-13 | 2021-08-19 | Google Llc | Way partitioning for a system-level cache |
| CN114301858A (en) * | 2021-02-05 | 2022-04-08 | 井芯微电子技术(天津)有限公司 | Shared cache system and method, electronic device and storage medium |
| CN114402304A (en) * | 2020-08-19 | 2022-04-26 | 谷歌有限责任公司 | Memory sharing |
| US20230081746A1 (en) * | 2021-09-13 | 2023-03-16 | Apple Inc. | Guaranteed real-time cache carveout for displayed image processing systems and methods |
Families Citing this family (37)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| AU2001286410A1 (en) * | 2000-07-31 | 2002-02-13 | Morphics Technology, Inc. | Method and apparatus for time-sliced and multi-threaded data processing in a communication system |
| EP1451712A2 (en) * | 2001-07-07 | 2004-09-01 | Koninklijke Philips Electronics N.V. | Processor cluster |
| ITRM20020281A1 (en) * | 2002-05-20 | 2003-11-20 | Micron Technology Inc | METHOD AND EQUIPMENT FOR QUICK ACCESS OF MEMORIES. |
| US7076609B2 (en) * | 2002-09-20 | 2006-07-11 | Intel Corporation | Cache sharing for a chip multiprocessor or multiprocessing system |
| US7380114B2 (en) * | 2002-11-15 | 2008-05-27 | Broadcom Corporation | Integrated circuit with DMA module for loading portions of code to a code memory for execution by a host processor that controls a video decoder |
| US7089361B2 (en) * | 2003-08-07 | 2006-08-08 | International Business Machines Corporation | Dynamic allocation of shared cache directory for optimizing performance |
| ATE366985T1 (en) * | 2003-09-04 | 2007-08-15 | Koninkl Philips Electronics Nv | INTEGRATED CIRCUIT AND METHOD FOR CACHE REMAPPING |
| US7856632B2 (en) * | 2004-01-29 | 2010-12-21 | Klingman Edwin E | iMEM ASCII architecture for executing system operators and processing data operators |
| US7617359B2 (en) | 2004-06-10 | 2009-11-10 | Marvell World Trade Ltd. | Adaptive storage system including hard disk drive with flash interface |
| US7788427B1 (en) | 2005-05-05 | 2010-08-31 | Marvell International Ltd. | Flash memory interface for disk drive |
| US20070083785A1 (en) * | 2004-06-10 | 2007-04-12 | Sehat Sutardja | System with high power and low power processors and thread transfer |
| US7702848B2 (en) * | 2004-06-10 | 2010-04-20 | Marvell World Trade Ltd. | Adaptive storage system including hard disk drive with flash interface |
| US7730335B2 (en) | 2004-06-10 | 2010-06-01 | Marvell World Trade Ltd. | Low power computer with main and auxiliary processors |
| US7634615B2 (en) * | 2004-06-10 | 2009-12-15 | Marvell World Trade Ltd. | Adaptive storage system |
| US7941585B2 (en) | 2004-09-10 | 2011-05-10 | Cavium Networks, Inc. | Local scratchpad and data caching system |
| US7594081B2 (en) | 2004-09-10 | 2009-09-22 | Cavium Networks, Inc. | Direct access to low-latency memory |
| DK1794979T3 (en) | 2004-09-10 | 2017-07-24 | Cavium Inc | Selective copying of data structure |
| US7996644B2 (en) * | 2004-12-29 | 2011-08-09 | Intel Corporation | Fair sharing of a cache in a multi-core/multi-threaded processor by dynamically partitioning of the cache |
| US7590804B2 (en) * | 2005-06-28 | 2009-09-15 | Intel Corporation | Pseudo least recently used replacement/allocation scheme in request agent affinitive set-associative snoop filter |
| US7730261B1 (en) | 2005-12-20 | 2010-06-01 | Marvell International Ltd. | Multicore memory management system |
| US20080263324A1 (en) | 2006-08-10 | 2008-10-23 | Sehat Sutardja | Dynamic core switching |
| JP2008097572A (en) * | 2006-09-11 | 2008-04-24 | Matsushita Electric Ind Co Ltd | Arithmetic device, computer system, and portable device |
| US8209702B1 (en) * | 2007-09-27 | 2012-06-26 | Emc Corporation | Task execution using multiple pools of processing threads, each pool dedicated to execute different types of sub-tasks |
| US8176255B2 (en) * | 2007-10-19 | 2012-05-08 | Hewlett-Packard Development Company, L.P. | Allocating space in dedicated cache ways |
| US20090217280A1 (en) * | 2008-02-21 | 2009-08-27 | Honeywell International Inc. | Shared-Resource Time Partitioning in a Multi-Core System |
| US8327198B2 (en) * | 2009-08-14 | 2012-12-04 | Intel Corporation | On-die logic analyzer for semiconductor die |
| US8677371B2 (en) * | 2009-12-31 | 2014-03-18 | International Business Machines Corporation | Mixed operating performance modes including a shared cache mode |
| US8627012B1 (en) | 2011-12-30 | 2014-01-07 | Emc Corporation | System and method for improving cache performance |
| US9009416B1 (en) * | 2011-12-30 | 2015-04-14 | Emc Corporation | System and method for managing cache system content directories |
| US9235524B1 (en) | 2011-12-30 | 2016-01-12 | Emc Corporation | System and method for improving cache performance |
| US9053033B1 (en) * | 2011-12-30 | 2015-06-09 | Emc Corporation | System and method for cache content sharing |
| US8930947B1 (en) | 2011-12-30 | 2015-01-06 | Emc Corporation | System and method for live migration of a virtual machine with dedicated cache |
| US9104529B1 (en) | 2011-12-30 | 2015-08-11 | Emc Corporation | System and method for copying a cache system |
| US9158578B1 (en) | 2011-12-30 | 2015-10-13 | Emc Corporation | System and method for migrating virtual machines |
| US9529719B2 (en) | 2012-08-05 | 2016-12-27 | Advanced Micro Devices, Inc. | Dynamic multithreaded cache allocation |
| US9772881B2 (en) | 2013-10-29 | 2017-09-26 | Hua Zhong University Of Science Technology | Hardware resource allocation for applications |
| US9558124B2 (en) | 2013-11-08 | 2017-01-31 | Seagate Technology Llc | Data storage system with passive partitioning in a secondary memory |
Family Cites Families (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH0799508B2 (en) * | 1990-10-15 | 1995-10-25 | インターナショナル・ビジネス・マシーンズ・コーポレイション | Method and system for dynamically partitioning cache storage |
| US6058456A (en) * | 1997-04-14 | 2000-05-02 | International Business Machines Corporation | Software-managed programmable unified/split caching mechanism for instructions and data |
| US5978888A (en) * | 1997-04-14 | 1999-11-02 | International Business Machines Corporation | Hardware-managed programmable associativity caching mechanism monitoring cache misses to selectively implement multiple associativity levels |
| US6205519B1 (en) * | 1998-05-27 | 2001-03-20 | Hewlett Packard Company | Cache management for a multi-threaded processor |
| JP3358655B2 (en) * | 1998-12-22 | 2002-12-24 | 日本電気株式会社 | Cache memory management method in disk array device |
| US6480941B1 (en) * | 1999-02-23 | 2002-11-12 | International Business Machines Corporation | Secure partitioning of shared memory based multiprocessor system |
| US6493800B1 (en) * | 1999-03-31 | 2002-12-10 | International Business Machines Corporation | Method and system for dynamically partitioning a shared cache |
| US6457102B1 (en) * | 1999-11-05 | 2002-09-24 | Emc Corporation | Cache using multiple LRU's |
| US6446168B1 (en) * | 2000-03-22 | 2002-09-03 | Sun Microsystems, Inc. | Method and apparatus for dynamically switching a cache between direct-mapped and 4-way set associativity |
| US6801208B2 (en) * | 2000-12-27 | 2004-10-05 | Intel Corporation | System and method for cache sharing |
-
2001
- 2001-04-20 US US09/838,921 patent/US6725336B2/en not_active Expired - Lifetime
Cited By (64)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6918012B2 (en) * | 2001-08-28 | 2005-07-12 | Hewlett-Packard Development Company, L.P. | Streamlined cache coherency protocol system and method for a multiple processor single chip device |
| US20030046495A1 (en) * | 2001-08-28 | 2003-03-06 | Venkitakrishnan Padmanabha I. | Streamlined cache coherency protocol system and method for a multiple processor single chip device |
| US20030074502A1 (en) * | 2001-10-15 | 2003-04-17 | Eliel Louzoun | Communication between two embedded processors |
| US6925512B2 (en) * | 2001-10-15 | 2005-08-02 | Intel Corporation | Communication between two embedded processors |
| US7380085B2 (en) * | 2001-11-14 | 2008-05-27 | Intel Corporation | Memory adapted to provide dedicated and or shared memory to multiple processors and method therefor |
| US20080294839A1 (en) * | 2004-03-29 | 2008-11-27 | Bell Michael I | System and method for dumping memory in computer systems |
| US8914606B2 (en) * | 2004-07-08 | 2014-12-16 | Hewlett-Packard Development Company, L.P. | System and method for soft partitioning a computer system |
| US20060010450A1 (en) * | 2004-07-08 | 2006-01-12 | Culter Bradley G | System and method for soft partitioning a computer system |
| US20090083508A1 (en) * | 2004-11-11 | 2009-03-26 | Koninklijke Philips Electronics, N.V. | System as well as method for managing memory space |
| CN100485640C (en) * | 2004-11-30 | 2009-05-06 | 国际商业机器公司 | Cache for an enterprise software system |
| US20060259733A1 (en) * | 2005-05-13 | 2006-11-16 | Sony Computer Entertainment Inc. | Methods and apparatus for resource management in a logically partitioned processing environment |
| WO2006136495A1 (en) * | 2005-06-23 | 2006-12-28 | International Business Machines Corporation | System and method of remote media cache optimization for use with multiple processing units |
| US20060294313A1 (en) * | 2005-06-23 | 2006-12-28 | International Business Machines Corporation | System and method of remote media cache optimization for use with multiple processing units |
| US20090024793A1 (en) * | 2007-07-17 | 2009-01-22 | Fontenot Nathan D | Method and apparatus for managing data in a hybrid drive system |
| US7861038B2 (en) * | 2007-07-17 | 2010-12-28 | International Business Machines Corporation | Method and apparatus for managing data in a hybrid drive system |
| US7949913B2 (en) | 2007-08-14 | 2011-05-24 | Dell Products L.P. | Method for creating a memory defect map and optimizing performance using the memory defect map |
| US7945815B2 (en) | 2007-08-14 | 2011-05-17 | Dell Products L.P. | System and method for managing memory errors in an information handling system |
| US20090049257A1 (en) * | 2007-08-14 | 2009-02-19 | Dell Products L.P. | System and Method for Implementing a Memory Defect Map |
| US9373362B2 (en) | 2007-08-14 | 2016-06-21 | Dell Products L.P. | System and method for implementing a memory defect map |
| US20090049270A1 (en) * | 2007-08-14 | 2009-02-19 | Dell Products L.P. | System and method for using a memory mapping function to map memory defects |
| US20090049351A1 (en) * | 2007-08-14 | 2009-02-19 | Dell Products L.P. | Method for Creating a Memory Defect Map and Optimizing Performance Using the Memory Defect Map |
| US7694195B2 (en) | 2007-08-14 | 2010-04-06 | Dell Products L.P. | System and method for using a memory mapping function to map memory defects |
| US20090049335A1 (en) * | 2007-08-14 | 2009-02-19 | Dell Products L.P. | System and Method for Managing Memory Errors in an Information Handling System |
| US8219220B2 (en) * | 2007-11-13 | 2012-07-10 | Rockwell Automation Technologies, Inc. | Industrial controller using shared memory multicore architecture |
| US20090210069A1 (en) * | 2007-11-13 | 2009-08-20 | Schultz Ronald E | Industrial controller using shared memory multicore architecture |
| US8108056B2 (en) * | 2007-11-13 | 2012-01-31 | Rockwell Automation Technologies, Inc. | Industrial controller using shared memory multicore architecture |
| US8219221B2 (en) * | 2007-11-13 | 2012-07-10 | Rockwell Automation Technologies, Inc. | Industrial controller using shared memory multicore architecture |
| US20090210070A1 (en) * | 2007-11-13 | 2009-08-20 | Schultz Ronald E | Industrial controller using shared memory multicore architecture |
| US20090132059A1 (en) * | 2007-11-13 | 2009-05-21 | Schultz Ronald E | Industrial controller using shared memory multicore architecture |
| GB2457341B (en) * | 2008-02-14 | 2010-07-21 | Transitive Ltd | Multiprocessor computing system with multi-mode memory consistency protection |
| GB2457341A (en) * | 2008-02-14 | 2009-08-19 | Transitive Ltd | Multiprocessor computing system with multi-mode memory consistency protection |
| US20100199044A1 (en) * | 2009-01-30 | 2010-08-05 | Sony Corporation | Interface apparatus, calculation processing apparatus, interface generation apparatus, and circuit generation apparatus |
| US8307160B2 (en) * | 2009-01-30 | 2012-11-06 | Sony Corporation | Interface apparatus, calculation processing apparatus, interface generation apparatus, and circuit generation apparatus |
| US20110040940A1 (en) * | 2009-08-13 | 2011-02-17 | Wells Ryan D | Dynamic cache sharing based on power state |
| US9311245B2 (en) * | 2009-08-13 | 2016-04-12 | Intel Corporation | Dynamic cache sharing based on power state |
| US9983792B2 (en) | 2009-08-13 | 2018-05-29 | Intel Corporation | Dynamic cache sharing based on power state |
| US20120314513A1 (en) * | 2011-06-09 | 2012-12-13 | Semiconductor Energy Laboratory Co., Ltd. | Semiconductor memory device and method of driving semiconductor memory device |
| KR20120137282A (en) * | 2011-06-09 | 2012-12-20 | 가부시키가이샤 한도오따이 에네루기 켄큐쇼 | Semiconductor memory device and method of driving semiconductor memory device |
| US8953354B2 (en) * | 2011-06-09 | 2015-02-10 | Semiconductor Energy Laboratory Co., Ltd. | Semiconductor memory device and method of driving semiconductor memory device |
| TWI575536B (en) * | 2011-06-09 | 2017-03-21 | 半導體能源研究所股份有限公司 | Semiconductor memory device and method of driving semiconductor memory device |
| KR101993586B1 (en) * | 2011-06-09 | 2019-06-28 | 가부시키가이샤 한도오따이 에네루기 켄큐쇼 | Semiconductor memory device and method of driving semiconductor memory device |
| US9224500B2 (en) | 2011-11-29 | 2015-12-29 | Kingtiger Technology (Canada) Inc. | Systems and methods for testing and assembling memory modules |
| US8724408B2 (en) | 2011-11-29 | 2014-05-13 | Kingtiger Technology (Canada) Inc. | Systems and methods for testing and assembling memory modules |
| US9223709B1 (en) * | 2012-03-06 | 2015-12-29 | Marvell International Ltd. | Thread-aware cache memory management |
| US9117552B2 (en) | 2012-08-28 | 2015-08-25 | Kingtiger Technology(Canada), Inc. | Systems and methods for testing memory |
| US20210334144A1 (en) * | 2014-03-31 | 2021-10-28 | Cfph, Llc | Resource allocation |
| US12204942B2 (en) * | 2014-03-31 | 2025-01-21 | Cfph, Llc | Resource allocation to avoid slowdown |
| US9928110B2 (en) * | 2014-03-31 | 2018-03-27 | Cfph, Llc | Resource allocation based on processor assignments |
| US11055143B2 (en) | 2014-03-31 | 2021-07-06 | Cfph, Llc | Processor and memory allocation |
| WO2015152893A1 (en) * | 2014-03-31 | 2015-10-08 | Cfph, Llc | Resource allocation |
| EP3317769A4 (en) * | 2015-07-28 | 2018-07-04 | Huawei Technologies Co., Ltd. | Advance cache allocator |
| US10248457B2 (en) | 2016-08-10 | 2019-04-02 | International Business Machines Corporation | Providing exclusive use of cache associated with a processing entity of a processor complex to a selected task |
| US10275280B2 (en) | 2016-08-10 | 2019-04-30 | International Business Machines Corporation | Reserving a core of a processor complex for a critical task |
| US10223164B2 (en) | 2016-10-24 | 2019-03-05 | International Business Machines Corporation | Execution of critical tasks based on the number of available processing entities |
| US10248464B2 (en) * | 2016-10-24 | 2019-04-02 | International Business Machines Corporation | Providing additional memory and cache for the execution of critical tasks by folding processing units of a processor complex |
| US10671438B2 (en) | 2016-10-24 | 2020-06-02 | International Business Machines Corporation | Providing additional memory and cache for the execution of critical tasks by folding processing units of a processor complex |
| US20210255972A1 (en) * | 2019-02-13 | 2021-08-19 | Google Llc | Way partitioning for a system-level cache |
| US11620243B2 (en) * | 2019-02-13 | 2023-04-04 | Google Llc | Way partitioning for a system-level cache |
| CN114402304A (en) * | 2020-08-19 | 2022-04-26 | 谷歌有限责任公司 | Memory sharing |
| US20220300421A1 (en) * | 2020-08-19 | 2022-09-22 | Google Llc | Memory Sharing |
| US12013780B2 (en) * | 2020-08-19 | 2024-06-18 | Google Llc | Multi-partition memory sharing with multiple components |
| CN114301858A (en) * | 2021-02-05 | 2022-04-08 | 井芯微电子技术(天津)有限公司 | Shared cache system and method, electronic device and storage medium |
| US20230081746A1 (en) * | 2021-09-13 | 2023-03-16 | Apple Inc. | Guaranteed real-time cache carveout for displayed image processing systems and methods |
| US11875427B2 (en) * | 2021-09-13 | 2024-01-16 | Apple Inc. | Guaranteed real-time cache carveout for displayed image processing systems and methods |
Also Published As
| Publication number | Publication date |
|---|---|
| US6725336B2 (en) | 2004-04-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US6725336B2 (en) | Dynamically allocated cache memory for a multi-processor unit | |
| US7076609B2 (en) | Cache sharing for a chip multiprocessor or multiprocessing system | |
| US7793038B2 (en) | System and method for programmable bank selection for banked memory subsystems | |
| US6490655B1 (en) | Data processing apparatus and method for cache line replacement responsive to the operational state of memory | |
| US6546471B1 (en) | Shared memory multiprocessor performing cache coherency | |
| US4616310A (en) | Communicating random access memory | |
| EP0557050B1 (en) | Apparatus and method for executing processes in a multiprocessor system | |
| US6816947B1 (en) | System and method for memory arbitration | |
| JP4006436B2 (en) | Multi-level cache with overlapping sets of associative sets at different cache levels | |
| EP1442374B1 (en) | Multi-core multi-thread processor | |
| US7257673B2 (en) | Ternary CAM with software programmable cache policies | |
| US4881163A (en) | Computer system architecture employing cache data line move-out queue buffer | |
| US20130046934A1 (en) | System caching using heterogenous memories | |
| US20040268044A1 (en) | Multiprocessor system with dynamic cache coherency regions | |
| US20050021913A1 (en) | Multiprocessor computer system having multiple coherency regions and software process migration between coherency regions without cache purges | |
| US8527708B2 (en) | Detecting address conflicts in a cache memory system | |
| US7809889B2 (en) | High performance multilevel cache hierarchy | |
| US6988167B2 (en) | Cache system with DMA capabilities and method for operating same | |
| US6671822B1 (en) | Method and system for absorbing defects in high performance microprocessor with a large n-way set associative cache | |
| US6038642A (en) | Method and system for assigning cache memory utilization within a symmetric multiprocessor data-processing system | |
| KR20230046356A (en) | Memory device, operating method of memory device, and electronic device including memory device | |
| US5893163A (en) | Method and system for allocating data among cache memories within a symmetric multiprocessor data-processing system | |
| US6789168B2 (en) | Embedded DRAM cache | |
| US6094710A (en) | Method and system for increasing system memory bandwidth within a symmetric multiprocessor data-processing system | |
| US20020108021A1 (en) | High performance cache and method for operating same |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: SUN MICROSYSTEMS, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CHERABUDDI, RAJASEKHAR;REEL/FRAME:011959/0552 Effective date: 20010503 |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| FPAY | Fee payment |
Year of fee payment: 4 |
|
| FPAY | Fee payment |
Year of fee payment: 8 |
|
| FPAY | Fee payment |
Year of fee payment: 12 |
|
| AS | Assignment |
Owner name: ORACLE AMERICA, INC., CALIFORNIA Free format text: MERGER AND CHANGE OF NAME;ASSIGNORS:ORACLE USA, INC.;SUN MICROSYSTEMS, INC.;ORACLE AMERICA, INC.;REEL/FRAME:037278/0757 Effective date: 20100212 |