US20210286732A1 - Multi-way cache memory access - Google Patents
Multi-way cache memory access Download PDFInfo
- Publication number
- US20210286732A1 US20210286732A1 US16/817,609 US202016817609A US2021286732A1 US 20210286732 A1 US20210286732 A1 US 20210286732A1 US 202016817609 A US202016817609 A US 202016817609A US 2021286732 A1 US2021286732 A1 US 2021286732A1
- Authority
- US
- United States
- Prior art keywords
- tag
- memory
- instruction
- cpu
- address
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/16—Handling requests for interconnection or transfer for access to memory bus
- G06F13/1668—Details of memory controller
- G06F13/1689—Synchronisation and timing concerns
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3004—Arrangements for executing specific machine instructions to perform operations on memory
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/084—Multiuser, multiprocessor or multiprocessing cache systems with a shared cache
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0844—Multiple simultaneous or quasi-simultaneous cache accessing
- G06F12/0853—Cache with multiport tag or data arrays
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0864—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches using pseudo-associative means, e.g. set-associative or hashing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0875—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0893—Caches characterised by their organisation or structure
- G06F12/0895—Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30076—Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
- G06F9/30087—Synchronisation or serialisation instructions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/45—Caching of specific data in cache memory
- G06F2212/452—Instruction code
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present application generally pertains to generating cache memory, and more particularly to cache memory architectures which use low area.
- Cache memories are used in computer systems to reduce instruction access time for frequently used instructions.
- Central Processing Unit (CPU) executable instructions are stored in RAM, and are available for access by the CPU, as needed. Some, but not all, instructions, for example recently used instructions, are additionally stored in the cache memory. Because the cache memory is faster than RAM, the cache memory is preferred, and is used if the instruction needed by the CPU is stored therein. If the instruction needed by the CPU is not stored in the cache memory, the instruction is retrieved from the RAM.
- CPU Central Processing Unit
- the cache memory includes an instruction memory portion having a plurality of instruction memory locations configured to store instruction data encoding a plurality of CPU instructions.
- the cache memory also includes a tag memory portion having a plurality of tag memory locations configured to store tag data encoding a plurality of RAM memory address ranges the CPU instructions are stored in.
- the instruction memory portion includes a single memory circuit having an instruction memory array and a plurality of instruction peripheral circuits communicatively connected with the instruction memory array.
- the tag memory portion includes a plurality of tag memory circuits, where each of the tag memory circuits includes a tag memory array, and a plurality of tag peripheral circuits communicatively connected with the tag memory array.
- the computer system includes a CPU configured to execute CPU instructions, a RAM configured to store first representations of the CPU instructions, and a cache memory.
- the cache memory includes an instruction memory portion having a plurality of instruction memory locations configured to store instruction data encoding a plurality of CPU instructions.
- the cache memory also includes a tag memory portion having a plurality of tag memory locations configured to store tag data encoding a plurality of RAM memory address ranges the CPU instructions are stored in.
- the instruction memory portion includes a single memory circuit having an instruction memory array and a plurality of instruction peripheral circuits communicatively connected with the instruction memory array.
- the tag memory portion includes a plurality of tag memory circuits, where each of the tag memory circuits includes a tag memory array, and a plurality of tag peripheral circuits communicatively connected with the tag memory array.
- FIG. 1 is a schematic flowchart illustrating
- FIG. 2 is a schematic illustration of a cache memory according to some embodiments.
- FIG. 3 illustrates a timing diagram schematically illustrating certain timing relationships for various operations of a cache memory.
- FIG. 4 is a schematic illustration of a comparison circuit, which is configured to identify which of the M ways has the instruction requested by a CPU.
- FIG. 5 illustrates a timing diagram schematically illustrating certain timing relationships for various operations of a cache memory and a comparison circuit.
- FIG. 6 is a schematic illustration of an instruction memory portion of a cache memory according to some embodiments.
- FIG. 7 illustrates a timing diagram schematically illustrating certain timing relationships for various operations instruction of a memory portion of a cache memory.
- FIG. 8 illustrates a timing diagram schematically illustrating another embodiment of certain timing relationships for various operations of a cache memory.
- FIG. 9 is a chart illustrating cache memory area and power improvement achieved using an embodiment of a cache memory using inventive aspects discussed herein as compared with a traditional cache memory.
- FIG. 1 is a schematic illustration of a computer system 100 .
- Computer system 100 includes CPU 110 , random access memory (RAM) 120 , and cache memory 130 .
- RAM random access memory
- the information stored in a cache memory 130 includes instructions which the CPU 110 may need for executing a software application.
- the information stored in the cache memory 130 also includes information for each particular instruction identifying a portion or address range of the RAM 120 the particular instruction is stored in.
- the identifying information is called a tag.
- Other information may additionally be stored in the cache memory, as understood by those of skill in the art.
- cache memories may be subdivided into multiple ways, where each way is independently written and read.
- the CPU provides an address to the cache memory.
- the CPU address includes a tag portion and an index portion.
- the CPU address may additionally include other information, such as an offset, as understood by those of skill in the art.
- the index portion of the CPU address is used to read one instruction and its corresponding tag from each of the ways. Accordingly, a number of instructions corresponding to the number of ways, along with each of their corresponding tags, are read from the cache memory based on the index portion of the CPU address.
- the tags associated with the instructions are each compared to the tag portion of the CPU address. If one of the tags matches the tag portion of the CPU address, the instruction corresponding with the matching tag is provided to the CPU as the instruction requested by the CPU. If none of the tags match the tag portion of the CPU address, the instruction requested by the CPU is not located in the cache memory, and must, instead, be retrieved from RAM.
- FIG. 2 is a schematic illustration of a cache memory 200 according to some embodiments.
- Cache memory 200 may be used in computer system 100 as cache memory 130 .
- Cache memory 200 includes M ways 230 , where each way includes a tag memory portion 210 and an instruction memory portion 220 .
- the ways of cache memory 200 include one or more other memory portions, as understood by those of skill in the art.
- each way includes a valid bit portion, where each bit in the valid bit portion indicates whether a particular instruction is valid, as understood by those of skill in the art.
- Instruction memory portion 220 is written with data corresponding with CPU instructions.
- tag memory portion 210 is written with data corresponding with portions or address ranges of the RAM the instructions are stored in.
- Instruction memory portion 220 is a single memory circuit, despite being abstractly or conceptually segmented into the M ways. Accordingly, instruction memory portion 220 includes an array of memory cells which receives signals from and provides signals to a number of peripheral circuits which are used to access the memory cells for writing and for reading instruction information. As understood by those of skill in the art, the peripheral circuits may include, for example, an address decoder, sense amplifiers, a column multiplexer, and output buffers. In some embodiments, the peripheral circuits may include one or more other circuits. The memory cells are each constituent to a particular one of the ways. The peripheral circuits, however, may each receive signals from or provide signals to memory cells of all of the ways.
- Tag memory portion 210 includes a single memory circuit for each of the ways. Accordingly each way includes an array of memory cells which receives signals from and provides signals to a number of peripheral circuits which are used to access the memory cells for writing and for reading tag information.
- the peripheral circuits may include, for example, an address decoder, sense amplifiers, a column multiplexer, and output buffers. In some embodiments, the peripheral circuits may include one or more other circuits. The memory cells and the peripheral circuits are each constituent to a particular one of the ways.
- Cache memory 200 is structured so that, to fetch an instruction therefrom, the CPU (e.g. CPU 110 ) provides an address to the cache memory 200 .
- the CPU address includes a tag portion and an index portion.
- the CPU address may additionally include other information, such as an offset, as understood by those of skill in the art.
- the index portion of the CPU address identifies a memory location in each of the tag memory portions 210 ( 0 ) to 210 (M ⁇ 1) of the M ways.
- the M tag memory portions 210 ( 0 ) to 210 (M ⁇ 1) are each associated with a memory location in a corresponding one of the instruction memory portions 220 ( 0 ) to 220 (M ⁇ 1) of the M ways.
- the association of the M tag memory portions 210 ( 0 ) to 210 (M ⁇ 1) and the instruction memory portions 220 ( 0 ) to 220 (M ⁇ 1) of the M ways is instantiated in hardware at least by each of the M tag memory portions 210 ( 0 ) to 210 (M ⁇ 1) and its associated instruction memory portion 220 ( 0 ) to 220 (M ⁇ 1) having an address partially or wholly identified by the index portion of the CPU address.
- the M tag memory portions 210 ( 0 ) to 210 (M ⁇ 1) identified by the index portion of the CPU address are read to retrieve M tags.
- the M tags are each compared with the tag portion of the CPU address. If one of the M tags matches the tag portion of the CPU address, the way having the matching tag is identified. If none of the tags matches the tag portion of the CPU address, the instruction requested by the CPU is not located in the cache memory, and must be retrieved from RAM.
- the index portion of the CPU address is then used to read an instruction from the instruction memory portion 220 ( x ) of the identified way.
- the instruction read from the instruction memory portion 220 ( x ) of the identified way is returned to the CPU as the instruction requested by the CPU.
- FIG. 3 illustrates a timing diagram 300 schematically illustrating certain timing relationships for various operations of cache memory 200 .
- Timing diagram 300 illustrates CPU clock waveform 310 , tag clock waveform 320 , instruction clock waveform 330 , and instruction waveform 340 .
- the illustrated waveforms correspond with clocks generated by clock generation circuits understood by those of skill in the art.
- CPU clock waveform 310 illustrates a representation of a CPU clock used by the CPU 110 .
- the CPU clock represented by CPU clock waveform 310 may be used by CPU 110 , for example, to receive input data, to execute instructions, and to generate output data.
- CPU 110 may use the CPU clock represented by CPU clock waveform 310 to additionally perform other operations.
- CPU 110 may use additional clocks (not shown).
- Tag clock waveform 320 illustrates a representation of a tag clock used by tag memory portion 210 .
- the tag clock represented by tag clock waveform 320 may be used by tag memory portion 210 , for example, for writing and reading tag data to and from tag memory portion 210 .
- Instruction clock waveform 330 illustrates a representation of an instruction clock used by instruction memory portion 220 .
- the instruction clock represented by instruction clock waveform 330 may be used by instruction memory portion 220 , for example, for writing and reading instruction data to and from instruction memory portion 220 .
- Instruction waveform 340 illustrates a representation of instruction data.
- the instruction data encodes instructions which are executable by CPU 110 , and which are provided to CPU 110 by cache memory 200 , for example, for execution by CPU 110 .
- the tag clock is active (high). While the tag clock is active, the M tag memory portions 210 ( 0 ) to 210 (M ⁇ 1) identified by the index portion of the CPU address are read to retrieve M tags. In addition, while the tag clock is active, the M tags are each compared with the tag portion of the CPU address. If one of the M tags matches the tag portion of the CPU address, the way having the matching tag is identified.
- the instruction clock is active (high).
- the index portion of the CPU address is used to perform a read operation on the instruction memory portion 220 ( x ) of the identified way to read an instruction therefrom.
- the instruction read from the instruction memory portion 220 ( x ) of the identified way is returned to the CPU 110 as the instruction requested by the CPU 110 .
- Timing diagram 300 schematically illustrates an embodiment of certain timing relationships for CPU clock waveform 310 , tag clock waveform 320 , instruction clock waveform 330 , and instruction waveform 340 .
- Alternative timing relationships may alternatively be used.
- the phase relationship between CPU clock waveform 310 and either or both of tag clock waveform 320 and instruction clock waveform 330 may be modified.
- the active states of either or both of tag clock waveform 320 and instruction clock waveform 330 are low.
- FIG. 4 is a schematic illustration of a comparison circuit 400 , which is configured to identify which of the M ways has the instruction requested by the CPU 110 .
- Comparison circuit 400 includes tag comparators 410 ( 0 ) to 410 (M ⁇ 1) and tri-state driver arrays 420 ( 0 ) to 420 (M ⁇ 1). In some embodiments, alternative comparison circuits are used.
- comparison circuit 400 receives the tag portion of the CPU address at bus CPUTAG.
- comparison circuit receives M tags at tag busses TAG( 0 ) to TAG(M ⁇ 1). Each of the M tags is generated as the result of reading one of the tag memory portions 210 ( 0 ) to 210 (M ⁇ 1) of the M ways.
- Each of tag comparators 410 ( 0 ) to 410 (M ⁇ 1) is configured to compare one of the M tags with the tag portion of the CPU address at bus CPUTAG. At most one of the M tags matches the tag portion of the CPU address.
- Tri-state driver arrays 420 ( 0 ) to 420 (M ⁇ 1) each have data inputs which receive data identifying one of the M ways.
- the data inputs of each of the tri-state driver arrays 420 ( 0 ) to 420 (M ⁇ 1) are connected to one of the way identification busses WAY( 0 ) to WAY(M ⁇ 1).
- tri-state driver arrays 420 ( 0 ) to 420 (M ⁇ 1) each receive an indication of whether a particular one of the M tags at tag busses TAG( 0 ) to TAG(M ⁇ 1) matches the tag portion of the CPU address at bus CPUTAG.
- the tri-state driver arrays 420 ( 0 ) to 420 (M ⁇ 1) perform a multiplexing function which passes data identifying the particular way having tag data matching the tag portion of the CPU address.
- a multiplexing function which passes data identifying the particular way having tag data matching the tag portion of the CPU address.
- alternative circuits performing the multiplexing function may alternatively be used.
- FIG. 5 illustrates a timing diagram 500 schematically illustrating certain timing relationships for various operations of cache memory 200 and comparison circuit 400 .
- Timing diagram 500 illustrates CPU clock waveform 510 , tag clock waveform 520 , tag data waveform 530 , CPU tag data waveform 540 , and way bus data waveform 550 .
- CPU clock waveform 510 illustrates a representation of a CPU clock used by the CPU 110 , and has characteristics similar or identical to CPU clock waveform 310 of FIG. 3 .
- Tag clock waveform 520 illustrates a representation of a tag clock used by tag memory portion 210 , and has characteristics similar or identical to tag clock waveform 320 of FIG. 3 .
- Tag data waveform 530 illustrates a representation of tag data identifying the M tags at tag busses TAG( 0 ) to TAG(M ⁇ 1).
- CPU tag data waveform 540 illustrates a representation of the tag portion of the CPU address at bus CPUTAG.
- Way bus data waveform 550 illustrates a representation of way identification data at way bus WAY(X).
- the tag clock is active (high).
- the M tag memory portions 210 ( 0 ) to 210 (M ⁇ 1) identified by the index portion of the CPU address are read to retrieve M tags.
- the M tags are respectively represented by data at tag busses TAG( 0 ) to TAG(M ⁇ 1).
- the M tags are each compared with the tag portion of the CPU address by comparators 410 ( 0 ) to 410 (M ⁇ 1).
- tri-state driver arrays 420 ( 0 ) to 420 (M ⁇ 1) each receive data identifying one of the M ways from the way identification busses WAY( 0 ) to WAY(M ⁇ 1).
- tri-state driver arrays 420 ( 0 ) to 420 (M ⁇ 1) each receive an indication from a corresponding comparator 410 indicating whether a particular one of the M tags at tag busses TAG( 0 ) to TAG(M ⁇ 1) matches the tag portion of the CPU address at bus CPUTAG.
- the tri-state driver array receiving the indication that the particular one of the M tags associated therewith matches the tag portion of the CPU address at bus CPUTAG, outputs way identification data at way bus WAY(X) identifying the particular way identified at its data input.
- Timing diagram 700 schematically illustrates an embodiment of certain timing relationships for CPU clock waveform 510 , tag clock waveform 520 , tag data waveform 530 , CPU tag data waveform 540 , and way bus data waveform 550 .
- Alternative timing relationships may alternatively be used.
- the phase relationship between CPU clock waveform 510 and tag clock waveform 520 may be modified.
- the active state of tag clock waveform 520 is low.
- FIG. 6 is a schematic illustration of an instruction memory portion 220 of a cache memory 200 according to some embodiments.
- Instruction memory portion 220 includes memory locations for instructions stored in all of the ways of cache memory 200 . Accordingly, instruction memory portion 220 includes stores instructions in memory locations 220 ( 0 ) to 220 (M*L ⁇ 1), where M is equal to the number of ways, and L is equal to the length (number of instruction memory locations) in each way.
- instruction memory portion 220 receives an address. Instruction memory portion 220 is configured to be read so as to output the instruction stored in the memory location associated with the received address.
- the received address includes a tag portion and an index portion.
- the tag portion is generated by comparison circuit 400 and is the formed by the way data at way bus WAY(X) indicating the particular way identified as having tag data matching the tag portion of the CPU address.
- the index portion of the received address is formed by the index portion of the CPU address.
- FIG. 7 illustrates a timing diagram 700 schematically illustrating certain timing relationships for various operations instruction memory portion 220 of cache memory 200 .
- Timing diagram 700 illustrates CPU clock waveform 710 , instruction clock waveform 720 , index data waveform 730 , way data waveform 740 , and instruction data waveform 750 .
- CPU clock waveform 710 illustrates a representation of a CPU clock used by the CPU 110 , and has characteristics similar or identical to CPU clock waveform 310 of FIG. 3 .
- Instruction clock waveform 720 illustrates a representation of an instruction clock used by instruction memory portion 220 , and has characteristics similar or identical to instruction clock waveform 320 of FIG. 3 .
- Index data waveform 730 illustrates a representation of the index portion of the CPU address at bus CPUTAG.
- Way data waveform 740 illustrates a representation of way identification data at way bus WAY(X).
- Instruction data waveform 750 illustrates a representation of the instruction read from instruction memory portion 220 .
- the instruction clock is active (high).
- the way identification data and the index portion of the CPU address are used to read an instruction from the instruction memory portion 220 .
- the instruction read from the instruction memory portion 220 is returned to the CPU 110 as the instruction requested by the CPU 110 .
- Timing diagram 700 schematically illustrates an embodiment of certain timing relationships for CPU clock waveform 710 , instruction clock waveform 720 , index data waveform 730 , way data waveform 740 , and instruction data waveform 750 .
- Alternative timing relationships may alternatively be used.
- the phase relationship between CPU clock waveform 710 and instruction clock waveform 720 may be modified.
- the active state of instruction clock waveform 720 is low.
- FIG. 8 illustrates a timing diagram 800 schematically illustrating another embodiment of certain timing relationships for various operations of cache memory 200 .
- Timing diagram 800 illustrates CPU clock waveform 810 , N ⁇ CPU clock waveform 810 , tag clock waveform 820 , instruction clock waveform 830 , and instruction waveform 840 .
- the illustrated waveforms correspond with clocks generated by clock generation circuits understood by those of skill in the art.
- CPU clock waveform 810 illustrates a representation of a CPU clock used by the CPU 110 .
- the CPU clock represented by CPU clock waveform 810 may be used by CPU 110 , for example, to receive input data, to execute instructions, and to generate output data.
- CPU 110 may use the CPU clock represented by CPU clock waveform 810 to additionally perform other operations.
- CPU 110 may use additional clocks (not shown).
- N ⁇ CPU clock waveform 815 illustrates a representation of a clock which has a frequency which is a multiple of the frequency of the CPU clock.
- the frequency of the clock of N ⁇ CPU clock waveform 815 has a frequency which is three times the frequency of the CPU clock.
- the clock of N ⁇ CPU clock waveform 815 may be generated based on the CPU clock using circuits known to those of skill in the art.
- Tag clock waveform 820 illustrates a representation of a tag clock used by tag memory portion 210 .
- the tag clock represented by tag clock waveform 820 may be used by tag memory portion 210 , for example, for writing and reading tag data to and from tag memory portion 210 .
- Instruction clock waveform 830 illustrates a representation of an instruction clock used by instruction memory portion 220 .
- the instruction clock represented by instruction clock waveform 830 may be used by instruction memory portion 220 , for example, for writing and reading instruction data to and from instruction memory portion 220 .
- Instruction waveform 840 illustrates a representation of instruction data.
- the instruction data encodes instructions which are executable by CPU 110 , and which are provided to CPU 110 by cache memory 200 , for example, for execution by CPU 110 .
- the tag clock is active (high). While the tag clock is active, the M tag memory portions 210 ( 0 ) to 210 (M ⁇ 1) identified by the index portion of the CPU address are read to retrieve M tags. In addition, while the tag clock is active, the M tags are each compared with the tag portion of the CPU address. If one of the M tags matches the tag portion of the CPU address, the way having the matching tag is identified.
- the instruction clock is active (high).
- the index portion of the CPU address is used to perform a read operation on the instruction memory portion 220 ( x ) of the identified way to read an instruction therefrom.
- the instruction read from the instruction memory portion 220 ( x ) of the identified way is returned to the CPU 110 as the instruction requested by the CPU 110 .
- Timing diagram 800 schematically illustrates an embodiment of certain timing relationships for CPU clock waveform 810 , N ⁇ CPU clock waveform 815 , tag clock waveform 820 , instruction clock waveform 830 , and instruction waveform 840 .
- Alternative timing relationships may alternatively be used.
- the phase relationship between CPU clock waveform 810 and an of N ⁇ CPU clock waveform 815 , tag clock waveform 820 , and instruction clock waveform 830 may be modified.
- the active states of either or both of tag clock waveform 820 and instruction clock waveform 830 are low.
- FIG. 9 is a chart illustrating cache memory area and power improvement achieved using an embodiment of a cache memory using inventive aspects discussed herein as compared with a traditional cache memory.
- the new and old cache memories are each 8 Kbyte 4-way running a 32 MHz CPU clock.
- the new cache memory uses a 3 ⁇ CPU clock running at 96 MHz.
- the new cache memory uses 51% less area and 72% less power.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
- The present application generally pertains to generating cache memory, and more particularly to cache memory architectures which use low area.
- Cache memories are used in computer systems to reduce instruction access time for frequently used instructions. Central Processing Unit (CPU) executable instructions are stored in RAM, and are available for access by the CPU, as needed. Some, but not all, instructions, for example recently used instructions, are additionally stored in the cache memory. Because the cache memory is faster than RAM, the cache memory is preferred, and is used if the instruction needed by the CPU is stored therein. If the instruction needed by the CPU is not stored in the cache memory, the instruction is retrieved from the RAM.
- Conventional cache memories require large amounts of die area to implement. Improved cache memories requiring less area are needed in the art.
- One inventive aspect is a cache memory. The cache memory includes an instruction memory portion having a plurality of instruction memory locations configured to store instruction data encoding a plurality of CPU instructions. The cache memory also includes a tag memory portion having a plurality of tag memory locations configured to store tag data encoding a plurality of RAM memory address ranges the CPU instructions are stored in. The instruction memory portion includes a single memory circuit having an instruction memory array and a plurality of instruction peripheral circuits communicatively connected with the instruction memory array. The tag memory portion includes a plurality of tag memory circuits, where each of the tag memory circuits includes a tag memory array, and a plurality of tag peripheral circuits communicatively connected with the tag memory array.
- Another inventive aspect is a computer system. The computer system includes a CPU configured to execute CPU instructions, a RAM configured to store first representations of the CPU instructions, and a cache memory. The cache memory includes an instruction memory portion having a plurality of instruction memory locations configured to store instruction data encoding a plurality of CPU instructions. The cache memory also includes a tag memory portion having a plurality of tag memory locations configured to store tag data encoding a plurality of RAM memory address ranges the CPU instructions are stored in. The instruction memory portion includes a single memory circuit having an instruction memory array and a plurality of instruction peripheral circuits communicatively connected with the instruction memory array. The tag memory portion includes a plurality of tag memory circuits, where each of the tag memory circuits includes a tag memory array, and a plurality of tag peripheral circuits communicatively connected with the tag memory array.
-
FIG. 1 is a schematic flowchart illustrating -
FIG. 2 is a schematic illustration of a cache memory according to some embodiments. -
FIG. 3 illustrates a timing diagram schematically illustrating certain timing relationships for various operations of a cache memory. -
FIG. 4 is a schematic illustration of a comparison circuit, which is configured to identify which of the M ways has the instruction requested by a CPU. -
FIG. 5 illustrates a timing diagram schematically illustrating certain timing relationships for various operations of a cache memory and a comparison circuit. -
FIG. 6 is a schematic illustration of an instruction memory portion of a cache memory according to some embodiments. -
FIG. 7 illustrates a timing diagram schematically illustrating certain timing relationships for various operations instruction of a memory portion of a cache memory. -
FIG. 8 illustrates a timing diagram schematically illustrating another embodiment of certain timing relationships for various operations of a cache memory. -
FIG. 9 is a chart illustrating cache memory area and power improvement achieved using an embodiment of a cache memory using inventive aspects discussed herein as compared with a traditional cache memory. - Particular embodiments of the invention are illustrated herein in conjunction with the drawings.
- Various details are set forth herein as they relate to certain embodiments. However, the invention can also be implemented in ways which are different from those described herein. Modifications can be made to the discussed embodiments by those skilled in the art without departing from the invention. Therefore, the invention is not limited to particular embodiments disclosed herein.
-
FIG. 1 is a schematic illustration of acomputer system 100.Computer system 100 includesCPU 110, random access memory (RAM) 120, andcache memory 130. - The information stored in a
cache memory 130 includes instructions which theCPU 110 may need for executing a software application. The information stored in thecache memory 130 also includes information for each particular instruction identifying a portion or address range of theRAM 120 the particular instruction is stored in. The identifying information is called a tag. Other information may additionally be stored in the cache memory, as understood by those of skill in the art. - As understood by those of skill in the art, in computer systems, cache memories may be subdivided into multiple ways, where each way is independently written and read. To fetch an instruction from the cache memory, the CPU provides an address to the cache memory. The CPU address includes a tag portion and an index portion. In some embodiments, the CPU address may additionally include other information, such as an offset, as understood by those of skill in the art.
- In a conventional cache memory, the index portion of the CPU address is used to read one instruction and its corresponding tag from each of the ways. Accordingly, a number of instructions corresponding to the number of ways, along with each of their corresponding tags, are read from the cache memory based on the index portion of the CPU address.
- In the conventional cache memory, the tags associated with the instructions are each compared to the tag portion of the CPU address. If one of the tags matches the tag portion of the CPU address, the instruction corresponding with the matching tag is provided to the CPU as the instruction requested by the CPU. If none of the tags match the tag portion of the CPU address, the instruction requested by the CPU is not located in the cache memory, and must, instead, be retrieved from RAM.
-
FIG. 2 is a schematic illustration of acache memory 200 according to some embodiments.Cache memory 200 may be used incomputer system 100 ascache memory 130.Cache memory 200 includesM ways 230, where each way includes atag memory portion 210 and aninstruction memory portion 220. - In some embodiments, the ways of
cache memory 200 include one or more other memory portions, as understood by those of skill in the art. For example, in some embodiments, each way includes a valid bit portion, where each bit in the valid bit portion indicates whether a particular instruction is valid, as understood by those of skill in the art. -
Instruction memory portion 220 is written with data corresponding with CPU instructions. In addition,tag memory portion 210 is written with data corresponding with portions or address ranges of the RAM the instructions are stored in. -
Instruction memory portion 220 is a single memory circuit, despite being abstractly or conceptually segmented into the M ways. Accordingly,instruction memory portion 220 includes an array of memory cells which receives signals from and provides signals to a number of peripheral circuits which are used to access the memory cells for writing and for reading instruction information. As understood by those of skill in the art, the peripheral circuits may include, for example, an address decoder, sense amplifiers, a column multiplexer, and output buffers. In some embodiments, the peripheral circuits may include one or more other circuits. The memory cells are each constituent to a particular one of the ways. The peripheral circuits, however, may each receive signals from or provide signals to memory cells of all of the ways. -
Tag memory portion 210 includes a single memory circuit for each of the ways. Accordingly each way includes an array of memory cells which receives signals from and provides signals to a number of peripheral circuits which are used to access the memory cells for writing and for reading tag information. As understood by those of skill in the art, the peripheral circuits may include, for example, an address decoder, sense amplifiers, a column multiplexer, and output buffers. In some embodiments, the peripheral circuits may include one or more other circuits. The memory cells and the peripheral circuits are each constituent to a particular one of the ways. -
Cache memory 200 is structured so that, to fetch an instruction therefrom, the CPU (e.g. CPU 110) provides an address to thecache memory 200. The CPU address includes a tag portion and an index portion. In some embodiments, the CPU address may additionally include other information, such as an offset, as understood by those of skill in the art. - The index portion of the CPU address identifies a memory location in each of the tag memory portions 210(0) to 210(M−1) of the M ways. The M tag memory portions 210(0) to 210(M−1) are each associated with a memory location in a corresponding one of the instruction memory portions 220(0) to 220(M−1) of the M ways. The association of the M tag memory portions 210(0) to 210(M−1) and the instruction memory portions 220(0) to 220(M−1) of the M ways is instantiated in hardware at least by each of the M tag memory portions 210(0) to 210(M−1) and its associated instruction memory portion 220(0) to 220(M−1) having an address partially or wholly identified by the index portion of the CPU address.
- The M tag memory portions 210(0) to 210(M−1) identified by the index portion of the CPU address are read to retrieve M tags. The M tags are each compared with the tag portion of the CPU address. If one of the M tags matches the tag portion of the CPU address, the way having the matching tag is identified. If none of the tags matches the tag portion of the CPU address, the instruction requested by the CPU is not located in the cache memory, and must be retrieved from RAM.
- The index portion of the CPU address is then used to read an instruction from the instruction memory portion 220(x) of the identified way. The instruction read from the instruction memory portion 220(x) of the identified way is returned to the CPU as the instruction requested by the CPU.
-
FIG. 3 illustrates a timing diagram 300 schematically illustrating certain timing relationships for various operations ofcache memory 200. Timing diagram 300 illustratesCPU clock waveform 310,tag clock waveform 320,instruction clock waveform 330, andinstruction waveform 340. The illustrated waveforms correspond with clocks generated by clock generation circuits understood by those of skill in the art. -
CPU clock waveform 310 illustrates a representation of a CPU clock used by theCPU 110. As understood by those of skill in the art, the CPU clock represented byCPU clock waveform 310 may be used byCPU 110, for example, to receive input data, to execute instructions, and to generate output data.CPU 110 may use the CPU clock represented byCPU clock waveform 310 to additionally perform other operations.CPU 110 may use additional clocks (not shown). -
Tag clock waveform 320 illustrates a representation of a tag clock used bytag memory portion 210. The tag clock represented bytag clock waveform 320 may be used bytag memory portion 210, for example, for writing and reading tag data to and fromtag memory portion 210. -
Instruction clock waveform 330 illustrates a representation of an instruction clock used byinstruction memory portion 220. The instruction clock represented byinstruction clock waveform 330 may be used byinstruction memory portion 220, for example, for writing and reading instruction data to and frominstruction memory portion 220. -
Instruction waveform 340 illustrates a representation of instruction data. The instruction data encodes instructions which are executable byCPU 110, and which are provided toCPU 110 bycache memory 200, for example, for execution byCPU 110. - As illustrated in
FIG. 3 , during a first portion of a CPU clock period, the tag clock is active (high). While the tag clock is active, the M tag memory portions 210(0) to 210(M−1) identified by the index portion of the CPU address are read to retrieve M tags. In addition, while the tag clock is active, the M tags are each compared with the tag portion of the CPU address. If one of the M tags matches the tag portion of the CPU address, the way having the matching tag is identified. - During a second, subsequent, portion of the CPU clock period, the instruction clock is active (high). During the second portion of the CPU clock period, the index portion of the CPU address is used to perform a read operation on the instruction memory portion 220(x) of the identified way to read an instruction therefrom. Once read, the instruction read from the instruction memory portion 220(x) of the identified way is returned to the
CPU 110 as the instruction requested by theCPU 110. - Timing diagram 300 schematically illustrates an embodiment of certain timing relationships for
CPU clock waveform 310,tag clock waveform 320,instruction clock waveform 330, andinstruction waveform 340. Alternative timing relationships may alternatively be used. For example, the phase relationship betweenCPU clock waveform 310 and either or both oftag clock waveform 320 andinstruction clock waveform 330 may be modified. Additionally or alternatively, in some embodiments, the active states of either or both oftag clock waveform 320 andinstruction clock waveform 330 are low. -
FIG. 4 is a schematic illustration of acomparison circuit 400, which is configured to identify which of the M ways has the instruction requested by theCPU 110.Comparison circuit 400 includes tag comparators 410(0) to 410(M−1) and tri-state driver arrays 420(0) to 420(M−1). In some embodiments, alternative comparison circuits are used. - As illustrated,
comparison circuit 400 receives the tag portion of the CPU address at bus CPUTAG. In addition, comparison circuit receives M tags at tag busses TAG(0) to TAG(M−1). Each of the M tags is generated as the result of reading one of the tag memory portions 210(0) to 210(M−1) of the M ways. - Each of tag comparators 410(0) to 410(M−1) is configured to compare one of the M tags with the tag portion of the CPU address at bus CPUTAG. At most one of the M tags matches the tag portion of the CPU address.
- Tri-state driver arrays 420(0) to 420(M−1) each have data inputs which receive data identifying one of the M ways. The data inputs of each of the tri-state driver arrays 420(0) to 420(M−1) are connected to one of the way identification busses WAY(0) to WAY(M−1). In addition, tri-state driver arrays 420(0) to 420(M−1) each receive an indication of whether a particular one of the M tags at tag busses TAG(0) to TAG(M−1) matches the tag portion of the CPU address at bus CPUTAG.
- The tri-state driver arrays receiving indications that the particular one of the M tags associated therewith does not match the tag portion of the CPU address at bus CPUTAG have outputs which are tri-stated, and are high impedance. The tri-state driver receiving the indication that the particular one of the M tags associated therewith does match the tag portion of the CPU address at bus CPUTAG, outputs data at way bus WAY(X) indicating the particular way identified at its data input.
- Accordingly, the tri-state driver arrays 420(0) to 420(M−1) perform a multiplexing function which passes data identifying the particular way having tag data matching the tag portion of the CPU address. As understood by those of skill in the art, alternative circuits performing the multiplexing function may alternatively be used.
-
FIG. 5 illustrates a timing diagram 500 schematically illustrating certain timing relationships for various operations ofcache memory 200 andcomparison circuit 400. Timing diagram 500 illustratesCPU clock waveform 510,tag clock waveform 520,tag data waveform 530, CPUtag data waveform 540, and waybus data waveform 550. -
CPU clock waveform 510 illustrates a representation of a CPU clock used by theCPU 110, and has characteristics similar or identical toCPU clock waveform 310 ofFIG. 3 . -
Tag clock waveform 520 illustrates a representation of a tag clock used bytag memory portion 210, and has characteristics similar or identical to tagclock waveform 320 ofFIG. 3 . -
Tag data waveform 530 illustrates a representation of tag data identifying the M tags at tag busses TAG(0) to TAG(M−1). - CPU
tag data waveform 540 illustrates a representation of the tag portion of the CPU address at bus CPUTAG. - Way bus data waveform 550 illustrates a representation of way identification data at way bus WAY(X).
- During a first portion of a CPU clock period, the tag clock is active (high). In response to the tag clock being active, the M tag memory portions 210(0) to 210(M−1) identified by the index portion of the CPU address are read to retrieve M tags. The M tags are respectively represented by data at tag busses TAG(0) to TAG(M−1). In addition, while the tag clock is active, the M tags are each compared with the tag portion of the CPU address by comparators 410(0) to 410(M−1).
- Furthermore, while the tag clock is active, tri-state driver arrays 420(0) to 420(M−1) each receive data identifying one of the M ways from the way identification busses WAY(0) to WAY(M−1). In addition, tri-state driver arrays 420(0) to 420(M−1) each receive an indication from a
corresponding comparator 410 indicating whether a particular one of the M tags at tag busses TAG(0) to TAG(M−1) matches the tag portion of the CPU address at bus CPUTAG. - In addition, while the tag clock is active, the tri-state driver array receiving the indication that the particular one of the M tags associated therewith matches the tag portion of the CPU address at bus CPUTAG, outputs way identification data at way bus WAY(X) identifying the particular way identified at its data input.
- Timing diagram 700 schematically illustrates an embodiment of certain timing relationships for
CPU clock waveform 510,tag clock waveform 520,tag data waveform 530, CPUtag data waveform 540, and waybus data waveform 550. Alternative timing relationships may alternatively be used. For example, the phase relationship betweenCPU clock waveform 510 andtag clock waveform 520 may be modified. Additionally or alternatively, in some embodiments, the active state oftag clock waveform 520 is low. -
FIG. 6 is a schematic illustration of aninstruction memory portion 220 of acache memory 200 according to some embodiments.Instruction memory portion 220 includes memory locations for instructions stored in all of the ways ofcache memory 200. Accordingly,instruction memory portion 220 includes stores instructions in memory locations 220(0) to 220(M*L−1), where M is equal to the number of ways, and L is equal to the length (number of instruction memory locations) in each way. - As illustrated,
instruction memory portion 220 receives an address.Instruction memory portion 220 is configured to be read so as to output the instruction stored in the memory location associated with the received address. - The received address includes a tag portion and an index portion.
- The tag portion is generated by
comparison circuit 400 and is the formed by the way data at way bus WAY(X) indicating the particular way identified as having tag data matching the tag portion of the CPU address. - The index portion of the received address is formed by the index portion of the CPU address.
-
FIG. 7 illustrates a timing diagram 700 schematically illustrating certain timing relationships for various operationsinstruction memory portion 220 ofcache memory 200. Timing diagram 700 illustratesCPU clock waveform 710,instruction clock waveform 720,index data waveform 730, way data waveform 740, andinstruction data waveform 750. -
CPU clock waveform 710 illustrates a representation of a CPU clock used by theCPU 110, and has characteristics similar or identical toCPU clock waveform 310 ofFIG. 3 . -
Instruction clock waveform 720 illustrates a representation of an instruction clock used byinstruction memory portion 220, and has characteristics similar or identical toinstruction clock waveform 320 ofFIG. 3 . -
Index data waveform 730 illustrates a representation of the index portion of the CPU address at bus CPUTAG. -
Way data waveform 740 illustrates a representation of way identification data at way bus WAY(X). -
Instruction data waveform 750 illustrates a representation of the instruction read frominstruction memory portion 220. - During a second portion of the CPU clock period, subsequent to the first portion of the CPU clock period discussed with reference to
FIG. 5 , the instruction clock is active (high). In response to the instruction clock being active, the way identification data and the index portion of the CPU address are used to read an instruction from theinstruction memory portion 220. Once read, the instruction read from theinstruction memory portion 220 is returned to theCPU 110 as the instruction requested by theCPU 110. - Timing diagram 700 schematically illustrates an embodiment of certain timing relationships for
CPU clock waveform 710,instruction clock waveform 720,index data waveform 730, way data waveform 740, andinstruction data waveform 750. Alternative timing relationships may alternatively be used. For example, the phase relationship betweenCPU clock waveform 710 andinstruction clock waveform 720 may be modified. Additionally or alternatively, in some embodiments, the active state ofinstruction clock waveform 720 is low. -
FIG. 8 illustrates a timing diagram 800 schematically illustrating another embodiment of certain timing relationships for various operations ofcache memory 200. Timing diagram 800 illustratesCPU clock waveform 810, N×CPU clock waveform 810,tag clock waveform 820,instruction clock waveform 830, andinstruction waveform 840. The illustrated waveforms correspond with clocks generated by clock generation circuits understood by those of skill in the art. -
CPU clock waveform 810 illustrates a representation of a CPU clock used by theCPU 110. As understood by those of skill in the art, the CPU clock represented byCPU clock waveform 810 may be used byCPU 110, for example, to receive input data, to execute instructions, and to generate output data.CPU 110 may use the CPU clock represented byCPU clock waveform 810 to additionally perform other operations.CPU 110 may use additional clocks (not shown). - N×
CPU clock waveform 815 illustrates a representation of a clock which has a frequency which is a multiple of the frequency of the CPU clock. In this embodiment, the frequency of the clock of N×CPU clock waveform 815 has a frequency which is three times the frequency of the CPU clock. The clock of N×CPU clock waveform 815 may be generated based on the CPU clock using circuits known to those of skill in the art. -
Tag clock waveform 820 illustrates a representation of a tag clock used bytag memory portion 210. The tag clock represented bytag clock waveform 820 may be used bytag memory portion 210, for example, for writing and reading tag data to and fromtag memory portion 210. -
Instruction clock waveform 830 illustrates a representation of an instruction clock used byinstruction memory portion 220. The instruction clock represented byinstruction clock waveform 830 may be used byinstruction memory portion 220, for example, for writing and reading instruction data to and frominstruction memory portion 220. -
Instruction waveform 840 illustrates a representation of instruction data. The instruction data encodes instructions which are executable byCPU 110, and which are provided toCPU 110 bycache memory 200, for example, for execution byCPU 110. - As illustrated in
FIG. 8 , during a first portion of a CPU clock period, the tag clock is active (high). While the tag clock is active, the M tag memory portions 210(0) to 210(M−1) identified by the index portion of the CPU address are read to retrieve M tags. In addition, while the tag clock is active, the M tags are each compared with the tag portion of the CPU address. If one of the M tags matches the tag portion of the CPU address, the way having the matching tag is identified. - During a second, subsequent, portion of the CPU clock period, the instruction clock is active (high). During the second portion of the CPU clock period, the index portion of the CPU address is used to perform a read operation on the instruction memory portion 220(x) of the identified way to read an instruction therefrom. Once read, the instruction read from the instruction memory portion 220(x) of the identified way is returned to the
CPU 110 as the instruction requested by theCPU 110. - Timing diagram 800 schematically illustrates an embodiment of certain timing relationships for
CPU clock waveform 810, N×CPU clock waveform 815,tag clock waveform 820,instruction clock waveform 830, andinstruction waveform 840. Alternative timing relationships may alternatively be used. For example, the phase relationship betweenCPU clock waveform 810 and an of N×CPU clock waveform 815,tag clock waveform 820, andinstruction clock waveform 830 may be modified. Additionally or alternatively, in some embodiments, the active states of either or both oftag clock waveform 820 andinstruction clock waveform 830 are low. -
FIG. 9 is a chart illustrating cache memory area and power improvement achieved using an embodiment of a cache memory using inventive aspects discussed herein as compared with a traditional cache memory. The new and old cache memories are each 8 Kbyte 4-way running a 32 MHz CPU clock. The new cache memory uses a 3×CPU clock running at 96 MHz. - As illustrated in
FIG. 9 , the new cache memory uses 51% less area and 72% less power. - Though the present invention is disclosed by way of specific embodiments as described above, those embodiments are not intended to limit the present invention. Based on the methods and the technical aspects disclosed herein, variations and changes may be made to the presented embodiments by those of skill in the art without departing from the spirit and the scope of the present invention.
Claims (20)
Priority Applications (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US16/817,609 US11176051B2 (en) | 2020-03-13 | 2020-03-13 | Multi-way cache memory access |
| CN202180004238.XA CN114072776B (en) | 2020-03-13 | 2021-03-11 | small area cache memory |
| PCT/CN2021/080297 WO2021180186A1 (en) | 2020-03-13 | 2021-03-11 | Low area cache memory |
| CN202410172650.8A CN117950725B (en) | 2020-03-13 | 2021-03-11 | Method and peripheral circuit for processing CPU access cache |
| EP21767323.5A EP3977294B1 (en) | 2020-03-13 | 2021-03-11 | Low area cache memory |
| US17/499,834 US11544199B2 (en) | 2020-03-13 | 2021-10-12 | Multi-way cache memory access |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US16/817,609 US11176051B2 (en) | 2020-03-13 | 2020-03-13 | Multi-way cache memory access |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/499,834 Continuation US11544199B2 (en) | 2020-03-13 | 2021-10-12 | Multi-way cache memory access |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20210286732A1 true US20210286732A1 (en) | 2021-09-16 |
| US11176051B2 US11176051B2 (en) | 2021-11-16 |
Family
ID=77664734
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US16/817,609 Active US11176051B2 (en) | 2020-03-13 | 2020-03-13 | Multi-way cache memory access |
| US17/499,834 Active US11544199B2 (en) | 2020-03-13 | 2021-10-12 | Multi-way cache memory access |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/499,834 Active US11544199B2 (en) | 2020-03-13 | 2021-10-12 | Multi-way cache memory access |
Country Status (4)
| Country | Link |
|---|---|
| US (2) | US11176051B2 (en) |
| EP (1) | EP3977294B1 (en) |
| CN (2) | CN114072776B (en) |
| WO (1) | WO2021180186A1 (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5031141A (en) * | 1990-04-06 | 1991-07-09 | Intel Corporation | Apparatus for generating self-timing for on-chip cache |
| US20030061446A1 (en) * | 2001-07-27 | 2003-03-27 | Samsung Electronics Co., Ltd. | Multi-way set associative cache memory |
| US20120272007A1 (en) * | 2011-04-19 | 2012-10-25 | Freescale Semiconductor, Inc. | Cache memory with dynamic lockstep support |
Family Cites Families (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5353424A (en) | 1991-11-19 | 1994-10-04 | Digital Equipment Corporation | Fast tag compare and bank select in set associative cache |
| US6449693B1 (en) * | 1999-04-05 | 2002-09-10 | International Business Machines Corporation | Method and apparatus for improving caching within a processor system |
| US6385700B2 (en) * | 1999-06-21 | 2002-05-07 | Philips Electronics No. America Corp. | Set-associative cache-management method with parallel read and serial read pipelined with serial write |
| EP1634182A2 (en) * | 2003-06-17 | 2006-03-15 | PACT XPP Technologies AG | Data processing device and method |
| US7395372B2 (en) * | 2003-11-14 | 2008-07-01 | International Business Machines Corporation | Method and system for providing cache set selection which is power optimized |
| US7827356B2 (en) * | 2007-09-10 | 2010-11-02 | Qualcomm Incorporated | System and method of using an N-way cache |
| GB2458295B (en) * | 2008-03-12 | 2012-01-11 | Advanced Risc Mach Ltd | Cache accessing using a micro tag |
| JP5142868B2 (en) * | 2008-07-17 | 2013-02-13 | 株式会社東芝 | Cache memory control circuit and processor |
| JP2010097557A (en) * | 2008-10-20 | 2010-04-30 | Toshiba Corp | Set associative cache apparatus and cache method |
| US8949530B2 (en) | 2011-08-02 | 2015-02-03 | International Business Machines Corporation | Dynamic index selection in a hardware cache |
| US9183155B2 (en) * | 2013-09-26 | 2015-11-10 | Andes Technology Corporation | Microprocessor and method for using an instruction loop cache thereof |
-
2020
- 2020-03-13 US US16/817,609 patent/US11176051B2/en active Active
-
2021
- 2021-03-11 CN CN202180004238.XA patent/CN114072776B/en active Active
- 2021-03-11 WO PCT/CN2021/080297 patent/WO2021180186A1/en not_active Ceased
- 2021-03-11 CN CN202410172650.8A patent/CN117950725B/en active Active
- 2021-03-11 EP EP21767323.5A patent/EP3977294B1/en active Active
- 2021-10-12 US US17/499,834 patent/US11544199B2/en active Active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5031141A (en) * | 1990-04-06 | 1991-07-09 | Intel Corporation | Apparatus for generating self-timing for on-chip cache |
| US20030061446A1 (en) * | 2001-07-27 | 2003-03-27 | Samsung Electronics Co., Ltd. | Multi-way set associative cache memory |
| US20120272007A1 (en) * | 2011-04-19 | 2012-10-25 | Freescale Semiconductor, Inc. | Cache memory with dynamic lockstep support |
Also Published As
| Publication number | Publication date |
|---|---|
| CN114072776A (en) | 2022-02-18 |
| CN114072776B (en) | 2024-02-20 |
| EP3977294B1 (en) | 2023-11-29 |
| EP3977294A1 (en) | 2022-04-06 |
| US20220066942A1 (en) | 2022-03-03 |
| CN117950725A (en) | 2024-04-30 |
| CN117950725B (en) | 2025-05-09 |
| US11176051B2 (en) | 2021-11-16 |
| US11544199B2 (en) | 2023-01-03 |
| EP3977294A4 (en) | 2022-07-20 |
| WO2021180186A1 (en) | 2021-09-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11467966B2 (en) | Cache memory having a programmable number of ways | |
| US5752260A (en) | High-speed, multiple-port, interleaved cache with arbitration of multiple access addresses | |
| AU653798B2 (en) | Refresh control arrangement for dynamic random access memory system | |
| US20090094435A1 (en) | System and method for cache access prediction | |
| US6381686B1 (en) | Parallel processor comprising multiple sub-banks to which access requests are bypassed from a request queue when corresponding page faults are generated | |
| US5627988A (en) | Data memories and method for storing multiple categories of data in latches dedicated to particular category | |
| JPH11273365A (en) | Content addressable memory(cam) | |
| US6581140B1 (en) | Method and apparatus for improving access time in set-associative cache systems | |
| US6823426B2 (en) | System and method of data replacement in cache ways | |
| US6272595B1 (en) | N-way set-associative cache memory which includes a store hit buffer for improved data access | |
| EP0675443A1 (en) | Apparatus and method for accessing direct mapped cache | |
| US8230277B2 (en) | Storage of data in data stores having some faulty storage locations | |
| US20030065891A1 (en) | Memory controller and a cache for accessing a main memory, and a system and a method for controlling the main memory | |
| US11176051B2 (en) | Multi-way cache memory access | |
| US5295253A (en) | Cache memory utilizing a two-phase synchronization signal for controlling saturation conditions of the cache | |
| CN119557245B (en) | Memory access method, device and related equipment | |
| JPH04228187A (en) | Random-access-memory-array | |
| US11188465B1 (en) | Cache memory replacement policy | |
| JP3997404B2 (en) | Cache memory and control method thereof | |
| CN1838317A (en) | Means for defining latency in memory circuits | |
| US12242736B1 (en) | Memory controller including row hammer tracking device | |
| US20050268021A1 (en) | Method and system for operating a cache memory | |
| US10599583B2 (en) | Pre-match system and pre-match method | |
| JPH08297968A (en) | Semiconductor memory device | |
| JPH0528414B2 (en) |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: GOODIX TECHNOLOGY INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KAMAND, BASSAM S.;YOUNIS, WALEED;ZUNIGA, RAMON;AND OTHERS;REEL/FRAME:052103/0650 Effective date: 20200312 |
|
| FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| AS | Assignment |
Owner name: SHENZHEN GOODIX TECHNOLOGY CO., LTD., CHINA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:GOODIX TECHNOLOGY INC.;REEL/FRAME:055011/0720 Effective date: 20210104 |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: AWAITING TC RESP, ISSUE FEE PAYMENT RECEIVED |
|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |