US20170024320A1 - System and method for adaptation of coherence models between agents - Google Patents
System and method for adaptation of coherence models between agents Download PDFInfo
- Publication number
- US20170024320A1 US20170024320A1 US14/970,467 US201514970467A US2017024320A1 US 20170024320 A1 US20170024320 A1 US 20170024320A1 US 201514970467 A US201514970467 A US 201514970467A US 2017024320 A1 US2017024320 A1 US 2017024320A1
- Authority
- US
- United States
- Prior art keywords
- agent
- coherence
- translator
- model
- coherency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0831—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0817—Cache consistency protocols using directory methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1004—Compatibility, e.g. with legacy hardware
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/62—Details of cache specific to multiprocessor cache arrangements
- G06F2212/621—Coherency control relating to peripheral accessing, e.g. from DMA or I/O device
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/3017—Runtime instruction translation, e.g. macros
Definitions
- the invention is in the field of system-on-chip (SoC) and, more specifically, heterogeneous processing units, including processor cores, graphics processing units, and other components that manipulate or move data.
- SoC system-on-chip
- heterogeneous processing units including processor cores, graphics processing units, and other components that manipulate or move data.
- SoCs Modern systems-on-chip
- a processing unit, or agent implements a cache for storing a local copy of data, and that agent assumes a particular coherence model, which defines a set of cache states and a set of coherent transactions that effect transitions between cache states.
- a coherence model enforces data coherence among agents that share the same coherence model.
- processing units delivered as intellectual property blocks from different vendors, may implement different and incompatible coherence models, making integration of those processing units in a SoC difficult if not impossible.
- SoC designer is faced with either higher design costs and longer development cycles or selection of compatible, but less optimal, processing units.
- a translator is used to adapt between different coherence models.
- a translator intermediates the exchange of coherency requests and responses between an agent and a coherence controller.
- the same or a different translator intermediates the exchange of snoop requests and responses between a coherence controller and an agent.
- the same or a different translator intermediates the exchange of snoop requests and responses between a coherence controller, the snooped agent, and the initiating agent.
- the coherence controller has a coherence model that includes a cache state model with a state that is not supported by the coherence model of the agent.
- the agent has a coherence model that includes a cache state model with a state that is not supported by the coherence model of the coherence controller.
- the coherence controller has a coherence model that supports a type of snoop request that is not supported by the coherence model of the agent.
- the agent has a coherence model that includes a type of snoop request that is not supported by the coherence model of the coherence controller.
- the translator is configured with awareness of the allowable states and behaviors of the coherence model on each of its interfaces.
- the translator issues one or more requests and responses within the scope of the states and behaviors or the destination that provide for the functionality required by the allowed states and behaviors of the source.
- the translator makes the coherence model of agents and coherence controllers transparent to each other so that each only need be designed according to its coherence protocol, and a correctly functioning system may be implemented.
- FIG. 1 illustrates a system of heterogeneous caching agents with translators to a generic model.
- FIG. 2 illustrates a system of heterogeneous caching agents, one using a generic model with no translation.
- FIG. 3 illustrates a translator between a native and a generic coherence model.
- FIG. 4 illustrates a process of a generic coherency request causing a snoop in the cache of another heterogeneous agent.
- FIG. 5 illustrates a process of a coherency request according to one native coherence model causing a snoop in the cache of another agent with a different coherence model.
- FIG. 6 illustrates a table of native states and equivalent generic representations.
- FIG. 7 illustrates a system of heterogeneous caching agents comprising a proxy between a non-coherent agent and a translator.
- FIG. 8 illustrates a system of three agents, using two different caching models.
- FIG. 9 illustrates a coherency read from an agent of a first coherence model causing snoops to agents of a second coherence model with different agent cache state models.
- FIG. 10 illustrates a coherency read from an agent of a second coherence model causing snoops to an agent of a first coherence model and an agent with the second coherence model but a more restrictive agent cache state model.
- FIG. 11 illustrates a coherency read from an agent of a second coherence model causing snoops to an agent of a first coherence model and an agent with the second coherence model but a less restrictive agent cache state model.
- FIG. 12 illustrates a scenario of coherency responses of dirty data from agents of a second coherence model to an agent of a first coherence model.
- FIG. 13 illustrates a scenario of coherency responses of clean data from agents of a second coherence model to an agent of a first coherence model in a system in which state responses and data responses are separate.
- a processing unit may implement caches and a coherence model, within which the processing unit is known as an agent.
- a coherence model describes the following:
- a coherence model may permit agents to implement a subset of the cache states defined by the cache state model and this subset is known as an agent cache state model.
- agent cache state model Different agent cache state models within a given coherence model are compatible by definition.
- a coherence model may also define additional characteristics and behaviors not described above.
- Cache coherence is maintained by hardware across multiple agents with caches in accordance with the coherence model implemented by those agents; these agents are known as cache-coherent agents.
- a coherence model may define agents that initiate coherent transactions but that implement caches that do not need to be kept coherent with other agents' caches; these agents are known as IO-coherent agents.
- IO-coherent agents may be used to manage caches to be kept coherent by software.
- a coherence model applies to both cache-coherent agents and IO-coherent agents and is relevant whether an individual cache is kept coherent by hardware or software. Both types of agents are collectively called coherent agents.
- the hardware block that enforces coherence is the coherence controller.
- the coherence controller is coupled to each coherent agent.
- the coupling can be direct or indirect, such as through a network-on-chip or any other means of interconnect.
- the coherence controller is distributed and partitioned by cache line address.
- an initiating agent issues coherency requests to the coherence controller, which in turn might issue resulting snoop requests to snooped agents.
- the snooped agents must, in turn, issue snoop responses, the results of which are typically aggregated by the coherence controller, which, in turn, returns a coherency response to the initiating agent.
- the coherence controller implements a snoop filter to track the cache states of a cache line in each agent, and based on the state of the snoop filter, the coherence controller may issue snoop requests to a subset of agents.
- each agent uses a particular coherence model, and assumes that all other coherent agents use the same model.
- multiple coherent agents use the same coherence model.
- the coherence controller must accept coherency requests and issue coherency responses to an initiating agent according to its coherence model and issue snoop requests and accept snoop responses each according to the coherence model of the particular snooped agent. To do so, translation must be performed between coherence models.
- translation can be performed in a distinct and/or dedicated translator unit, as a function of the coherence controller, and/or in an interface unit connected to an agent.
- the functionality of a translator unit can be divided into sub-units, such as a request sub-unit and a response sub-unit, or such as a coherency sub-unit that translates coherency requests and responses and a snoop sub-unit that translates snoop requests and responses.
- sub-units such as a request sub-unit and a response sub-unit, or such as a coherency sub-unit that translates coherency requests and responses and a snoop sub-unit that translates snoop requests and responses.
- embodiments disclosed indicate a distinct, dedicated translator, though this should not be construed as limiting the scope of the disclosed invention.
- coherent systems include fully-coherent (FC) agents and/or IO-coherent (IO) agents.
- FC fully-coherent
- IO IO-coherent
- An agent communicates using its own native protocol or the generic protocol.
- a cache-coherent system includes at least one FC agent; if all agents are IO, then that is similar to all agents being non-coherent.
- agents can act as either a requesting agent or a snooped agent; an FC agent can act as both, but an IO agent can only act as a requesting agent.
- FIG. 1 shows, in accordance with some aspects of the invention, one embodiment of a system according to the invention.
- Initiating agent 100 uses a coherence model that has two states, S and I.
- Agent 100 issues coherency requests to translator 102 according to its native coherence model. Those cause translator 102 to issue coherency requests to coherence controller 104 according to a generic protocol.
- coherence controller 104 issues snoop requests to translator 106 according to the generic coherence model. Those cause translator 106 to issue snoop requests to snooped agent 108 according to its native coherence model.
- Agent 108 supports five states, M, O, E, S, and I.
- the coherence models of agent 100 and agent 108 are different and not directly compatible.
- FIG. 2 shows, in accordance with some aspects of the invention, another embodiment, wherein an agent uses a generic coherence model.
- SI initiating agent 200 issues coherency requests directly to coherence controller 204 according to a generic protocol.
- coherence controller 204 issues snoop requests to translator 206 according to the generic coherence model.
- the snoop requests cause translator 206 to issue snoop requests to MOESI agent 208 according to its native coherence model.
- FIG. 3 shows, in accordance with various aspects of the invention, an embodiment of a translator according to various aspects of the invention. It comprises coherency request and response and snoop request and response on each of an agent native interface and a coherence controller generic interface. Translators may be built with subsets of interface signals, and the embodiment shown should not be construed as limiting.
- the translator receives coherency requests from an agent on native interface signals 301 . It reacts by issuing one or more coherency requests to the coherence controller on generic interface signals 303 . Multiple requests may occur if, for instance, a transaction must be split, e.g.
- the coherence controller will respond on generic interface signals 305 to each coherency request issued on generic interface signals 303 .
- a corresponding coherency response will be issued to the connected agent on native interface signals 307 for each coherency request issued on native interface signals 301 .
- the translator gathers and assembles the multiple generic coherency responses into a single native coherency response.
- the connected coherence controller may issue snoop requests to the translator through generic interface signals 311 .
- the translator will react by issuing one or more corresponding snoop requests to the connected agent on native interface signals 313 .
- the connected agent will issue a snoop response on native interface signals 315 .
- Snoop responses may or may not carry data, depending on the type of snoop request and the state of the requested line in the agent cache.
- the translator Upon receiving a snoop response from the agent, issues a corresponding snoop response to the coherence controller on generic interface signals 317 .
- the translator gathers and assembles the multiple native snoop responses into a single generic snoop response.
- native coherency response signals 307 and native snoop response signals 315 may be split into a state response group and a data response group depending on the requirements of the native coherence model.
- generic coherency response signals 305 and generic snoop response signals 317 may be split into a state response group and a data response group depending on the requirements of the generic coherence model.
- FIG. 4 shows, in accordance with various aspects of the invention, a process of a coherency request from a first agent causing a snoop request at a second agent in a system where the first agent and the second agent both use incompatible native coherence models that differ from a generic coherence model used by a coherence controller, as in the system of FIG. 1 .
- the agent 100 issues a coherency request to the translator 102 according to its native coherence model.
- the translator 102 issues a coherency request the coherence controller 104 according to a generic coherence model.
- the coherence manager 104 issues a corresponding snoop request to the translator 106 according to its generic coherence model.
- the translator 106 issues a snoop request to the agent 108 according to its native coherence model.
- FIG. 5 shows, in accordance with various aspects of the invention, a process of a coherency request from a first agent causing a snoop request at a second agent in a system where the first agent and the coherence controller share a coherence model, as in the system of FIG. 2 .
- the agent 200 issues a coherency request to the coherence controller 204 .
- the coherence manager 204 issues a corresponding snoop request to the translator 206 according to its generic coherence model.
- the translator 206 issues a snoop request to the agent 208 according to its native coherence model.
- a coherence controller only issues, to any particular agent, snoop requests that put a cache line into a state that is legal according to the snooped agent's coherence model and that is consistent with the state requested by the initiating agent. For example, if the coherence model for the initiating agent does not allow shared cache lines in other agents, a coherence controller will never issue a snoop request to a snooped agent that leaves a shared copy of data in the snooped agent, even if the coherence model for the snooped agent supports such an outcome.
- the state of a cache line in any agent is generically represented within the coherence controller by four binary cache line characteristics:
- Common cache state models include MSI, MESI, and MOESI, among others, where each of the letters represents a cache state: Modified (M), Owned (O), Exclusive (E), Shared (S), and Invalid (I).
- An example of a cache state model that is widely used is the cache state model specified by the Advanced Microcontroller Bus Architecture (AMBA) AXI Coherency Extensions (ACE), which defines a five-state cache state model that is effectively equivalent to the MOESI cache state model.
- AZA Advanced Microcontroller Bus Architecture
- ACE AXI Coherency Extensions
- Some other cache state models may additionally define a Forward (F) state.
- FIG. 6 shows, in accordance with various aspects of the invention, a table of cache states found in the cache state models of many agents, along with the equivalent state of each cache line characteristic of the generic coherence model.
- IXXX, VNSC, VOUC, VOUD, VOSD, and VOSC states are legally recognized by the generic cache state model of the coherence controller.
- the four binary states are encoded in a representation that uses only three bits to identify the six legal states.
- a generic coherence model may be a superset, subset, or partially inclusive of the cache state models, the transaction processing models, and the coherence granules defined by the native coherence models of the agents in the system.
- a single request according to one coherence model causes a translator to issue multiple requests according to a second coherence model.
- the translator issues multiple requests into the second coherence model in response to a request from the first, the translator also gathers and assembles the multiple responses from the second coherence model into a single response to the first. This is true for a single request according to one native protocol translated into multiple requests in the generic protocol or for a single request according to a generic protocol translated into multiple requests in a second native protocol.
- a generic coherence model represents fewer than all possible valid states within an agent's native coherence model.
- a coherence controller may be coupled to a system directory.
- a system directory is an integral part of the coherence controller. In any case, according to some aspects and embodiments of the invention, it is not necessary for the system directory to be aware of, or even support the coherence model of agents.
- a system includes a proxy cache for a non-coherent agent.
- FIG. 7 shows, in accordance with various aspects of the invention, one embodiment of a system with a proxy cache according to the invention.
- Caching agent 700 issues coherency requests to translator 702 according to its native coherence model. Those cause translator 702 to issue coherency requests to coherence controller 704 according to a generic protocol.
- coherence controller 704 issues snoop requests to proxy cache 706 according to the generic coherence model.
- Proxy cache 706 acts as proxy cache for cacheless agent 708 . However, proxy cache 706 need not issue any request to agent 708 .
- Proxy cache 706 autonomously issues appropriate snoop responses to coherence controller 704 according to the generic protocol. No snoop translation is necessary.
- cacheless agent 708 issues non-coherent coherency requests to proxy cache 706 .
- Proxy cache 706 issues corresponding coherent coherency requests to coherence controller 704 .
- Coherence controller 704 issues snoop requests to caching agent 700 , through translator 702 .
- Caching agent 700 issues appropriate snoop responses to coherence controller 704 , through translator 702 .
- the coherence controller issues snoop responses to proxy cache 706 , which returns a non-coherent response to cacheless agent 708 .
- a system comprises an agent that accords with a self-snooping coherence model and an agent that accords with a model that does not support self-snooping.
- the self-snooping agent issues a coherency request to a coherence controller
- the coherence controller issues a snoop to the initiating agent.
- the non-self-snooping agent issues a coherency request to the coherence controller
- the coherence controller snoops other agents, but does not snoop the initiating agent.
- an initiating agent is required to serialize outstanding writeback coherency requests with respect to inbound snoop requests to the same cache line.
- the initiating agent blocks such snoop requests until the agent receives the corresponding coherency response for the writeback coherency request and then issues the snoop response for the snoop request.
- the coherence controller must guarantee that writeback coherency requests make forward progress so that snoop requests can make forward progress.
- the coherence controller is required to serialize outstanding snoop requests with respect to inbound writeback coherency requests to the same cache line.
- the coherence controller blocks such writeback coherency requests until the snoop responses for the snoop requests have been received.
- the native to generic translators must select a writeback coherency request with semantics in the generic coherence model appropriate to the corresponding native coherence models, and the coherence controller must either block the writeback coherency request or allow the writeback coherency request to proceed as needed.
- agents respond with data in the S state (or SharedClean in ACE or VNSC in a generic coherence model).
- a coherence controller that operates according to a generic coherence model, in reaction to coherency requests, issues snoops to multiple agents and, if multiple agents return data from the line in the S state, the coherence controller discards all but the first data response, which it issues in the coherency response to the initiating agent.
- Coherence model A defines a single agent cache state model that consists of four cache states, MESI.
- Coherence model B defines multiple agent cache state models, including one that consists of four cache states, MESI, and one that consists of five cache states, MOESI.
- Both coherence models define a basic ReadShared coherency request that fetches a copy of a cache line for the initiating agent to satisfy an internal load operation and allows snooped agents to retain a valid, shared copy of the cache line.
- an initiating MESI agent i.e. an agent that implements the MESI agent cache state model, issues a ReadShared coherency request to install a cache line in the M, E, or S state based on the coherency response.
- a snooped MESI agent in coherence model A may retain a valid copy, transitioning to the S state, or may invalidate its copy, transitioning to the I state.
- a snooped MESI agent had a copy in the M state initially and then retained a copy in the S state in response to a ReadShared snoop request, an initiating MESI agent installs the copy in the S state, and memory must be updated with the dirty data transferred from the snooped agent.
- an initiating MOESI agent i.e. an agent that implements the MOESI agent cache state model, issues a ReadShared coherency request to install a cache line in the M, O, E, or S state based on the coherency response.
- a MESI agent in coherence model B must not issue a ReadShared coherency request.
- coherence model B may define a ReadClean coherency request and snoop request. In this case, an initiating MESI agent issues a ReadClean coherency request to install a cache line in the E or S state based on the coherency response.
- a snooped MOESI agent in coherence model B may retain a copy in the O or S state or may transition to the I state.
- a snooped MESI agent in coherence model B may retain a copy in the S state or may transition to the I state.
- a ReadShared coherency request does not require an update to memory since only MOESI agents issue such a request and those agents can install a cache line in all valid states including O.
- MESI agent in coherence model B issues a specific coherency request to interoperate with a MOESI agent while the latter is unaware that a MOESI agent may be present in the system.
- a generic coherence model, coherence model G defines an agent cache state model with six cache states, MOESIF.
- Coherence model G also defines a MemRdShd coherency request and a SnpRdShd snoop request with behaviors compatible with the ReadShared requests defined by coherence model B; and defines a MemRdCln coherency request and a SnpRdCln snoop request with behaviors compatible with the ReadShared requests defined by coherence model A and the ReadClean requests defined by coherence model B.
- the translators in the system perform at least two functions: adapting requests and responses between agents in different coherence models; and transmitting requests and responses between agents in the same coherence model. In other embodiments, the translators in the system also adapt requests and responses between agents in the same coherence model.
- each coherency request or snoop request is designated with its corresponding coherence model, e.g. a ReadShared from coherence model A is written as ReadShared-A and a SnpRdCln from coherence model G is written as SnpRdCln-G.
- the first agent uses coherence model A with a MESI agent cache state model.
- the second agent uses coherence model B with a MOESI agent cache state model.
- the third agent uses coherence model B with a MESI agent cache state model.
- FIG. 9 shows, in accordance with various aspects of the invention, a scenario in which a coherency read from the first agent causes snoops to the second agent and the third agent.
- a coherency request in coherence model A is translated into snoop requests in coherence model B.
- a first agent 902 issues a ReadShared-A coherency request to a first agent translator 904 .
- the first agent translator 904 translates the coherency request from the first agent 902 into the coherency request in the generic coherence model that is most consistent with the cache states of the first agent 902 , i.e. MemRdCln-G.
- the first agent translator 904 issues a MemRdCln-G coherency request to a coherence controller 906 .
- the coherence controller 906 issues a SnpRdCln-G snoop request to each of a second agent translator 908 and a third agent translator 912 .
- the second agent translator 908 and the third agent translator 912 translate the snoop request in the generic coherence model into the most appropriate snoop request in coherence model B, i.e. ReadClean-B.
- the second agent translator 908 issues a ReadClean-B snoop request to a second agent 910 .
- the third agent translator 912 issues a ReadClean-B snoop request to a third agent 914 .
- the translators adapt the request from the first agent 902 into requests to the second and third agents.
- FIG. 10 shows, in accordance with various aspects of the invention, a scenario in which a coherency read from the second agent 910 causes snoops to the first agent 902 and the third agent 914 .
- a coherency request in coherence model B is translated into a snoop request in coherence model A and into a snoop request in coherence model B.
- the second agent translator 908 translates the coherency request from the second agent 910 into the equivalently defined coherency request in the generic coherence model, i.e. MemRdShd-G.
- the second agent 910 issues a ReadShared-B coherency request to the second agent translator 908 .
- the second agent translator 908 issues a MemRdShd-G coherency request to the coherence controller 906 .
- the coherence controller 906 issues a SnpRdShd-G snoop request to each of the first agent translator 904 and the third agent translator 912 .
- the first agent translator 904 translates the snoop request in the generic coherence model into a snoop request in coherence model A that is consistent with generic snoop request, i.e. ReadShared-A.
- the first agent translator 904 issues a ReadShared-A snoop request to the first agent 902 .
- the third agent translator 912 translates the snoop request in the generic coherence model into the equivalently defined snoop request in coherence model B, i.e. ReadShared-B.
- the third agent translator 912 issues a ReadShared-B snoop request to the third agent 914 .
- the translators adapt the request from the second agent 910 into a request to the first agent 902 and simply transmit the request from the second agent 910 to the third agent 914 , as those agents share the same coherence model.
- FIG. 11 shows, in accordance with various aspects of the invention, a scenario in which a coherency read from the third agent 914 causes snoops to the first agent 902 and the second agent 910 .
- a coherency request in coherence model B is translated into a snoop request in coherence model A and into a snoop request in coherence model B.
- the third agent 914 issues a ReadClean-B coherency request to the third agent translator 912 .
- the third agent translator 912 translates the coherency request from the third agent 914 into the equivalently defined coherency request in the generic coherence model, i.e. MemRdCln-G.
- the third agent translator 912 issues a MemRdCln-G coherency request to the coherence controller 906 .
- the coherence controller 906 issues a SnpRdCln-G snoop request to each of the first agent translator 904 and the second agent translator 910 .
- the first agent translator 904 translates the snoop request in the generic coherence model into a snoop request in coherence model A that is consistent with generic snoop request, i.e. ReadShared-A.
- the first agent translator 904 issues a ReadShared-A snoop request to the first agent 902 .
- the second agent translator 908 For the second agent 910 , the second agent translator 908 translates the snoop request in the generic coherence model into the equivalently defined snoop request in coherence model B, i.e. ReadClean-B.
- the second agent translator 908 issues a ReadClean-B snoop request to the second agent 910 .
- the translators adapt the request from the third agent 914 into a request to the first agent 902 and simply transmit the request from the third agent 914 to the second agent 910 , as those agents share the same coherence model.
- snoop responses and coherency responses communicate cache state and data.
- Responses may contain a state portion only or may contain both a state portion and a data portion.
- responses may be divided into state responses, consisting of a state portion only, and data responses, consisting of a data portion only, that travel independently through the system.
- a data response has an associated state response, and a state response may or may not have an associated data response.
- the state portion may be divided into a state response that communicates a subset of the state information and a data response that communicates the remaining subset of the state information along with the data; other divisions are also possible.
- the state portion of a response typically indicates the cache state of one or more agents after the snoop requests have been processed.
- the coherence controller aggregates the state portions from multiple snoop responses and issues a coherency response with a state portion that summarizes the state portions from the snoop responses.
- a translator typically the translator associated with the initiating agent, may be responsible for performing the aggregation of snoop responses and issuing the summary of the state portions in a coherency response.
- the initiating agent may be responsible for performing the aggregation of snoop responses, if this function is defined by the coherence model, and each snoop response effectively becomes a coherency response. In these latter two examples, the translator or the initiating agent, via the translator, provides a summary to the coherence controller, if necessary.
- multiple snoop responses with data portions may result from issuing snoop requests.
- data aggregation may occur at the coherence controller, at the translator associated with the initiating agent, or at the initiating agent itself in a manner similar to one of those described above for aggregating state responses.
- the state portion typically includes an indication of whether an accompanying or associated data portion is clean or dirty with respect to memory.
- a translator uses this indication in conjunction with the type of snoop request or coherency request to determine whether memory must be updated.
- the translator associated with a snooped agent or the translator associated with the initiating agent may update memory, and in others, a translator may direct the coherence controller to update memory. If memory is updated, the state portion of the response is modified to indicate that the data are clean with respect to memory.
- FIG. 12 shows, in accordance with various aspects of the invention, a response scenario, corresponding to the request scenario illustrated by FIG. 9 , in which the second agent 910 has a cache line in the O state and the third agent 914 has the cache line in the I state.
- the second agent 910 issues a ShdDty-B native snoop response and transitions to the S state and the third agent 914 issues an Inv-B native snoop response.
- the second agent translator 908 translates the ShdDty-B native snoop responses into a ShdDty-G generic snoop response.
- the third agent translator 912 translates the Inv-B native snoop responses into an Inv-G generic snoop response.
- the two generic protocol snoop responses are aggregated by the coherence controller 906 into a single ShdDty-G generic coherency response issued to the first agent translator 904 .
- the first agent translator 904 receives the ShdDty-G generic coherency response.
- the initial native coherency request from the first agent 902 was translated to MemRdCln-G, implying that clean data must be provided to the first agent 902 . Consequently, the first agent translator 904 updates memory and translates the generic coherency response into a ShdCln-A native coherency response to the first agent 902 .
- the second agent translator 908 recognizes that the generic snoop request, SnpRdCln-G represents a query for clean data and, in reaction to the ShdDty-B native snoop response, updates memory and translates the ShdDty-B native snoop response into a ShdCln-G generic snoop response. In some embodiments, instead of updating memory and translating the snoop response, the second agent translator 908 provides additional indication to the coherence controller 906 to update memory and translate the dirty generic snoop response into a clean generic coherency response.
- data bypasses the coherence controller 906 , and the coherence controller 906 only aggregates the state portions of the snoop responses.
- the coherence controller 906 either the second agent translator 908 or the first agent translator 904 updates memory.
- FIG. 13 shows, in accordance with various aspects of the invention, another response scenario, corresponding to a request scenario similar to the one illustrated by FIG. 9 , in which the second agent 910 has a cache line in the S state and the third agent 914 has the cache line in the S state.
- coherent controller 1306 additionally issued a request to memory interface 1316 (requests are not shown in the figure).
- the second agent 910 issues a ShdCln-B native snoop response and the third agent 914 also issues a ShdCln-B native snoop response.
- the second agent translator 1308 splits the ShdCln-B native snoop responses into a generic snoop state response to coherence controller 1306 and a generic data response directly to first agent translator 1304 .
- the third agent translator 1312 splits the ShdCln-B native snoop responses into a generic snoop state response to the coherence controller 1306 and a generic data response directly to the first agent translator 1304 .
- Memory interface 1316 provides a generic data response directly to the first agent translator 1304 .
- the generic snoop state responses are aggregated by the coherence controller 1306 into a single ShdCln-G generic coherency state response to the first agent translator 1304 , the generic coherency state response including an additional indication of the number of expected generic data responses, which is two, in accordance with some aspects of the invention.
- the first agent translator 1304 receives the ShdCln-G generic coherency state response with the number of expected data responses.
- coherence model A requires a single native coherency response to be presented to the first agent 902 . So, the first agent translator 1304 gathers the generic coherency state response and the three generic data responses before issuing the native coherency response to the first agent 902 .
- translator 1304 issues a native coherency response after the generic coherency state response and the first generic data response have been received, and the first agent translator 1304 ignores the second and third generic data response.
- the generic coherency state response is guaranteed to arrive before the generic data responses.
- the generic coherency state response and the generic data responses may arrive in any order.
- Machines claimed herein can be embodied in physical machines, such as semiconductor chips; in hardware description language representations of the logical or functional behavior of machines according to the invention as disclosed; and in one or more non-transitory computer readable media arranged to store such hardware description language representations.
- Methods claimed herein can be embodied in the behavior of either one or a combination of humans and machines; in instructions that, if executed by one or more computers, would cause the one or more computers to perform methods according to the invention as disclosed; and in one or more non-transitory computer readable media arranged to store such instructions.
- Inventions claimed herein, the practice of which require more than one non-transitory computer readable medium, should be construed as embodied by each of the more than one non-transitory computer readable medium.
- a computer and a computing device are articles of manufacture.
- articles of manufacture include: an electronic component residing on a mother board, a server, a mainframe computer, or other special purpose computer each having one or more processors (e.g., a Central Processing Unit, a Graphical Processing Unit, or a microprocessor) that is configured to execute a computer readable program code (e.g., an algorithm, hardware, firmware, and/or software) to receive data, transmit data, store data, or perform methods.
- processors e.g., a Central Processing Unit, a Graphical Processing Unit, or a microprocessor
- a computer readable program code e.g., an algorithm, hardware, firmware, and/or software
- the article of manufacture includes a non-transitory computer readable medium or storage/memory that may include a series of instructions, such as computer readable program steps or code encoded therein.
- the non-transitory computer readable medium includes one or more data repositories.
- computer readable program code (or code) is encoded in a non-transitory computer readable medium of the computer or computing device.
- the processor or a module executes the computer readable program code to create or amend an existing computer-aided design using a tool.
- module may refer to one or more circuits, components, registers, processors, software subroutines, or any combination thereof.
- the creation or amendment of the computer-aided design is implemented as a web-based software application in which portions of the data related to the computer-aided design or the tool or the computer readable program code are received or transmitted to a computing device of a host.
- An article of manufacture or system in accordance with various aspects of the invention, is implemented in a variety of ways: with one or more distinct processors or microprocessors, volatile and/or non-volatile memory and peripherals or peripheral controllers; with an integrated microcontroller, which has a processor, local volatile and non-volatile memory, peripherals and input/output pins; discrete logic which implements a fixed version of the article of manufacture or system; and programmable logic which implements a version of the article of manufacture or system which can be reprogrammed either through a local or remote interface.
- Such logic could implement a control system either in logic or via a set of commands executed by a processor.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
- This application is related to and a continuation-in-part of U.S. Non-Provisional Utility patent application Ser. No. 14/806,786 titled DISTRIBUTED IMPLEMENTATION FOR CACHE COHERENCE filed on Jul. 23, 2015 by FORREST, Craig Stephen et al., the entire disclosure of which is incorporated herein by reference.
- The invention is in the field of system-on-chip (SoC) and, more specifically, heterogeneous processing units, including processor cores, graphics processing units, and other components that manipulate or move data.
- Modern systems-on-chip (SoCs) are often designed with heterogeneous processing units that are selected for their different system characteristics. Typically, a processing unit, or agent, implements a cache for storing a local copy of data, and that agent assumes a particular coherence model, which defines a set of cache states and a set of coherent transactions that effect transitions between cache states. A coherence model enforces data coherence among agents that share the same coherence model.
- However, different processing units, delivered as intellectual property blocks from different vendors, may implement different and incompatible coherence models, making integration of those processing units in a SoC difficult if not impossible. As a result, a SoC designer is faced with either higher design costs and longer development cycles or selection of compatible, but less optimal, processing units.
- The invention described herein directly addresses integrating incompatible coherence models implemented by heterogeneous agents. According to an aspect of the invention, a translator is used to adapt between different coherence models. A translator intermediates the exchange of coherency requests and responses between an agent and a coherence controller. In some embodiments the same or a different translator intermediates the exchange of snoop requests and responses between a coherence controller and an agent. In some embodiments the same or a different translator intermediates the exchange of snoop requests and responses between a coherence controller, the snooped agent, and the initiating agent. In some embodiments the coherence controller has a coherence model that includes a cache state model with a state that is not supported by the coherence model of the agent. In some embodiments the agent has a coherence model that includes a cache state model with a state that is not supported by the coherence model of the coherence controller.
- In some embodiments the coherence controller has a coherence model that supports a type of snoop request that is not supported by the coherence model of the agent. In some embodiments the agent has a coherence model that includes a type of snoop request that is not supported by the coherence model of the coherence controller.
- According to an aspect of the invention, the translator is configured with awareness of the allowable states and behaviors of the coherence model on each of its interfaces. The translator issues one or more requests and responses within the scope of the states and behaviors or the destination that provide for the functionality required by the allowed states and behaviors of the source. According to an aspect of the invention, the translator makes the coherence model of agents and coherence controllers transparent to each other so that each only need be designed according to its coherence protocol, and a correctly functioning system may be implemented.
-
FIG. 1 illustrates a system of heterogeneous caching agents with translators to a generic model. -
FIG. 2 illustrates a system of heterogeneous caching agents, one using a generic model with no translation. -
FIG. 3 illustrates a translator between a native and a generic coherence model. -
FIG. 4 illustrates a process of a generic coherency request causing a snoop in the cache of another heterogeneous agent. -
FIG. 5 illustrates a process of a coherency request according to one native coherence model causing a snoop in the cache of another agent with a different coherence model. -
FIG. 6 illustrates a table of native states and equivalent generic representations. -
FIG. 7 illustrates a system of heterogeneous caching agents comprising a proxy between a non-coherent agent and a translator. -
FIG. 8 illustrates a system of three agents, using two different caching models. -
FIG. 9 illustrates a coherency read from an agent of a first coherence model causing snoops to agents of a second coherence model with different agent cache state models. -
FIG. 10 illustrates a coherency read from an agent of a second coherence model causing snoops to an agent of a first coherence model and an agent with the second coherence model but a more restrictive agent cache state model. -
FIG. 11 illustrates a coherency read from an agent of a second coherence model causing snoops to an agent of a first coherence model and an agent with the second coherence model but a less restrictive agent cache state model. -
FIG. 12 illustrates a scenario of coherency responses of dirty data from agents of a second coherence model to an agent of a first coherence model. -
FIG. 13 illustrates a scenario of coherency responses of clean data from agents of a second coherence model to an agent of a first coherence model in a system in which state responses and data responses are separate. - All publications and patents cited in this specification are herein incorporated by reference as if each individual publication or patent were specifically and individually indicated to be incorporated by reference and are incorporated herein by reference to disclose and describe the methods and/or system in connection with which the publications are cited. The citation of any publication is for its disclosure prior to the filing date and should not be construed as an admission that the invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.
- A processing unit may implement caches and a coherence model, within which the processing unit is known as an agent. According to the various aspects of the invention a coherence model describes the following:
-
- the cache state model, which defines the set of cache states, the properties of those states, and the permitted agent operations allowed by those states;
- the transaction processing model, which defines the set of transactions, the cache state transitions resulting from those transactions, the serialization and interlocking behavior of transactions, and the resource management performed by transactions; and
- the coherence granules, which defines the sizes of caches lines on which coherence is maintained.
- A coherence model may permit agents to implement a subset of the cache states defined by the cache state model and this subset is known as an agent cache state model. Different agent cache state models within a given coherence model are compatible by definition.
- Finally, a coherence model may also define additional characteristics and behaviors not described above.
- Cache coherence is maintained by hardware across multiple agents with caches in accordance with the coherence model implemented by those agents; these agents are known as cache-coherent agents. In addition, a coherence model may define agents that initiate coherent transactions but that implement caches that do not need to be kept coherent with other agents' caches; these agents are known as IO-coherent agents. Furthermore, a coherence model may enable such caches to be kept coherent by software.
- According to an aspect of the invention, a coherence model applies to both cache-coherent agents and IO-coherent agents and is relevant whether an individual cache is kept coherent by hardware or software. Both types of agents are collectively called coherent agents.
- In a system of multiple coherent caching agents, the hardware block that enforces coherence is the coherence controller. The coherence controller is coupled to each coherent agent. The coupling can be direct or indirect, such as through a network-on-chip or any other means of interconnect. In some embodiments the coherence controller is distributed and partitioned by cache line address. In such a system, an initiating agent issues coherency requests to the coherence controller, which in turn might issue resulting snoop requests to snooped agents. The snooped agents must, in turn, issue snoop responses, the results of which are typically aggregated by the coherence controller, which, in turn, returns a coherency response to the initiating agent. In some embodiments, the coherence controller implements a snoop filter to track the cache states of a cache line in each agent, and based on the state of the snoop filter, the coherence controller may issue snoop requests to a subset of agents.
- As described above, each agent uses a particular coherence model, and assumes that all other coherent agents use the same model. In accordance with some aspects and some embodiments of the invention, multiple coherent agents use the same coherence model. In heterogeneous systems of agents with different coherence models, the coherence controller must accept coherency requests and issue coherency responses to an initiating agent according to its coherence model and issue snoop requests and accept snoop responses each according to the coherence model of the particular snooped agent. To do so, translation must be performed between coherence models.
- In accordance with the aspects of the invention, translation can be performed in a distinct and/or dedicated translator unit, as a function of the coherence controller, and/or in an interface unit connected to an agent. The functionality of a translator unit can be divided into sub-units, such as a request sub-unit and a response sub-unit, or such as a coherency sub-unit that translates coherency requests and responses and a snoop sub-unit that translates snoop requests and responses. For clarity, embodiments disclosed indicate a distinct, dedicated translator, though this should not be construed as limiting the scope of the disclosed invention.
- In accordance with the various aspects of the invention, coherent systems include fully-coherent (FC) agents and/or IO-coherent (IO) agents. An agent communicates using its own native protocol or the generic protocol. Furthermore, a cache-coherent system includes at least one FC agent; if all agents are IO, then that is similar to all agents being non-coherent. Additionally, agents can act as either a requesting agent or a snooped agent; an FC agent can act as both, but an IO agent can only act as a requesting agent.
-
FIG. 1 shows, in accordance with some aspects of the invention, one embodiment of a system according to the invention. Initiatingagent 100 uses a coherence model that has two states, S andI. Agent 100 issues coherency requests totranslator 102 according to its native coherence model. Thosecause translator 102 to issue coherency requests tocoherence controller 104 according to a generic protocol. To control the cache in another agent,coherence controller 104 issues snoop requests totranslator 106 according to the generic coherence model. Thosecause translator 106 to issue snoop requests to snoopedagent 108 according to its native coherence model.Agent 108 supports five states, M, O, E, S, and I. The coherence models ofagent 100 andagent 108 are different and not directly compatible. -
FIG. 2 shows, in accordance with some aspects of the invention, another embodiment, wherein an agent uses a generic coherence model.SI initiating agent 200 issues coherency requests directly tocoherence controller 204 according to a generic protocol. To control the cache in another agent,coherence controller 204 issues snoop requests totranslator 206 according to the generic coherence model. The snoop requests causetranslator 206 to issue snoop requests toMOESI agent 208 according to its native coherence model. -
FIG. 3 shows, in accordance with various aspects of the invention, an embodiment of a translator according to various aspects of the invention. It comprises coherency request and response and snoop request and response on each of an agent native interface and a coherence controller generic interface. Translators may be built with subsets of interface signals, and the embodiment shown should not be construed as limiting. The translator receives coherency requests from an agent on native interface signals 301. It reacts by issuing one or more coherency requests to the coherence controller on generic interface signals 303. Multiple requests may occur if, for instance, a transaction must be split, e.g. in the case that the transaction address and length exceed the size of a coherence granule in the generic coherence model or crosses a defined address range boundary. Eventually the coherence controller will respond on generic interface signals 305 to each coherency request issued on generic interface signals 303. A corresponding coherency response will be issued to the connected agent on native interface signals 307 for each coherency request issued on native interface signals 301. When a native coherency request is split into multiple generic coherency requests, the translator gathers and assembles the multiple generic coherency responses into a single native coherency response. - Simultaneously, and in response to coherency requests initiated by other agents, the connected coherence controller may issue snoop requests to the translator through generic interface signals 311. The translator will react by issuing one or more corresponding snoop requests to the connected agent on native interface signals 313. In response to each snoop request, the connected agent will issue a snoop response on native interface signals 315. Snoop responses may or may not carry data, depending on the type of snoop request and the state of the requested line in the agent cache. Upon receiving a snoop response from the agent, the translator issues a corresponding snoop response to the coherence controller on generic interface signals 317. When multiple snoop requests are issued on the native snoop interface, the translator gathers and assembles the multiple native snoop responses into a single generic snoop response.
- In some embodiments, native coherency response signals 307 and native snoop
response signals 315 may be split into a state response group and a data response group depending on the requirements of the native coherence model. In addition, in some embodiments, generic coherency response signals 305 and generic snoopresponse signals 317 may be split into a state response group and a data response group depending on the requirements of the generic coherence model. -
FIG. 4 shows, in accordance with various aspects of the invention, a process of a coherency request from a first agent causing a snoop request at a second agent in a system where the first agent and the second agent both use incompatible native coherence models that differ from a generic coherence model used by a coherence controller, as in the system ofFIG. 1 . Ininitial step 400, theagent 100 issues a coherency request to thetranslator 102 according to its native coherence model. Insubsequent step 402, thetranslator 102 issues a coherency request thecoherence controller 104 according to a generic coherence model. Insubsequent step 404, thecoherence manager 104 issues a corresponding snoop request to thetranslator 106 according to its generic coherence model. Insubsequent step 406, thetranslator 106 issues a snoop request to theagent 108 according to its native coherence model. -
FIG. 5 shows, in accordance with various aspects of the invention, a process of a coherency request from a first agent causing a snoop request at a second agent in a system where the first agent and the coherence controller share a coherence model, as in the system ofFIG. 2 . Ininitial step 500, theagent 200 issues a coherency request to thecoherence controller 204. Insubsequent step 502, thecoherence manager 204 issues a corresponding snoop request to thetranslator 206 according to its generic coherence model. Insubsequent step 504, thetranslator 206 issues a snoop request to theagent 208 according to its native coherence model. - According to one aspect of the invention, a coherence controller only issues, to any particular agent, snoop requests that put a cache line into a state that is legal according to the snooped agent's coherence model and that is consistent with the state requested by the initiating agent. For example, if the coherence model for the initiating agent does not allow shared cache lines in other agents, a coherence controller will never issue a snoop request to a snooped agent that leaves a shared copy of data in the snooped agent, even if the coherence model for the snooped agent supports such an outcome.
- According to some aspects and embodiments of the invention, the state of a cache line in any agent is generically represented within the coherence controller by four binary cache line characteristics:
-
- valid or invalid (V/I)—if valid, the line is present in the cache, and if invalid, the line is not present in the cache;
- owned or non-owned (O/N)—if owned, the agent must forward data in response to a snoop and may be required to update memory upon replacing the line in its cache, and if non-owned, the agent does not have these responsibilities;
- unique or shared (U/S)—if unique, the line is known to be present only in the agent's cache, and if shared, the line may be present in another agent's cache; and
- clean or dirty (C/D)—if clean, the agent may replace the line without updating main memory, and if dirty the agent must update memory upon replacing the line.
- Common cache state models include MSI, MESI, and MOESI, among others, where each of the letters represents a cache state: Modified (M), Owned (O), Exclusive (E), Shared (S), and Invalid (I). An example of a cache state model that is widely used is the cache state model specified by the Advanced Microcontroller Bus Architecture (AMBA) AXI Coherency Extensions (ACE), which defines a five-state cache state model that is effectively equivalent to the MOESI cache state model. Some other cache state models may additionally define a Forward (F) state.
-
FIG. 6 shows, in accordance with various aspects of the invention, a table of cache states found in the cache state models of many agents, along with the equivalent state of each cache line characteristic of the generic coherence model. - According to one such embodiment, only IXXX, VNSC, VOUC, VOUD, VOSD, and VOSC states are legally recognized by the generic cache state model of the coherence controller. The four binary states are encoded in a representation that uses only three bits to identify the six legal states.
- According to different aspects and embodiments, a generic coherence model may be a superset, subset, or partially inclusive of the cache state models, the transaction processing models, and the coherence granules defined by the native coherence models of the agents in the system.
- According to some aspects and embodiments, a single request according to one coherence model causes a translator to issue multiple requests according to a second coherence model. When the translator issues multiple requests into the second coherence model in response to a request from the first, the translator also gathers and assembles the multiple responses from the second coherence model into a single response to the first. This is true for a single request according to one native protocol translated into multiple requests in the generic protocol or for a single request according to a generic protocol translated into multiple requests in a second native protocol.
- According to some aspects and embodiments, a generic coherence model represents fewer than all possible valid states within an agent's native coherence model.
- A coherence controller may be coupled to a system directory. In some aspects and embodiments, a system directory is an integral part of the coherence controller. In any case, according to some aspects and embodiments of the invention, it is not necessary for the system directory to be aware of, or even support the coherence model of agents.
- According to some aspects and embodiments, a system includes a proxy cache for a non-coherent agent.
FIG. 7 shows, in accordance with various aspects of the invention, one embodiment of a system with a proxy cache according to the invention.Caching agent 700 issues coherency requests totranslator 702 according to its native coherence model. Thosecause translator 702 to issue coherency requests tocoherence controller 704 according to a generic protocol. To control the cache in another agent,coherence controller 704 issues snoop requests toproxy cache 706 according to the generic coherence model.Proxy cache 706 acts as proxy cache forcacheless agent 708. However,proxy cache 706 need not issue any request toagent 708.Proxy cache 706 autonomously issues appropriate snoop responses tocoherence controller 704 according to the generic protocol. No snoop translation is necessary. - According to
FIG. 7 ,cacheless agent 708 issues non-coherent coherency requests toproxy cache 706.Proxy cache 706, in turn, issues corresponding coherent coherency requests tocoherence controller 704.Coherence controller 704 issues snoop requests tocaching agent 700, throughtranslator 702.Caching agent 700 issues appropriate snoop responses tocoherence controller 704, throughtranslator 702. The coherence controller, in turn, issues snoop responses toproxy cache 706, which returns a non-coherent response tocacheless agent 708. - Some coherence models require self-snoops for purposes such as disambiguating coherency request ordering. Other coherence models do not require self-snoops. According to some aspects and an embodiment of the invention, a system comprises an agent that accords with a self-snooping coherence model and an agent that accords with a model that does not support self-snooping. When the self-snooping agent issues a coherency request to a coherence controller, the coherence controller issues a snoop to the initiating agent. When the non-self-snooping agent issues a coherency request to the coherence controller, the coherence controller snoops other agents, but does not snoop the initiating agent.
- According to some coherence models, an initiating agent is required to serialize outstanding writeback coherency requests with respect to inbound snoop requests to the same cache line. In such a coherence model, the initiating agent blocks such snoop requests until the agent receives the corresponding coherency response for the writeback coherency request and then issues the snoop response for the snoop request. Furthermore, in this coherence model, the coherence controller must guarantee that writeback coherency requests make forward progress so that snoop requests can make forward progress.
- According to some coherence models, the coherence controller is required to serialize outstanding snoop requests with respect to inbound writeback coherency requests to the same cache line. In such a coherence model, the coherence controller blocks such writeback coherency requests until the snoop responses for the snoop requests have been received.
- Consequently, the native to generic translators must select a writeback coherency request with semantics in the generic coherence model appropriate to the corresponding native coherence models, and the coherence controller must either block the writeback coherency request or allow the writeback coherency request to proceed as needed.
- According to some coherence models, agents respond with data in the S state (or SharedClean in ACE or VNSC in a generic coherence model). In one embodiment, a coherence controller that operates according to a generic coherence model, in reaction to coherency requests, issues snoops to multiple agents and, if multiple agents return data from the line in the S state, the coherence controller discards all but the first data response, which it issues in the coherency response to the initiating agent.
- Referring now to
FIG. 8 , two coherence models are depicted. Coherence model A defines a single agent cache state model that consists of four cache states, MESI. Coherence model B defines multiple agent cache state models, including one that consists of four cache states, MESI, and one that consists of five cache states, MOESI. Both coherence models define a basic ReadShared coherency request that fetches a copy of a cache line for the initiating agent to satisfy an internal load operation and allows snooped agents to retain a valid, shared copy of the cache line. - In each of the two coherence models, however, the ReadShared coherency request and the corresponding ReadShared snoop request have different meanings. According to coherence model A, an initiating MESI agent, i.e. an agent that implements the MESI agent cache state model, issues a ReadShared coherency request to install a cache line in the M, E, or S state based on the coherency response. In response to a ReadShared snoop request, a snooped MESI agent in coherence model A may retain a valid copy, transitioning to the S state, or may invalidate its copy, transitioning to the I state. If a snooped MESI agent had a copy in the M state initially and then retained a copy in the S state in response to a ReadShared snoop request, an initiating MESI agent installs the copy in the S state, and memory must be updated with the dirty data transferred from the snooped agent.
- On the other hand, according to coherence model B, an initiating MOESI agent, i.e. an agent that implements the MOESI agent cache state model, issues a ReadShared coherency request to install a cache line in the M, O, E, or S state based on the coherency response. Because the coherency response to a ReadShared coherency request may return shared, dirty data to the initiating agent, causing that agent to install a cache line in the O state, a MESI agent in coherence model B must not issue a ReadShared coherency request. Instead, coherence model B may define a ReadClean coherency request and snoop request. In this case, an initiating MESI agent issues a ReadClean coherency request to install a cache line in the E or S state based on the coherency response.
- According to coherence model B, and in response to a ReadShared snoop request or a ReadClean snoop request, a snooped MOESI agent in coherence model B may retain a copy in the O or S state or may transition to the I state. In response to a ReadShared snoop request or a ReadClean snoop request, a snooped MESI agent in coherence model B may retain a copy in the S state or may transition to the I state. If a snooped MOESI agent had a copy in the M or O state initially, or if a snooped MESI agent had a copy in the M state initially, and then retained a copy in the S state in response to a ReadClean snoop request, an initiating MESI agent installs the copy in the S state, and memory must be updated with the dirty data transferred from the snooped agent. In coherence model B, a ReadShared coherency request does not require an update to memory since only MOESI agents issue such a request and those agents can install a cache line in all valid states including O.
- According to various aspects and an embodiment of the invention, as shown in
FIG. 8 , what distinguishes a MESI agent in coherence model B from a MESI agent in coherence model A is that the former issues a specific coherency request to interoperate with a MOESI agent while the latter is unaware that a MOESI agent may be present in the system. - A generic coherence model, coherence model G, defines an agent cache state model with six cache states, MOESIF. Coherence model G also defines a MemRdShd coherency request and a SnpRdShd snoop request with behaviors compatible with the ReadShared requests defined by coherence model B; and defines a MemRdCln coherency request and a SnpRdCln snoop request with behaviors compatible with the ReadShared requests defined by coherence model A and the ReadClean requests defined by coherence model B.
- In accordance with various aspects of the invention and an embodiment of a system with the coherence models described above, the translators in the system perform at least two functions: adapting requests and responses between agents in different coherence models; and transmitting requests and responses between agents in the same coherence model. In other embodiments, the translators in the system also adapt requests and responses between agents in the same coherence model.
- Examples of coherency request and snoop request translation between coherence model A and coherence model B are described below. To clarify the examples, each coherency request or snoop request is designated with its corresponding coherence model, e.g. a ReadShared from coherence model A is written as ReadShared-A and a SnpRdCln from coherence model G is written as SnpRdCln-G.
- Consider an embodiment, based on the various aspects of the invention, of the invention with three agents. The first agent uses coherence model A with a MESI agent cache state model. The second agent uses coherence model B with a MOESI agent cache state model. The third agent uses coherence model B with a MESI agent cache state model.
-
FIG. 9 shows, in accordance with various aspects of the invention, a scenario in which a coherency read from the first agent causes snoops to the second agent and the third agent. In this scenario, a coherency request in coherence model A is translated into snoop requests in coherence model B. Afirst agent 902 issues a ReadShared-A coherency request to afirst agent translator 904. Thefirst agent translator 904 translates the coherency request from thefirst agent 902 into the coherency request in the generic coherence model that is most consistent with the cache states of thefirst agent 902, i.e. MemRdCln-G. Thefirst agent translator 904 issues a MemRdCln-G coherency request to acoherence controller 906. Thecoherence controller 906 issues a SnpRdCln-G snoop request to each of asecond agent translator 908 and athird agent translator 912. Next, thesecond agent translator 908 and thethird agent translator 912 translate the snoop request in the generic coherence model into the most appropriate snoop request in coherence model B, i.e. ReadClean-B. Thesecond agent translator 908 issues a ReadClean-B snoop request to asecond agent 910. Thethird agent translator 912 issues a ReadClean-B snoop request to athird agent 914. In effect, the translators adapt the request from thefirst agent 902 into requests to the second and third agents. -
FIG. 10 shows, in accordance with various aspects of the invention, a scenario in which a coherency read from thesecond agent 910 causes snoops to thefirst agent 902 and thethird agent 914. In this scenario, a coherency request in coherence model B is translated into a snoop request in coherence model A and into a snoop request in coherence model B. Thesecond agent translator 908 translates the coherency request from thesecond agent 910 into the equivalently defined coherency request in the generic coherence model, i.e. MemRdShd-G. Thesecond agent 910 issues a ReadShared-B coherency request to thesecond agent translator 908. Thesecond agent translator 908 issues a MemRdShd-G coherency request to thecoherence controller 906. Thecoherence controller 906 issues a SnpRdShd-G snoop request to each of thefirst agent translator 904 and thethird agent translator 912. For thefirst agent 902, thefirst agent translator 904 translates the snoop request in the generic coherence model into a snoop request in coherence model A that is consistent with generic snoop request, i.e. ReadShared-A. Thefirst agent translator 904 issues a ReadShared-A snoop request to thefirst agent 902. For thethird agent 914, thethird agent translator 912 translates the snoop request in the generic coherence model into the equivalently defined snoop request in coherence model B, i.e. ReadShared-B. Thethird agent translator 912 issues a ReadShared-B snoop request to thethird agent 914. In effect, the translators adapt the request from thesecond agent 910 into a request to thefirst agent 902 and simply transmit the request from thesecond agent 910 to thethird agent 914, as those agents share the same coherence model. -
FIG. 11 shows, in accordance with various aspects of the invention, a scenario in which a coherency read from thethird agent 914 causes snoops to thefirst agent 902 and thesecond agent 910. Like the previous scenario, a coherency request in coherence model B is translated into a snoop request in coherence model A and into a snoop request in coherence model B. Thethird agent 914 issues a ReadClean-B coherency request to thethird agent translator 912. Thethird agent translator 912 translates the coherency request from thethird agent 914 into the equivalently defined coherency request in the generic coherence model, i.e. MemRdCln-G. Thethird agent translator 912 issues a MemRdCln-G coherency request to thecoherence controller 906. Thecoherence controller 906 issues a SnpRdCln-G snoop request to each of thefirst agent translator 904 and thesecond agent translator 910. For thefirst agent 902, thefirst agent translator 904 translates the snoop request in the generic coherence model into a snoop request in coherence model A that is consistent with generic snoop request, i.e. ReadShared-A. Thefirst agent translator 904 issues a ReadShared-A snoop request to thefirst agent 902. For thesecond agent 910, thesecond agent translator 908 translates the snoop request in the generic coherence model into the equivalently defined snoop request in coherence model B, i.e. ReadClean-B. Thesecond agent translator 908 issues a ReadClean-B snoop request to thesecond agent 910. In effect, the translators adapt the request from thethird agent 914 into a request to thefirst agent 902 and simply transmit the request from thethird agent 914 to thesecond agent 910, as those agents share the same coherence model. - In accordance with various aspects of the invention, snoop responses and coherency responses communicate cache state and data. Responses may contain a state portion only or may contain both a state portion and a data portion. In some aspects and embodiments, responses may be divided into state responses, consisting of a state portion only, and data responses, consisting of a data portion only, that travel independently through the system. In such an embodiment, a data response has an associated state response, and a state response may or may not have an associated data response. In some aspects and embodiments, the state portion may be divided into a state response that communicates a subset of the state information and a data response that communicates the remaining subset of the state information along with the data; other divisions are also possible.
- The state portion of a response typically indicates the cache state of one or more agents after the snoop requests have been processed. In some aspects and embodiments, the coherence controller aggregates the state portions from multiple snoop responses and issues a coherency response with a state portion that summarizes the state portions from the snoop responses. In other aspects and embodiments, a translator, typically the translator associated with the initiating agent, may be responsible for performing the aggregation of snoop responses and issuing the summary of the state portions in a coherency response. In yet other aspects and embodiments, the initiating agent may be responsible for performing the aggregation of snoop responses, if this function is defined by the coherence model, and each snoop response effectively becomes a coherency response. In these latter two examples, the translator or the initiating agent, via the translator, provides a summary to the coherence controller, if necessary.
- In some other aspects and embodiments, multiple snoop responses with data portions may result from issuing snoop requests. In these embodiments, data aggregation may occur at the coherence controller, at the translator associated with the initiating agent, or at the initiating agent itself in a manner similar to one of those described above for aggregating state responses.
- The state portion typically includes an indication of whether an accompanying or associated data portion is clean or dirty with respect to memory. A translator uses this indication in conjunction with the type of snoop request or coherency request to determine whether memory must be updated. In some embodiments, the translator associated with a snooped agent or the translator associated with the initiating agent may update memory, and in others, a translator may direct the coherence controller to update memory. If memory is updated, the state portion of the response is modified to indicate that the data are clean with respect to memory.
- Examples of snoop response and coherency response translation between coherence model A and coherence model B are described below. In each coherence model, it is assumed that the state portion of a response has an indication of whether an agent has a shared or invalid copy and, if a data portion is associated with the state portion, an indication of whether the data are clean or dirty with respect to memory. These result in six response combinations: invalid state (Inv), invalid state with clean data (InvCln), invalid state with dirty data (InvDty), shared state (Shd), shared state with clean data (ShdCln), and shared state with dirty data (ShdDty). Not all response combinations are allowed in all coherence models, e.g. a ShdDty coherency response is not legal in coherence model A since a MESI agent in coherence model A does not support an O state.
-
FIG. 12 shows, in accordance with various aspects of the invention, a response scenario, corresponding to the request scenario illustrated byFIG. 9 , in which thesecond agent 910 has a cache line in the O state and thethird agent 914 has the cache line in the I state. In response to each ReadClean-B native snoop request, thesecond agent 910 issues a ShdDty-B native snoop response and transitions to the S state and thethird agent 914 issues an Inv-B native snoop response. Thesecond agent translator 908 translates the ShdDty-B native snoop responses into a ShdDty-G generic snoop response. Thethird agent translator 912 translates the Inv-B native snoop responses into an Inv-G generic snoop response. The two generic protocol snoop responses are aggregated by thecoherence controller 906 into a single ShdDty-G generic coherency response issued to thefirst agent translator 904. Thefirst agent translator 904 receives the ShdDty-G generic coherency response. However, the initial native coherency request from thefirst agent 902 was translated to MemRdCln-G, implying that clean data must be provided to thefirst agent 902. Consequently, thefirst agent translator 904 updates memory and translates the generic coherency response into a ShdCln-A native coherency response to thefirst agent 902. - In some aspects and embodiments, the
second agent translator 908 recognizes that the generic snoop request, SnpRdCln-G represents a query for clean data and, in reaction to the ShdDty-B native snoop response, updates memory and translates the ShdDty-B native snoop response into a ShdCln-G generic snoop response. In some embodiments, instead of updating memory and translating the snoop response, thesecond agent translator 908 provides additional indication to thecoherence controller 906 to update memory and translate the dirty generic snoop response into a clean generic coherency response. In some aspects and embodiments, data bypasses thecoherence controller 906, and thecoherence controller 906 only aggregates the state portions of the snoop responses. In such embodiments, either thesecond agent translator 908 or thefirst agent translator 904 updates memory. -
FIG. 13 shows, in accordance with various aspects of the invention, another response scenario, corresponding to a request scenario similar to the one illustrated byFIG. 9 , in which thesecond agent 910 has a cache line in the S state and thethird agent 914 has the cache line in the S state. In the scenario ofFIG. 13 ,coherent controller 1306 additionally issued a request to memory interface 1316 (requests are not shown in the figure). In response to each ReadClean-B native snoop request, thesecond agent 910 issues a ShdCln-B native snoop response and thethird agent 914 also issues a ShdCln-B native snoop response. Thesecond agent translator 1308 splits the ShdCln-B native snoop responses into a generic snoop state response tocoherence controller 1306 and a generic data response directly tofirst agent translator 1304. Thethird agent translator 1312 splits the ShdCln-B native snoop responses into a generic snoop state response to thecoherence controller 1306 and a generic data response directly to thefirst agent translator 1304.Memory interface 1316 provides a generic data response directly to thefirst agent translator 1304. The generic snoop state responses are aggregated by thecoherence controller 1306 into a single ShdCln-G generic coherency state response to thefirst agent translator 1304, the generic coherency state response including an additional indication of the number of expected generic data responses, which is two, in accordance with some aspects of the invention. Thefirst agent translator 1304 receives the ShdCln-G generic coherency state response with the number of expected data responses. However, in accordance with aspects and embodiment ofFIG. 13 , coherence model A requires a single native coherency response to be presented to thefirst agent 902. So, thefirst agent translator 1304 gathers the generic coherency state response and the three generic data responses before issuing the native coherency response to thefirst agent 902. - In some aspects and embodiments,
translator 1304 issues a native coherency response after the generic coherency state response and the first generic data response have been received, and thefirst agent translator 1304 ignores the second and third generic data response. In some aspects and embodiments, the generic coherency state response is guaranteed to arrive before the generic data responses. In some aspects and embodiments, the generic coherency state response and the generic data responses may arrive in any order. - Machines claimed herein can be embodied in physical machines, such as semiconductor chips; in hardware description language representations of the logical or functional behavior of machines according to the invention as disclosed; and in one or more non-transitory computer readable media arranged to store such hardware description language representations. Methods claimed herein can be embodied in the behavior of either one or a combination of humans and machines; in instructions that, if executed by one or more computers, would cause the one or more computers to perform methods according to the invention as disclosed; and in one or more non-transitory computer readable media arranged to store such instructions. Inventions claimed herein, the practice of which require more than one non-transitory computer readable medium, should be construed as embodied by each of the more than one non-transitory computer readable medium.
- Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The verb “couple”, its gerundial forms, and other variants, should be understood to refer to either direct connections or operative manners of interaction between elements of the invention through one or more intermediating elements, whether or not any such intermediating element is recited.
- Any methods and materials similar or equivalent to those described herein are not considered abstract ideas and are considered to be significant improvements in the art when used in the practice of the invention. Representative illustrative methods and materials are also described. Additionally, it is intended that equivalents include both currently known equivalents and equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure. The scope of the invention, therefore, is not intended to be limited to the exemplary aspects and embodiments shown and described herein.
- In accordance with the various aspects of the invention a computer and a computing device are articles of manufacture. Other examples of an article of manufacture include: an electronic component residing on a mother board, a server, a mainframe computer, or other special purpose computer each having one or more processors (e.g., a Central Processing Unit, a Graphical Processing Unit, or a microprocessor) that is configured to execute a computer readable program code (e.g., an algorithm, hardware, firmware, and/or software) to receive data, transmit data, store data, or perform methods.
- The article of manufacture (e.g., computer or computing device) includes a non-transitory computer readable medium or storage/memory that may include a series of instructions, such as computer readable program steps or code encoded therein. In certain aspects of the invention, the non-transitory computer readable medium includes one or more data repositories. Thus, in certain embodiments that are in accordance with any aspect of the invention, computer readable program code (or code) is encoded in a non-transitory computer readable medium of the computer or computing device. The processor or a module, in turn, executes the computer readable program code to create or amend an existing computer-aided design using a tool. The term “module” as used herein may refer to one or more circuits, components, registers, processors, software subroutines, or any combination thereof. In other aspects of the embodiments, the creation or amendment of the computer-aided design is implemented as a web-based software application in which portions of the data related to the computer-aided design or the tool or the computer readable program code are received or transmitted to a computing device of a host.
- An article of manufacture or system, in accordance with various aspects of the invention, is implemented in a variety of ways: with one or more distinct processors or microprocessors, volatile and/or non-volatile memory and peripherals or peripheral controllers; with an integrated microcontroller, which has a processor, local volatile and non-volatile memory, peripherals and input/output pins; discrete logic which implements a fixed version of the article of manufacture or system; and programmable logic which implements a version of the article of manufacture or system which can be reprogrammed either through a local or remote interface. Such logic could implement a control system either in logic or via a set of commands executed by a processor.
- Accordingly, the preceding merely illustrates the various aspects and principles as incorporated in various embodiments of the invention. It will be appreciated that those of ordinary skill in the art will be able to devise various arrangements which, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and conditional language recited herein are principally intended to aid the reader in understanding the principles of the invention and the concepts contributed by the inventors to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents and equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
- Therefore, the scope of the invention is not intended to be limited to the various aspects and embodiments discussed and described herein. Rather, the scope and spirit of invention is embodied by the appended claims.
Claims (18)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US14/970,467 US9542316B1 (en) | 2015-07-23 | 2015-12-15 | System and method for adaptation of coherence models between agents |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US14/806,786 US20160188472A1 (en) | 2014-12-30 | 2015-07-23 | Distributed implementation for cache coherence |
| US14/970,467 US9542316B1 (en) | 2015-07-23 | 2015-12-15 | System and method for adaptation of coherence models between agents |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US14/806,786 Continuation-In-Part US20160188472A1 (en) | 2014-12-30 | 2015-07-23 | Distributed implementation for cache coherence |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US9542316B1 US9542316B1 (en) | 2017-01-10 |
| US20170024320A1 true US20170024320A1 (en) | 2017-01-26 |
Family
ID=57705984
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US14/970,467 Active US9542316B1 (en) | 2015-07-23 | 2015-12-15 | System and method for adaptation of coherence models between agents |
Country Status (1)
| Country | Link |
|---|---|
| US (1) | US9542316B1 (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US9990292B2 (en) | 2016-06-29 | 2018-06-05 | Arm Limited | Progressive fine to coarse grain snoop filter |
| US10042766B1 (en) | 2017-02-02 | 2018-08-07 | Arm Limited | Data processing apparatus with snoop request address alignment and snoop response time alignment |
| US20180227382A1 (en) * | 2017-02-08 | 2018-08-09 | Arm Limited | Forwarding responses to snoop requests |
| US10157133B2 (en) * | 2015-12-10 | 2018-12-18 | Arm Limited | Snoop filter for cache coherency in a data processing system |
| US11360906B2 (en) | 2020-08-14 | 2022-06-14 | Alibaba Group Holding Limited | Inter-device processing system with cache coherency |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017123208A1 (en) * | 2016-01-12 | 2017-07-20 | Hewlett Packard Enterprise Development Lp | Partially coherent memory transfer |
| US10503641B2 (en) * | 2016-05-31 | 2019-12-10 | Advanced Micro Devices, Inc. | Cache coherence for processing in memory |
| US10073776B2 (en) * | 2016-06-23 | 2018-09-11 | Advanced Micro Device, Inc. | Shadow tag memory to monitor state of cachelines at different cache level |
| US10528519B2 (en) * | 2017-05-02 | 2020-01-07 | Mellanox Technologies Ltd. | Computing in parallel processing environments |
| US11681624B2 (en) * | 2020-07-17 | 2023-06-20 | Qualcomm Incorporated | Space and time cache coherency |
| US20250240156A1 (en) * | 2022-12-23 | 2025-07-24 | Advanced Micro Devices, Inc. | Systems and methods relating to confidential computing key mixing hazard management |
Family Cites Families (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB8814077D0 (en) * | 1988-06-14 | 1988-07-20 | Int Computers Ltd | Data memory system |
| JPH0680499B2 (en) * | 1989-01-13 | 1994-10-12 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Cache control system and method for multiprocessor system |
| US5659710A (en) * | 1995-11-29 | 1997-08-19 | International Business Machines Corporation | Cache coherency method and system employing serially encoded snoop responses |
| US6321304B1 (en) * | 1999-01-22 | 2001-11-20 | Apple Computer, Inc. | System and method for deleting read-only head entries in multi-processor computer systems supporting cache coherence with mixed protocols |
| US6275900B1 (en) * | 1999-01-27 | 2001-08-14 | International Business Machines Company | Hybrid NUMA/S-COMA system and method |
| US6763434B2 (en) * | 2000-12-30 | 2004-07-13 | International Business Machines Corporation | Data processing system and method for resolving a conflict between requests to modify a shared cache line |
| DE60140859D1 (en) * | 2001-05-01 | 2010-02-04 | Sun Microsystems Inc | Shared Memory Multiprocessor System Using Mixed Broadcast Snoop and Directory Based Coherence Protocols |
| US20040111563A1 (en) * | 2002-12-10 | 2004-06-10 | Edirisooriya Samantha J. | Method and apparatus for cache coherency between heterogeneous agents and limiting data transfers among symmetric processors |
| US7177987B2 (en) * | 2004-01-20 | 2007-02-13 | Hewlett-Packard Development Company, L.P. | System and method for responses between different cache coherency protocols |
| US8352656B2 (en) * | 2010-04-08 | 2013-01-08 | Intel Corporation | Handling atomic operations for a non-coherent device |
-
2015
- 2015-12-15 US US14/970,467 patent/US9542316B1/en active Active
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10157133B2 (en) * | 2015-12-10 | 2018-12-18 | Arm Limited | Snoop filter for cache coherency in a data processing system |
| US9990292B2 (en) | 2016-06-29 | 2018-06-05 | Arm Limited | Progressive fine to coarse grain snoop filter |
| US10042766B1 (en) | 2017-02-02 | 2018-08-07 | Arm Limited | Data processing apparatus with snoop request address alignment and snoop response time alignment |
| US20180227382A1 (en) * | 2017-02-08 | 2018-08-09 | Arm Limited | Forwarding responses to snoop requests |
| US11159636B2 (en) * | 2017-02-08 | 2021-10-26 | Arm Limited | Forwarding responses to snoop requests |
| TWI753093B (en) * | 2017-02-08 | 2022-01-21 | 英商Arm股份有限公司 | Forwarding responses to snoop requests |
| US11360906B2 (en) | 2020-08-14 | 2022-06-14 | Alibaba Group Holding Limited | Inter-device processing system with cache coherency |
Also Published As
| Publication number | Publication date |
|---|---|
| US9542316B1 (en) | 2017-01-10 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9542316B1 (en) | System and method for adaptation of coherence models between agents | |
| Das Sharma et al. | An introduction to the compute express link (cxl) interconnect | |
| DE112013007764B4 (en) | high-performance interconnect coherence protocol | |
| US6976131B2 (en) | Method and apparatus for shared cache coherency for a chip multiprocessor or multiprocessor system | |
| EP2486487B1 (en) | Notification protocol based endpoint caching of host memory | |
| US20140108744A1 (en) | Simplified controller with partial coherency | |
| US7549024B2 (en) | Multi-processing system with coherent and non-coherent modes | |
| US8037253B2 (en) | Method and apparatus for global ordering to insure latency independent coherence | |
| GB2403560A (en) | Memory bus within a coherent multi-processing system | |
| JP2005025726A (en) | Power control in coherent multiprocessing system | |
| CN104115128A (en) | Integrated circuits with cache-coherency | |
| US20040111563A1 (en) | Method and apparatus for cache coherency between heterogeneous agents and limiting data transfers among symmetric processors | |
| US20030023794A1 (en) | Cache coherent split transaction memory bus architecture and protocol for a multi processor chip device | |
| US20080109610A1 (en) | Selective snooping by snoop masters to locate updated data | |
| DE102022119705A1 (en) | CLUSTER IDENTIFIER REMAP FOR ASYMMETRICAL TOPOLOGIES | |
| US7568073B2 (en) | Mechanisms and methods of cache coherence in network-based multiprocessor systems with ring-based snoop response collection | |
| US10452272B2 (en) | System to reduce directory information storage | |
| KR20060087561A (en) | Method and apparatus for concatenated cache coherency states in multi-interface caches | |
| US20160188470A1 (en) | Promotion of a cache line sharer to cache line owner | |
| WO2024073864A1 (en) | Distributed address translation services | |
| US10146615B2 (en) | Recovery of a system directory after detection of uncorrectable error | |
| Dukare et al. | Development of AMBA ACE protocol | |
| CN118779266B (en) | A server, an acceleration device and an acceleration card | |
| US20080082756A1 (en) | Mechanisms and methods of using self-reconciled data to reduce cache coherence overhead in multiprocessor systems | |
| US20240241830A1 (en) | Cache management using shared cache line storage |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: ARTERIS, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:FORREST, CRAIG STEPHEN;KRUCKEMYER, DAVID A.;REEL/FRAME:037304/0926 Effective date: 20151216 |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
| FEPP | Fee payment procedure |
Free format text: SURCHARGE FOR LATE PAYMENT, SMALL ENTITY (ORIGINAL EVENT CODE: M2554); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2551); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 4 |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2552); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 8 |