[go: up one dir, main page]

CA1301354C - Alias address support - Google Patents

Alias address support

Info

Publication number
CA1301354C
CA1301354C CA000577716A CA577716A CA1301354C CA 1301354 C CA1301354 C CA 1301354C CA 000577716 A CA000577716 A CA 000577716A CA 577716 A CA577716 A CA 577716A CA 1301354 C CA1301354 C CA 1301354C
Authority
CA
Canada
Prior art keywords
virtual
cache
virtual memory
memory
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CA000577716A
Other languages
French (fr)
Inventor
William Van Loo
John Watkins
Joseph Moran
Ray Cheng
William Shannon
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Microsystems Inc
Original Assignee
Sun Microsystems Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Microsystems Inc filed Critical Sun Microsystems Inc
Application granted granted Critical
Publication of CA1301354C publication Critical patent/CA1301354C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1045Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache
    • G06F12/1063Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache the data cache being concurrently virtually addressed
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/65Details of virtual memory and virtual address translation
    • G06F2212/653Page colouring

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

ABSTRACT
An apparatus and method for resolving the problem of alias addressing in virtual memory systems which employ write back caches. Specifically, in virtual memory systems aliasing can occur. Aliasing occurs when there is a possibility that multiple virtual memory addresses will map to the same physical memory address. This condition can result in the data at the common physical address in main memory including only a portion of the modifications made by the central processing unit or input/output device into the several cache locations. The subject application resolves this conflict for addresses located within the operating system using one method and for addresses other than operating system addresses using a second method.

Description

~L3~35~

1 $UMMARY OF ~HE I~ NTIQN

~ hi~ invention i8 directed to certain hardware and so~twar~ i~provement~ in workstations which utilize virtual addres~ing $n multi-user operating systems wlth write back caches, including operating 5y6te~8 which allow each us~r to have ~ultiple active proces es. In this connection, or convenience the invention will be described with reference ~o a particular multi-user, multiple act$ve proce~ses operating system, namely the Unix bperating ~ystem.
How~ver, the invention ls not limited to ~se in connection with the Unix operating sy~tem, nor ~re the claims to be interpretod as covering ~n invention which ~ay be used only with tho Unix operating ~y~tem.

In a Unix based workst~tion, sy6tem per~ormance may be improved signifiaantly by including ~ virtual address write back cache a~ one of the sy6tem elements. However, one problem which arises in such systems is in the support of ~ ali~s addresses, i.e., two or more virtual addres~es which ; 20 map to the same phy~ical address in real memory~

The problem arises because any data update into ~ write back cache which is mad~ tbrough one alias address will not be ~een through a cache access to the alias address, ~ince the two ~lias addresses will not ~atch.
2s More specifi~ally, virtual addres ing allows aliasing, .e., the possibility o~ ~ultiple virtual addresses mapping ~3i[~354 1 to the same phy6ical ~ddress. If a direct mapped, virtual ~ddre~ wxite back cache were used in a ~y~tem without page mapping xestrictions, any two arbitrary virtual addresses could occupy any two arbitrary cache l~cation6 and still map to th~ ~ame phy~ical addressO When cache ~lock~ are modi~ied, in general, it is impossible to check between arbitrary cache locatio~s for ~ata cons~stency~ Dat~ can become inconsl~tent when changes at one cache locatlon are not ~een ~t another cache location. Ultimately, th~ data at 10 khe common phy~cal addres~ in main memory will includ2 only part o~ the modifications made by the CPU or I/0 device into the 6everal cache location~.

In the pr2sent invention, ths ~oregoing probl~m ~
~olved by combining two dlstinct strategies to handling alia6e~.

The ~irst ~trategy i~ to create alias addres~e~ 60 that their low order addres~ bit~ ~re identical, mo~ulo the ~i~e of the cache (~8 a minimum). Thi~ Btrategy i6 applic2ble to 20 all user programs whioh use alias addres~es generated by the kernel, or wholely within the kernel. These alias addresses for thi~ strategy are generated by modifications to the kernel and are invisible to user programs. The ali~s addresses ~o gener~ted will map to the ~ame cache bl~ck : 25 w~thin a direct mapped (one-way ~et associative) cache, or within the same cache ~et within a ~ulti-way 6et a~ociative cacheO -Ali~s hardware det~ction logic i8 then u~ed to ....... ..

, ' ` `
!.
:

: ~ .

~3~
.

1 guar~ntee data ~onsistency within this c~che block (or cache 8e~).

The fiecond ~trategy covers tho~e ~lias addresse~ in the ~: operating 6y~tem, rather than user programs, which cannot be made to ~atch in their low order Rddress bits. These ~re handled by a~igning their page~ as ~'Don't Cache" pages in the memory management unit (~NU) employed by workstations which utilize virtual addressing.

RIE~ DESCRIPTION OF THE_DRAWI~GS

Figur~ 1 is ~ block diagram showing the ma~n components o~ a workstation utilizing virtual ~ddress~s w~th write back cache.

Figure 2a i~ a schematic diagram of cache Nhit" logic 25.

Figure 2b i8 a schematia diagram of a cir,cuit ~or detecting a cache protection Yiolation.

Figure 2c i8 a schematic diagram of a circuit ~or de~ecting a MMN protect~on violation.

Figuxe 3 i8 a detailed block diagram 6howing ths addxess path utilized by the alias detection logic of the present invention~

.
o3_ . ~

., .'`' ~3~3S~

1 Figura 4 (4(a3, 4~b~) i6 2 ~low dlagram of ~ ~tate ~achine i~plementation ~or certai~ controls related to the : ~ddressing of a v~rtual addres~ write bac~ cache.

~ igure S i8 a detailed block di~gram showing the data path utillz~d by ~he alias dete~tion log$c of the present inventl~n.

Figure 6 (6a, 6b) is a flow diagram of ~ 6tate machine impl~mentation for certain controls related to data lo ~r~nsfer~ to and fro~ a virtual address write bacX cache (states (~) - ())~

Figur~ 7a i8 A ~low diagra~ of A ~tate ~achlne l~plementat~on for the data path when there iB a real address match (state5 tg) - (u))-Figure 7b i5 a flow diagram of ~ ~tate machineimplementation ~or the data path when there i6 no real address match during a CPU writ~ bus cycle (~tates (q) -~Y) ~ -Figure 7c i~ a ~low diagram of a ~tate machineimplementation for the data path when there i6 no real ddre~ ~atch ~uring ~ CPU read bus cycle (~tate6 (g) (Y) ) ~igure 7d i5 a flow diagram of a ~tate machine ~plementation for the data path during a CPU write bu~
cycle when the MMU indicate~ a Don't Cache Page.

... .. . .

, - .
~3~3S4 1 Fi~ure 8 i5 a ~low diagram o~ a 6tate mach$ne implement~tlon for controlling Write Baak ~u~ cycl~s to ~mory.

Figure 9R ~ a timing diagram for th~ best ca~e timing ~or a CPU wr~te bus Gycle when th~ NNU indicateæ ~ cacheable page.

Figur~ 9b is a timing diagram ~or the be~t case timing of a CPU writa bus cycle when the MMU indicates a Don't Cach~ page.

Figure lOa is a t~miny diagram for th~ best case timing ~or a CPU x~ad bus cycle when the MMU indic~tes a cacheable page.

: 15 Figure lOb i5 a timing diagram for the best case timing of a CPU read bus cycl~ when the MMU indicates a Don't Cache page.

Figur~ lla i~ a timing diagram o~ the ~emory bu8 cycle for performing a block read cycle.
Figure llb is a timing diagram of the ~emory bus aycle ~or performing a write back ~ycle.

: Figure llc i~ a timing diagram of the memory bus cycle for per~orming ~ write to a D~n~t Cache page.

... ... . ... . ..

~.

g~3~1354 DETAILED DESCRIPTION OF ~HE INVENTION

Flgur~ 1 shows the functional blocks in a typical workstation using virtu~l ~ddre66es in which the present invention i~ implemented.

Specifically, ~uch ~ work~tation ~ncludes a ' microprooessor or central processing unit (CPU) 11, cache ; daka ~rray 19, cache tag ~rray 23, cache hit co~parator 25, ~emory management unit (MMU) 27, main memory 31, write back bu~er 39 and workstation control logic 40. Such : work~tations may, optionally, al~o include context ID
register 32, cache flush logic 33, direct virtual memory nccess ~DVMA) logic 35, and ~ultiplexor 37.

In addition to the ~oregoing elements, to lmplement the present invention~ al80 needed are multiplexor 45, ali~s detect logic 47, alias detect control logic 49 and real address register 51. The foregoing elements support alias addresses without the problems inherent in prior art implementati~ns utilizing a virtual ~ddress wr~te b~ck cache.

Each of the foregoing workstation elements will now be describe~, including changes which ~u t be made ~o the operating 6ystem kernel, with paxti~ular emphasls on the 2s components unique to the pre ent inv~ntion.

~3~35~L

iptlon of Nece sa ~ Element~ of Worksta~ion CPU 11 16~ue6 ~u~ cycle~ to addres~ instructions and data in memo~y (following address translation) and possibly s o~h~r ~ystem devices. The CPU address it~el~ i~ a virtual Addre~s o~ (~) blt~ in 6ize which uniguely identifie~ byte~
o~ instruction~ or data within a virtual context. The bus ~ycl~ may b~ characterized by one or more control field to ~n~quely id~nt~y the bus cycle. ~n paxticular, a 10 Read/Write indicat~r i8 xequired, ~8 well ~ a ~Type" ~iQld.
Thi~ ~ield identi~ the memory instruct~on and data ~ddx~ pac~ a~ well A5 the access priority (i.e., "Supervisor" or "User" access priority) for the bu~ cycle.
A CPU which may be utillzed in a workstation having virtual 15 a~dressing und capable o~ ~upporting a multi-usQr operating system i8 a MC68020.

~ nother nece~sary element in a virtual addre~s wor~stztion with wr~te back cache 6hown in Figure 1 i~
virtual address cache data array 19, wh~ch i organlzed ~s ~n array o~ 2N blocks of data, e~ch o~ which contain~ 2M
byte~. The 2M bytes within each block are uniquely identif~ed with the low order M addre~s bit~. Each o~ ~h~
2N blocks is uniquely addre~sed as an array ~le~ent by the next lowe~t N addres~ bits. A~ a virtual addr~s~ cache, the ~N~ it~ ~ddressing bytes within the cache ~re ~ro~ the ~irtual addre~s space o~ tA~C) bit~. ~Th~ tc) bits ~re ~, . .. . ..

~3~L3S4 1 context ~t6 ~rom optional context ID regi6ter 32 de~crlbed below.) The ~N+~) bits include, in gener~l, the (P) untran~lated page bit~ plu~ added virtual bit~ from th~
(A~C-P) bit~ defining the virtual page ~ddres~.

Virtu~l addre6s cache data arxay lg described hereln i5 a "dlrect mapp~d" ~ache, or l'one way set a~soaiat~ve" cache.
~hile thi6 cache organlzat~on ls used to lllustrate the $nvention, it i not meant to restrict the 6cope of the invantion which may also be used in connection with multi way ~et as80ciativ~ caches~

Another required ~lement ~hown in Figure 1 i~ virtual addr~ss cache tag array 23 which has OnQ tag array element for e~ch block o~ dat~ in cache data nrray 19. Th~ tag ~rray thu~ contain~ 2N Qlements, each o~ which has a Valid bit ~V), a Modified bit (~), two protection bits tP) consisting of a Supervisor Protect bit (~upvsr Prot) and Writ~ Allowed bit, and a virtual address ~ield,(VA, and optionally CX) as shown in Figure 3. The contents o~ the 20 virtual address field, together with low order addre~s bit~
u~ed ~o addre6~ the cache tag and data arrays, uniquely identify the ~a~he block within the total virtual addres~
8p~C~ 0~ t~C) bit~. ~hat i~, the tag virtual address ~ield ~ust contain ((~+C~ - (M+~) vixtual addres bits.

Cache "Hit" logic 25 compares virtual access addxesses w~th th~ content~ he virtual address ca~he tag aadress ~L3~1~3S~

1 ~ield. Within the access addre~ he lowest order M bit~
addre~s bytes within ~ block; the next lowest N bit~ addres~
~ block within the cache; and the remaining ((A+C) ~ (M+N)) blt~ ~ompare with the tag virtual ~ddre~ field, ~ part of the cache "hlt" logi~.

~ he ca~he ~h~t" logia ~ust identify, ~or 6y~tem with ~hared operating ~ystem, accesse~ to user instruct~on~ and ~ata, ~nd to ~upervi~or lnstructions ~nd data", A "hit"
definition which 6ati~fies these reguirement6 i illustrated in Figure 2a which comprises comparators ~0, ~ND gate 22, 0 g~ta 24 and AND gate 26.

MMU 27, which translates addresses wikhin the virtual space lnto a physical address, ia another required ~lement.
~MU 27 ~s organizad on the basia of pages o~ ~lze (2P) bytQs, which ln turn are grouped ~5 ~egments of size ~2S) p~ges. Addres~ing within a page requires (P) bits~ These (P) bit6 ~re phy~ical addre~s bit~ which regyi~e no translation. Tho role of MMU 27 i5 to tran~late the ~irtual page nddress bit6 ((~+C-P) or (A-P)) into physical page addre&se~ of (NM) bits. The composite physical ~ddres~ i6 th~n (NM~ pag~ address bits with ~P) bits per page.

~NU 27 i~ al60 th~ locus for protection ~hecki~g, i.e., comparing the ac~ess bus cycle pri~rity with the protection ~ssigned to the page. To illustrate thi~ point, there are two type6 o~ protection that ~ay be ~Gsigned to a page _g_ ~3~354 1 namely, ~ Supervl~or/User acce~s de~ignator and a Write ~rotect/Write Allowed designator. Although the ~ub~ect invention i~ not limit~d to such typas of protection~ given thi~ p~ge protect~on, a "Protection Vlolation" can result if either ~ "U6er" priority ~us cycle ~cc~ es ~ p~ge with ~Supervi~or" protection; or if a ~rite" bus cycle accesseB
page with ~ "Write Protectl' de~l~nation.

The applicatiGn of NMU protection checking through the MMU i6 ~hown ~n Figur~ 2c which compri~es invert~r 28, ~ND
gat~ 30~ and 30b, OR gate 34 and AND gate 35. In addit~on, with a v~rtual addre6~ write back c~che, the concept of protection checking can be extended to cac~e only CPU cycle~
which do not access the MMU. Such cache only protection logic is ~hown in Flgure 2b compri~ing ~nverter 42, AND
gates 44a and 44b, OR gate 4~ and AND gate 48.

~ 1BO ~hown in Figure 1 i5 main memoxy 31 which 1 addressable within the physical address space; control o~
~ain memory acces~ i~ through work6tation control logic 40.

Wrike back buffer 39 is a regieter cont~ining o~e bloc~
oP cache dat~ loaded ~xom cache data ~rray 19. Write back buffer 39 i6 loaded whenever an existin~ cache block 1~ to be displaced. ~hie may be caused by a need to update the cache block with new cont~nts, or b~c~use the block ~ust be ~lushed~ In either ca~e, in a writ~ back carhe, the 6tate o~ the cache tags ~or the axisting cache block d~termine .. . .. . .. . . ..

~3~3S4 1 wheter this block ~ust be written back to memory. X~ the t~gs indicate that th~ block i~ valid and modif~ed, A~
defined bel~w, then the block contents must be written bacX
to ~emoxy 31 when the cache block i~ di~placed. Write ~rk 5 buffer 39 temporarily holds such d~ta befox it i~ writt~n to ~emory.

Workstation control logic 40 control~ the overal~
operation of the workstation elements 6hown in Figure 1. In the preferred embodiment, control logic 40 i6 implemented as 6everal etate ma~hines which are shown in Figuxes 4 and 6 -8 a~ will b~ desaribed mor~ ~ully b~low in con~unction with the description o~ alias detect control logic 49 which ie ~1BO~ in the pre~erred e~bodimen~, integrate8 into th~
workstation control logic.
~escription o~ Optional Elements of Workstatio~

Context ID register 32 i8 an optional external address register which contalns rurther virtual address bit~ to 20$dentify a virtual context or process. This reg~ter, containing C bit~, identifies tot~l o~ (2**C) active user processes; khe total virtual addre ~ ~pace is of ~ize 2**(A~C) An impor~ant component in this virtual address space o~
2**~A~C) bit~ i~ the ~ddre~ 6pace occupied by the oper~ting ~y~tem. The operating ~ystem i6 common to all ~s~r .
3~a~

1 pxoce6ses, ~nd GO lt i5 assigned to a common address ~pace acro6~ all active user pr4cesse~. Tha~ i~, the (C) context bits have no meaning in qual~fying the ~ddresses o~ pages within th~ operatlng ~ystem. Rather, the operating 6y~tem 5 i6 assumed to lie with~n ~ common, Qxclu6ive region ~t the top of the (2*~A) bytes of virtual addre66 ~pace for each ~ctive context. No user payes may lie within this region.
So the operating system page addresses for two distlnct user proces~es are tde~t~cal, while the user pages for the two processes are distinct. All pages within the operating sy~tem are marked as having "~upervisor" protectlon.

Workstatlons of the type in which the pr~sent invention may be utilized ~ay also include cache flush logic 33 to remove ~elected blocks from the virtual cache when virtual addresses axe to be reassigned. A complete description of one implementation of cache flush logic may be found in co-pendiny Canadian patent applica~ion Serial ~o. 577,050, filed September 9, 1988, entitled "Virtual Address Write Back Cache with Address Reassignment and Cache slock Flush".
Cache f lush logic 33 is described here on].y to indicate it~ role as a component in a virtual dddress, write back cache system. If ~ range of addresses (a virtual page address, for example) is to be r~a~signed, then all instances of addre~ses from withln thi6 range ~ust be removed, or "flushed", from the cache before the new address assignment can be made. ~ cache block is "flushed" by invalidating the valid bit in 1 ~ tags and writing the block back to memory, if the block has been modified.

A

~310~35~

1 In ~ddition to CPU ll.as ~ ~ource of bu~ cycle6, the workstati~n ~ay include one or ~ore extexnal Input/Output (I/O) devices ~uch a~ DYMA logic 35. These external I/O
device6 a~e capable o~ i6suing bus cycles which parallel the CPU in ~cce~sing one or ~ore "Types" of virtual addre~s spaces. The virtual ~ddress ~rom either the CPU 11 or DVMA
l~gic.35, together with the ~ddre~ ln context ID registex 3~, i8 referred to as the access address.

Another optional element ie dat~ bu6 bu~er 37, which in the pref~rred embodiment i~ implemented a~ two buffer6 to control data ~low between ~ 32 bit bus ~nd ~ 64 bit ~us.
Such bu~fere are needed when the CPU data bu6 is 32 ~it~ and th~ cache data array data bus i6 64 bits.

Description o~ Elements Uni~ue to the Invented Workstation As noted abo~e, in the present invention, two distinct strategiQs ~re utilized to ~olve ths data con6istency problem~ xesulting from alias addre~6es. Both ~trategies re~uire the interaction of the operating ey~tem with special cache hardware to ensure consistent data.

Tha fir~t ~trategy requires that all al$as addre~ses which map to the same data ~ust ~atch in their low order ~ddr~ss bits to ensure that they will u6e the ~ame cache location, if t~e data i8 to be cached. The present i~vention utilize6 alia~ detect~on logic 47, which i~ a real ~3~

1 ~ddress comparator, ~o detect alia~ addres~es on memory acce~ses tha~ "miss" the cache and to control the cache data update to ensure ~hat all ~lias addr@~se~ point to ~onsistent data within the ~ame cache location.

The kernel ~ddree operation mo~ules i~plementing thl6 ~irst 6trategy force alias addre~se~ to match in the~r low order ~ddre~ bit~, 60 that ~lias ~ddresse~ will be guaranteed to u6e the 6ame cache location- Tf the caohe ~
of ~ize 2M blocks of data, sach with 2N bytes, then at least the low order (N+M) bit6 of the ~lias addr~sses must match.
~h~ applies to alias addresses within the s~me proce~ as weil ~s ~ s ~ddres6es between proceases. So long ~ thi~
reguirement i~ ~et, in dlract ~apped caches, Alia~ addresses map to the same cache block, and in multi-way set associative caches alias addresses will map to the ~a~e cache set. The ~econd 6trategy prevents data from being cached through the use of a "Don't Cachel' bit which i5 defined for ~ach page in M~U 27. In other word~, each page descripter in MMU 27 has a "Don't Cache" bit, which control~
whether instructions and data from that page may ~e written lnto the cache. If this control bit is 6et for a page, then 11 dats accesses to thi~ page ~re made directly to and from main ~emory, bypa ing the aache. In bypa~sing the cache, the ~irtual cache data con6istency problem is avoidedO
Slnce alias addressing i~ pos~lble, i~ a page iB marked ND~n't Cache" in ~ne MMU page entry, then it ~U6t be marked . . .

13~35~

~on't Cache" tn all ~ B page entrie~. Dat~ c~nsistency ls not guaranteed otherwi~e.

Ali~ address generation for u~er processes i8 c~ntrolle~ through th~ kernel, 60 th~t all user processes utllize the flrst ~trategy to en~ure data consi~tency among alias addre~ses. Some addre~se~ for the operating sy6tem, however, cannot be altered to meet the addre~sing requirements of the fir~t etrategy~ These 6y5tem ~ s Addresse6 are handled in~tead by the second strate~y, assign~ent to "Don't Cache" pages.

The following i~ a functional descript~on of what i8 needed to produce data con~istency in a direct ~apped virtual address, write back cache, u~ing ~ co~bination of the two ~trategies.

If ~ CPU 11 or DVMA ~5 memory aaces~ cycle "misses" the cache, then the accees virtual address will be tran~lated by the MMU. The MMU tran~lation will determine if the ~ccessed page is a "Don't Cache" page a~d whether the access has a protection violation. I~ ~.he access i5 val$d ~nd to r cacheable page, then ~he cache will be updated with the cache block corxesponding to the access addr~s.

The current contents of the ~ache at th~ locat$on corresp~nding to the access address must be examined ko detect a possible alias address. I~ the current ~ache block i~ valid and modi~ied, then the tran~l~ted ddrees o~ the - 13~13S4 1 c~che block ~u~t be compared to the tr~n~lated acce~s addre~e to determ~ne ~he ~ource of Yalid data to update the cache.

The real address compari~on performed by slia~
detection logic 47 takeæ a~ $nputs th~ tr~n~l~ted bus cycle access addre s from real addres3 regi~ter 51 and the translated cache addre~s from NMn 27.

I~ th~ curxent cache block i6 val~d, and the translated addre~Res compar~, khen the access addre~ and cache block address are alla~Qs. ~f the cacha block i8 modified, then the current c2che data i~ the most current aata, and the main memory dat~ at this address i~ ~tale.

If the translated addresses compar~ but the cache block i~ not moAified, then the old cache data and memory dat~ are identical, and either can ba used as the ~ource for the cache update.

Once the ~ource o~ valid block data has ~een determined, the ~ccess cycl~ aan be completed. 0~ Read cycle5 t the bus cycle returns data either directly from the ~ource or ~rom the cache tollowing the cache update, depcnding on the implementation. On Write cycles, the access data ~ay be written into the cache. Both the ~iæe of~
cache update and cache data alignment are implementation dependentO

~3~3~354 1 To gu~rantee data consi~tency, any write to a page requires that all references to that page (read or wr~te) ~dhere to this restriction. No requirement i6 placed on ~lia~ addre~s~ng to read only pages.

The preferred embodi~ent ~ox th addre~s path incorporating alla~ detection logic 47 ls ~hown in Flyure 3.
~ hown in Figur~ 3, the addres~ path includes the fundamental element~ to support address control ~n a ~rtual address write back cache. ~or alias address 6upport, also needed are A virtual addre ~ register 52 (VAR) ~or ~he ~irtual a~dres6 (CX and VA) and cache block Vali~ bit (V), ~ultlplexer 45 which multiplexes the virtual ~ddxess and virtual addres3 register, real address register 51, alias detect logic ~7, AND gat~ 53 (with the Valid bit ~rom the VA~ and the alias detect logic output as inputs), and Re~l ~ddress Match flip-flop 55 which i6 set when ~ real address m~tch ~ detect~d.

The data path ~ro~ the ca~he 19 to mai~ memory 31 i~
~ver two 64 bit bu6ses 5~ and 58. The CPU data path 60 i6 32 bits, indicated as D(31:0). On read ~us cycles, the cache address bit A(2) sele~ts which o~ two 32 bit buff~r~
37 m~y enable data from thè 64 bit caché data buæ 56 onto the 32 bit CPU data bus 60. Alias detection logic 49 controls the fiource of the data on read cycle cache ~i ses tthe cac~e or ~e~ory) ~nd wh~ther the cache is updated with .

.. ..

130~354 memoxy data on wr~t~ cyclR c:ache mi6ses ~8 ~lescribed ln the dat~ 6tate 3l1achine, Figures 6 and 7.

In Figure~ 3 ~n~ 5, to avo~d unnece~sarily cluttQrlng the Figuras, llot all control lines ~re shown. However, the ~:ontrol lines nece~sary ~or proper operation o~ the $nvention can be ~scertained Irom the ~low chart c)~ the ~tate machines ~ho~m in Figures 4 and 6 8~, In the flow charts, the Pollowing abbreviations are 10 utiliz~d:

ultipl~xor ~5 Sel - ~elect VA - virtual ~ddress RA - real addr2s~

OE - output en~ble ~ck - acknowledge Cache Hit? - Did c~che Nhlt'l logic 25 ~etect ~ cache hit~ (Fig 2a~

Cache Prot~ct Yiolation 7 - Did control lo~ic 4 0 detect a 2 5 detect a e:ache protect violation?

(~g 2b~

~3~L354 Memory Bu6y~. - Has Memory Bu5y been dsserted?

MMU Prote~t Viol? Did control logic 4.0 detect I~MU prot~ct violat~on~

~Fig 2c) r~al addre~ regi~ter 51 CI~ - clock Adr - addreæ~

Nem Adr Strobe - ~nemory 31 address ~trobe VAR - virtual ~ddress regl~ter ~2 Nem Adr Ack? - Has a memory ~ddre~s acknowledge been asserted by memory 31?

~em Data Strobe 0? - Has memory data Gtrobe 0 been I

~sserted7 Mem Data Aclc 0? - ~as memory data acknowledge 0 been as~erted?

Mem Data Strobe l? - Has memory data strobe 1 bsen 2 5 asserted?

2qe~ Dat~ ~ck 17 - Ha6 ~emory dnta acknowledge 1 beer 1~--~30~L3~;;4 a~erted~

ClX Write Back Buf~er - alock write back buffer 39 Real Adr Match~ - Ha~ ~ real ~ddre~ match been detected (flip-flop 55~

Dont~t Cache Page? - Ha6 control logic 40 det~cted a Don't Cache Page from ~U 27 CPU Read Cycle? - I~ CP~ n ~ r~ad c:ycle Clk Data Reg - clock dat3 register 61 Valid ~nd Modified Wr~te - Has control logic 40 detectQd Back Data? Valid bit(V) and Modified bit(M) 15 Write to Don't Cache Page - Has ~ontrol logic 40 detected a CPU writ~ to a Don't Cache Page?

~tart No Cache Write? - Has aontrol log~a 40 ~serted Start No Cache Write?
Start Write ~ack s:ycle? - ~as corltrol loqic 40 ~serted Start Write Back Cycle Simllar abbre~riations are used ln the timing diagrams 2 5 of Figures 9 ~2 0 ~3g~i~354 1 The ~ddress ~tate m chine shown in F~gur~s 4~ and 4b defines certain of the contxols relat~d to the addre~
h~ndling portion o~ the cache. ~he invention i~ integrated ~hrough the clocking o~ the Real Address Match fl~p-~lop 55 ~t 6tate (o~. ~he cache tags 23 are written as Yalid during 6tate (w), ~ollowing a ~ucce~sful transf~r of all block dat~
. fro~ ~e~ory 31.

The ~ata ~tatP machins shown in Figure~ 6a and 6b ~nd 7a - 7d de~ines certai~ controls related to the data transf~r portion of the c~che, As illu~trated, ~ollowing 6tate tg), a te~t i8 made for a write to a Don't Cache Page;
the ha~dling of thi~ wr~te to memory ~s also described ln t~e path ~ollow~ng 6t~te ~i.dw) in the data state machine.
Following state (o), a te~t i5 made for a Don't Cache Page acoes~ (this time ~or Read data). ~he Don't Cache Read control takes the ~ame path as th~ No-Real Addres6 Match path, until ~tat~s ~g.nr) and ~u.nr). Here a test ~or Don't Cache Pages inhibits cache update6 in 6tate5 (~.nr) ~nd (w.nr).
The write back state ~achine ~hown in Flgure 8 de~ines the control of the Writ~ Baak bus cycle to memo~y. ~his cycl~ may be performed in parallel with CPU cache acces.e~, ~ince b~th the Write Back control~ and data path are independent of the cache access control~ and data path. As described below, the "Me~ory Busy" signal causes the ~ddress ~3~L~

and dat~ ~ate machine~ to wait until ~ previou6 Write Back cycla has completed.

The write c&che mi~s timing di. gram shown in Figure 9a defines the overall timing OI a C~U write bus cycle to a e~cheable page in me~ory which mi~se~ the cache. The cachs Hi1: and Protect~ on Ch~ck occur in ~ycle (c) in thi~ di~gr~m.

A part o~ the ml~s handling sequencQ include6 the loading o~ the current cache block which i~ being repl~ced 10 ~nto write back }:uffer ~9 in cycl~s (i) and (m3. The translated ~ddre~ for khe current cache block i6 also loa~ed into re~l addrass rey~ ~ter 51 in cycle (o) . The Real Addre~s Xats:h l~tch ( ~lip-~lop 55) i~ al80 c:locked at cycle (4). I~ th~ current ~ache blocX 1B both Valid and 2~odified 15 from a previous CPU (or ~VMA) write oycle, then thi~ cache block will be writtterl back to ~nemory 31 through a Write Back bu~, cycle, described in both the Mcmory Data Bus Ti~ing ~nd the Write Back State Machine, Figures llb and 8 respectively .

An active Real Address Match latch (flip-~Elop 55) signifles an a!llias addre~s match. I~ there is no Allas ~atch, ths CPU write d~ta i6 merged with block dat~ returned ~rom ~emory on the ~r~t data tran ~er of a 810ck Read me~nory bUc cycle. During cycl~s (q) through (u), th~ C~U
Write Output Enable controlling buffer~ 37 will ~e ~ctive for only tho~e ~byte~ to 3~e written by the CPU, while the .. . . . . .

~:30~35~

1 Data Regi6ter Output Enable controlling data register 61 will. be aGtive for ~11 other byt2~i During ~he sQcond data transfer, cycle (w~, ~he Data Register Output Enables for all byte~ w~ll be activ~.

If there i~ ~n Alias Match, the CP~ data 1~ wr$tten into the data cacho at ~tate (B), ~nd the data ~rom ~emory 3 1 i6 ignored.

The Wr~te to ~on~t Cache Page ti~ing ~hown in Figure 9b de~ine~ the overall timing of a CPU write ~uc cycle to ~emory ~or ~ccesse to a Don't CachQ Pags. ~he c~ohe Hit, which occurs in cycle (c), will always indicate ~ ~iss (no Hit).

The Write to ~ Don't Cache page ca~e dif~ers ~rom th~
cache mi~5 c~e ~or ~ write to ~ oacheablQ page in th~t the cache i~ not updated with either CPU or memory dat~. The imple~entation u~es a speci~l memory bus cycle, calle~ the Write to Don't Cache P~ge cycle (Figur~ 11C), to directly update memory. Note that the Real Address Match l~t~h has no meaning for this ca6e.

~ he read cache mi~s timing diagram 6hown in Figure lOa defines the overall timing of a CPU read bus cycle to ~
eacheable page in memory which mi~6e~ the cache~ The cache Hit and ~rotection Check occur ~n cycle (e) in thi6 diagr~.

.. . . . ..

~3~54 1 A part of the miss handling ~e~uence include~ the loading of the ~urrent cache block which i6 b~ing repl~ced ~nto write back buffer 39 in cycles (i) and (m). Th~
~ranslated ~ddre~R ~or the current cache blocX is al~o load~d ~nto real address regi6ter 51 in cycle (o). The Real Addre6s ~atch latch (flip flop 55) i8 also clocked ~t ~ycle (o). I~ the current cache block is both Valid and Modified from ~ previous CPU (or DVMA) writ~ cycle, then thi~ cache ~lock will be writtten back to membry 31 through a Write B~ok bus cycle, described in both the ~emory Data Bu~ Timing and the Write Back State Ma~hine, Fiyure~ llb ~nd 8 respectiv~ly.

~n active Xeal Address Natch latch (flip-flop 55) 6ignifies ~n alias address match. If there iB no alias addrsss match, dat~ i~ re~d to the CPU by simultaneously bypa~ing the data to the CP~ through buffer~ 37 enabled by control ~ignal CPU Read Output Enable, active in st~tes (q) through (u), and updating the cache, in 6tate (~). The ~emory i5 de~igned to alway6 return the "mi6sing data'l on the ~irst 64 bit transfer, of a Block Read memory bus cycle ~nd the alternate 64 blt~ on th~ 6ubsequent transfer. After the CPU read bus cyele data ~s returned, th~ CPU may run internal ~ycle6 while the cache i6 being updat2d with ~he ~econd data ~ransfer from ~emory.

13~3~

1 If there i8 ~n ~lia~ address ~atch, dat~ i~ read ~irectly ~ro~ the oache 19 to ~he CPU 1~, and the data ~rom ~mory 31 i6 lgnored~

: The Re~d ~rom Don't Cache Page timing ~hown ln Fi~ur~
lOb defines the overall timing of ~ CPU read bus cycle to ~e~ory ~or ~ccesses to a Don7t Cache Page. The cache ~it, which ~curs in ~tate (~), will always ~ndicate a mi~s (no ~it).

The R~ad from a Don't ~ache page case differ~ from the c~che mi~ ca~e for reading fxo~ a c~cheabl~ page in th2t the aache i~ not upd~t~d wlth memory data. Th~
implementation uses the ~ame ~lock Read memory bu8 cycle ~s tha caahe miss case (~ee the Memory Data BU8 Timing, below).
The Real Address Match latch (flip-flop 55) has no meaning for this case.

The Memory Data Bu~ T~ming shown in F$gure 11~ - llc 6hows the timing af Block Read, Write Back, and Write to Don't Cache Page bu6 cycles. Since the cache block eize ~ 6 128 bit~, ~ach cache block update re~uire~ two dat~
tr~n6fer~ indicated ~bove the 64 bit6 containing the data addressed by CPU 11 ara alway~ returned on the ~irst trans~er ~or Block Read bu~ cycle The "Memory Bu6y'~
control signal acti~e during ~he Write ~ack cyGle i~ u~ed to inhibit the Etart of the next cache mi6~ ~ycle untll the prevlou~ Wr~t~ Back cycle can complete.

l25-~3~3~;4 On Write to Don ' t C~che Page bus ~ycle~, the 8 blt Byte ~ar~ ~leld, ~ent during the addre~ transfer ph~6e o the cycl~, de~irae~ wh~ch o~ the 8 byte~ of d~t~, 6ent during the data pha~, are to be upd~te~ emD~ 31.

In ~ddltion to the foregoing hardware, the operating fiy~tem lcernel DlU~t be modi~ied ln two fundamental ways to ~upport alia~ addre~iny:

1) The oper~ting 6y~tem utll~tie6 which generate user 10 ali~6 addresse~ ~Ufit bs modifl~d to guarAntee that ali~l~
addressefi conform to the rule re~uiring that thelr low order (N+M) address bit~, ~s ~ minimum, must m~tch.

~) Instance~ of ali~s addreases in61de the oper~ting ~ysteD~, whic21 cannot be ~ad2 to oonfor~ to the rule requ~ring the match o~ the low order (N+M) bit~, mu:st be assigned to "Don't Cache" p~ge6.

The kernel changes needed to support alla~ addre~sing ~or the Unix operatlng ~y6tem are 6hown $n Appendix A.

; :

l31D~35~L
APPENDIX A

.. ,~ 1 . .

break;
~f (shmn >~ shm~nfo.shm~eg) ~ 130~35~
u.u erlor ~ EM~ILE;
return;

~ NOTE: nuch of the following iB ripped-off from ~ern mman.c ~/
/~ NOT~: the re~t iB ripped-off from Sun consulti~g ~hm dri~er *~

/* ~ake sure that the target address ~ange $ already mapped */
s~ze c btoc(cp->~hm egsz);
~v ~ btop(uap-~addr);
1Y ~ f~ ~ ~iZe - 1;
~f 1 (lv c ~v) I I lisadsv(u.u_pro~p fv) II lisadsv(u.u_procp lv) ) {
pxintf(nshmat called with unmapped addre s ~ange %x - ~x\nn ctob~fv) (~tob(lv~l) - l));
u.u erxor ~ ~INVAL
~eturn;

/* If ir~t attach, create the shared m~mory ~eg~on */
SHMLOCX(3p);
i (sp-~shm_perm.mode & SMM INIT) if ~(sp-~shm kaddr ~
(uint)zmemall(vmemall, (int)ctob(size))) ~ 0) {
SHMUNLOCK(sp);
u.u error - ENO~EM;
return;
sp->ahm perm.mode & ~SHM INIT;
SHMUNLOCK(sp);
pm ~ SHM PGFLAG I ((uap->flag ~ SI~M RDONLY) ? PG ~RKR : PG ~W);
/* flush the cache befoxe changing the mapping */
vac flush((caddr t)uap->addr, ~p-~ahm seg~z);
or (off ~ 0; off ~ size; off~+) {
pte ~ vtopte~u.u ~rocp ~fv ~ off));
u.u procp~>p rssize -~ vmemfr~e(pte, l);
~(int ~)pte ~ pm I (getkpgmap((caddr t) (sp-~shm kaddr ~ ctob(of4))) ~ PG PFNUM);
((stxuct fpte ~)pte)-~pg fileno - SH~ FILENO;
}

newptes(vtopte(u.u procp fv), fv, ~i~t)size):
~spp sp;
/* END ... PRE-VM-REW~I~E */

~p->shm nattch~+;
sp->~hm at~m~ - (t~ne_t) t~me.tv Qec;
8p->shm lpid ~ u.u ~rocp-~p_p~d;
u.u rvall - uap-~addr;
i /*
hmsys - Sys~em entry po~nt for shmat, shmctl, ~hmdt, ~nd shmget ~ system calls.
*~
ahmsys xegister struct a ~
uint ~d; ~2 8 ~uap ~ (struct a ~)u.u ap;
int ~hmat(), 1301354 shmctl(), shmdt(), ~hmget~);
PRE-YM-REWRITE ~/
~nt shmalignment();
~tatlc ~nt ~*calls~() 8 {shmat, shm~tl, shmdt, shmget , shmalignment };
J~ ~ND PRE-VM-R~WRITE */
~f(uap->~d > 4) I
uOu_error ~ EINVAL;
return;
u.u ap ~ ~u.u arg~1];
~*calls[uap-~id~
}

/~ ~RE-VM-REWRITE */
I*
shmalignment o retuxn the current system's shared memory alignment ~ re trict~on~.
*/
~nt shmalignment() u.u_rvall - shm alignment; /~ macro defined in <machine/param.h> ~/
.

. . . ~.q ~mperr ~ err~o; ~3~3~4 (free(~hma~dr) ~
perror (n^Qhmat: free~3) error~);
errno ~ tmp~rr;
return ((char ~) re~);
... ~RE~YM-REWRI~E *~
ernel Qhmat syscall ~hmat attach a ~hared mem~ry ~egment */
shmat O
register ~t~uct a I
: int ~hmid;
uint addr;
int flag;
*uap G (struct a *~u~u ap;
~egister struct shmid ds *sp; J* shared memory header ptr */
x~gister struct shmid ds **spp;
regist~r lnt ~hmn;
~t of~;
uint s$ze;
ulnt fv, 1~, pm;
~truct pt~ *pte;
~ifdef notdef ~ysin~o.shm+~; /* bump shared memory count */
~endif notdef ~f ((sp ~ shmconv(uap->shmid)) ~= NULL) return;
if ~sp->shm_perm.mode & SHM DEST) u.u error ~ EINVAL;
return:
) ~f ~pcaccess(~p->shm perm, SHM R)) return;
((uap-> lag & SHM RDONLY) -3 0) lf ~ipcaccess(&sp->shm_perm, SaM ~) return;
/~ PRE-VM-REWRITE */
address 0 should be filtered at the syscall library */
~f (uap-~addr ~ 0) t printf (nshmat call~d with 0 address?~nn);
u.u error - EINVAL;
ret~rn;
END ... PRE-VM-REWRITE ~/
~f (uap->~lag ~ SHM RND) uap->add~ (S~MLBA - 1);
if ~uap->addr ~ SB~ C~OFSET) {
u.~ error e EINVAL;
return;

/* pRE-VN-~WRITE ~J
/~ loo~ for ~ &lo~ ~n the p~r-process ~hm list *J
8pp c ~shm shm~ml(u~u proqp - proc) ~ ~hm~nfo~3hmseg];
for (shmn - 0; shmn s shminf~.shmseg; 8~mn++, ~pp++) ~ sPP ~= NU~L~

~ /usrJinclude~sun3/par~n.h------------/~
* ~he Virtual AddreRs Cache $n Sun-3 requires aliasiDg addresses be ~ matched in modulo 12BK (0x20000) to guarantee data consistency.
~J
~define shm_alignment ~
((cpu ~ CPV SUN3 260) ? 0x2~000 : CLBYTES) /* 3hared memory alignment */
~defiDe S~MALIGNMENT 4 -library ~hmat call---------------------~-----static unsigned ~m ~gsz ~ 0;
~t~tic int ~hm malloc ~ 1; /* true if pre-vm-rewr~te */
char ~ ~alloc ~t ~ddr();
char ~
~hma~(shmid, shmaddr, shmflg) int ~hmid;
char *3hmaddr;
$nt shmflg;
/* P~E-VM-REWRITE */
struct shmid ds tmp shmid;
register unsigned size;
register int t~perr, ret;
/*
~ixst, get the required addre~s alignment.
If this fail~, then this is probably a 3.2 binary running ~ on a post-vm-rewrite kernel, in which case the shmat~) is * implemented ~n the kernel.
~/
if ( shm pgsz ~ ~) {
tmperr - errno;
errno ~ 0;
shm_pgsz ~ syscall~SYS shmsys, SHMAIIGNMæNT);
if (errno l~ 0) shm malloc ~ 0; /* must be post-vm-rewrite ~/
errno - tmperr;
}

/* If post-vm-rewrite, just issue the ~ystem call ~/
if ~! shm malloc) return ((char *) syscall(SYS ~hmsys, SHMAT, shmid, shmaddr, shmflg));
if (shmctl(shmid, IPC STAT, &tmp shmid) -- -1) {
return ((char *) -1);

3i2e ~ ((tmp shmid.shm segsz ~ shm_ pgsz - 1) / shm pgsz) ~ shm_pgsz;
~f (shmaddr !~ 0) ~
if (shmflg ~ SHM RND) (unsigned)shmaddr ~ shm_pgsz - 1);
if (t(unsigned)shmaddr ~ ( shm_pgsz - 1)) 11 shm~ddr -- tchar ~)0) 11 m~lloc at addr~shmaddr, size) l~ shma~dr)l l errno ~ EINVA~;
return ((char *) -1);
I else if ~tshmaddr J (char ~) mRmalign(_shm pgsz, ~ize)) ~ 0) return ~(char *) ~
}
~f ~(ret - syscall(SYS shm3ys, S~MAT, ~hmid, shmaddr, ~hmflg)) !~ -1) {
return (~char *) ret);
.. ' . , . . .~1 . .

Claims (9)

1. In a computer system comprising at least one process being executed, an operating system allocating resources for said processes and utilizing alias addressing, a central processor (CPU) executing said processes and said operating system, a virtually addressed cache tag array (CTA) coupled to said CPU identifying virtual memory blocks being cached, a virtually addressed write back cache data array (CDA) coupled to said CPU caching said identified virtual memory blocks, a memory management unit (MMU) coupled to said CPU translating virtual addresses to physical addresses, a main memory (MM) coupled to said CPU
storing virtual memory pages that are currently resident, a cache hit detector (CHD) coupled to said CPU and said CTA
detecting cache hits, and processor control logic (PCL) coupled to said CPU, CTA, CDA, MMU, MM, and CHD controlling their operations, the improvement comprising:
(a) when a first and second virtual addresses are alias to each other, but not members of a predetermined set of alias addresses, said operating system altering said first and second virtual addresses to equal each other in their n+m low order virtual address bits, thereby causing one cache location to be used for caching a first and second virtual memory locations, said first and second virtual addresses comprising a first and second plurality ordered virtual address bits identifying a first and second virtual memory pages and said first and second virtual memory locations respectively, said first and second virtual memory locations being located within said first and second virtual memory pages, said first and second virtual addresses being translated into the same physical address, said same physical address comprising a plurality of ordered physical address bits identifying a memory page and a memory location of said MM, said memory location being located within said memory page, said predetermined set of alias addresses comprising a predetermined subset of virtual addresses of said operating system's address space, each of said predetermined subset of virtual addresses being alias to at least one other virtual address of said predetermined subset, said CTA having n cache tag entries identifying n virtual memory blocks being cached, said CDA having n cache blocks of m cache locations caching said n identified virtual memory blocks, each of said n virtual memory blocks having m virtual memory locations;
(b) when a virtual emmory page comprising at least one virtual memory location identified by one of said predetermined set of virtual addresses, said MMU marking said virtual memory page as a Don't Cache page, thereby, inhibiting virtual memory blocks of said marked virtual memory page from being cached, resulting in all data accesses to said marked virtual memory page being made directly to and from said MM by passing said CDA.
2. The improvement defined by claim 1 wherein said improvement further comprises:
(a) alias detect logic means coupled to said memory management unit and said cache tag array for detecting a virtual address as being an alias address;
(b) alias detect control logic means coupled to said processor control logic and said alias detect logic means for obtaining data used on read cycle and write cycle cache misses from a selected one of said cache data array and said main memory, and for controlling update the cache data array on write cycle cache misses.
3. The improvement defined by claim 2, wherein said alias detect logic means comprises:
(a) a real address register coupled to said memory management unit for storing a translated physical address;
(b) a comparator coupled to said memory management unit and said real address register, said comparator generating a logic one when the translated physical address stored in said real address register matches a predetermined cache address in said memory management unit;

(c) a virtual address register coupled to said cache tag array for storing a plurality of virtual address bits and a cache valid bit, said virtual address bits being extracted from a predetermined cache tag entry;
(d) an AND gate having one input coupled to the output of said comparator and a second input coupled to said cache valid bit within said virtual address register;
(e) a flip-flop coupled to the output of said AND
gate, said flip-flop being set when a physical address match is detected as determined by the output of said AND gate.
4. The improvement defined by claim 3 wherein said alias detect control logic means comprises a state machine, said state machine comprising:
an addresses state where said flip-flop is clocked;
a first plurality of data states where data are obtained from a cacheable virtual memory page during read cycle cache misses with physical address match detected;
a second plurality of data states where data are written to a cacheable virtual memory page during write cycle cache misses with physical address match detected;
a third plurality of data states where data are obtained from a Don't Cache virtual memory page during read cycle cache misses; and a fourth plurality of data states where data are written to a Don't Cache virtual memory page during write cycle cache misses.
5. The improvement defined by claim 1, wherein said memory management unit marks said virtual memory page comprising at least one virtual memory location identified by one of said predetermined set of virtual addresses by setting a bit within a page descriptor for said virtual memory page in said memory management unit, said memory management unit comprising a plurality of virtual memory page descriptors describing a plurality of corresponding virtual memory pages.
6. In a computer system comprising at least one process being executed, an operating system allocating resources for said processes and utilizing alias addressing, a central processor (CPU) executing said processes and said operating system, a virtually addressed cache tag array (CTA) coupled to said CPU identifying virtual memory blocks being cached, a virtually addressed write back cache data array (CDA) coupled to said CPU caching said identified virtual memory blocks, a memory management unit (MMU) coupled to said CPU translating virtual addresses to physical address, a main memory (MM) coupled to said CPU
storing virtual memory pages that are currently resident, a cache hit detector (CHD) coupled to said CPU and said CTA
detecting cache hits and processor control logic (PCL) coupled to said CPU, CTA, CDA, MMU, MM and CHD controlling their operations, a method for detecting data Claim 6 continued...

inconsistencies in said CDA and correcting detected data inconsistencies, said method comprising the steps of:
(a) altering a first and second virtual addresses to equal each other in their n+m lower order virtual address bits, when said first and second virtual addresses are alias to each other, but not members of a predetermined set of alias addresses, thereby causing one cache location to be used for caching a first and second virtual memory locations;
said first and second virtual addresses comprising a first and second plurality ordered virtual address bits identifying a first and second virtual memory pages and said first and second virtual memory locations respectively, said first and second virtual memory locations being located within said first and second virtual memory pages, said first and second virtual addresses being translated into the same physical address, said same physical address comprising a plurality of ordered physical address bits identifying a memory page and a memory location of said MM, said memory location being located within said memory page, said predetermined set of alias addresses comprising a predetermined subset of virtual addresses of said operating system's address space, each of said predetermined subset of virtual addresses being alias to at least one other virtual address of said predetermined subset, said CTA having n cache tag entries identifying n virtual memory blocks being cached, said CDA having n cache blocks of m cache locations caching said n identified virtual memory blocks, each of said n virtual memory blocks having m virtual memory locations;
(b) marking a virtual memory page as a Don't Cache page, when said virtual memory page comprising at least one virtual memory location identified by one of said predetermined set of virtual addresses, thereby, inhibiting virtual memory blocks of said marked virtual memory page from being cached, resulting in all data accesses to said marked virtual memory page being made directly to and from said MM by passing said CDA.
7. The improvement defined in claim 6, wherein said method further comprises the steps of:
(a) detecting a virtual address as being an alias address;
(b) obtaining data used on read cycle and write cycle cache misses from a selected one of said cache data array and said main memory;
(c) selectively updating the cache data array on write cycle cache misses.
8. The improvement defined by claim 7, wherein said detecting step comprising the steps of:
(a) storing a translated physical address in a real address register;
(b) generating a comparator output which is a logic one when the translated physical address stored in said real address register matches a predetermined cache address in said memory management unit;
(c) storing a plurality of virtual address bits and a cache bit in a virtual address register, said plurality of virtual address bits being extracted from a predetermined cache tag entry;
(d) inputting to an AND gate one input coupled to the output of said comparator and a second input coupled to said cache valid bit within said virtual address register;
(e) setting a flip-flop coupled to the output of said AND gate when a physical address match is detected as determined by the output of said AND gate.
9. The improvement defined by claim 6, wherein said marking step comprises setting a bit within a page descriptor for said virtual memory page in said memory management unit, said memory management unit comprising a plurality of virtual memory page descriptors describing a plurality of corresponding virtual memory pages.
CA000577716A 1987-10-02 1988-09-16 Alias address support Expired - Fee Related CA1301354C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10463587A 1987-10-02 1987-10-02
US104,635 1987-10-02

Publications (1)

Publication Number Publication Date
CA1301354C true CA1301354C (en) 1992-05-19

Family

ID=22301527

Family Applications (1)

Application Number Title Priority Date Filing Date
CA000577716A Expired - Fee Related CA1301354C (en) 1987-10-02 1988-09-16 Alias address support

Country Status (7)

Country Link
JP (1) JPH071489B2 (en)
AU (1) AU609519B2 (en)
CA (1) CA1301354C (en)
DE (1) DE3832758C2 (en)
FR (1) FR2621408A1 (en)
GB (1) GB2210479B (en)
HK (1) HK95493A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10761995B2 (en) 2018-04-28 2020-09-01 International Business Machines Corporation Integrated circuit and data processing system having a configurable cache directory for an accelerator

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5276798A (en) * 1990-09-14 1994-01-04 Hughes Aircraft Company Multifunction high performance graphics rendering processor
US5813046A (en) * 1993-11-09 1998-09-22 GMD--Forschungszentrum Informationstechnik GmbH Virtually indexable cache memory supporting synonyms
GB2293670A (en) * 1994-08-31 1996-04-03 Hewlett Packard Co Instruction cache
US6189074B1 (en) 1997-03-19 2001-02-13 Advanced Micro Devices, Inc. Mechanism for storing system level attributes in a translation lookaside buffer
US6446189B1 (en) 1999-06-01 2002-09-03 Advanced Micro Devices, Inc. Computer system including a novel address translation mechanism
US6510508B1 (en) 2000-06-15 2003-01-21 Advanced Micro Devices, Inc. Translation lookaside buffer flush filter
US6665788B1 (en) 2001-07-13 2003-12-16 Advanced Micro Devices, Inc. Reducing latency for a relocation cache lookup and address mapping in a distributed memory system
US6954829B2 (en) * 2002-12-19 2005-10-11 Intel Corporation Non-speculative distributed conflict resolution for a cache coherency protocol
US11853231B2 (en) 2021-06-24 2023-12-26 Ati Technologies Ulc Transmission of address translation type packets

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS54148329A (en) * 1978-05-15 1979-11-20 Toshiba Corp Buffer memory control system and information processor containing buffer memory
JPS595482A (en) * 1982-06-30 1984-01-12 Fujitsu Ltd Cache buffer controlling system
JPS62145341A (en) * 1985-12-20 1987-06-29 Fujitsu Ltd Cache memory system
EP0282213A3 (en) * 1987-03-09 1991-04-24 AT&T Corp. Concurrent context memory management unit

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10761995B2 (en) 2018-04-28 2020-09-01 International Business Machines Corporation Integrated circuit and data processing system having a configurable cache directory for an accelerator
US10846235B2 (en) 2018-04-28 2020-11-24 International Business Machines Corporation Integrated circuit and data processing system supporting attachment of a real address-agnostic accelerator
US11030110B2 (en) 2018-04-28 2021-06-08 International Business Machines Corporation Integrated circuit and data processing system supporting address aliasing in an accelerator
US11113204B2 (en) 2018-04-28 2021-09-07 International Business Machines Corporation Translation invalidation in a translation cache serving an accelerator

Also Published As

Publication number Publication date
AU609519B2 (en) 1991-05-02
GB8819017D0 (en) 1988-09-14
FR2621408B1 (en) 1994-04-22
FR2621408A1 (en) 1989-04-07
HK95493A (en) 1993-09-24
JPH071489B2 (en) 1995-01-11
JPH01108651A (en) 1989-04-25
DE3832758C2 (en) 1996-05-30
AU2242288A (en) 1989-04-06
DE3832758A1 (en) 1989-04-13
GB2210479B (en) 1992-06-17
GB2210479A (en) 1989-06-07

Similar Documents

Publication Publication Date Title
US5119290A (en) Alias address support
CA2026224C (en) Apparatus for maintaining consistency in a multiprocess computer system using virtual caching
KR920005280B1 (en) High speed cache system
US5257361A (en) Method and apparatus for controlling one or more hierarchical memories using a virtual storage scheme and physical to virtual address translation
CA1232970A (en) Data processing system provided with a memory access controller
US5210843A (en) Pseudo set-associative memory caching arrangement
US5510934A (en) Memory system including local and global caches for storing floating point and integer data
KR100190351B1 (en) Apparatus and method for reducing interference in two-level cache memory
US4332010A (en) Cache synonym detection and handling mechanism
JP2618175B2 (en) History table of virtual address translation prediction for cache access
US6446034B1 (en) Processor emulation virtual memory address translation
JPH11161547A (en) Storage device for data processing device and method of accessing storage location
CA1301354C (en) Alias address support
US6073226A (en) System and method for minimizing page tables in virtual memory systems
JPH04320553A (en) Address converting mechanism
US5276850A (en) Information processing apparatus with cache memory and a processor which generates a data block address and a plurality of data subblock addresses simultaneously
US6766434B2 (en) Method for sharing a translation lookaside buffer between CPUs
US6430664B1 (en) Digital signal processor with direct and virtual addressing
US5287482A (en) Input/output cache
Smith Design of CPU cache memories
JP2000339221A (en) System and method for invalidating entry of conversion device
US5784708A (en) Translation mechanism for input/output addresses
US6574698B1 (en) Method and system for accessing a cache memory within a data processing system
JPH083805B2 (en) TLB control method
EP0334479A2 (en) Pseudo set-associative memory cacheing arrangement

Legal Events

Date Code Title Description
MKLA Lapsed