TWI220042B - Non-temporal memory reference control mechanism - Google Patents
Non-temporal memory reference control mechanism Download PDFInfo
- Publication number
- TWI220042B TWI220042B TW091124007A TW91124007A TWI220042B TW I220042 B TWI220042 B TW I220042B TW 091124007 A TW091124007 A TW 091124007A TW 91124007 A TW91124007 A TW 91124007A TW I220042 B TWI220042 B TW I220042B
- Authority
- TW
- Taiwan
- Prior art keywords
- instruction
- extended
- memory
- item
- patent application
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3824—Operand accessing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
- G06F9/30185—Instruction operation extension or modification according to one or more bits in the instruction, e.g. prefix, sub-opcode
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30181—Instruction operation extension or modification
- G06F9/30189—Instruction operation extension or modification according to execution mode, e.g. mode flag
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Executing Machine-Instructions (AREA)
Abstract
Description
1220042 A7 五 發明說明(/) 與相關申請案之對照 [0001] 本巾晴案主張以下美㈣請案之優先權··案穿 10/227583,申請曰為2002年8月22曰。 [0002] 本申請案與下列同在申請中之美國專利申請案有 關,都具有相同的申請人與發明人。 〃 經濟部智慧財產局員工消費合作社印製 台灣申請 案號 申諳曰 DOCKET NUMBER 91116957 7/30/02 CNTR:2176 延伸微處理器指令集之裝 置及方法 91116958 7/30/02 CNTR:2186 執行條件指令之裝置及方 法 91124008 10/18/02 CNTR:2187 選擇性控制記憶體屬性之 裝置及方法 91116956 7/30/02 CNTR:2188 選擇性地控制條件碼回寫 之裝置及方法· 91116959 7/30/02 CNTR:2189 增加微處理器之暫存器數 量的機制 91124005 10/18/02 CNTR:2190 延伸微處理器資料模式之 裝置及方法 91124006 10/18/02 CNTR:2191 延伸微處理器位址模式Ζ 裝置及方法 CNTR:2192 儲存檢查之禁止 CNTR:2193 選擇性中斷之禁止 91116672 7/26/02 CNTR:2198 選擇性地控制結果回寫之 裝置及方法 (請先閱讀背面之注意事項再填寫本頁) P-裝 I 1 1 n 一一OJf eMmm tmf MmmB mMmm ϋ ϋ *1220042 A7 V Description of the Invention (/) Contrast with related applications [0001] This case claims the priority of the following US applications ... Case No. 10/227583, application dated August 22, 2002. [0002] This application is related to the following U.S. patent applications, which are also in the same application, and have the same applicant and inventor.员工 Printed the Taiwan application number by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs DOCKET NUMBER 91116957 7/30/02 CNTR: 2176 Device and method for extending the microprocessor instruction set 91116958 7/30/02 CNTR: 2186 Implementation conditions Device and method for instruction 91124008 10/18/02 CNTR: 2187 Device and method for selectively controlling memory attributes 91116956 7/30/02 CNTR: 2188 Device and method for selectively controlling condition code writeback · 91116959 7/30 / 02 CNTR: 2189 Mechanism for increasing the number of registers in the microprocessor911124005 10/18/02 CNTR: 2190 Device and method for extending the microprocessor data mode911124006 10/18/02 CNTR: 2191 Extended microprocessor address Mode Z Device and method CNTR: 2192 Prohibition of storage inspection CNTR: 2193 Prohibition of selective interruption 91116672 7/26/02 CNTR: 2198 Apparatus and method for selectively controlling the result write-back (Please read the precautions on the back before filling (This page) P-Pack I 1 1 n One OJf eMmm tmf MmmB mMmm ϋ ϋ *
本紙張尺度適用中國國家標準(CNS)A4規格(210 X 297公釐) 1220042 A7 ______B7 —___ _ 五、發明說明(Q ) (一) 發明技術領域: [0003]本發明係有關微電子的領域,尤指一種能將指令 層次之非暫存(n〇n_temp〇ral)記憶體屬性控制納入一既有之 微處理器指令集架構的技術。 (二) 發明技術背景: [〇〇〇4]自1970年代初發韌以來,微處理器之使用即呈指 數般成長。從最早應用於科學與技術的領域,到如今已從那 些特殊領域引進商業的消費者領域,如桌上型與膝上型 (laptop )電腦、視訊遊戲控制器以及許多其他常見的家用與 商用裝置等產品。 [0005] 者使用上的爆炸性成長,在技術上也歷經一相 對應之提昇,其特徵在於對下列項目有著日益昇高之要求: 更快的速度、更強的定址能力、更快的記憶體存取、更大的 運算元、更多種一般用途類型之運算(如浮點運算、單一指 令多重貧料(SIMD)、條件移動等)以及附加的特殊用途運 算(如數位訊號處理功能及其他多媒體運算)。如此造就了 忒領域中驚人的技術進展,且都已應用於微處理器之設計, 像擴充管線化(extensive pipelining )、超純量架構(―㈣㈤批 architecture)、快取結構、亂序處理(〇ut_〇f_〇rder pr〇㈣丨叩)、 爆發式存取(burst access )機制、分支預測(branch predicati〇n ) 以及假想執行(speculative execution)。直言之,比起3〇年 刖剛出現時,現在的微處理器呈現出驚人的複雜度,且具備 了強大的能力。 (請先閱讀背面之注意事項再填寫本頁) 訂-------- #- 經濟部智慧財產局員工消費合作社印製 3This paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 1220042 A7 ______B7 —___ _ V. Description of the invention (Q) (1) Field of invention: [0003] The present invention is related to the field of microelectronics In particular, a technology capable of incorporating non-temporary (non_temporal) memory attribute control at the instruction level into an existing microprocessor instruction set architecture. (II) Technical background of the invention: [0004] Since its introduction in the early 1970s, the use of microprocessors has grown exponentially. From the earliest applications in science and technology to consumer areas that have introduced business from those special areas, such as desktop and laptop computers, video game controllers, and many other common home and business devices And other products. [0005] Explosive growth in use has also experienced a corresponding improvement in technology, which is characterized by increasing requirements for the following items: faster speed, stronger addressing ability, faster memory Access, larger operands, more general-purpose types of operations (such as floating-point operations, single instruction multiple lean (SIMD), conditional moves, etc.) and additional special-purpose operations (such as digital signal processing functions and others Multimedia computing). This has created amazing technological progress in the field, and has been applied to the design of microprocessors, such as extensive pipelining, ultra-scalar architecture, cache structure, and out-of-order processing ( 〇ut_〇f_〇rder pr〇㈣ 丨 叩), burst access (burst access) mechanism, branch prediction (branch predicati) and speculative execution. To put it bluntly, today's microprocessors are surprisingly complex and powerful compared to when they first appeared in 30 years. (Please read the notes on the back before filling out this page) Order -------- #-Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 3
1220042 A71220042 A7
1220042 A71220042 A7
五、發明說明(f ) 這些特被。若既有的指令集架構沒有多餘的運算碼狀態,則 某些既存的運算碼狀態必須重新定義,以提供給新的特徵。 因此,為了提供新的特徵,就得犧牲舊有軟體相容性了。 [0009] —個現在微處理器設計者所關心的問題領域,係 應用程式如何有效率地使用快取記憶體結構。隨著快取技術 的演進,已提供越來越多的特徵,其允許系統程式員可控制 一系統中快取記憶體何時及如何被使用。早期的快取控制特 徵僅提供開/關的能力。藉由設定微處理器之一内部暫存器, 或藉由將其封裝體(Package)上之某外部訊號腳位設為真, 設計者可將記憶體之快取致能,或將整個記憶體空間設定為 不可快取(uncacheable)。對於不可快取之記憶體參照 (memory reference)(即載入/讀取與儲存/寫入),則皆送 至系統記憶體匯流排,而產生與外在匯流排架構相同之等待 時間(latency)。相反地,記憶體對於一快取記憶體之參照 或存取,只有在一快取未中(cachemiss)發生時(亦即,一 圯憶體參照的目標在内部快取記憶體内並非有效).,才被送 至系統記憶體匯流排。快取特徵使得應用程式在執行速度上 經濟部智慧財產局員工消費合作社印製 大幅提昇,特別是應用程式對記憶體中相同的資料結構進行 重複參照時。 [0010] 晚近微處理器架構上的改良,使得系統設計者能 更精確地控制如何使用快取特徵。這些改良允許設計者在一 j處理器的紐空_,雜其中_段紐_,就微處理 口口如何依快取層級體系(cachehierarchy)執行對該位址區間 之參照’設親位址區間的性質。一般而言,對該位址區間5. Description of the Invention (f) These special blankets. If the existing instruction set architecture does not have redundant opcode states, some existing opcode states must be redefined to provide new features. Therefore, in order to provide new features, it is necessary to sacrifice legacy software compatibility. [0009] An area of concern for microprocessor designers today is how applications can efficiently use cache memory structures. As cache technology has evolved, more and more features have been provided that allow system programmers to control when and how cache memory in a system is used. Early cache control features only provided on / off capabilities. By setting one of the internal registers of the microprocessor, or by setting an external signal pin on its package to true, the designer can enable the cache of the memory or the entire memory The body space is set to uncacheable. For non-cacheable memory references (load / read and store / write), they are all sent to the system memory bus, resulting in the same latency (latency) as the external bus architecture ). In contrast, memory references or accesses to a cache memory only occur when a cache miss occurs (ie, a memory reference target is not valid in the internal cache memory) ., Is sent to the system memory bus. The cached feature makes the application run faster at the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs, especially when the application repeatedly references the same data structure in memory. [0010] Recent improvements in microprocessor architecture have enabled system designers to more precisely control how cache features are used. These improvements allow the designer to set a pro-address range on the microprocessor's interface, with a mix of _ paragraphs, and how to perform a reference to the address range according to the cachehierarchy on the microprocessor. Nature. Generally speaking, for the address range
X 297公釐) 經濟部智慧財產局員工消費合作社印製 1220042 A7 ---------- B7 _ 五、發明說明(疒) 之參照可被設定為何絲、複合寫人(writeeGmbining)、 寫透(writethrough)、回寫(writeback)或寫入保護(術如 protected)。這些性質稱為記憶體屬性(attribute),或記憶 體特性(tmit)。因此,對具有回寫屬性之位址的儲存參照二 會被送至快取記Μ,並假想地(speeulatively)分派至其中 的儲存位置。對具有不可快取屬性之位址的儲存參照, 則送至系統匯流排,且不會被假想執行(speculativdy executed)。 [0011] 不過,對於記憶體屬 态藉其快取記憶體加以處理,提供一深度的說明,則不在本 申请案的I副内。此處去了解本技術領域目前·使設計者 才曰派一圯k、體屬性予一記憶體區域,以及所有後續對該區域 内位址之記憶體參照,將依據關聯於該指定記憶體屬性之快 取原則(cachepolicy)來處理,如此即已足夠。 [0012] 雖然現代的微處理器設計允許記憶體的不同區域 被賦予不同的記憶體特性,但在兩個重要方面,設計上仍受 限制。第一,微處理器指令集架構限制了用以定義/改變記憶 體特性至使用者層級(user-level)的應用程式所無法存取之 一(privilege)層級的指令執行。因此,當一桌上型/膝上型 微處理器啟動時,其作業系統在任何使用者層級應用程式開 啟前,便將虛擬記憶體空間之記憶體特性建立好。因而使用 者層級的應用程式便不能改變主機系統之記億體特性。第 二,在現代的微處理器中,用來建立記憶體特性的最佳處理 層級為刀頁層級。在’白用之允蜂記憶體分頁(mem〇ry paging) 6 本紙張尺度適用中國國家標準(CNS)A4規格(210 χ 297公爱)X 297 mm) Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 1220042 A7 ---------- B7 _ V. References to the invention description (疒) can be set for silk and composite writers (writeeGmbining) , Writethrough, writeback, or write protection (such as protected). These properties are called memory attributes, or memory attributes. Therefore, the storage reference of the address with the write-back attribute is sent to the cache memory M, and is speeulatively assigned to the storage location therein. The storage reference of the address with the non-cacheable attribute is sent to the system bus and will not be speculativdy executed. [0011] However, providing a deep explanation of the memory state by its cache memory is not included in the first group of this application. To understand the current technical field, the designer is required to assign a memory area to a memory area, and all subsequent memory references to the addresses in the area will be based on the specified memory properties. Cachepolicy, which is sufficient. [0012] Although modern microprocessor designs allow different regions of memory to be endowed with different memory characteristics, they are still limited in design in two important respects. First, the microprocessor instruction set architecture limits the execution of instructions that define / change memory characteristics to a level that is not accessible to user-level applications. Therefore, when a desktop / laptop microprocessor is activated, its operating system establishes the memory characteristics of the virtual memory space before any user-level application is launched. Therefore, user-level applications cannot change the characteristics of the host system. Second, in modern microprocessors, the optimal processing level used to establish memory characteristics is the blade level. In ‘white use of paging memory paging (mem〇ry paging) 6 This paper size applies Chinese National Standard (CNS) A4 specifications (210 χ 297 public love)
、發明說明(z) 每一記憶體分頁之記刪性’由作業系統在分 、、’’、表(Page directory/table)之項目内作設定。因此 ^對於-特定分肋位址之參照,將個於該 取運算執行_軒之記㈣雜。 匕體存 [0013] 對井多應用程式而言,上述之控制特徵雖可 =者層級的細程式日_加快其執行速度,但本案發明^注 …到’就其他的顧程式而言其效果仍是有限的。這除了因 ,在使用者層級上,並無法朗現代的記紐特性控制 徵,也因為記憶體屬性僅能依分頁層級(柯明知⑴的單位 ^建立。例如,-個對一第一資料結構作重複存取的使用者 私式’在對-第二資料結構進行—偶發的參照時,若第一資 料,的快取項目必須清除,以空出快取記憶體的空間供二 二資料結構使用,職使用者程式的執行效率會因而受到影 響。由於作業系統並未預知使用者層級之應用程式對於資料 結構的參照頻率,應雜式的資料空間-般皆被賦予一回寫 特性,因而促成了前述衝突的產生條件。程式員並沒有用來 更改資料空間特性的工具,以強迫該偶發參照轉送至記憶體 匯流排(例如,賦予不可快取之特性給該第二資料結構), 而排除該衝突。 [0014] 在此技術領域裡,應用程式所重複存取的資料被 稱為暫存資料(temporal data),而偶然存取的資料則稱為非 暫存資料(non-temporal data)。熟悉此領域技術者將發覺, 一快取記憶體若填滿了非暫存資料(亦即快取污染(cack pollution)),將是非常不利的。因此,最近的技術已進 A7 "" ^_BZ____ 五、發明說明(^!) 判可於既有的指令集中,增加一組有限的非暫存儲存指令, 以it應用程式設計者將資料從内部暫存器移至記憶體,而 不會造成快取記憶體的污染。然而,現在並沒有適用的工具 :讓%式員將一既有指令(例如,指定一使用一個或更多運 异兀之鼻術或邏輯運算的指令)所指定之記憶體參照以非暫 存方式執行,因而完全跳過快取記憶體的存取。 [0015]因此,我們所需要的是,一種可將指令層級之非 暫存記憶體參照控制特徵納入既有微處理器指令集架構的裝 置及方法,其中該指令集架構係被已定義之運算碼完全佔 用,且納入該記憶體參照控制特徵能讓一符合舊有規格之微 處理斋保留執行舊有應用程式的能力,同時還提供程式員指 定非暫存記憶體存取的能力。 (三)發明簡要說明: [0016]本發明如同前述其他申請案,係針對上述及其他 習知技術之問題與缺點加以克服。本發明提供一種更好的技 術,用以擴充微處理器之指令集,使其超越現有的能力,提 供指令層級之非暫存記憶體參照的控制。在一具體實施例 中,提供了一種可在微處理器内進行指令層級之記憶體參只召 控制的裝置。該裝置包括一轉譯邏輯(translation logic)與— 延伸執行邏輯(extended execution logic)。該轉譯邏輯將一 延伸指令轉譯成一微指令序列(micro instruction sequenee )。 該延伸指令具一延伸前置碼(extendedprefix)與一延伸前置 碼標記(extended prefix tag)。該延伸前置碼對於該延伸指 8 本紙張尺度適用中國國家標準(CNS)A4規格(210 X 297公釐) 請 先 閱 讀 背 面 之 注 項2. Description of the invention (z) Deletion property of each memory page is set by the operating system in the items of the page, page, and table. Therefore, for the reference of a specific ribbed address, the fetch operation will be performed on the _xuanzhiji mixed. [0013] For the well multi-application program, although the above-mentioned control characteristics can be equal to the level of the program's day _ to speed up its execution speed, but the invention of this case ^ Note ... to 'the effect on other Gu programs Still limited. In addition to this, at the user level, there is no way to control the characteristics of modern keys. It is also because the memory attributes can only be established based on the paging level (the unit of Ke Mingzhi⑴). For example, a one-to-one first data structure For the user's private access of repeated accesses-when making reference to the-second data structure-occasional reference, if the first data, the cache items must be cleared to free up the space of the cache memory for the two-two data structure Use, the execution efficiency of professional user programs will be affected. Because the operating system does not predict the reference frequency of user-level applications to the data structure, the heterogeneous data space is generally given a write-back feature, so Contributed to the aforementioned conflict condition. The programmer does not have the tools to change the characteristics of the data space to force the occasional reference to be transferred to the memory bus (for example, to give the second data structure an uncacheable property), and Eliminate the conflict. [0014] In this technical field, the data repeatedly accessed by the application program is called temporary data, and accidentally accessed Data is called non-temporal data. Those skilled in the art will find that if a cache memory is filled with non-temporal data (that is, cache pollution), it will be Very unfavorable. Therefore, the latest technology has been introduced into A7 " " ^ _BZ ____ V. Invention Description (^!) It can be added to the existing instruction set to add a limited set of non-temporary storage instructions to it applications Designers move data from internal registers to memory without contaminating cache memory. However, there is no applicable tool now: let% s to use an existing instruction (for example, to specify one to use one Or more instructions that run different noses or logic operations) The memory reference specified is executed in a non-temporary manner, thus skipping the access of the cache memory completely. [0015] Therefore, what we need is A device and method capable of incorporating instruction-level non-temporary memory reference control features into an existing microprocessor instruction set architecture, wherein the instruction set architecture is completely occupied by a defined operation code and is included in the memory reference control Features allow a microprocessor that meets the old specifications to retain the ability to execute legacy applications, while also providing programmers with the ability to specify non-temporary memory access. (3) Brief description of the invention: [0016] The present invention is like The aforementioned other applications are aimed at overcoming the problems and disadvantages of the above-mentioned and other conventional technologies. The present invention provides a better technology for expanding the instruction set of a microprocessor to exceed the existing capabilities and provide instruction-level information. Non-temporary memory reference control. In a specific embodiment, a device that can perform instruction-level memory reference only control in a microprocessor is provided. The device includes a translation logic and— Extended execution logic. The translation logic translates an extended instruction into a micro instruction sequence (micro instruction sequenee). The extended instruction has an extended prefix and an extended prefix tag. The extended preamble is applicable to the extended paper size 8 Chinese paper standard (CNS) A4 (210 X 297 mm). Please read the note on the back first.
fi 頁I I I I 僅 I 訂 經濟部智慧財產局員工消費合作社印製 1220042fi Page I I I I Order only I Printed by the Consumer Cooperative of the Intellectual Property Bureau of the Ministry of Economic Affairs 1220042
A7 —- _____ B7 五、發明說明(/ ) 執行邏輯耦接至該轉譯邏輯,藉由該非暫存存取執行該記憶 體參照。 [0019]本發明的再一目的,在於提供一種擴充既有指令 集架構的方法,以於指令層級提供非暫存記憶體參照之控 制。该方法包括提供一延伸指令,該延伸指令包含一延伸標 記及一延伸前置碼,其中該延伸標記係該既有指令集架構其 中一弟一運异碼項目;透過該延伸前置碼指定要應用於一對 應記憶體參照之一非暫存存取,其中該記憶體參照係由該延 伸指令之其餘部分所指定;以及應用該非暫存存取以執行該 記憶體參照,其巾該應㈣作排除了該記憶體參照之相關資 料的快取動作。 (四)發明圖示說明: γ〇〇2〇]本發明之前述與其它目的、特徵及優點,在配合 下列說明及所附圖示後,將可獲得更好的理解: [0021 ]圖一係為一相關技術之微處理器指令格式的方塊 圖, [0022]圖一係為一表格,其描述一指令隼牟椹中指 經濟部智慧財產局員工消費合作社印製 令,如何對應至圖-指令格式内一 8位元運算碼位元組之位 元邏輯狀態; [〇〇23]圖二係為本發明之輯齡格摘方塊圖; ⑽1°〇24]圖四係為—表格,其顯示依據本發明,延伸架構 二何對應至- 8位兀延伸前置碼實施例中位元的邏輯狀 1220042A7 —- _____ B7 V. Description of the invention (/) The execution logic is coupled to the translation logic, and the memory reference is performed by the non-temporary access. [0019] Another object of the present invention is to provide a method for expanding an existing instruction set architecture to provide non-temporary memory reference control at the instruction level. The method includes providing an extended instruction, the extended instruction includes an extended mark and an extended preamble, wherein the extended mark is a different code item in the existing instruction set structure; specifying the required through the extended prefix Apply to a non-temporary access to a corresponding memory reference, where the memory reference is specified by the rest of the extended instruction; and apply the non-temporary access to execute the memory reference, which should be executed Perform a cache operation that excludes related data referenced by the memory. (IV) Illustrative illustration of the invention: γ〇〇2〇 The foregoing and other objects, features, and advantages of the present invention will be better understood after cooperating with the following description and accompanying drawings: [0021] It is a block diagram of a related-art microprocessor instruction format. [0022] FIG. 1 is a table describing the instruction of a consumer ’s cooperative printing order issued by the Intellectual Property Bureau of the Intellectual Property Bureau of the Ministry of Economic Affairs. The bit logic state of an 8-bit opcode byte in the instruction format; [0023] Figure 2 is a block diagram of the age division of the present invention; ⑽1 ° 〇24] Figure 4 is a table, which Shows how the extended architecture corresponds to the 8-bit extended preamble embodiment according to the present invention. 1220042
[0025] 圖五係為解說本發明應用非暫存記憶體參照控制[0025] FIG. 5 illustrates the application of non-temporary memory reference control in the present invention.
之一管線化微處理器的方塊圖; IBlock diagram of a pipelined microprocessor; I
[0026] 圖六係為本發明於一微處理器中,用以指定一矛。 式化§己憶體參照之非暫存存取的延伸前置碼之一具體實施你 方塊圖; ^ [〇〇27] ®七係為圖五微處理器内轉譯階段邏輯之細 方塊圖, ' [0028] 圖人係為圖五之微處理器内延伸執行階段邏輯的 方塊圖,.以及 [0029] 目九係為描述本發明用於控制微處理器中之非暫 存記憶體參照的方法之運作流程圖。 C請先閱讀背面之注意事項再填寫本頁) 0 裝 經濟部智慧財產局員工消費合作社印製 圖號說明: 100指令格式 102運算碼 200 8位元運算碼圖 202運算碼F1H 300延伸指令格式 302運算碼 304延伸指令標記 400 8位元前置碼圖 500管線化微處理器 1〇1前置碼 1〇3位址指定元 201運算碼值 3〇1前置碼 303位址指定元 401架構特德 501提取邏詞 502指令快取記憶體/外部記憶體 503 5〇4 轉譯邏輯 訂--------- 11 1220042 經濟部智慧財產局員工消費合作社印製 五、發明說明(//) 505延伸轉譯邏輯 507執行邏輯 600延伸前置碼 602目的攔位 700轉譯階段邏輯 702機器特定暫存器 704指令緩衝器 706轉譯控制器 708逸出指令偵測器 710指令解碼器 712微指令緩衝器 714微運算碼攔位 716來源攔位 800延伸執行階段邏輯 802位址緩衝器 804目的運算元緩衝器 806記憶體特性描述元 808匯流排單元 810儲存邏輯 812匯流排 814延伸微指令暫存器 816非暫存載入緩衝器 900〜932控制微處理器 作流程 506微指令彳宁列 508延伸執行邏輯 601來源攔位 603備用棚位 701啟動狀態訊號 703延伸特徵攔位 705轉譯邏輯 707除能訊號 709延伸前置碼解碼器 711控制唯讀記憶體 713運算碼延伸項欄位 715目的攔位 717位移攔位 801微指令緩衝器 803位址緩衝器 805延伸存取邏輯 807快取記憶體 809存取控制器 811載入邏輯 813匯流排 815來源運算元緩衝器 817複合寫入緩衝器 中之非暫存記憶體參照的方法之運 • I.—^^ L—裝 (請先閱讀背面之注意事項再填寫本頁)[0026] FIG. 6 shows the invention in a microprocessor for designating a spear. One of the extended preambles of the non-temporary access referenced by the self-memory format is specifically implemented in your block diagram; ^ [0027] ® 7 is a detailed block diagram of the logic in the translation stage of the microprocessor in Figure 5. [0028] FIG. 5 is a block diagram of the extended execution stage logic in the microprocessor of FIG. 5, and [0029] The headline 9 is a description of the non-temporary memory reference used to control the microprocessor in the present invention. Method flow chart. C Please read the precautions on the back before filling in this page) 0 Install the printed number description of the employee consumer cooperative of the Intellectual Property Bureau of the Ministry of Economic Affairs: 100 instruction format 102 operation code 200 8-bit operation code figure 202 operation code F1H 300 extended instruction format 302 opcode 304 extended instruction mark 400 8-bit prefix code diagram 500 pipelined microprocessor 1101 prefix code 103 address designator 201 opcode value 3301 prefix code 303 address designator 401 Architecture Ted 501 extracts logical words 502 instructions cache memory / external memory 503 5 04 translation logic order --------- 11 1220042 Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs //) 505 extended translation logic 507 execution logic 600 extended preamble 602 purpose stop 700 translation stage logic 702 machine specific register 704 instruction buffer 706 translation controller 708 escape instruction detector 710 instruction decoder 712 micro Instruction buffer 714 Micro operation code block 716 Source block 800 Extended execution stage logic 802 Address buffer 804 Purpose operand buffer 806 Memory characteristic description 808 Bus unit 810 Store logic 812 Bus 814 Micro-instruction temporary register 816 Non-temporary load buffer 900 ~ 932 Controls the microprocessor's process 506 Micro-instruction 彳 Column 508 Extended execution logic 601 Source stop 603 Stand-by stand 701 Start status signal 703 Extended feature stop 705 translation logic 707 disabling signal 709 extended preamble decoder 711 control read-only memory 713 operation code extension field field 715 purpose stop 717 displacement stop 801 microinstruction buffer 803 address buffer 805 extended access logic 807 cache memory 809 access controller 811 load logic 813 bus 815 source operand buffer 817 composite write buffer non-temporary memory reference method operation I .— ^^ L— 装(Please read the notes on the back before filling this page)
ϋ mamam n n^-eJI ϋ an mmm— n l^i I I # 12 1220042 Α7 Β7 經濟部智慧財產局員工消費合作社印製 五、發明說明(A ) (五)發明詳細說明·· [0030] 以下的說明,係在一特定實施例及其必要條件的 脈絡下而提供,可使一般熟習此項技術者能夠利用本發明。 然而’各種對該較佳實施例所作的修改,對熟習此項技術者 而吕乃係顯而易見,並且,在此所討論的一般原理,亦可應 用至其他實施例。因此,本發明並不限於此處所展示與敘述 之特定實施例,而是具有與此處所揭露之原理與新穎特徵相 符之最大範圍。 [0031] 前文已針對今日之微處理器内,如何擴充其架構 特徵,以超越相關指令集能力之技術,作了背景的討論。有 鑑於此,在圖一與圖二,將討論一相關技術的例子。此處的 討論強調了微處理器設計者所一直面對的兩難,即一方面, 他們想將最新開發之架構特徵納入微處理器的設計中,但另 一方面,他們又要保留執行舊有應用程式的能力。在圖二至 二的例子中,一完全佔用之運算碼圖,已把增加新運算碼至 該範娜制可紐齡,_迫使設計者料輯=將新 特徵納入’錢牲某種程度之财倾相雜,要不就將架 構上的,新職—併放棄,讀維持微處理H與舊有應用s 式之相容性。在相關技術的討論後,於圖三至九,將提供對 本發明之討論。藉_用—既有但未制之運細作為;; 伸指令之輕碼標記,本發明可讓題驾設計者克服已完 t使用之指令集轉的關,除了提供程式貝於指令層級ί 定一特定記憶體參照之非暫存記憶體存取的能力,同時也 (請先閱讀背面之注意事項再填寫本頁) FI裝 1_| I 1 1111 訂-— I! — ! 13 A7 五、發明說明(/9 ) 保留執行餘應聰;切需之所有特徵。 [0032]請參閱圖―,爱 1·十負目101-103,母一項目皆貝 成微處理ϋ之-特定指令觸值,合在—起便組 器執行一特定運算寺f令100指示微處理 項 元從記憶舰移至1部暫’或者是將—運算 記憶體。-般而古,“=:,或從該内部暫存器搬移至 要執仃之特疋運异,而選用(〇pti〇nal)之位址指定元項目· 經濟部智慧財產局員工消費合作社印製 ㈣讀,1礙_肖纖謂加資訊, 像=如何執行额算,元錄贿㈣。齡格式刚 ,允井程式貝在-運算碼102前加上_碼項目101。在運 异碼102所指定之特定運算執行時,前置碼⑼用以指示是 =用特疋的木構特徵。一般來說,這些架構特徵能應用於 ^集中任何運算碼1G2所指定運算的大部分。例如,現今 珂置石馬101存在於一些能使用不同大小虛擬位址(如8位元、 I6位70、32位元)執行運算的微處理$中。而當許多此類處 理器被程式化為-預設的紐大小時(比如32位元),在其 1別指令集中所提供之前置碼1(n,仍能使程式員依據各個 指令,選擇性地取代(override)該預設的位址大小(如為了 產生16位it之虛擬位址)。可選擇之位址大小僅是架構特徵 之例,在s午多現代的微處理器中,這些架構特徵能應用於 眾多可由運算碼102加以指定的運算(如加、減、乘、布林 邏輯等)。 14 奉紙張尺度迥用甲國國豕標準(CNS)A4規格(21〇 X 297公釐) 經濟部智慧財產局員工消費合作社印製 1220042 A7 ---------— B7 _ 五、發明說明(/〆) ~ -- 圖一所示之指令格式卿,有一為業界所熟知的 牵巳例,此即滿指令袼式⑽,其為所有現代之秦相容微 採用。更具體地說,x86指令格式卿(也稱為娜 才曰々术架構議)使用了 8位元前置碼1〇1、8位元運算碼ι〇2 以及8位植址狀元⑽。.架構丨⑻亦具有數個前置碼 1〇1其中兩個取代了娜微處理器所預設的位址/資料大小 (即運算碼狀態66Η與㈣),另—個則指示微處理器依據 不同的轉譯規則來解譯其後之運算碼位元組1()2 (即前置碼 值OFH ’其使得轉譯動作是依據所謂的二位元組運算碼規則 來進行),其他的前置碼1〇1則使特殊運算重複執行,直至 重衩條件滿足為止(即REP運算碼:F〇h、F2H及F3H)。 [〇〇34]現請參閱圖二,其顯示一表格2〇〇,用以描述一 指令集架構之指令201如何對應至圖一指令格式内一 8位元 運算碼位元組102之位元值。表格2〇〇呈現了 一 8位元運算 碼圖200的範例,其將一 8位元運算碼項目1〇2所具有之最 多256個值,關聯到對應之微處理器運算碼指令2〇1。表格 200將運算碼項目1〇2之一特定值,譬如〇2H,映射至一對 應之運异碼指令201 (即指令102 201)。在χ86運算碼圖的 例子中,為此領域中人所熟知的是,運算碼值14Η係映射至 x86之進位累加(Add With Carry,ADC)指令,此指令將一 8位元之直接(immediate)運算元加至架構暫存器AL之内 含值。熟習此領域技術者也將發覺,上文提及之x86前置碼 101 (亦即 66H、67H、OFH、F0H、F2H 及 F3H)係實際的 運算碼值201,其在不同脈絡下,指定要將特定的架構延伸 15 本紙張尺度適用中國國家標準(CNS)A4規格(210 X 297公釐)ϋ mamam nn ^ -eJI ϋ an mmm— nl ^ i II # 12 1220042 Α7 Β7 Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 5. Description of the invention (A) (5) Detailed description of the invention. [0030] The following description It is provided in the context of a specific embodiment and its necessary conditions, so that those skilled in the art can utilize the present invention. However, various modifications made to the preferred embodiment will be apparent to those skilled in the art, and the general principles discussed herein can also be applied to other embodiments. Therefore, the present invention is not limited to the specific embodiments shown and described herein, but has the widest scope consistent with the principles and novel features disclosed herein. [0031] The foregoing has discussed the background of today's microprocessors on how to expand their architectural features to surpass the capabilities of related instruction sets. In view of this, an example of related technology will be discussed in Figs. 1 and 2. The discussion here highlights the dilemma that microprocessor designers have been facing. On the one hand, they want to incorporate the newly developed architecture features into the design of the microprocessor, but on the other hand, they must keep the old implementation. Application capabilities. In the examples in Figures 2 to 2, a completely occupied opcode diagram has added a new opcode to the Fanny system, but it is compulsory for designers to incorporate new features into 'Qianzai to some extent' Wealth is mixed, otherwise we will change the architecture, new position-and give up, and read to maintain the compatibility of micro-processing H and old applications. After a discussion of the related art, a discussion of the present invention will be provided in Figs. Borrowing_existing—existing but unspecified operations ;; extending the light code mark of instructions, the present invention allows the designer to overcome the barrier of instruction set transfers that have been used, in addition to providing programs at the instruction level. Ability to determine the access of non-temporary memory referenced by a specific memory, and also (please read the precautions on the back before filling this page) FI equipment 1_ | I 1 1111 Order-— I! —! 13 A7 V. DESCRIPTION OF THE INVENTION (/ 9) All the features of Yu Yingcong are reserved for execution. [0032] Please refer to the figure, love 1 · 10 negative heads 101-103, the mother and one item are all micro-processed-the specific command touch value, together-the toilet organ executes a specific operation, f 100 orders The micro-processing item element is moved from the memory ship to a temporary 'or will-operation memory. -As usual, "= :, or moved from the internal register to the special operation to be executed, and select (〇pti〇nal) address to specify the meta project · Intellectual Property Bureau, Ministry of Economic Affairs, Consumer Consumption Cooperative Printed and read, 1 hinder_Xiao Xian said to add information, like = how to perform the calculation, Yuan recorded bribes. The age format is just, Yunjing Chengbei added _code item 101 before-operation code 102. When a specific operation specified by code 102 is executed, the preamble ⑼ is used to indicate whether it is a special wooden structure feature. In general, these architectural features can be applied to most of the operations specified by any operation code 1G2 in the set. For example, today Kechi Shima 101 exists in some micro-processing $ that can perform operations using virtual addresses of different sizes (such as 8-bit, I6-bit 70, 32-bit). And when many of these processors are programmed as -When the preset button size (such as 32-bit), set the code to 1 (n before it is provided in another instruction set, which can still enable the programmer to selectively override the preset according to each instruction. Address size (such as to generate a 16-bit it virtual address). The selectable address size is only Examples of architectural features. In modern microprocessors, these architectural features can be applied to many operations (such as addition, subtraction, multiplication, Bollinger logic, etc.) that can be specified by opcode 102. 14 Printed in accordance with National Standard A4 (CNS) A4 (21 × 297 mm), printed by the Consumer Cooperative of the Intellectual Property Bureau of the Ministry of Economic Affairs 1220042 A7 ---------— B7 _ V. Description of the invention (/ 〆 ) ~-The instruction format shown in Figure 1 is a well-known example of the industry. This is the full instruction format, which is used by all modern Qin compatible micro. More specifically, the x86 instruction format ( Also known as Nacai's 々 Technical Architecture Discussion) uses 8-bit preambles 101, 8-bit opcodes ι02, and 8-bit addressing champion 架构 ..architecture 丨 ⑻ also has several preambles Two of 〇1 replaced the preset address / data size (namely, the opcode states 66 码 and ㈣) of the Na microprocessor, and the other instructed the microprocessor to interpret the subsequent ones according to different translation rules. Opcode byte 1 () 2 (i.e. the preamble value OFH 'which makes the translation action based on the so-called two bits The operation code rules are used), and the other preambles 101 cause special operations to be repeated until the repetition conditions are satisfied (ie, the REP operation codes: F〇h, F2H, and F3H). [〇〇34] Please Refer to FIG. 2, which shows a table 200, which is used to describe how the instruction 201 of an instruction set architecture corresponds to the bit value of an 8-bit opcode byte 102 in the instruction format of FIG. 1. The table 200 is presented. An example of an 8-bit opcode map 200, which associates a maximum of 256 values of an 8-bit opcode item 102 with the corresponding microprocessor opcode instruction 201. Table 200 calculates A specific value of the code item 102, such as 02H, is mapped to a corresponding different code instruction 201 (ie, instruction 102 201). In the example of the χ86 opcode diagram, it is well known in the art that the opcode value 14 is mapped to the x86 Add With Carry (ADC) instruction. This instruction is an 8-bit immediate (immediate ) Operand is added to the embedded value of the architecture register AL. Those skilled in this field will also find that the above-mentioned x86 preamble 101 (that is, 66H, 67H, OFH, F0H, F2H, and F3H) is the actual operation code value 201. In different contexts, the specified Extending the specific architecture by 15 paper sizes Applicable to China National Standard (CNS) A4 specifications (210 X 297 mm)
1220042 A7 五、發明說明(//) ,應用於隨後之運算碼項目1〇2所指定的運算。例如,在運 常情況下’係前述之ADc運算碼)前加上前置 使得x86處理器執行一「解壓縮與插入低壓縮之 早、又汙點值」(Unpack and Interleave bw p此ked ^gle_PredsiGn FlGating_pGint )運算’㈣縣的就 =。諸如此咖例子所述之特徵,魏代之微處理器中係 2 2能’此因微處理器内之指令轉譯/解碼邏輯是依序解 澤-才"100的項目1〇Μ〇3。所以在過去,於指令集架構中 使用特^運算碼值作為前置碼1(Π,可允賴處理器設計者 、不:>、先進的*構納人相容舊有軟體之微處理器的設計 、而不會對未使用那些特定運算碼狀態的財程I帶來 t上的負面衝擊。例如,—未f使用χ86運算碼咖的舊 有程式’仍可在今日的χ86微處理器上執行。而一較新的應 用程式’藉著運用滿運算碼簡作為前置碼101,就能使 用許多新進納入之χ86架構特徵,如單令 (S細)運算,條件移動運算等等。”夕重貝科 兑[〇〇35]儘管過去已藉由指定可用/多餘的運算碼值加作 為所置碼101 (也稱為架構特徵標記/指標1〇1或逸出指令 101)’來提供架構特徵’但許多指令集架構100在提供功能 2 I上的強化日^ ’仍會因為—非常直接的理由’而碰到阻礙:所 慧1有可用/多餘的運算碼值已被用完,也就是,運算碼圖中 的王^運异碼值已被架構化地指定。當所有可用的值被分派 為運算碼項目102或前置碼項目1〇1時,就沒有剩餘的運算 碼值可作為納入新特徵之用。這個嚴重的問題存在於現在的 注 項 頁 訂 t « : I___ 16 本紙張尺度適用中關家標準(CNS)A4規格(210 X 297公釐_y 1220042 五、發明說明(4) 許多微處理器架構中,因而迫使設計者得在增添架構特徵與 保留舊有程式之相容性兩者間作抉擇。 [0036] 值得注思的是,圖二所示之指令2〇1係以一般性 的方式表示(亦即124、脱),而非具體指涉實際的運算(如 ,位累加、減、互斥或)。這是因為,在一些不同的微處理 器架構中,完全佔用之運算碼圖200在架構上,已將納入較 新,展的可能性排除。雖然圖二例子所提到的,是8位元的 運算碼項目102,熟習此領域技術者仍將發覺,運算瑪1〇2 的特定大小,除了作為-特殊情況來討論完全佔用之運算瑪 結構200所造成的問題外,其他方面與問題本身並不相干。 f此,-完全侧之6位元運算碼_有64個可架構化地指 定之運算碼/前置碼2CU,並將無法提供可用/多餘的運算碼值 作為擴充之用。 [0037] 另一種替代做法,則並非將原有指令集完全廢 棄,以一新的格式100與運算碼圖2〇〇取代,而是只針對二 部份既有的運算碼201,以新的指令意含取代,如圖二之運 异碼40H至4FH。以這種混合的技術,微處理器就可以單獨 地以下列兩種模式之一運作··其中舊有模式利用運算碼 40H-4FH,係依舊有規則來解譯,或者以另一種改良模式 (enhanced mode)運作,此時運算碼40Η_4ΡΉ則依加強之架 構規則來解譯。此項技術確能允許設計者將新特徵納入設 計,然而,當符合舊有規格之微處理器於加強模式運作時, 缺點仍舊存在,因為微處理器不能執行任何使用運算碼 40H_4FH的應用程式。jg此,站在保留舊有軟體相容性的立 17 Μ氏張尺度適用中國國家標準(CNS)A4規格(210 X 297公釐) 先 閱 讀 背 面 之 注 意 事 項1220042 A7 V. Description of the invention (//) is applied to the operation specified by the subsequent operation code item 102. For example, under normal circumstances, 'the above-mentioned ADc opcodes' before the preamble makes the x86 processor perform a "decompress and insert low compression early, and stain value" (Unpack and Interleave bw p this ked gle_PredsiGn FlGating_pGint) operation '㈣ 县 的 就 =. According to the characteristics described in this example, the microprocessor in the Wei dynasty is able to process the instruction translation / decoding logic in the microprocessor in order. Therefore, in the past, using the special operation code value as the preamble 1 (Π in the instruction set architecture can allow processor designers, not: >, advanced * constructors to be compatible with the old software microprocessing Without the negative impact on t, which does not use those specific opcode states. For example, the old program that does not use χ86 opcodes can still be processed by today's χ86 microprocessors. A newer application 'by using the full operation code short as the preamble 101, can use many of the newly incorporated χ86 architecture features, such as single order (S thin) operation, conditional movement operation, etc. "Xi Zhongbeikedui [〇〇35] Although in the past by specifying the available / excessive opcode value plus as the set code 101 (also known as the architecture feature flag / indicator 101 or escape instruction 101) ' To provide architectural features', but many instruction set architectures 100 have been enhanced in the provision of function 2 I ^ 'will still be hindered by — very direct reasons': all available / excessive opcode values have been used End, that is, the Wang ^ ununique code value in the opcode graph has been structured When all available values are assigned as opcode entry 102 or preamble entry 101, there are no remaining opcode values available for inclusion in the new feature. This serious problem exists in current annotations Order t «: I___ 16 This paper size is applicable to the Zhongguanjia Standard (CNS) A4 specification (210 X 297 mm_y 1220042 V. Description of the invention (4) Many microprocessor architectures have forced designers to add [0036] It is worth noting that the instruction 201 shown in Figure 2 is expressed in a general way (ie, 124, off), It does not specifically refer to the actual operation (such as bit accumulation, subtraction, mutual exclusion, or). This is because in some different microprocessor architectures, the fully occupied operation code map 200 is architecturally included in the comparison. New, the possibility of exhibition is excluded. Although the example in Figure 2 mentioned above is an 8-bit operation code item 102, those skilled in the art will still find that the specific size of the operation ma 102, except as a-special case Let's discuss the fully occupied operation. Except for the problems caused by 200, other aspects are not related to the problem itself. F This,-6-bit opcodes on the full side _ have 64 opcodes / preambles that can be architecturally specified, and will not be available [0037] Another alternative is not to completely abandon the original instruction set and replace it with a new format 100 and opcode figure 200, but only for two Part of the existing operation code 201 is replaced by a new instruction meaning, as shown in the second code 40H to 4FH. With this mixed technology, the microprocessor can operate in one of the following two modes independently · Among them, the old mode uses operation codes 40H-4FH, which still has rules to interpret, or operates in another improved mode. At this time, operation codes 40Η_4ΡΉ are interpreted according to the enhanced architecture rules. This technology does allow designers to incorporate new features into their designs. However, when a microprocessor that conforms to the old specifications operates in enhanced mode, the disadvantages still exist because the microprocessor cannot execute any application that uses the opcode 40H_4FH. jg, standing at the standpoint of maintaining the compatibility of the old software. The 17 MM scale is applicable to the Chinese National Standard (CNS) A4 specification (210 X 297 mm). Read the notes on the back side first.
I裝 頁I I I I I I 訂 經濟部智慧財產局員工消費合作社印製 1220042 A7 經濟部智慧財產局員工消費合作社印製 五、發明說明(// ) 場,相容舊有軟體/加強模式的技術,還是無法接受的。 [〇〇38]然而,對於運算碼空間已完全佔用之指令集 200 ’且該空間涵蓋所有於符合舊有規格之微處_上執行之 應用程式的情形,本案發明人已注意到其中運算碼201的使 用狀況,且他們亦觀察出,雖然有些指令2〇2是架構化地指 疋,但未用於旎被微處理器執行之應用程式中。圖二所述之 指令IF1 202即為此現象之一例。事實上,相同的運算碼值 202 (亦即F1H)係映射至未用於χ86指令集架構之一有效指 令202。雖然該未使用之χ86指令2〇2是有效的沾6指令2〇2, 其指示要在x86微處理器上執行一架構化地指定之運算,但 它卻未使用於任何能在現代x86微處理器上執行之應用程 式。這個特殊的x86指令202被稱為電路内模擬中斷點(In Circuit Emulation Breakpoint)(亦即 ICE BKPT,運算碼值為 F1H),之前都是專門使用於一種現在已不存在之微處理器 核擬设備中。ICE BKPT 202從未用於電路内模擬器之外的應 用程式中,並且先前使用ICE BKPT 202之電路内模擬設備 已不復存在。因此,在X86的情形下,本案發明人已在一完 王佔用之指令集架構2〇〇内發現一樣工具,藉著利用一有效 但未使用之運算碼202,以允許在微處理器的設計中納入先 進的架構特徵,而不需犧牲舊有軟體之相容性。在一完全佔 用之指令集架構200中,本發明利用一架構化地指定但未使 用之運算碼202,作為一指標標記,以指出其後之一 η位元 前置碼,因此允許微處理器設計者可將最多2η個最新發展之 架構特徵,納入微處理器的設計中,同時保留與所有舊有軟 (請先閱讀背面之注意事項再填寫本頁) 裝I install page IIIIII order printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 1220042 A7 printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs V. Invention description (//) field, compatible with the old software / enhanced technology Accepted. [0038] However, for the case where the instruction code space 200 'has been fully occupied and the space covers all applications that are executed on micro-locations that meet the old specifications, the inventor of this case has noticed that the operation code The use of 201, and they also observed that although some instructions 202 are structured, they are not used in applications executed by microprocessors. The instruction IF1 202 described in Fig. 2 is an example of this phenomenon. In fact, the same opcode value 202 (ie F1H) is mapped to a valid instruction 202 that is not used in the x86 instruction set architecture. Although the unused χ86 instruction 200 is a valid instruction 6202, which instructs to perform a structurally specified operation on an x86 microprocessor, it is not used in any modern x86 microprocessor. Applications running on the processor. This special x86 instruction 202 is called the In Circuit Emulation Breakpoint (ie, ICE BKPT, and the opcode value is F1H). It was previously used exclusively for a microprocessor that no longer exists. Device. ICE BKPT 202 has never been used in applications other than in-circuit simulators, and in-circuit simulation devices that previously used ICE BKPT 202 no longer exist. Therefore, in the case of X86, the inventor of the present case has found a tool within the instruction set structure 2000 occupied by the King. By using an effective but unused operation code 202, the design of the microprocessor is allowed. Incorporates advanced architectural features without sacrificing compatibility with legacy software. In a fully occupied instruction set architecture 200, the present invention uses a structured but unused operation code 202 as an index mark to indicate the next n-bit preamble, thus allowing the microprocessor Designers can incorporate up to 2η of the latest development architecture features into the design of the microprocessor while retaining all the old software (please read the precautions on the back before filling this page).
_1 1 1 emmf^-rSJ a— 1 i· n I I ϋ I_1 1 1 emmf ^ -rSJ a— 1 i · n I I ϋ I
18 1220042 A7 B7 五 經濟部智慧財產局員工消費合作社印製 、發明說明(//) 體完全的相容性。 义[0〇39]本發明藉提供一 η位元之延伸非暫存存取指定元 前置碼,以使用前置碼標記/延伸前置碼的概念,因而可允許 程式員在-微處理器巾,依據每健令指定—非暫存記憶體 f取予-對應的記憶體參照運算。在該對應的記憶體參_ 异執行時,該非暫存記憶體存取被用於取代依照一預設屬性 所進行之基錄取記憶體(eaehe_based)的存取,射該預 口又屬性係由作㈣統程式先前建立之記憶體特性描述元表/ 機制所指定。本發明現將參照圖三至九進行討論。 [〇〇4〇]現請參關三,其為本發明之延伸指令格式· 的方塊圖。與圖-所討論之格式刚非常近似,該延伸指令 格式300具有數量可變之指令項目3G1_3G5,每_項目設定^ 一,定值,集合起來便組成微處理器之-特定指令該 特定指令指示微處_執行—特定運算,像是將兩運算 疋相加’或是將-運算元從記紐娜至微處_之暫存: 内。一般而言,指令300之運算碼項目3〇2指定了所^ ^ 之特定運算,而選用之位址指定元項目3G3則位於運瞀碼迎 ΐ,以f定該4鍵運算之相關附加資訊,像是如何“該運 异、運异7L所在之暫存H、胁計算祕/結科算元之情 體位址的直接制接資解等。指令格式3⑻亦^ 用既有她1項目3G1梅絲是否要使 _1]— ’本發_延伸指令係前述圖_指令格 請 項 # μ謙尺度適用 19 X 297公釐) 1220042 A7 經濟部智慧財產局員工消費合作社印製 五、發明說明(//) 式100之赵集合(superset),其具有兩個附加項目撕與 3G5,可被選擇性作為指令延伸項,並置於一格式化延伸指令 300中所有其餘項目301_303之前。這兩個附加項目綱與 305可讓程式員能對於延伸指令所指定之記憶體參照指 定-非暫存記憶體存取,其中對應於該記憶體參照之該非暫 存記憶體存取係無法另由符合舊有規格微處理器之既有指令 集來加以指定。選用項目304與3〇5係一延伸指令標記3〇4 與-延伸非暫存指定元前置碼3〇5。該延伸指令標記綱係 -微處理H指令集㈣一依據架構所指定之運算瑪。在一 滿的實施例中,該延伸指令標記304,或稱逸出標記304, 係用運算碼狀態F1H,其為早先使用之ICE ΒΚρτ指令。逸 出標記304向微處理器邏輯指丨,該延伸前置碼3〇5,或稱 1伸特徵指定元305,係跟隨在後,其中該延伸前置碼305 才曰疋了對應於一指定記憶體參照(即一載入運算、一儲存運 算或兩者)之一非暫存存取。在一具體實施例中,逸出標記 304指出’一對應延伸指令3〇〇之附隨部分3〇1_3〇3及3〇5指 定了微處理器所要執行之記憶體參照。非暫存存取指定元 3j〇5,或稱延伸前置碼3〇5,則指定在一來源運算元載入運 ^ 目的運异元儲存運算或以上兩者中,需進行該非暫存 存取。微處理器内之延伸執行邏輯便藉著進行該非暫存記憶 體存取,來執行該記憶體參照,以取代原先用其他方式所指 疋之可快取(caeheable)的預設記憶體屬性。這些其他方式 包括使用現代微處理器架構所具有之控制暫存器位元、記憶 體類型暫存器、分頁表及其他類型之記憶體屬性描述元18 1220042 A7 B7 V Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs, and the invention description (//) is fully compatible. [0039] The present invention provides an n-bit extended non-temporary access to the designated meta-preamble to use the concept of preamble marking / extended preamble, thus allowing programmers to perform micro-processing Device towel, specified according to each health order-non-temporary memory f fetch-corresponding memory reference operation. When the corresponding memory parameter is executed differently, the non-temporary memory access is used to replace the access to the base access memory (eaehe_based) according to a preset attribute. Designated as a memory table / mechanism previously created by the system. The invention will now be discussed with reference to Figs. [0040] Please refer to the third part, which is a block diagram of the extended instruction format of the present invention. It is very similar to the format discussed in the figure. The extended instruction format 300 has a variable number of instruction items 3G1_3G5. Each _ item is set to ^ one, and the value is set to form a specific instruction of the microprocessor. The specific instruction indicates Microlocation_execute—specific operations, such as adding two operations 疋 or '-operating element' from Temporary Nuina to Microlocation_'s temporary storage: inside. In general, the operation code item 3202 of instruction 300 specifies the specific operation of ^ ^, and the selected address designation meta-item 3G3 is located in the operation code, and f is used to determine additional information about the 4-key operation. , Such as how to "the Yun Yi, Yun Yi 7L where the temporary H, threat computing secret / end result of the emotional body address direct access to the fund solution, etc .. Command format 3⑻ also ^ using existing her 1 project 3G1 Does Meiss make _1] — 'this issue_extended instruction is the figure above_instruction grid please item # μ Qian scale applies to 19 X 297 mm) 1220042 A7 Printed by the Consumers ’Cooperative of Intellectual Property Bureau of the Ministry of Economic Affairs (//) Zhao set (superset) of formula 100, which has two additional items tear off and 3G5, can be selectively used as instruction extensions and placed before all the remaining items 301_303 in a formatted extension instruction 300. These two The additional item outline and 305 allow programmers to designate non-temporary memory access to the memory reference specified by the extended instruction, where the non-temporary memory access corresponding to the memory reference cannot be matched by the old Existing instructions with a specification microprocessor To specify it. Select items 304 and 305 as an extended instruction mark 304 and-extended non-temporary designated meta-prefix 305. The extended instruction mark system-micro-processing H instruction set-according to the architecture The specified operation ma. In a full embodiment, the extended instruction flag 304, or escape flag 304, uses the operation code state F1H, which is the ICE ΒΚρτ instruction used earlier. The escape flag 304 is processed to the micro processor. The device logic means that the extended prefix code 305, or 1 extended feature designator 305, follows, and the extended prefix code 305 corresponds to a designated memory reference (that is, a load Input operation, a store operation, or both) one of the non-temporary accesses. In a specific embodiment, the escape tag 304 indicates' an accompanying part 3001_3 and 3 of a corresponding extended instruction 300. 5 specifies the memory reference to be executed by the microprocessor. The non-temporary access designation element 3j05, or extended preamble 3505, specifies that a source operand is loaded to load the destination. In the operation or both, the non-temporary storage access is required. The extension execution logic executes the memory reference by performing the non-temporary memory access to replace the default cache attribute of caeheable that was originally referred to by other methods. These other methods include using Control register bits, memory type registers, paging tables, and other types of memory attribute descriptors in modern microprocessor architectures
(請先閲讀背面之注意事項再填寫本頁) 裝 • I 1_1 ϋ #. 發明說明(A) (descriptor ) 〇 [0042] 此處將本發明之非暫存參照的控制技術作個概 述。一延伸指令係組態為對一既有微處理器指令集之記憶體 參照指定一非暫存記憶體存取,其中該記憶體參照之非暫存 存取無法另以該既有微處理器指令集之指令來加以指定。該 延伸指令包括該既有指令集之運算碼/指令3〇4其中之一以及 一 η位元之延伸前置碼305。所選取之運算碼/指令作為一指 標304 ’以指出指令300是一延伸特徵指令3〇〇 (亦即,其指 定了微處理器架構之延伸項),而該η位元之特徵前置碼3〇5 則指出该非暫存存取係應用於一來源運算元、一目的運算元 或以上兩者。在一具體實施例中,延伸前置碼3〇5具八位元 的大小,可指定非暫存存取控制特徵與其他最多64種延伸特 徵的組合。η位元前置碼的實施例,則除了非暫存存取控制 特徵外,最多還可指定其他2η_2種延伸特徵。 [0043] 現請參閱圖四,一表格4〇〇顯示依據本發明,一 指定記憶體參照之非暫存存取控制特徵如何映射至一 8位元 延伸前置碼實施例之位元邏輯狀態。類似於圖二所討論之運 算碼圖2GG,圖四之表格4GG呈現-8位元之延伸前置碼圖 400的範例,其將一 8位元延伸前置碼項目3〇5之最多256 個值’ _到-符合舊有規格之微處理㈣對應延伸特徵 4〇1 (如Ε34、·等),其中兩個係指示進行非暫存存取。 在-滿的具體實施例中’本發明之8位元延伸特徵前置碼 305係k供給非暫存记憶體存取(亦即的指令 層級控制之用’ 記賴躲4G1乃現行χ86齡集架 1220042 A7 五、發明說明(〇?/ ) 於指令層級所未能指定的。 _4]圖四所示之延伸特徵•係以一般性的方式表 ===指涉實_特徵’此因本發明之技術可應用於 域觀者將發覺,許多不同的架構特徵·,其中 及’可依11匕處所述之逸出標記3ό4/延伸前置碼305技 一既有之指令集。圖四之8位元前置碼實施例提 256個不_特徵4G1,而―η位猶置碼實施例 貝J具有敢多2個不同特徵4〇1的程式化選擇。 ^ [0〇45]現雜_五,其為解說本發明用以執行非暫存 戏體參=、運异之官線化微處理器5⑻的方麵。微處理器 篇具有三個明顯的階段類型:提取、轉譯及執行。提取階 &具有提取邏輯5G1,可從指令快取記憶體5〇2或外部記憶 $ 502提取指令。所提取之指令經由指令側$的送至轉譯 P白I又。轉雜段具有轉譯邏輯綱,搞接至一微指令仔列 ,06轉澤邏輯5〇4包括延伸轉譯邏輯5〇5。執行階段則有執 行邏輯507,其内具有延伸執行邏輯508。 [0046]依據本發明,於運作時,提取邏輯5〇1從指令快 取。己隐體/外部兄憶體5〇2提取格式化指令,並將這些指令依 二執行順序放入指令佇列5〇3中。接著從指令佇列5〇3提取 =二#曰々,送至轉澤邏輯504。轉譯邏輯504將每一送入的 才曰令轉譯/解碼為-對應之微齡相,以指稀處理器5〇〇 去執行這些指令所指定的運算。依本發明,延伸轉譯邏輯505 偵/貝j那些具有延伸前置碼標記之指令,以進行對應延伸非暫 •ί !ί·!裝 (請先閱讀背面之注意事項再填寫本頁)(Please read the precautions on the back before filling out this page.) 装 • I 1_1 ϋ #. Description of the Invention (A) (descriptor) 〇 [0042] Here is an overview of the non-temporary reference control technology of the present invention. An extended instruction is configured to specify a non-temporary memory access to a memory reference of an existing microprocessor instruction set, wherein the non-temporary access to the memory reference cannot be replaced by the existing microprocessor. The instructions in the instruction set are specified. The extended instruction includes one of the operation code / instruction 304 of the existing instruction set and an n-bit extended preamble 305. The selected operation code / instruction is used as an index 304 'to indicate that the instruction 300 is an extended feature instruction 300 (that is, it specifies an extension of the microprocessor architecture), and the n-bit feature preamble 3 05 indicates that the non-transitory access is applied to a source operand, a destination operand, or both. In a specific embodiment, the extended preamble 305 has an octet size, and can specify a combination of non-temporary access control features and up to 64 other extended features. In the embodiment of the η-bit preamble, in addition to non-temporary access control features, a maximum of other 2η_2 extension features can be specified. [0043] Referring now to FIG. 4, a table 400 shows how a non-temporary access control feature referenced by a specified memory is mapped to the bit logic state of an 8-bit extended preamble embodiment according to the present invention. . Similar to the operation code diagram 2GG discussed in FIG. 2, the table 4GG in FIG. 4 shows an example of an 8-bit extended preamble map 400, which extends a maximum of 256 8-bit preamble entries 305. Values _to-microprocessing that conforms to the old specifications, corresponding to extended features 401 (such as Ε34, ·, etc.), two of which indicate non-temporary access. In the specific embodiment of the "man", the 8-bit extended feature preamble 305 of the present invention is provided for non-temporary memory access (that is, for instruction-level control). Remember that hiding 4G1 is the current x86 age. Set 1220042 A7 V. Explanation of the invention (〇? /) Which cannot be specified at the instruction level. _4] The extended features shown in Figure 4 are shown in a general way. The technology of the present invention can be applied to domain viewers and will find that there are many different architectural features. Among them, 'can be escaped according to the description of the escape tag 3ό4 / extended preamble 305-an existing instruction set. The four-bit 8-bit preamble embodiment provides 256 non-characteristic 4G1, and the η-bit preamble code embodiment has a stylized selection of 2 different features 4 0. ^ [0〇45] Now it is five, which explains the aspects of the present invention used to implement the non-temporary play-by-play game, the official line of the microprocessor 5⑻. The microprocessor chapter has three distinct types of stages: extraction, translation, and Execution. Fetch stage & has fetch logic 5G1, which can fetch instructions from the instruction cache memory 502 or external memory $ 502. The instruction is sent to the translator P and I through the instruction side $. The transmision section has a translator logic program that connects to a microinstruction queue. 06 translator logic 504 includes extended translator logic 505. During the execution phase, There is execution logic 507, which has extended execution logic 508. [0046] According to the present invention, during operation, the extraction logic 501 is cached from the instruction cache. The hidden / external brother memory 502 is used to extract the formatted instruction, And put these instructions into the instruction queue 503 in the order of execution. Then extract from the instruction queue 503 = two # 々, and send it to the transfer logic 504. The translation logic 504 sends each input The command translates / decodes into-corresponding micro-age phase, to refer to the thin processor 500 to perform the operations specified by these instructions. According to the present invention, the extended translation logic 505 detects Instructions for corresponding extended non-temporary • ί! Ί !! (please read the precautions on the back before filling this page)
I _1 n n 一:OJ羼 MM I I MB Ml· I _#- 經濟部智慧財產局員工消費合作社印製I _1 n n One: OJ 羼 MM I I MB Ml · I _ #-Printed by the Consumer Cooperative of the Intellectual Property Bureau of the Ministry of Economic Affairs
本紙張尺度_TiiS^NSM4 ^ X 297公釐) 1220042 A7 五、發明說明(说) 存§己丨思體參照4日疋元刖置碼之轉譯/解碼。在—幼6的杳:》 中,延伸轉譯邏輯505組態為偵測其值為ΠΗ之延伸二 標記,其係碰之ICE BKPT運算碼。微指令攔位則提供於 微指令侧506中’以允許指定由該指令附隨部分所指定之 相關記憶體參照的來源/目的非暫存存取。 [0047] 微指令從微指令佇列5〇6被送至執行邏輯斯, 其中延伸執行邏輯508組態為依照一預設記憶體特性(由既 有記憶體特性描述元m_狀義)執行一指定記憶體參 照,或組態為執行於使用者層級透過本發明之延伸前置^所 程式化之一非暫存土憶體存取,依延伸微指令攔位的指定, 取代3預汉的A憶體特性,並完全跳過快取記憶體。在一具 體實施例中,非暫存儲存運算的處理方<,與使用 ^ 入屬性之位址區間的儲存運算相同。 々口… [0048] 熟習此領域技術者將發現,圖五所示之微處理器 500係現代之管線化微處理器5〇經過簡化的結果。事實上, 現代的管線化微處理H篇最多可包含有2()至3σ個二同的 管線階段。然而,這些階段可概括地歸類為方塊圖所示之三 個階段’因此’圖五之方塊圖500可用以點明前述本發明實 經濟部智慧財產局員工消費合作社印製 (請先閱讀背面之注意事項再填寫本頁) # 施例所需之必要7L件。為了簡明起見,微處理器5〇〇中無關 的元件並未顯示出來。 [0049] 現請參閱圖六,其為本發明於一微處理器中,用 以指定-程式化記憶體參照之非暫存存取的延伸前置碼_ 之-具體貫施例方塊圖。非暫存存取指定元前置碼6〇〇具8 位元大小,且包括一來源攔位6〇1、一目的欄位6〇2及一備 23 私紙張尺度適用中國國豕^示準(CNS)A4規格(21〇 X 297公复) 1220042 A7 經濟部智慧財產局員工消費合作社印製Size of this paper _TiiS ^ NSM4 ^ X 297 mm) 1220042 A7 V. Description of the invention (speaking) Save §self-thinking translation / decoding with reference to the 4th Yuan Yuan code. In "幼 6 杳:", the extended translation logic 505 is configured to detect the extended two mark whose value is ΠΗ, which is the ICE BKPT operation code. The microinstruction block is provided in the microinstruction side 506 'to allow specifying the source / destination non-temporary access of the relevant memory reference specified by the instruction accompanying section. [0047] The micro-instruction is sent from the micro-instruction queue 506 to the execution logic, where the extended execution logic 508 is configured to execute according to a preset memory characteristic (by an existing memory characteristic descriptor m_ 状 义). A designated memory reference, or configured to be executed at the user level through one of the non-transitory local memory accesses programmed through the extended frontend of the present invention, and designated by the extended microinstruction to replace 3 pre-hanks A memory feature and skips cache memory completely. In a specific embodiment, the processing method of the non-transitory storage operation < is the same as the storage operation of the address range using the input attribute. [0048] Those skilled in the art will find that the microprocessor 500 shown in FIG. 5 is a simplified result of a modern pipelined microprocessor 50. In fact, modern pipeline microprocessing H can contain at most 2 () to 3σ two identical pipeline stages. However, these stages can be broadly categorized into the three stages shown in the block diagram. Therefore, the block diagram 500 of FIG. 5 can be used to point out the above printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Real Economy of the present invention (please read the back first) (Notes on this page, please fill out this page) # The necessary 7L pieces required for the implementation. For brevity, extraneous components of the microprocessor 500 are not shown. [0049] Please refer to FIG. 6, which is a block diagram of a specific embodiment of an extended preamble of a non-temporary access referenced-programmed memory reference in a microprocessor according to the present invention. The non-temporary access designated meta-prefix 600 has an 8-bit size, and includes a source block 601, a field 602, and a backup 23. Private paper standards are applicable to China's national standards ^ CNS) A4 specification (21〇X 297 public reply) 1220042 A7 Printed by the Consumer Cooperative of Intellectual Property Bureau of the Ministry of Economic Affairs
---------- R7_ 五、發明說明(A ) 用攔位603。來源嫩6〇1指定一非暫存存取要應用於一相 關延伸,令之其餘部分所指定的來源運算元記憶體存取(即 載入、讀取)中,而目的攔位則指定一非暫存存取要應 用於該其餘部分所指定的目的運算元記憶體存取(即儲存、 寫入)中。熟悉此領域技術者將發覺,來源與目的非暫存存 取可分別加以指^,在與重複字串指令如χ86架構的卿 MOVS等連用的情形下,會特別有用。 。[〇〇5〇]現請參閱圖七,其為圖五之微處理器内轉譯階段 邏輯700之細部的方塊圖。轉譯階段邏輯700具有一指令緩 衝器704,依本發明,其提供延伸指令至轉譯邏輯705。轉譯 邏輯705係耦接至一具有一延伸特德^欄位7〇3之機器特定暫 存器(machine specific register) 7〇2。轉譯邏輯 7〇5 具一轉譯 控制器706,其提供-除能訊號7〇7至一逸出指令偵測器· 及一延伸解碼器709。逸出指令偵測器7〇8耦接至延伸解碼 器709及一指令解碼器71〇。延伸解碼器7〇9與指令解碼邏 輯710存取一控制唯讀記憶體(R〇M) 711,其中儲存了對 應至某些延伸指令之樣板(template)微指令序列。轉譯邏輯 705亦包含-微指令緩衝器712,其具有一運算碼延伸項棚位 713、一微運算碼襴位714、一目的欄位715、一來源攔位716 以及一位移欄位717。 [0051]運作上’在微處理器通電啟動期間,機器特定暫 存器观内之延伸欄位7〇3的狀態係藉由訊號啟動狀態 (signal P_r-up state) 701決定,以指出該特定微處理器是 否能轉譯與執行本發明之用以執行指令層級之非暫存記---------- R7_ V. Description of the Invention (A) Use stop 603. Source Tone 601 specifies a non-temporary access to be applied to a related extension, so that the rest of the source operand memory accesses (ie, load, read) specified by the rest of the source, and the destination block specifies a Non-temporary accesses are applied to the destination operand memory accesses (ie, store, write) specified by the rest. Those skilled in the art will find that the source and destination non-temporary storage can be specified separately, and it will be particularly useful in the case of repeated string instructions such as the MOVS of the x86 architecture. . [0050] Please refer to FIG. 7, which is a detailed block diagram of the logic 700 of the translation stage in the microprocessor of FIG. 5. The translation stage logic 700 has an instruction buffer 704, which provides extended instructions to the translation logic 705 according to the present invention. The translation logic 705 is coupled to a machine specific register 702 having an extended Ted field 703. The translation logic 705 has a translation controller 706, which provides a disabling signal 707 to an escape command detector and an extended decoder 709. The escape instruction detector 708 is coupled to the extended decoder 709 and an instruction decoder 71. The extended decoder 709 and the instruction decoding logic 710 access a control read-only memory (ROM) 711, which stores a template micro instruction sequence corresponding to some extended instructions. The translation logic 705 also includes a micro-instruction buffer 712, which has an opcode extension slot 713, a microop address 714, a field 715, a source block 716, and a shift field 717. [0051] Operationally, during the start-up of the microprocessor, the state of the extended field 703 in the specific register of the machine is determined by the signal P_r-up state 701 to indicate the specific Whether the microprocessor can translate and execute the non-temporary register of instruction level of the present invention
本紙張尺度適用中國國家標準(CNS)A4規格(210 X (請先閲讀背面之注意事項再填寫本頁) • I n --I I 訂·!-- #· 24 042 042This paper size applies to China National Standard (CNS) A4 specifications (210 X (please read the precautions on the back before filling this page) • I n --I I order ·!-# · 24 042 042
五、發明說明 參照的延伸指令。在一每 經濟部智慧財產局員工消費合作社印製 制暫存器(圖上未顯干號701從一特徵控 ΐ 出,該特徵控㈣存11則讀取— 器特定暫存哭观蔣」車(sea卿)(未顯示)。機 哭706。^ 特徵嫌703之狀態送至轉譯控制 二 5控制邏輯706則控制從指令緩衝H 7〇4所提取 供::延伸轉譯規則或習用轉譯規則進行解譯。提 供廷樣的控制特徵,可& 了允雜督應綠式(如BIOS)致能/ 處理器之延伸執行特徵。若延伸特徵被除能’則具有 =為延伸特徵標記之運算碼狀態的指令,將依制轉譯規 1轉#。在—x86的具體實施财,選取運算碼狀態腿 作己’則在習用的轉譯規則下,遇到削將造成不合法 勺才曰7異吊(exceptl0n)。若延伸轉譯被除能,指令解碼器 710將轉澤/解碼所有送人的指令,並對微指令712的所有棚 ^;713至717進行組態。然而,在延伸轉譯規則下,若遇到 私5己,則會被逸出指令偵測器708偵測出來。逸出指令偵測 器7〇8將指示延伸前置碼解碼器7〇9依據延伸轉譯規則,轉 譯/解碼該延伸指令的延伸前置碼部分,並對運算碼延伸項攔 位713進行組悲’以指示該非暫存記憶體存取要應用於該延 伸指令之其餘部分所指定的記憶體參照中。指令解碼器710 將解碼/轉譯該延伸指令之其餘部分,並對微指令712的微運 异碼攔位714、來源欄位716、目的欄位715以及位移攔位 717進行組態。某些特定指令將導致對控制ROM 711的存 取’以獲取對應之微指令序列樣板。經過組態之微指令712 被送至一微指令佇列(未顯示於圖中),由處理器進行後續 25 本紙張尺度通用中國國家標準(CNS)A4規格(21() χ 297公釐5. Description of the invention A temporary register is printed at the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs (the number 701 is not shown in the picture from a feature control, and the feature control is stored in 11 reads-device-specific temporary storage.) Car (sea) (not shown). Machine cry 706. ^ The status of feature 703 is sent to the translation control II. 5 control logic 706 controls the extraction from the instruction buffer H 704 for: extended translation rules or custom translation rules. Interpretation. Provides control features like Ting, which can be used to allow the administrator to implement a green-type (such as BIOS) extended processor execution feature. If the extended feature is disabled, then == The instruction of the state of the opcode will be based on the translation rule 1 turn #. In the specific implementation of -x86, select the leg of the opcode state as your own. Under the conventional translation rules, it will be illegal to encounter a cut. Except l0n. If the extended translation is disabled, the instruction decoder 710 will translate / decode all the instructions given to you, and configure all the sheds 713 to 717 of the microinstruction 712. However, in the extended translation Under the rules, if you encounter a private person, you will be escaped Detected by the detector 708. The escape instruction detector 708 will instruct the extended preamble decoder 709 to translate / decode the extended preamble portion of the extended instruction according to the extended translation rule, and to the operation code Extension block 713 performs a group tragedy to indicate that the non-temporary memory access is to be applied to the memory reference specified by the rest of the extension instruction. The instruction decoder 710 will decode / translate the rest of the extension instruction, And configure the micro-transport different code block 714, source field 716, destination field 715, and displacement block 717 of the micro instruction 712. Certain specific instructions will cause access to the control ROM 711 'to obtain the corresponding Micro-instruction sequence template. The configured micro-instruction 712 is sent to a micro-instruction queue (not shown in the figure), which is processed by the processor for the subsequent 25 paper standards. Common Chinese National Standard (CNS) A4 specification (21 () χ 297 mm
1220042 A7 五、發明說明(J/) 執行。 [0052]現请參閱圖八,其為圖五微處理器内之延伸執行 階段邏輯800的方塊圖。該延伸執行階段邏輯8⑻具一延伸 存取邏輯(extended access logic) 805,其分別經由匯流排812 與813耦接至一快取記憶體807與一匯流排單元8〇8。匯流 排單元808係用於指導一記憶體匯流構(圖中未顯示)上之 記憶體存取作業(memory transaction)。依本發明,延伸存 項1220042 A7 V. Description of Invention (J /) Implementation. [0052] Please refer to FIG. 8, which is a block diagram of the extended execution stage logic 800 in the microprocessor of FIG. The extended execution stage logic 8 includes an extended access logic 805, which is coupled to a cache memory 807 and a bus unit 808 via buses 812 and 813, respectively. The bus unit 808 is used to guide a memory transaction on a memory bus structure (not shown). According to the present invention, extended inventory
頁 取邏輯805從微處理器前一階段之一延伸微指令緩衝器8〇1 接收彳曰令’從位址緩衝802與803接收兩個位址運算元, 並從目的運算元緩衝器804接收一目的運算元。延伸存取邏 輯805亦耦接至複數個依主機微處理器之架構常規進行組態 的記憶體特性描述元806。延伸存取邏輯8〇5包含一存取控 制器809、一儲存邏輯810及一載入邏輯811。載入邏輯811 包含一非暫存載入緩衝器816,並將一來源運算元輸出至一 來源運算元緩衝器815。儲存邏輯810則具有一複合寫入緩 衝器 817。 口”、 經濟部智慧財產局員工消費合作社印製53 [G〇53]運作上’延伸執行邏輯_係彳峨延伸微指令緩 衝器801中之微指令的指示,來執行記憶體存取,從記憶體 讀取運算元,以及將運算元寫入記憶體。執行讀取/載入運算 時,存取控制器809從位址缓衝器8〇2與8〇3接收一個或更 多記憶體位址,並讀取記憶體特性描述元嶋,以決定相關 於该載入運异之記憶體屬性。在_ χ86實施例中,記憶體特 性描述元806包含χ86快取記憶體與分頁控制暫存器、分頁 目錄與分頁表項目、記憶體_範圍暫存器(_町咖 26 Μ氏張尺度適用中國國家標準(CNS)A4規格(210 X 297公 1 1220042 A7 B7 五、發明說明(W ) 辟 ’ MTTR)、分頁屬性表(paging attribute 孤 PAT)以及外部訊號腳位证爾、丁#、pcT及pwT。 存取控制器8〇9依據x86的層級記憶體屬性常規,使用從這 些來源806所取得之資訊,以決定該載入運算之預設記憶體 屬性。對非x86之實施例而言,存取控制器_依據對應主 機微處理II之特定架_層級記憶闕性魏,使用從記憶 體特性描述元806所取得之資訊,來決定人運算之預設 記憶體屬性。記憶體位址,連同其對應存取之屬性,被送至 載入邏輯S1卜依據所提供之特性屬性,載入邏輯川經由 匯流排812從快取記憶體或直接經由匯流排單元808從系統 記,體(未顯示)獲得來源運算元。所獲得之來源運算元與 答線時脈訊號(未顯示)同步,被送至來源運算元緩衝器 815。延伸微指令亦與該管線時脈訊號同步,被送入管線至延 伸微指令暫存器814。來源運算元便以此種方式被送至微處 理器之下一階段。 [0054]執行延伸微指令所指示之寫入/儲存運算時,存取 控制器809從位址緩衝器、8〇2與8〇3接收該運算之位址資 料,並從緩衝器804接收所要儲存之運算元。存取控制器8〇9 存取如别所述之記憶體特性描述元8〇6,以決定對應於該儲 存存取運算之記憶體特性。該記憶體特性、位址資料以及該 目的運异兀並送至儲存邏輯810。依據所提供之特定屬性, 儲存邏輯810經由匯流排812將該目的運算元寫入快取記憶 體807,或直接經由匯流排單元8〇8寫入系統記憶體。 [0055]本發明之儲存邏輯81〇與載入邏輯811被組態為 27 本紙張尺度適用中國國家標準(CNS)A4規袼(21〇_x 297公爱) 1220042 員 五 A7 、發明說明(J/) 依據主機處理器之記憶體屬性模型的相關處理要求,來執行 儲存與載入的參照運算,其中該處理要求係包括強/弱排序常 規(如假想執行規則)以及快取存取原則。在一具體實施例 中,載入與儲存運算係在主機微處理器之不同管線階段中執 行。 [0056]對使用非暫存記憶體參照前置碼的延伸指令而 言,相關記憶體參照(即載入、儲存或載入與儲存兩者)之 非暫存運异元指定元透過延伸微指令緩衝器8〇1内之延伸微 才曰令的運异碼延伸項攔位(未顯示),被送至存取控制器 809。存取控制器809,如前所述,藉由從記憶體特性描述元 806所獲得之資汛,決定所指定記憶體存取之預設記憶體特 性。若該對應之預設特性允許非暫存存取(即可快取之特性, 如回寫特性),則存取控制器_將非暫存指定元連同前述 之位址及/或目的運算元,送至儲存邏輯讀載入邏輯川。 若該對應之預設特性不允許非暫存存取(即不可快取之特 ^ ’則存取控制器8。9將該預設特性連同位址及/或目的運 异元,送至儲存邏輯810/載入邏輯811。 仔戰入翏照而g,載入邏輯811首先 :,:;,_梅記憶雜?。=:: ^疋否存在且有效(即載人命中(load hit))。若日 ^载入,即_記麵雜執行。細,絲對2 不存在於快取記憶體807中,則载人邏輯^it 。早π _從域顧取她域人ilh之快取 ________ 28 紙張尺度適The page fetch logic 805 extends the microinstruction buffer 801 from one of the previous stages of the microprocessor. The receive command is' receive two address operands from the address buffers 802 and 803, and receive from the destination operand buffer 804. A destination operand. The extended access logic 805 is also coupled to a plurality of memory characteristic descriptors 806, which are conventionally configured according to the architecture of the host microprocessor. The extended access logic 805 includes an access controller 809, a storage logic 810, and a load logic 811. The load logic 811 includes a non-temporary load buffer 816 and outputs a source operand to a source operand buffer 815. The storage logic 810 has a composite write buffer 817.口 ”, Printed by the Intellectual Property Bureau ’s Employees’ Cooperatives of the Ministry of Economic Affairs 53 [G〇53] Operational 'Extended Execution Logic _ is the instruction of the micro instruction in the Saga extended micro instruction buffer 801 to perform memory access from Memory read operands and write operands to memory. When performing read / load operations, the access controller 809 receives one or more memory bits from the address buffers 802 and 803. Address, and read the memory characteristic descriptor to determine the memory attributes related to the load. In the _χ86 embodiment, the memory characteristic descriptor 806 contains the χ86 cache memory and the paging control temporary storage. Device, paging table of contents and paging table items, memory _ range register (_ Machika 26 M Zhang scale applicable to Chinese National Standard (CNS) A4 specifications (210 X 297 1 1220042 A7 B7 V. Description of the invention (W) (MTTR), paging attribute table (paging attribute orphan PAT), and external signal pins ID, Ding #, pcT, and pwT. The access controller 809 is based on the x86 level memory attribute conventions and uses 806 from these sources. Information obtained to determine The default memory attribute of the operation. For non-x86 embodiments, the access controller _ according to the specific frame corresponding to the host microprocessor II _ level memory performance, uses the information obtained from the memory characteristic descriptor 806 To determine the default memory attributes for human computing. The memory address, along with its corresponding access attributes, is sent to the load logic S1. Based on the provided attribute attributes, the load logic is retrieved from the cache via the bus 812 The memory or the source operand is obtained from the system memory (not shown) directly via the bus unit 808. The obtained source operand is synchronized with the answer clock signal (not shown) and sent to the source operand buffer 815 The extension microinstruction is also synchronized with the pipeline clock signal and is sent to the pipeline to the extension microinstruction register 814. In this way, the source operand is sent to the next stage of the microprocessor. [0054] Execution extension When a write / storage operation is instructed by a micro instruction, the access controller 809 receives the address data of the operation from the address buffer, 802 and 803, and receives the operand to be stored from the buffer 804. Save The controller 809 accesses the memory characteristic descriptor 806 as described above to determine the memory characteristics corresponding to the storage access operation. The memory characteristics, address data, and the purpose are different. Send to storage logic 810. According to the specific attributes provided, storage logic 810 writes the destination operand into cache memory 807 via bus 812, or directly into system memory via bus unit 808. [0055 ] The storage logic 810 and the load logic 811 of the present invention are configured as 27. The paper size is applicable to the Chinese National Standard (CNS) A4 regulation (21〇_x 297 public love) 1220042 Member 5 A7, invention description (J / ) Perform reference operations for storage and loading according to the processing requirements of the memory attribute model of the host processor, where the processing requirements include strong / weak ordering conventions (such as imaginary execution rules) and cache access principles. In a specific embodiment, the load and store operations are performed in different pipeline stages of the host microprocessor. [0056] For an extended instruction using a non-temporary memory reference preamble, the non-temporary transport heterogeneous element designator of the relevant memory reference (that is, load, store, or both load and store) extends the micro The extension code extension command block (not shown) in the instruction buffer 801 is sent to the access controller 809. The access controller 809 determines the default memory characteristics of the specified memory access by using the data obtained from the memory characteristic descriptor 806 as described above. If the corresponding default feature allows non-temporary access (ie, cacheable features, such as write-back feature), then the access controller _ sets the non-temporary designated element together with the aforementioned address and / or destination operand , And send it to the storage logic read and load the logic stream. If the corresponding preset feature does not allow non-temporary access (ie, a non-cacheable feature ^ ', then the access controller 8. 9 sends the preset feature with the address and / or destination transport element to storage Logic 810 / Load logic 811. A warrior enters the photos and g, load logic 811 first:,:;, _ mei memory miscellaneous ?. = :: ^ 疋 Exists and is valid (ie load hit) ). If the day ^ is loaded, that is, _memory execution is fine. Thin, silk pair 2 does not exist in the cache memory 807, then the manned logic ^ it. Early π _ fetch her domain person ilh fast from the domain Take ________ 28 paper size
---Γ ” < 裝--------訂--------- (請先閱讀背面之注意事項再填寫本頁} 1220042 經濟部智慧財產局員工消費合作社印製 A7 五、發明說明(d/) 線,並將該快取線保留於非暫存 過快取記憶體807。於是 ::,王跳 源運算元緩衝n 815。 載人運异磁非暫存地送至來 Μ ’Π就非暫存_參照而言’儲存邏輯81G首先詢問 ^槐體斯,以判斷在快取記憶體浙中,-透過目的 =元緩衝ϋ _所提供的儲存運算元之—對應快取線是否 f在且有效(即儲存命中(咖ehit))。若是,則儲存運算 P依该預4記憶體特性㈣是非暫存地執行。義,若該快 取線不存在於快取記憶體_中(即儲存未中(storemiss)), 則儲存邏輯810並不配置快取記憶體浙的㈣給該快取 線L而ΐ將該儲存運算元送至複合寫入緩衝器817。複合寫 錢衝器、817的内容接著透過匯流排單元808,直接被寫入 記憶體,以符合特定處理器(pr〇cess〇r_specific)層級記憶體 屬性的處理常規,其巾該f規係應驗複合寫人之記憶體特 生。,一 X86的實施例中,複合寫入屬性允許將記憶體的寫 運"^予以延遲及合併,而不要求一致性(coherency)。儲 存運算元因而以非暫存的方式被送至記憶體。 [〇〇59]現請參閱圖九,其為描述本發明對可使程式員於 才曰令層級取代微處理器内之非暫存記憶體參照的指令,進行 轉澤與執行的方法之運作流程圖9〇〇。流程開始於方塊9〇2, 其中一個組態有延伸特徵指令的程式,被送至微處理器。流 程接著進行至方塊904。 [0060]於方塊904中,下一個指令係從快取記憶體/外部 5己憶體提取。流程接著進行至判斷方塊9〇6。 I ^ I 1 I I — — — — — — — ^-11111111^ (請先閱讀背面之注意事項再填寫本頁)--- Γ ”< Packing -------- Order --------- (Please read the precautions on the back before filling out this page} 1220042 Printed by the Consumer Cooperative of the Intellectual Property Bureau of the Ministry of Economic Affairs A7 V. Invent the (d /) line, and keep the cache line in the non-temporary cache memory 807. So :, the king jump source operand buffer n 815. Manned transport non-temporary non-temporary cache The arrival of the ground M 'Π in terms of non-temporary_reference' storage logic 81G first asks ^ Huai Tisi, to determine in the cache memory,-through the purpose = metabuffer ϋ _ the storage operand provided No—corresponds to whether the cache line is present and valid (ie, storage hit (ehit)). If yes, the storage operation P is performed non-temporarily according to the pre-4 memory characteristics. Meaning, if the cache line does not exist In the cache memory_ (ie, store miss), the storage logic 810 does not configure the cache memory to the cache line L and sends the storage operand to the composite write buffer 817. The contents of the composite writing punch, 817 are then directly written into the memory through the bus unit 808 to meet the specific processor (pr〇cess r_specific) Hierarchical memory attribute processing routines, which are related to the memory specificity of composite writers. In an X86 embodiment, the composite write attribute allows delaying memory write operations " ^ And merge without requiring coherency. The storage operand is therefore sent to memory in a non-transitory manner. [0059] Now refer to FIG. Only the command level replaces the non-temporary memory referenced instructions in the microprocessor, and the flow chart of the method for translating and executing is 900. The flow starts at block 902, and one of them is configured with extended feature instructions. The program is sent to the microprocessor. The flow then proceeds to block 904. [0060] In block 904, the next instruction is fetched from the cache memory / external memory. The flow then proceeds to decision block 9o. 6. I ^ I 1 II — — — — — — — — ^ -11111111 ^ (Please read the notes on the back before filling this page)
1220042 經 濟 部 智 慧 財 產 局 員 工 消 費 合 作 社 印 製 A7 B7 五、發明說明(γ) [0061] 於判斷方塊906中,對在方塊904中所提取的下 個指令進行檢查,以判斷是否包含一本發明之延伸逸出碼。 在一 χ86的η施例中,邊檢查係用以摘測運算碼值fi (ice BKPT)。若偵測到該延伸逸出碼,則流程進行至方塊9〇8。 若未偵測到該延伸逸出碼,則流程進行至方塊9丨2。 [0062] 於方塊908中,解碼/轉譯該延伸指令之延伸前置 碼部分,以决疋疋否應用一非暫存存取,該非暫存存取係被 指定為取代於方塊904所提取指令所指定之相關記憶體參照 的預設記憶體屬性。流程接著進行到方塊910。 [0063] 於方塊910中,該相關記憶體參照之一非暫存存 取指定元於一對應微指令序列之延伸項攔位進行組態。流程 接著進行至方塊912。 [0064] 於方塊912中,該指令之所有其餘部分被解碼/ 轉譯,以決定所指定之記憶體參照、暫存器運算元之位置、 記憶體位址指定元以及依據該既有微處理器指令集,由前置 碼所指定之既有架構特徵的使用。流程接著進行至方塊914。 [0065] 於方塊914中,一微指令序列被組態為指定所指 定的記憶體參照及其對應之運算碼延伸項。流程接著進行至 方塊916 〇 [0066] 於方塊916中,該微指令序列被送至一微指令佇 列,由微處理器執行。流程接著進行至方塊918。 [0067] 於方塊918中,該微指令序列由本發明之一位址 邏輯進行提取。該位址邏輯產生該記憶體參照之位址,並將 該位址送至延伸執行邏輯。流程接著進行至方塊92〇。 30 ,本紙張尺度適用中國國家標準(CNS)A4規格(210 X 297公釐) --^--J---Ί·---裝--------訂--------- (請先閱讀背面之注意事項再填寫本頁) 經濟部智慧財產局員工消費合作社印製 A7 ---------------- —_B7 五、發明說明(如) [0068] 於方塊920中,延伸執行邏輯 構之記憶體特性描述X具,以決定_預設的記特性。流 程接著進行至判斷方塊922。 [0069] 於判斷方塊922中進行評估,以判斷該微處理器 杀構之快取/圯憶體模型疋否允許該非暫存存取取代該預設 屬性。若非暫存存取被允許,流程進行至判斷方塊926。若 非暫存存取未被允許’則流程進行至方塊924。 [0070] 於方塊924中,藉由使用於方塊920所決定之預 設記憶體屬性,執行該記憶體存取。流程接著進行至方塊 932 〇 [0071] 於判斷方塊926中進行評估,以判斷於快取記憶 體中,對應於該指定記憶體參照之快取線是否存在且有效。 若是,流程進行至方塊928。若產生一快取未中,則流程進 行至方塊930。 [0072] 於方塊928中,由於在快取記憶體中,對應於該 記憶體參照之快取線存在且有效,即使用於方塊92〇所決定 之預設記憶體屬性,經由快取記憶體執行該記憶體存取。流 程接著進行至方塊932。 [0073] 於方塊930中,運用非暫存工具(如非暫存載入 緩衝器或/與複合寫入緩衝器)執行該記憶體參照。流程接著 進行至方塊932。 [0074] 於方塊932中,本方法完成。 [0075] 雖然本發明及其目的、特徵與優點已詳細敘述, 其它實施例亦可包含在本發明之範圍内。例如,本發明已就 31 本紙張尺&適用中國國家標準(CNS)A4規格(21〇 X 297公f ----- (請先閱讀背面之注意事項再填寫本頁) I I I I — — — — — — — — — — 1220042 A7 五、發明說明(W ) 如下的技術加以敘述:利用已完全佔用之指令集架構内一單 一、未使用之運算碼狀態作為標記,以指出其後之延伸特徵 前置碼。但本發明的範圍就任一方面來看,並不限於已完全 佔用之指令集架構,或未使用的指令,或是單一標記。相反 地,本發明涵蓋了未完全映射之指令集、具已使用運算碼之 實施例以及使用一個以上之指令標記的實施例。例如,考慮 一;又有未使用運异碼狀態之指令集架構。本發明之一具體實 施例包含了選取一作為逸出標記之運算碼狀態,其中選取標 準係依市場因素而決定。另一具體實施例則包含使用運算碼 之一特殊組合作為標記,如運算碼狀態7FH的連續出現。因 此,本發明之本質係在於使用一標記序列,其後則為一 11位 元之延伸前置碼,可允許程式員於指令層級指定記憶體存取 之記憶體屬性,而該些屬性係無法另由微處理器指令集之既 有指令來提供。 μ [0076]此外,雖然上文係利用微處理器為例來解說本發 明及其目的、特徵和優點,熟習此領域技術者仍可察覺,本 發明的範圍並不限於微處理器的架構,而可涵蓋所有形式之 可程式化裝置,如訊號處理器、工業用控制器 controller)、陣列處理器及其他同類裝置。 〜總之,以上所述者,僅為本發明之較佳實施例而已,當 一 b、之限疋本發明所實施之範圍。大凡依本發明申請專利 =圍所作之均等變化與修飾,皆應仍屬於本判專利涵蓋之 範圍内,謹請貴審查委員明鑑,並祈惠准,是所至禱。 (請先閱讀背面之注意事項再填寫本頁> -ϋ ·ϋ ϋ «ϋ ϋ ϋ ·ϋ^r<»J_ 1 «ϋ n n n ϋ I # 經濟部智慧財產局員工消費合作社印製1220042 Printed by the Consumers ’Cooperative of the Intellectual Property Bureau of the Ministry of Economic Affairs A7 B7 V. Description of Invention (γ) [0061] In decision block 906, the next instruction extracted in block 904 is checked to determine whether it contains an invention The extended escape code. In a η86 embodiment, the edge check is used to extract the operation code value fi (ice BKPT). If the extended escape code is detected, the flow proceeds to block 908. If the extended escape code is not detected, the flow proceeds to block 9 丨 2. [0062] In block 908, the extended preamble portion of the extended instruction is decoded / translated to determine whether to apply a non-temporary access, which is designated to replace the fetched instruction in block 904 The default memory attributes referenced by the specified related memory. The process then proceeds to block 910. [0063] In block 910, the related memory refers to a non-temporary storage to fetch the designated element in an extension of a corresponding microinstruction sequence for configuration. The flow then proceeds to block 912. [0064] In block 912, all the rest of the instruction is decoded / translated to determine the specified memory reference, the location of the register operand, the memory address designator, and the instructions based on the existing microprocessor Set, the use of an existing architectural feature specified by the preamble. Flow then proceeds to block 914. [0065] In block 914, a microinstruction sequence is configured to specify the specified memory reference and its corresponding opcode extension. The flow then proceeds to block 916. [0066] In block 916, the microinstruction sequence is sent to a microinstruction queue for execution by the microprocessor. Flow then proceeds to block 918. [0067] In block 918, the microinstruction sequence is extracted by an address logic of the present invention. The address logic generates an address referenced by the memory, and sends the address to the extended execution logic. The flow then proceeds to block 92. 30, this paper size is applicable to China National Standard (CNS) A4 specification (210 X 297 mm)-^-J --- Ί · --------------- Order ----- ---- (Please read the notes on the back before filling out this page) Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs A7 ---------------- —_B7 V. Description of Invention (Eg) [0068] In block 920, the memory characteristic description X of the logical structure is extended to determine the preset memory characteristics. The process then proceeds to decision block 922. [0069] An evaluation is performed in decision block 922 to determine whether the microprocessor's cache / memory model does not allow the non-temporary access to replace the default attribute. If non-temporary access is allowed, the flow proceeds to decision block 926. If non-temporary access is not allowed ', the flow proceeds to block 924. [0070] In block 924, the memory access is performed by using the preset memory attributes determined in block 920. The flow then proceeds to block 932. [0071] An evaluation is performed in decision block 926 to determine whether a cache line corresponding to the specified memory reference exists in the cache memory and is valid. If yes, the flow proceeds to block 928. If a cache miss is generated, the flow proceeds to block 930. [0072] In block 928, because in the cache memory, a cache line corresponding to the memory reference exists and is valid, even if it is used for the preset memory attribute determined by block 92, via the cache memory Perform the memory access. The process then proceeds to block 932. [0073] In block 930, the memory reference is performed using a non-transitory tool (such as a non-transitory load buffer or / and a composite write buffer). Flow then proceeds to block 932. [0074] In block 932, the method is completed. [0075] Although the present invention and its objects, features, and advantages have been described in detail, other embodiments may also be included within the scope of the present invention. For example, the present invention has been applied to 31 paper rulers & Chinese National Standards (CNS) A4 specifications (21〇X 297 male f ----- (Please read the precautions on the back before filling out this page) IIII — — — — — — — — — — 1220042 A7 V. Description of the Invention (W) The following technology is used to describe: Use a single, unused opcode state in the instruction set structure that has been fully occupied as a mark to indicate the following extended features Preamble. However, the scope of the present invention is not limited in any aspect to the fully occupied instruction set architecture, or unused instructions, or a single tag. On the contrary, the present invention covers incompletely mapped instruction sets. , An embodiment that has used an operation code, and an embodiment that uses more than one instruction tag. For example, consider one; there is an instruction set architecture that does not use a different code state. A specific embodiment of the present invention includes selecting one as The state of the operation code of the escape mark, where the selection criteria are determined according to market factors. Another specific embodiment includes the use of a special combination of operation codes as marks, such as operations State 7FH appears continuously. Therefore, the essence of the present invention is to use a tag sequence, followed by an 11-bit extended preamble, which allows the programmer to specify the memory attributes of memory access at the instruction level. These attributes cannot be provided by the existing instructions of the microprocessor instruction set. Μ [0076] In addition, although the above uses the microprocessor as an example to explain the present invention and its objectives, features, and advantages, familiarize yourself with this. Those skilled in the art can still perceive that the scope of the present invention is not limited to the architecture of the microprocessor, but can cover all forms of programmable devices, such as signal processors, industrial controllers, array processors, and other similar devices. . In short, the above is only a preferred embodiment of the present invention, and the limitation of b, is the scope of implementation of the present invention. Any equal change and modification applied for patenting according to the present invention should still fall within the scope of the patent under this judgment. I ask your reviewing committee to make a clear note and pray for your approval. (Please read the notes on the back before filling out this page> -ϋ · ϋ ϋ «ϋ ϋ ϋ · ϋ ^ r <» J_ 1 «ϋ n n n ϋ I # Printed by the Consumer Cooperative of the Intellectual Property Bureau of the Ministry of Economic Affairs
297公釐)297 mm)
Claims (1)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US10/227,583 US7328328B2 (en) | 2002-02-19 | 2002-08-22 | Non-temporal memory reference control mechanism |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| TWI220042B true TWI220042B (en) | 2004-08-01 |
Family
ID=22853669
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW091124007A TWI220042B (en) | 2002-08-22 | 2002-10-18 | Non-temporal memory reference control mechanism |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN1308813C (en) |
| TW (1) | TWI220042B (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI403900B (en) * | 2005-02-07 | 2013-08-01 | Advanced Micro Devices Inc | System for restricted cache access during data transfers and method thereof |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103605496A (en) * | 2013-12-02 | 2014-02-26 | 天津光电通信技术有限公司 | Method for fast analyzing communication instruction based on SCPI protocol |
| US9396056B2 (en) * | 2014-03-15 | 2016-07-19 | Intel Corporation | Conditional memory fault assist suppression |
| CN106503797B (en) * | 2015-10-08 | 2019-03-15 | 上海兆芯集成电路有限公司 | Neural network unit and collective with neural memory will arrange the neural pe array shifted received from the data of neural memory |
| CN108989841B (en) * | 2017-06-02 | 2020-12-18 | 上海数字电视国家工程研究中心有限公司 | Design method and transmission system of data frame suitable for high-speed motion reception |
| CN114691200A (en) * | 2020-12-29 | 2022-07-01 | 上海兆芯集成电路有限公司 | Instruction simulation device and method thereof |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5784607A (en) * | 1996-03-29 | 1998-07-21 | Integrated Device Technology, Inc. | Apparatus and method for exception handling during micro code string instructions |
| CN1190211A (en) * | 1997-06-10 | 1998-08-12 | Lsi罗吉克公司 | Object-oriented multi-media architecture |
| KR100379837B1 (en) * | 2000-06-30 | 2003-04-11 | 주식회사 에이디칩스 | Extended instruction folding system |
-
2002
- 2002-10-18 TW TW091124007A patent/TWI220042B/en not_active IP Right Cessation
-
2003
- 2003-01-28 CN CNB031030408A patent/CN1308813C/en not_active Expired - Lifetime
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI403900B (en) * | 2005-02-07 | 2013-08-01 | Advanced Micro Devices Inc | System for restricted cache access during data transfers and method thereof |
Also Published As
| Publication number | Publication date |
|---|---|
| CN1308813C (en) | 2007-04-04 |
| CN1431586A (en) | 2003-07-23 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US5781750A (en) | Dual-instruction-set architecture CPU with hidden software emulation mode | |
| TW591527B (en) | Apparatus and method for extending a microprocessor instruction set | |
| JP5837126B2 (en) | System, method and software for preloading instructions from an instruction set other than the currently executing instruction set | |
| KR100880681B1 (en) | Central processing unit accesses a set of extended registers in extended register mode | |
| US6901505B2 (en) | Instruction causing swap of base address from segment register with address from another register | |
| US7647479B2 (en) | Non-temporal memory reference control mechanism | |
| EP0465321B1 (en) | Ensuring data integrity in multiprocessor or pipelined processor system | |
| US5410682A (en) | In-register data manipulation for unaligned byte write using data shift in reduced instruction set processor | |
| CN102792265B (en) | Instruction cracking based on machine state | |
| CN105468333B (en) | Microprocessor with conditional instruction and processing method thereof | |
| EP0463973A2 (en) | Branch prediction in high performance processor | |
| EP0463978A2 (en) | Granularity hint for translation buffer in high performance processor | |
| EP0463975A2 (en) | Byte-compare operation for high-performance processor | |
| EP0463977A2 (en) | Branching in a pipelined processor | |
| JP5717848B2 (en) | Computer-implemented method, computer system, and computer program for blocking certain instructions from being executed by a processor | |
| GB2529777A (en) | Processor with granular add immediates capability and methods | |
| JP2011515750A (en) | RISC processor apparatus and method for supporting X86 virtual machine | |
| US7315921B2 (en) | Apparatus and method for selective memory attribute control | |
| EP0465328A2 (en) | Branch elimination in a reduced instruction set processor | |
| CN110347431A (en) | Adaptive spatial access prefetcher apparatus and method | |
| CN120234259A (en) | Apparatus and method for informing a predictor using data object range information in a pointer | |
| US8549266B2 (en) | System and method of instruction modification | |
| TWI220042B (en) | Non-temporal memory reference control mechanism | |
| TWI230356B (en) | Apparatus and method for extending address modes in a microprocessor | |
| TWI245221B (en) | Apparatus and method for selective memory attribute control |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| MK4A | Expiration of patent term of an invention patent |