[go: up one dir, main page]

TWI898651B - Memory device and operation mothod thereof - Google Patents

Memory device and operation mothod thereof

Info

Publication number
TWI898651B
TWI898651B TW113121728A TW113121728A TWI898651B TW I898651 B TWI898651 B TW I898651B TW 113121728 A TW113121728 A TW 113121728A TW 113121728 A TW113121728 A TW 113121728A TW I898651 B TWI898651 B TW I898651B
Authority
TW
Taiwan
Prior art keywords
data
memory
input data
weight
input
Prior art date
Application number
TW113121728A
Other languages
Chinese (zh)
Other versions
TW202548755A (en
Inventor
許萓庭
Original Assignee
新加坡商艾沛芯科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 新加坡商艾沛芯科技股份有限公司 filed Critical 新加坡商艾沛芯科技股份有限公司
Priority to TW113121728A priority Critical patent/TWI898651B/en
Application granted granted Critical
Publication of TWI898651B publication Critical patent/TWI898651B/en
Publication of TW202548755A publication Critical patent/TW202548755A/en

Links

Landscapes

  • Logic Circuits (AREA)

Abstract

A memory device and an operation method are provided. The memory device includes: an input data memory, a weight data memory, an output data memory, and a logic calculating circuit. The logic calculating circuit is coupled to the input data memory, the weight data memory, and the output data memory. The input data memory stores a plurality of input data. The weight data memory stores a plurality of weight values corresponding to the plurality of input data. The logic calculating circuit calculates an output data according to at least one input data and at least one weight data corresponding to the at least one input data and transmits the output data to the output data memory. The output data memory transmits the output data.

Description

記憶體裝置及其操作方法Memory device and operating method thereof

本發明是有關於一種人工智慧技術,且特別是有關於一種記憶體及其操作方法。The present invention relates to artificial intelligence technology, and more particularly to a memory and an operating method thereof.

在使用大語言模型(Large Language Model,LLM)的情況下,通常會使用堆疊式靜態隨機存取記憶體(Stacked Static Random Access Memory,Stacked SRAM)裝置搭配外部的運算邏輯電路來完成資料的處理。然而,在資料量龐大的LLM中,由Stacked SRAM與運算邏輯電路之間的溝通傳輸所造成存取延遲時間也大幅增加,進而影響整體效能。When using a Large Language Model (LLM), data processing is typically performed using stacked static random access memory (SRAM) devices in conjunction with external arithmetic logic circuits. However, in LLMs with large data volumes, the access latency caused by communication between the stacked SRAM and the arithmetic logic circuits increases significantly, impacting overall performance.

有鑑於此,本發明提供一種記憶體裝置及其操作方法,藉由將邏輯運算電路設置於記憶體裝置中,來避免邏輯運算電路與記憶體裝置的溝通傳輸時間,可提升執行資料運算的速度,並減少存取延遲時間。In view of this, the present invention provides a memory device and an operating method thereof. By placing a logic operation circuit in the memory device, the communication transmission time between the logic operation circuit and the memory device is avoided, thereby improving the speed of executing data operations and reducing access latency.

本發明的記憶體裝置,包括:輸入資料記憶體、權重資料記憶體、輸出資料記憶體以及邏輯運算電路。邏輯運算電路耦接輸入資料記憶體、權重資料記憶體以及輸出資料記憶體。輸入資料記憶體儲存多筆輸入資料。權重資料記憶體儲存分別對應於多筆輸入資料的多個權重值。邏輯運算單元根據至少一輸入資料及其對應的至少一權重值計算輸出資料,並將輸出資料儲存至輸出資料記憶體。輸出資料記憶體傳送輸出資料。The memory device of the present invention includes: an input data memory, a weight data memory, an output data memory, and a logical operation circuit. The logical operation circuit is coupled to the input data memory, the weight data memory, and the output data memory. The input data memory stores multiple input data. The weight data memory stores multiple weight values corresponding to the multiple input data. The logical operation unit calculates output data based on at least one input data and its corresponding at least one weight value, and stores the output data in the output data memory. The output data memory transmits the output data.

本發明的記憶體裝置的操作方法,包括:透過輸入資料記憶體儲存多筆輸入資料;透過權重資料記憶體儲存分別對應於多筆輸入資料的多個權重值;透過邏輯運算單元根據至少一輸入資料及其對應的至少一權重值計算輸出資料,並將輸出資料儲存至輸出資料記憶體;以及透過輸出資料記憶體傳送輸出資料。The operating method of the memory device of the present invention includes: storing multiple input data through an input data memory; storing multiple weight values corresponding to the multiple input data through a weight data memory; calculating output data based on at least one input data and its corresponding at least one weight value through a logical operation unit, and storing the output data in the output data memory; and transmitting the output data through the output data memory.

基於上述,本發明所提供的記憶體及其操作方法,可藉由設置於記憶體裝置中的邏輯運算電路,來避免邏輯運算電路與記憶體裝置的溝通傳輸時間,大幅減少存取延遲時間,以因應大語言模型(LLM)的龐大資料量的處理需求。Based on the above, the memory and operating method provided by the present invention can avoid the communication transmission time between the logical operation circuit and the memory device by setting up a logical operation circuit in the memory device, significantly reducing access latency to meet the processing requirements of large language models (LLMs) with large data volumes.

為讓本發明的上述特徵和優點能更明顯易懂,下文特舉實施例,並配合所附圖式作詳細說明如下。In order to make the above features and advantages of the present invention more clearly understood, embodiments are given below and described in detail with reference to the accompanying drawings.

本發明的部份實施例接下來將會配合附圖來詳細描述,以下的描述所引用的元件符號,當不同附圖出現相同的元件符號將視為相同或相似的元件。這些實施例只是本發明的一部份,並未揭示所有本發明的可實施方式。更確切的說,這些實施例只是本發明的專利申請範圍中的範例。Some embodiments of the present invention are described in detail below with reference to the accompanying drawings. Reference symbols in the following description will identify identical or similar elements when the same symbols appear in different drawings. These embodiments are only a portion of the present invention and do not disclose all possible implementations of the present invention. Rather, these embodiments are merely examples within the scope of the present invention's patent application.

圖1繪示本發明的一實施例的一種記憶體裝置的示意圖;圖2繪示本發明的一實施例的邏輯運算電路的運作的示意圖。請參照圖1與圖2。在本實施例中,記憶體裝置10可例如是堆疊式靜態隨機存取記憶體(Stacked Static Random Access Memory,Stacked SRAM)。記憶體裝置100包括輸入資料記憶體110、權重資料記憶體120、輸出資料記憶體130以及邏輯運算電路140。邏輯運算電路140耦接至輸入資料記憶體110、權重資料記憶體120以及輸出資料記憶體130。輸入資料記憶體110用以儲存多筆輸入資料。權重資料記憶體120用以儲存分別對應於多筆輸入資料的多個權重值。FIG1 is a schematic diagram of a memory device according to an embodiment of the present invention; FIG2 is a schematic diagram of the operation of a logic operation circuit according to an embodiment of the present invention. Please refer to FIG1 and FIG2 . In this embodiment, the memory device 10 may be, for example, a stacked static random access memory (SRAM). The memory device 100 includes an input data memory 110, a weight data memory 120, an output data memory 130, and a logic operation circuit 140. The logic operation circuit 140 is coupled to the input data memory 110, the weight data memory 120, and the output data memory 130. The input data memory 110 is used to store multiple input data. The weight data memory 120 is used to store multiple weight values corresponding to the multiple input data.

邏輯運算單元140可根據輸入資料D1~DN及其對應的權重值W1~WN計算輸出資料Z,並將輸出資料Z儲存至輸出資料記憶體130。具體來說,邏輯運算單元140包括乘法器141以及累加器142。乘法器141耦接累加器142。如圖2所示,乘法器141可分別計算輸入資料D1~DN及其對應的權重值W1~WN的多個乘積。例如,乘法器141可計算輸入資料Di與權重值Wi的乘積Di*Wi,其中i為1~N的任一整數。接下來,累加器142將由乘法器141所計算出的乘積進行加總,以計算輸出資料Z。也就是說,Z=D1*W1+ D1*W1+…+ DN*WN。The logical operation unit 140 can calculate the output data Z based on the input data D1~DN and the corresponding weight values W1~WN, and store the output data Z to the output data memory 130. Specifically, the logical operation unit 140 includes a multiplier 141 and an accumulator 142. The multiplier 141 is coupled to the accumulator 142. As shown in Figure 2, the multiplier 141 can calculate multiple products of the input data D1~DN and the corresponding weight values W1~WN respectively. For example, the multiplier 141 can calculate the product Di*Wi of the input data Di and the weight value Wi, where i is any integer from 1 to N. Next, the accumulator 142 sums the products calculated by the multiplier 141 to calculate the output data Z. In other words, Z=D1*W1+ D1*W1+…+ DN*WN.

邏輯運算單元140可將輸出資料Z儲存至輸出資料記憶體130。在後續需要使用到輸出資料Z時,輸出資料記憶體130可將輸出資料Z輸出至對應的組件。例如,輸出資料記憶體130可將輸出資料Z輸出至中央處理器(Central Processing Unit,CPU)(未繪示)。也就是說,中央處理器可存取輸出資料記憶體130,以取得所需的資料(意即,輸出資料Z)。The logical operation unit 140 can store the output data Z in the output data memory 130. When the output data Z is needed later, the output data memory 130 can output the output data Z to the corresponding component. For example, the output data memory 130 can output the output data Z to the central processing unit (CPU) (not shown). In other words, the CPU can access the output data memory 130 to obtain the required data (i.e., the output data Z).

如此一來,本發明的記憶體裝置10可藉由裝置內的邏輯運算電路140來進行資料運算的操作,以降低存取延遲時間。In this way, the memory device 10 of the present invention can perform data operations through the logic operation circuit 140 within the device to reduce access latency.

圖3繪示本發明的一實施例的一種記憶體裝置的示意圖;圖4繪示本發明的一實施例的一種記憶體裝置的操作方法的流程圖。請參照圖3與圖4。記憶體裝置30包括輸入資料記憶體310、權重資料記憶體320、輸出資料記憶體330、邏輯運算電路340、資料讀取邏輯電路350、控制邏輯電路360、輸入緩衝器370、權重緩衝器380、輸出緩衝器390、地址匯流排BA以及資料匯流排BD。邏輯運算電路340包括乘法器341以及累加器342。資料讀取邏輯電路350包括地址生成器351以及資料擷取器352。FIG3 is a schematic diagram of a memory device according to an embodiment of the present invention; FIG4 is a flow chart of an operating method of a memory device according to an embodiment of the present invention. Please refer to FIG3 and FIG4 . Memory device 30 includes an input data memory 310, a weight data memory 320, an output data memory 330, a logic operation circuit 340, a data read logic circuit 350, a control logic circuit 360, an input buffer 370, a weight buffer 380, an output buffer 390, an address bus BA, and a data bus BD. Logic operation circuit 340 includes a multiplier 341 and an accumulator 342. The data read logic circuit 350 includes an address generator 351 and a data extractor 352 .

在步驟S401中,輸入資料記憶體310儲存多筆輸入資料,並且權重資料記憶體320儲存分別對應於多筆輸入資料的多個權重值。在步驟S402中,地址生成器351基於地址控制訊號CON-A,產生至少一輸入資料D1~DN的至少一第一地址A-D1~A-DN以及對應於至少一輸入資料D1~DN的至少一權重值W1~WN的至少一第二地址A-W1~A-WN。In step S401, the input data memory 310 stores multiple entries of input data, and the weight data memory 320 stores multiple weight values corresponding to the multiple entries of input data. In step S402, the address generator 351 generates at least one first address A-D1-A-DN corresponding to at least one entry of input data D1-DN and at least one second address A-W1-A-WN corresponding to at least one weight value W1-WN of the at least one entry of input data D1-DN based on the address control signal CON-A.

具體來說,當多筆輸入資料被寫入輸入資料記憶體310時,地址生成器351可對應地產生多個地址,以確保被寫入輸入資料記憶體310中的每一筆輸入資料皆具有唯一的地址。類似地,當對應於多筆輸入資料的多個權重值被寫入權重資料記憶體320時,地址生成器351可對應地產生多個地址,以確保被寫入權重資料記憶體320中的每一個權重值皆具有唯一的地址。如此一來,地址生成器351可基於由控制邏輯電路360所產生的用以指示當前所需要的至少一輸入資料D1~DN及其對應的至少一權重值W1~WN的地址控制訊號CON-A來產生至少一第一地址A-D1~A-DN以及至少一第二地址A-W1~A-WN。Specifically, when multiple entries of input data are written into the input data memory 310, the address generator 351 may generate multiple addresses accordingly to ensure that each entry of input data written into the input data memory 310 has a unique address. Similarly, when multiple weight values corresponding to the multiple entries of input data are written into the weight data memory 320, the address generator 351 may generate multiple addresses accordingly to ensure that each weight value written into the weight data memory 320 has a unique address. In this way, the address generator 351 can generate at least one first address A-D1~A-DN and at least one second address A-W1~A-WN based on the address control signal CON-A generated by the control logic circuit 360 to indicate at least one input data D1~DN currently required and its corresponding at least one weight value W1~WN.

在步驟S403中,地址生成器351透過地址匯流排BA將至少一第一地址A-D1~A-DN以及至少一第二地址A-W1~A-WN分別傳送至輸入資料記憶體310以及權重資料記憶體320。In step S403 , the address generator 351 transmits at least one first address A-D1-A-DN and at least one second address A-W1-A-WN to the input data memory 310 and the weight data memory 320 respectively via the address bus BA.

在步驟S404中,資料擷取器352基於資料控制訊號CON-D,自輸入資料記憶體310以及權重資料記憶體320分別提取至少一輸入資料D1~DN以及至少一權重值W1~WN。具體來說,由控制邏輯電路360所產生的資料控制訊號CON-D可用以指示提取資料的時序關係。資料擷取器352可基於資料控制訊號CON-D來進行提取輸入資料D1~DN以及權重值W1~WN的操作,以確保資料讀取的正確性。In step S404, data extractor 352 extracts at least one input data D1-DN and at least one weight value W1-WN from input data memory 310 and weight data memory 320, respectively, based on data control signal CON-D. Specifically, data control signal CON-D generated by control logic circuit 360 can be used to indicate the timing of data extraction. Data extractor 352 extracts input data D1-DN and weight values W1-WN based on data control signal CON-D to ensure accurate data reading.

根據上述,控制邏輯電路360可分別藉由地址控制訊號CON-A以及資料控制訊號CON-D來控制地址生成器351以及資料擷取器352的操作,以確保資料讀取邏輯電路350執行讀取資料操作的資料正確性及時序關係。Based on the above, the control logic circuit 360 can control the operations of the address generator 351 and the data extractor 352 via the address control signal CON-A and the data control signal CON-D, respectively, to ensure the data accuracy and timing relationship of the data read operation performed by the data read logic circuit 350.

在步驟S405中,控制邏輯電路360可判斷資料擷取器352是否提取成功。若是資料擷取器352提取成功,進入步驟S406。反之,若是資料擷取器352提取失敗,則回到步驟S404中,使得資料擷取器352可重新提取輸入資料D1~DN以及權重值W1~WN。In step S405, control logic circuit 360 determines whether data extractor 352 successfully retrieves the data. If so, the process proceeds to step S406. Otherwise, if data extractor 352 fails, the process returns to step S404, allowing data extractor 352 to re-retrieve input data D1-DN and weights W1-WN.

在一實施例中,控制邏輯電路360可基於第一地址A-D1~A-DN以及第二地址A-W1~A-WN,判斷資料擷取器352是否提取成功。具體來說,控制邏輯電路360可驗證第一地址A-D1~A-DN以及第二地址A-W1~A-WN是否皆符合預期範圍。若符合,則第一地址A-D1~A-DN以及第二地址A-W1~A-WN為正確的地址,資料擷取器352可成功提取對應的輸入資料D1~DN以及權重值W1~WN。若不符合,則第一地址A-D1~A-DN以及第二地址A-W1~A-WN的其中之一為錯誤的地址,意即,地址生成器351所生成的地址有誤,資料擷取器352無法成功提取對應的輸入資料D1~DN及/或權重值W1~WN。換言之,資料擷取器352提取失敗。In one embodiment, control logic circuit 360 can determine whether data extraction by data extractor 352 is successful based on the first addresses A-D1-A-DN and the second addresses A-W1-A-WN. Specifically, control logic circuit 360 can verify whether the first addresses A-D1-A-DN and the second addresses A-W1-A-WN are within expected ranges. If so, the first addresses A-D1-A-DN and the second addresses A-W1-A-WN are correct, and data extractor 352 can successfully extract the corresponding input data D1-DN and weight values W1-WN. If they do not match, then one of the first addresses A-D1-A-DN and the second addresses A-W1-A-WN is an incorrect address. This means that the address generated by address generator 351 is incorrect, and data extractor 352 cannot successfully extract the corresponding input data D1-DN and/or weight values W1-WN. In other words, data extractor 352 has failed.

在一實施例中,控制邏輯電路360可基於輸入資料D1~DN以及權重值W1~WN,判斷資料擷取器352是否提取成功。具體來說,控制邏輯電路360可判斷輸入資料D1~DN以及權重值W1~WN是否為有效(valid)資料。例如,控制邏輯電路360可檢查輸入資料D1~DN以及權重值W1~WN的有效標誌位元。若是有效標誌位元為邏輯值1,則代表資料為有效資料,資料擷取器352提取成功。若是有效標誌位元為邏輯值0,則代表資料為無效(invalid)資料,資料擷取器352提取失敗。在另一實施例中,若是有效標誌位元為邏輯值0,則代表資料為有效資料,資料擷取器352提取成功。若是有效標誌位元為邏輯值1,則代表資料為無效資料,資料擷取器352提取失敗。In one embodiment, the control logic circuit 360 can determine whether the data extractor 352 has successfully extracted the data based on the input data D1-DN and the weight values W1-WN. Specifically, the control logic circuit 360 can determine whether the input data D1-DN and the weight values W1-WN are valid data. For example, the control logic circuit 360 can check the valid flag bits of the input data D1-DN and the weight values W1-WN. If the valid flag bit is a logical value of 1, it indicates that the data is valid and the data extractor 352 has successfully extracted the data. If the valid flag bit is a logical value of 0, it indicates that the data is invalid and the data extractor 352 has failed to extract the data. In another embodiment, if the valid flag bit is a logical value of 0, it means that the data is valid data and the data extractor 352 has successfully extracted it. If the valid flag bit is a logical value of 1, it means that the data is invalid data and the data extractor 352 has failed to extract it.

在一實施例中,控制邏輯電路360可基於輸入緩衝器370以及權重緩衝器380的性能,判斷資料擷取器352是否提取成功。具體來說,輸入緩衝器370以及權重緩衝器380需要具備足夠的儲存容量及/或足夠的傳輸速率來儲存及/或傳輸由資料擷取器352所提取的輸入資料D1~DN以及權重值W1~WN。若是輸入緩衝器370以及權重緩衝器380的儲存容量不足及/或傳輸速率不足,會導致資料延遲或丟失等問題,控制邏輯電路360可判斷資料擷取器352提取失敗。若是輸入緩衝器370以及權重緩衝器380具有足夠的儲存容量及傳輸速率,控制邏輯電路360可判斷資料擷取器352提取成功。In one embodiment, the control logic circuit 360 can determine whether the data extraction by the data extractor 352 is successful based on the performance of the input buffer 370 and the weight buffer 380. Specifically, the input buffer 370 and the weight buffer 380 need to have sufficient storage capacity and/or sufficient transmission rate to store and/or transmit the input data D1-DN and weight values W1-WN extracted by the data extractor 352. If the storage capacity and/or transmission rate of the input buffer 370 and the weight buffer 380 are insufficient, which may result in data delay or loss, the control logic circuit 360 may determine that the data acquisition by the data acquisition device 352 has failed. If the storage capacity and transmission rate of the input buffer 370 and the weight buffer 380 are sufficient, the control logic circuit 360 may determine that the data acquisition by the data acquisition device 352 has succeeded.

在一實施例中,控制邏輯電路360可基於是否發生雜訊干擾或傳輸錯誤,判斷資料擷取器352是否提取成功。若是在資料擷取器352提取輸入資料D1~DN以及權重值W1~WN的過程中,發生雜訊干擾或傳輸錯誤,控制邏輯電路360可判斷資料擷取器352提取失敗。若是在資料擷取器352提取輸入資料D1~DN以及權重值W1~WN的過程中,並未發生雜訊干擾或傳輸錯誤,控制邏輯電路360可判斷資料擷取器352提取成功。In one embodiment, the control logic circuit 360 can determine whether the data extractor 352 successfully extracts the data based on whether noise interference or transmission errors occur. If noise interference or transmission errors occur during the process of extracting the input data D1-DN and weight values W1-WN by the data extractor 352, the control logic circuit 360 can determine that the data extractor 352 has failed to extract the data. If noise interference or transmission errors do not occur during the process of extracting the input data D1-DN and weight values W1-WN by the data extractor 352, the control logic circuit 360 can determine that the data extractor 352 has successfully extracted the data.

在一實施例中,控制邏輯電路360可基於地址控制訊號CON-A及資料控制訊號CON-D的正確性,判斷資料擷取器352是否提取成功。具體來說,控制邏輯電路360可分別藉由地址控制訊號CON-A以及資料控制訊號CON-D來控制地址生成器351以及資料擷取器352的操作,以確保資料的正確性及時序關係。控制邏輯電路360可透過模擬測試、驗證測試等測試方法來檢查其所生成的地址控制訊號CON-A以及資料控制訊號CON-D的正確性,意即,檢查地址控制訊號CON-A以及資料控制訊號CON-D是否符合需求。若是地址控制訊號CON-A及/或資料控制訊號CON-D有誤,控制邏輯電路360可判斷資料擷取器352提取失敗。若是地址控制訊號CON-A以及資料控制訊號CON-D正確,控制邏輯電路360可判斷資料擷取器352提取成功。In one embodiment, control logic circuit 360 can determine whether data acquisition by data extractor 352 is successful based on the correctness of address control signal CON-A and data control signal CON-D. Specifically, control logic circuit 360 can control the operations of address generator 351 and data extractor 352 using address control signal CON-A and data control signal CON-D, respectively, to ensure data accuracy and timing. Control logic circuit 360 can verify the correctness of address control signal CON-A and data control signal CON-D generated by it through testing methods such as simulation testing and verification testing, that is, to check whether address control signal CON-A and data control signal CON-D meet requirements. If the address control signal CON-A and/or the data control signal CON-D are incorrect, the control logic circuit 360 may determine that the data acquisition by the data acquisition device 352 has failed. If the address control signal CON-A and the data control signal CON-D are correct, the control logic circuit 360 may determine that the data acquisition by the data acquisition device 352 has succeeded.

在步驟S406中,資料擷取器352透過資料匯流排BD將至少一輸入資料D1~DN以及至少一權重值W1~WN分別傳送至輸入緩衝器370以及權重緩衝器380。輸入緩衝器370用以暫存從輸入資料記憶體310所提取出的輸入資料D1~DN。權重緩衝器380用以暫存從權重資料記憶體320所提取出的權重值W1~WN。In step S406, the data extractor 352 transmits at least one input data D1-DN and at least one weight value W1-WN to the input buffer 370 and the weight buffer 380 via the data bus BD. The input buffer 370 is used to temporarily store the input data D1-DN extracted from the input data memory 310. The weight buffer 380 is used to temporarily store the weight values W1-WN extracted from the weight data memory 320.

在步驟S407中,邏輯運算電路340分別自輸入緩衝器370以及權重緩衝器380獲得至少一輸入資料D1~DN以及至少一權重值W1~WN,以根據至少一輸入資料D1~DN以及至少一權重值W1~WN計算輸出資料Z。具體來說,輸入緩衝器370的寄存器(Register)(未繪示)可將輸入資料D1~DN載入乘法器341的寄存器(未繪示)。類似地,權重緩衝器380的寄存器(未繪示)可將權重值W1~WN載入乘法器341的寄存器。接下來,乘法器341可分別計算輸入資料D1~DN及其對應的權重值W1~WN的多個乘積。乘法器341的寄存器將多個乘積載入累加器342的寄存器(未繪示),累加器342可將多個乘積進行加總,以計算出輸出資料Z。In step S407, the logic operation circuit 340 receives at least one input data D1-DN and at least one weight value W1-WN from the input buffer 370 and the weight buffer 380, respectively, and calculates the output data Z based on the at least one input data D1-DN and the at least one weight value W1-WN. Specifically, a register (not shown) of the input buffer 370 can load the input data D1-DN into a register (not shown) of the multiplier 341. Similarly, a register (not shown) of the weight buffer 380 can load the weight values W1-WN into the register of the multiplier 341. Next, multiplier 341 calculates multiple products of the input data D1-DN and their corresponding weights W1-WN. The register of multiplier 341 loads these products into the register of accumulator 342 (not shown). Accumulator 342 then sums these products to calculate output data Z.

在步驟S408中,邏輯運算電路340將輸出資料Z輸出至輸出緩衝區390。在步驟S409中,輸出資料記憶體330傳送輸出緩衝器390中的輸出資料Z。具體來說,輸出緩衝器390的寄存器(未繪示)可將輸出資料Z載入輸出資料記憶體330的寄存器(未繪示)。輸出資料記憶體330可將輸出資料Z傳送至對應的組件(例如,中央處理器)。In step S408, the logic operation circuit 340 outputs the output data Z to the output buffer 390. In step S409, the output data memory 330 transfers the output data Z from the output buffer 390. Specifically, a register (not shown) in the output buffer 390 can load the output data Z into a register (not shown) in the output data memory 330. The output data memory 330 can then transfer the output data Z to a corresponding component (e.g., a central processing unit).

如此一來,本發明的記憶體裝置30可藉由裝置內的邏輯運算電路340來進行資料運算的操作,以降低存取延遲時間。另外,本發明的記憶體裝置30還可藉由控制邏輯電路360所生成的地址控制訊號CON-A以及資料控制訊號CON-D來控制資料讀取邏輯電路350的操作,以確保資料的正確性及時序關係。此外,控制邏輯電路360還可進一步地判斷資料讀取邏輯電路350是否提取成功,以再次確認資料的正確性。據此,本發明的記憶體裝置30可大幅降低邏輯運算電路340與輸入資料記憶體310、權重資料記憶體320以及輸出資料記憶體330的溝通傳輸時間,並確保資料的正確性及時序關係,可滿足對於大語言模型(LLM)的龐大資料量的處理需求。In this way, the memory device 30 of the present invention can perform data operations through the logic circuit 340 within the device, thereby reducing access latency. Furthermore, the memory device 30 of the present invention can control the operation of the data read logic circuit 350 by controlling the address control signal CON-A and the data control signal CON-D generated by the control logic circuit 360 to ensure data accuracy and timing. Furthermore, the control logic circuit 360 can further determine whether the data read logic circuit 350 has successfully retrieved the data, thereby reconfirming the data's accuracy. Accordingly, the memory device 30 of the present invention can significantly reduce the communication transmission time between the logical operation circuit 340 and the input data memory 310, weight data memory 320, and output data memory 330, while ensuring the accuracy and timing of the data, and can meet the processing requirements of large language models (LLMs) with large amounts of data.

圖5繪示本發明的一實施例的一種記憶體裝置的操作方法的流程圖。操作方法可由圖1的記憶體裝置10來實現。請參照圖1與圖5。在步驟S501中,透過輸入資料記憶體110儲存多筆輸入資料。在步驟S502中,透過權重資料記憶體120儲存分別對應於多筆輸入資料的多個權重值。在步驟S503中,透過邏輯運算單元140根據至少一輸入資料及其對應的至少一權重值計算輸出資料,並將輸出資料儲存至輸出資料記憶體130。在步驟S504中,透過輸出資料記憶體130傳送輸出資料。FIG5 is a flow chart showing an operating method of a memory device according to an embodiment of the present invention. The operating method can be implemented by the memory device 10 of FIG1 . Please refer to FIG1 and FIG5 . In step S501 , a plurality of input data are stored in the input data memory 110 . In step S502 , a plurality of weight values corresponding to the plurality of input data are stored in the weight data memory 120 . In step S503 , output data is calculated based on at least one input data and at least one weight value corresponding thereto by the logic operation unit 140 , and the output data is stored in the output data memory 130 . In step S504, the output data is transmitted via the output data memory 130.

綜上所述,本發明所提供的記憶體裝置及其操作方法,可大幅降低邏輯運算電路與輸入資料記憶體、權重資料記憶體以及輸出資料記憶體之間的溝通傳輸時間,並確保資料讀取邏輯電路所提取出的資料的正確性及時序關係,以滿足對於大語言模型(LLM)的龐大資料量的處理需求。In summary, the memory device and operating method provided by the present invention can significantly reduce the communication transmission time between the logic operation circuit and the input data memory, weight data memory, and output data memory, while ensuring the accuracy and timing of the data extracted by the data read logic circuit, thereby meeting the large data processing requirements of large language models (LLMs).

雖然本發明已以實施例揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明的精神和範圍內,當可作些許的更動與潤飾,故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed above by way of embodiments, they are not intended to limit the present invention. Any person having ordinary skill in the art may make slight modifications and improvements without departing from the spirit and scope of the present invention. Therefore, the scope of protection of the present invention shall be determined by the scope of the attached patent application.

10、30:記憶體 110、310:輸入資料記憶體 120、320:權重資料記憶體 130、330:輸出資料記憶體 140、340:邏輯運算電路 141、341:乘法器 142、342:累加器 350:資料讀取邏輯電路 351:地址生成器 352:資料擷取器 360:控制邏輯電路 370:輸入緩衝器 380:權重緩衝器 390:輸出緩衝器 A-D1、A-DN、A-W1、A-WN:地址 BA、BD:匯流排 CON-A、CON-D:控制訊號 D1、D2、Di、DN:輸入資料 W1、W2、Wi、WN:權重值 S401、S402、S403、S404、S405、S406、S407、S408、S409、S501、S502、S503、S504:步驟 Z:輸出資料 10, 30: Memory 110, 310: Input Data Memory 120, 320: Weight Data Memory 130, 330: Output Data Memory 140, 340: Logical Operation Circuit 141, 341: Multiplier 142, 342: Accumulator 350: Data Read Logic Circuit 351: Address Generator 352: Data Extractor 360: Control Logic Circuit 370: Input Buffer 380: Weight Buffer 390: Output Buffer A-D1, A-DN, A-W1, A-WN: Address BA, BD: Buses CON-A, CON-D: Control signals D1, D2, Di, DN: Input data W1, W2, Wi, WN: Weight values S401, S402, S403, S404, S405, S406, S407, S408, S409, S501, S502, S503, S504: Steps Z: Output data

圖1繪示本發明的一實施例的一種記憶體裝置的示意圖。 圖2繪示本發明的一實施例的邏輯運算電路的運作的示意圖。 圖3繪示本發明的一實施例的一種記憶體裝置的示意圖。 圖4繪示本發明的一實施例的一種記憶體裝置的操作方法的流程圖。 圖5繪示本發明的一實施例的一種記憶體裝置的操作方法的流程圖。 Figure 1 is a schematic diagram of a memory device according to an embodiment of the present invention. Figure 2 is a schematic diagram of the operation of a logic operation circuit according to an embodiment of the present invention. Figure 3 is a schematic diagram of a memory device according to an embodiment of the present invention. Figure 4 is a flow chart of a method for operating a memory device according to an embodiment of the present invention. Figure 5 is a flow chart of a method for operating a memory device according to an embodiment of the present invention.

10:記憶體裝置 10: Memory device

110:輸入資料記憶體 110: Input data memory

120:權重資料記憶體 120: Weight data memory

130:輸出資料記憶體 130: Output data memory

140:邏輯運算電路 140: Logical Operation Circuit

141:乘法器 141: Multiplier

142:累加器 142: Accumulator

D1、DN:輸入資料 D1, DN: Input data

W1、WN:權重值 W1, WN: weight values

Z:輸出資料 Z: Output data

Claims (14)

一種記憶體裝置,包括: 輸入資料記憶體,用以儲存多筆輸入資料; 權重資料記憶體,用以儲存分別對應於該些輸入資料的多個權重值; 輸出資料記憶體;以及 邏輯運算電路,耦接至該輸入資料記憶體、該權重資料記憶體以及該輸出資料記憶體,其中 該邏輯運算單元根據至少一輸入資料及其對應的至少一權重值計算輸出資料,並將該輸出資料儲存至該輸出資料記憶體, 該輸出資料記憶體傳送該輸出資料, 其中該記憶體裝置更包括: 資料讀取邏輯電路,包括: 地址生成器,用以基於地址控制訊號,產生該至少一輸入資料的至少一第一地址以及對應於該至少一輸入資料的該至少一權重值的至少一第二地址:以及 資料擷取器,用以基於資料控制訊號,自該輸入資料記憶體以及該權重資料記憶體分別提取該至少一輸入資料以及該至少一權重值。 A memory device comprises: an input data memory for storing a plurality of input data; a weight data memory for storing a plurality of weight values corresponding to the input data; an output data memory; and a logic operation circuit coupled to the input data memory, the weight data memory, and the output data memory, wherein the logic operation unit calculates output data based on at least one input data and its corresponding at least one weight value, and stores the output data in the output data memory; the output data memory transmits the output data. The memory device further comprises: The data read logic circuit includes: an address generator for generating, based on an address control signal, at least one first address for the at least one input data and at least one second address corresponding to the at least one weight value of the at least one input data; and a data extractor for extracting, based on the data control signal, the at least one input data and the at least one weight value from the input data memory and the weight data memory, respectively. 如請求項1所述的記憶體裝置,其中該邏輯運算電路包括: 乘法器;以及 累加器,耦接至該乘法器,其中 該乘法器分別計算該至少一輸入資料及其對應的至少一權重值的至少一乘積, 該累加器將該至少一乘積進行加總,以計算該輸出資料。 The memory device of claim 1, wherein the logic operation circuit comprises: a multiplier; and an accumulator coupled to the multiplier, wherein the multiplier calculates at least one product of the at least one input data and its corresponding at least one weight value, respectively, and the accumulator sums the at least one product to calculate the output data. 如請求項1所述的記憶體裝置,更包括: 地址匯流排;以及 資料匯流排,其中 該地址生成器透過該地址匯流排將該至少一第一地址以及該至少一第二地址分別傳送至該輸入資料記憶體以及該權重資料記憶體, 該資料擷取器透過該資料匯流排傳送該至少一輸入資料以及該至少一權重值。 The memory device of claim 1 further comprises: an address bus; and a data bus, wherein the address generator transmits the at least one first address and the at least one second address to the input data memory and the weight data memory, respectively, via the address bus, and the data extractor transmits the at least one input data and the at least one weight value via the data bus. 如請求項3所述的記憶體裝置,更包括; 控制邏輯電路,用以判斷該資料擷取器是否提取成功, 響應於該控制邏輯電路判斷該資料擷取器提取失敗,該資料擷取器重新自該輸入資料記憶體以及該權重資料記憶體分別提取該至少一輸入資料以及該至少一權重值。 The memory device of claim 3 further comprises: a control logic circuit configured to determine whether the data extractor has successfully extracted the data; in response to the control logic circuit determining that the data extractor has failed to extract the data, the data extractor re-extracts the at least one input data and the at least one weight value from the input data memory and the weight data memory, respectively. 如請求項4所述的記憶體裝置,其中 響應於該控制邏輯電路判斷該資料擷取器提取成功,該資料擷取器透過該資料匯流排將該至少一輸入資料以及該至少一權重值分別傳送至輸入緩衝器以及權重緩衝器。 The memory device of claim 4, wherein: In response to the control logic circuit determining that the data acquisition is successful, the data acquisition transmits the at least one input data and the at least one weight value to the input buffer and the weight buffer, respectively, via the data bus. 如請求項4所述的記憶體裝置,其中 該邏輯運算電路分別自該輸入緩衝器以及該權重緩衝器獲得該至少一輸入資料以及該至少一權重值,以根據該至少一輸入資料以及該至少一權重值計算該輸出資料。 The memory device of claim 4, wherein the logic operation circuit receives the at least one input data and the at least one weight value from the input buffer and the weight buffer, respectively, to calculate the output data based on the at least one input data and the at least one weight value. 如請求項4所述的記憶體裝置,其中該控制邏輯電路基於該至少一第一地址以及該至少一第二地址,判斷該資料擷取器是否提取成功。The memory device of claim 4, wherein the control logic circuit determines whether the data extractor has successfully extracted the data based on the at least one first address and the at least one second address. 如請求項4所述的記憶體裝置,其中該控制邏輯電路基於該至少一輸入資料以及該至少一權重值,判斷該資料擷取器是否提取成功。The memory device of claim 4, wherein the control logic circuit determines whether the data extractor has successfully extracted the data based on the at least one input data and the at least one weight value. 如請求項4所述的記憶體裝置,其中該控制邏輯電路基於該輸入緩衝器以及該權重緩衝器的性能,判斷該資料擷取器是否提取成功。The memory device of claim 4, wherein the control logic circuit determines whether the data extractor is successful based on the performance of the input buffer and the weight buffer. 如請求項4所述的記憶體裝置,其中該控制邏輯電路基於是否發生雜訊干擾或傳輸錯誤,判斷該資料擷取器是否提取成功。The memory device of claim 4, wherein the control logic circuit determines whether the data acquisition is successful based on whether noise interference or transmission error occurs. 如請求項4所述的記憶體裝置,其中該控制邏輯電路基於該地址控制訊號及該資料控制訊號的正確性,判斷該資料擷取器是否提取成功。The memory device of claim 4, wherein the control logic circuit determines whether the data extractor has successfully extracted the data based on the accuracy of the address control signal and the data control signal. 如請求項1所述的記憶體裝置,其中該邏輯運算電路將該輸出資料輸出至輸出緩衝器。The memory device of claim 1, wherein the logic operation circuit outputs the output data to an output buffer. 如請求項1所述的記憶體裝置,其中該記憶體裝置為堆疊式靜態隨機存取記憶體(Stacked Static Random Access Memory,Stacked SRAM)。The memory device of claim 1, wherein the memory device is a stacked static random access memory (Stacked SRAM). 一種記憶體裝置的操作方法,包括: 透過輸入資料記憶體儲存多筆輸入資料; 透過權重資料記憶體儲存分別對應於該些輸入資料的多個權重值; 透過邏輯運算單元根據至少一輸入資料及其對應的至少一權重值計算輸出資料,並將該輸出資料儲存至輸出資料記憶體; 透過輸出資料記憶體傳送該輸出資料; 透過資料讀取邏輯電路的地址生成器基於地址控制訊號,產生該至少一輸入資料的至少一第一地址以及對應於該至少一輸入資料的該至少一權重值的至少一第二地址;以及 透過該資料讀取邏輯電路的資料擷取器基於資料控制訊號,自該輸入資料記憶體以及該權重資料記憶體分別提取該至少一輸入資料以及該至少一權重值。 A method for operating a memory device comprises: storing a plurality of input data in an input data memory; storing a plurality of weight values corresponding to the input data in a weight data memory; calculating output data based on at least one input data and its corresponding at least one weight value in a logic operation unit, and storing the output data in the output data memory; transmitting the output data through the output data memory; generating at least one first address of the at least one input data and at least one second address corresponding to the at least one weight value of the at least one input data by an address generator of a data read logic circuit based on an address control signal; and The data acquirer of the data read logic circuit extracts the at least one input data and the at least one weight value from the input data memory and the weight data memory respectively based on the data control signal.
TW113121728A 2024-06-12 2024-06-12 Memory device and operation mothod thereof TWI898651B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW113121728A TWI898651B (en) 2024-06-12 2024-06-12 Memory device and operation mothod thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW113121728A TWI898651B (en) 2024-06-12 2024-06-12 Memory device and operation mothod thereof

Publications (2)

Publication Number Publication Date
TWI898651B true TWI898651B (en) 2025-09-21
TW202548755A TW202548755A (en) 2025-12-16

Family

ID=97832268

Family Applications (1)

Application Number Title Priority Date Filing Date
TW113121728A TWI898651B (en) 2024-06-12 2024-06-12 Memory device and operation mothod thereof

Country Status (1)

Country Link
TW (1) TWI898651B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW318228B (en) * 1996-01-02 1997-10-21 Motorola Inc
US20200285950A1 (en) * 2017-04-04 2020-09-10 Hailo Technologies Ltd. Structured Weight Based Sparsity In An Artificial Neural Network Compiler
US11816045B2 (en) * 2016-10-27 2023-11-14 Google Llc Exploiting input data sparsity in neural network compute units

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW318228B (en) * 1996-01-02 1997-10-21 Motorola Inc
US11816045B2 (en) * 2016-10-27 2023-11-14 Google Llc Exploiting input data sparsity in neural network compute units
US20200285950A1 (en) * 2017-04-04 2020-09-10 Hailo Technologies Ltd. Structured Weight Based Sparsity In An Artificial Neural Network Compiler

Similar Documents

Publication Publication Date Title
US20260017134A1 (en) Memory module register access
TWI498913B (en) Ecc implementation in non-ecc components
US9766820B2 (en) Arithmetic processing device, information processing device, and control method of arithmetic processing device
US8239708B2 (en) System on chip (SoC) device verification system using memory interface
CN102918513B (en) For enabling the methods, devices and systems of determinacy interface
CN116256621B (en) Testing method, device, electronic equipment and storage medium of core particles
CN101903956B (en) Self-timed error correcting code evaluation system and method
TWI768435B (en) Semiconductor layered device with data bus inversion
CN103984506B (en) The method and system that data of flash memory storage equipment is write
US11176018B1 (en) Inline hardware compression subsystem for emulation trace data
CN101331464A (en) Storage area allocation system, method and control device
CN112668266A (en) Correction method of time sequence path
CN107797821A (en) Retry read method and the device using this method
CN110310693A (en) In-Line ECC Module with Cache
CN114664366A (en) Memory device and method of reading the same
CN105426314A (en) Process mapping method for FPGA memory
US9891986B2 (en) System and method for performing bus transactions
TWI898651B (en) Memory device and operation mothod thereof
US9047329B1 (en) Method and system for an algorithm and circuit for a high performance exact match lookup function
TW202548755A (en) Memory device and operation mothod thereof
CN121326221A (en) Memory devices and their operation methods
CN210072600U (en) Electronic device
CN111341374B (en) Memory test method and device and readable memory
US9285828B2 (en) Memory system with improved bus timing calibration
US8635566B2 (en) Parity error detection verification