TWI898651B - Memory device and operation mothod thereof - Google Patents
Memory device and operation mothod thereofInfo
- Publication number
- TWI898651B TWI898651B TW113121728A TW113121728A TWI898651B TW I898651 B TWI898651 B TW I898651B TW 113121728 A TW113121728 A TW 113121728A TW 113121728 A TW113121728 A TW 113121728A TW I898651 B TWI898651 B TW I898651B
- Authority
- TW
- Taiwan
- Prior art keywords
- data
- memory
- input data
- weight
- input
- Prior art date
Links
Landscapes
- Logic Circuits (AREA)
Abstract
Description
本發明是有關於一種人工智慧技術,且特別是有關於一種記憶體及其操作方法。The present invention relates to artificial intelligence technology, and more particularly to a memory and an operating method thereof.
在使用大語言模型(Large Language Model,LLM)的情況下,通常會使用堆疊式靜態隨機存取記憶體(Stacked Static Random Access Memory,Stacked SRAM)裝置搭配外部的運算邏輯電路來完成資料的處理。然而,在資料量龐大的LLM中,由Stacked SRAM與運算邏輯電路之間的溝通傳輸所造成存取延遲時間也大幅增加,進而影響整體效能。When using a Large Language Model (LLM), data processing is typically performed using stacked static random access memory (SRAM) devices in conjunction with external arithmetic logic circuits. However, in LLMs with large data volumes, the access latency caused by communication between the stacked SRAM and the arithmetic logic circuits increases significantly, impacting overall performance.
有鑑於此,本發明提供一種記憶體裝置及其操作方法,藉由將邏輯運算電路設置於記憶體裝置中,來避免邏輯運算電路與記憶體裝置的溝通傳輸時間,可提升執行資料運算的速度,並減少存取延遲時間。In view of this, the present invention provides a memory device and an operating method thereof. By placing a logic operation circuit in the memory device, the communication transmission time between the logic operation circuit and the memory device is avoided, thereby improving the speed of executing data operations and reducing access latency.
本發明的記憶體裝置,包括:輸入資料記憶體、權重資料記憶體、輸出資料記憶體以及邏輯運算電路。邏輯運算電路耦接輸入資料記憶體、權重資料記憶體以及輸出資料記憶體。輸入資料記憶體儲存多筆輸入資料。權重資料記憶體儲存分別對應於多筆輸入資料的多個權重值。邏輯運算單元根據至少一輸入資料及其對應的至少一權重值計算輸出資料,並將輸出資料儲存至輸出資料記憶體。輸出資料記憶體傳送輸出資料。The memory device of the present invention includes: an input data memory, a weight data memory, an output data memory, and a logical operation circuit. The logical operation circuit is coupled to the input data memory, the weight data memory, and the output data memory. The input data memory stores multiple input data. The weight data memory stores multiple weight values corresponding to the multiple input data. The logical operation unit calculates output data based on at least one input data and its corresponding at least one weight value, and stores the output data in the output data memory. The output data memory transmits the output data.
本發明的記憶體裝置的操作方法,包括:透過輸入資料記憶體儲存多筆輸入資料;透過權重資料記憶體儲存分別對應於多筆輸入資料的多個權重值;透過邏輯運算單元根據至少一輸入資料及其對應的至少一權重值計算輸出資料,並將輸出資料儲存至輸出資料記憶體;以及透過輸出資料記憶體傳送輸出資料。The operating method of the memory device of the present invention includes: storing multiple input data through an input data memory; storing multiple weight values corresponding to the multiple input data through a weight data memory; calculating output data based on at least one input data and its corresponding at least one weight value through a logical operation unit, and storing the output data in the output data memory; and transmitting the output data through the output data memory.
基於上述,本發明所提供的記憶體及其操作方法,可藉由設置於記憶體裝置中的邏輯運算電路,來避免邏輯運算電路與記憶體裝置的溝通傳輸時間,大幅減少存取延遲時間,以因應大語言模型(LLM)的龐大資料量的處理需求。Based on the above, the memory and operating method provided by the present invention can avoid the communication transmission time between the logical operation circuit and the memory device by setting up a logical operation circuit in the memory device, significantly reducing access latency to meet the processing requirements of large language models (LLMs) with large data volumes.
為讓本發明的上述特徵和優點能更明顯易懂,下文特舉實施例,並配合所附圖式作詳細說明如下。In order to make the above features and advantages of the present invention more clearly understood, embodiments are given below and described in detail with reference to the accompanying drawings.
本發明的部份實施例接下來將會配合附圖來詳細描述,以下的描述所引用的元件符號,當不同附圖出現相同的元件符號將視為相同或相似的元件。這些實施例只是本發明的一部份,並未揭示所有本發明的可實施方式。更確切的說,這些實施例只是本發明的專利申請範圍中的範例。Some embodiments of the present invention are described in detail below with reference to the accompanying drawings. Reference symbols in the following description will identify identical or similar elements when the same symbols appear in different drawings. These embodiments are only a portion of the present invention and do not disclose all possible implementations of the present invention. Rather, these embodiments are merely examples within the scope of the present invention's patent application.
圖1繪示本發明的一實施例的一種記憶體裝置的示意圖;圖2繪示本發明的一實施例的邏輯運算電路的運作的示意圖。請參照圖1與圖2。在本實施例中,記憶體裝置10可例如是堆疊式靜態隨機存取記憶體(Stacked Static Random Access Memory,Stacked SRAM)。記憶體裝置100包括輸入資料記憶體110、權重資料記憶體120、輸出資料記憶體130以及邏輯運算電路140。邏輯運算電路140耦接至輸入資料記憶體110、權重資料記憶體120以及輸出資料記憶體130。輸入資料記憶體110用以儲存多筆輸入資料。權重資料記憶體120用以儲存分別對應於多筆輸入資料的多個權重值。FIG1 is a schematic diagram of a memory device according to an embodiment of the present invention; FIG2 is a schematic diagram of the operation of a logic operation circuit according to an embodiment of the present invention. Please refer to FIG1 and FIG2 . In this embodiment, the memory device 10 may be, for example, a stacked static random access memory (SRAM). The memory device 100 includes an input data memory 110, a weight data memory 120, an output data memory 130, and a logic operation circuit 140. The logic operation circuit 140 is coupled to the input data memory 110, the weight data memory 120, and the output data memory 130. The input data memory 110 is used to store multiple input data. The weight data memory 120 is used to store multiple weight values corresponding to the multiple input data.
邏輯運算單元140可根據輸入資料D1~DN及其對應的權重值W1~WN計算輸出資料Z,並將輸出資料Z儲存至輸出資料記憶體130。具體來說,邏輯運算單元140包括乘法器141以及累加器142。乘法器141耦接累加器142。如圖2所示,乘法器141可分別計算輸入資料D1~DN及其對應的權重值W1~WN的多個乘積。例如,乘法器141可計算輸入資料Di與權重值Wi的乘積Di*Wi,其中i為1~N的任一整數。接下來,累加器142將由乘法器141所計算出的乘積進行加總,以計算輸出資料Z。也就是說,Z=D1*W1+ D1*W1+…+ DN*WN。The logical operation unit 140 can calculate the output data Z based on the input data D1~DN and the corresponding weight values W1~WN, and store the output data Z to the output data memory 130. Specifically, the logical operation unit 140 includes a multiplier 141 and an accumulator 142. The multiplier 141 is coupled to the accumulator 142. As shown in Figure 2, the multiplier 141 can calculate multiple products of the input data D1~DN and the corresponding weight values W1~WN respectively. For example, the multiplier 141 can calculate the product Di*Wi of the input data Di and the weight value Wi, where i is any integer from 1 to N. Next, the accumulator 142 sums the products calculated by the multiplier 141 to calculate the output data Z. In other words, Z=D1*W1+ D1*W1+…+ DN*WN.
邏輯運算單元140可將輸出資料Z儲存至輸出資料記憶體130。在後續需要使用到輸出資料Z時,輸出資料記憶體130可將輸出資料Z輸出至對應的組件。例如,輸出資料記憶體130可將輸出資料Z輸出至中央處理器(Central Processing Unit,CPU)(未繪示)。也就是說,中央處理器可存取輸出資料記憶體130,以取得所需的資料(意即,輸出資料Z)。The logical operation unit 140 can store the output data Z in the output data memory 130. When the output data Z is needed later, the output data memory 130 can output the output data Z to the corresponding component. For example, the output data memory 130 can output the output data Z to the central processing unit (CPU) (not shown). In other words, the CPU can access the output data memory 130 to obtain the required data (i.e., the output data Z).
如此一來,本發明的記憶體裝置10可藉由裝置內的邏輯運算電路140來進行資料運算的操作,以降低存取延遲時間。In this way, the memory device 10 of the present invention can perform data operations through the logic operation circuit 140 within the device to reduce access latency.
圖3繪示本發明的一實施例的一種記憶體裝置的示意圖;圖4繪示本發明的一實施例的一種記憶體裝置的操作方法的流程圖。請參照圖3與圖4。記憶體裝置30包括輸入資料記憶體310、權重資料記憶體320、輸出資料記憶體330、邏輯運算電路340、資料讀取邏輯電路350、控制邏輯電路360、輸入緩衝器370、權重緩衝器380、輸出緩衝器390、地址匯流排BA以及資料匯流排BD。邏輯運算電路340包括乘法器341以及累加器342。資料讀取邏輯電路350包括地址生成器351以及資料擷取器352。FIG3 is a schematic diagram of a memory device according to an embodiment of the present invention; FIG4 is a flow chart of an operating method of a memory device according to an embodiment of the present invention. Please refer to FIG3 and FIG4 . Memory device 30 includes an input data memory 310, a weight data memory 320, an output data memory 330, a logic operation circuit 340, a data read logic circuit 350, a control logic circuit 360, an input buffer 370, a weight buffer 380, an output buffer 390, an address bus BA, and a data bus BD. Logic operation circuit 340 includes a multiplier 341 and an accumulator 342. The data read logic circuit 350 includes an address generator 351 and a data extractor 352 .
在步驟S401中,輸入資料記憶體310儲存多筆輸入資料,並且權重資料記憶體320儲存分別對應於多筆輸入資料的多個權重值。在步驟S402中,地址生成器351基於地址控制訊號CON-A,產生至少一輸入資料D1~DN的至少一第一地址A-D1~A-DN以及對應於至少一輸入資料D1~DN的至少一權重值W1~WN的至少一第二地址A-W1~A-WN。In step S401, the input data memory 310 stores multiple entries of input data, and the weight data memory 320 stores multiple weight values corresponding to the multiple entries of input data. In step S402, the address generator 351 generates at least one first address A-D1-A-DN corresponding to at least one entry of input data D1-DN and at least one second address A-W1-A-WN corresponding to at least one weight value W1-WN of the at least one entry of input data D1-DN based on the address control signal CON-A.
具體來說,當多筆輸入資料被寫入輸入資料記憶體310時,地址生成器351可對應地產生多個地址,以確保被寫入輸入資料記憶體310中的每一筆輸入資料皆具有唯一的地址。類似地,當對應於多筆輸入資料的多個權重值被寫入權重資料記憶體320時,地址生成器351可對應地產生多個地址,以確保被寫入權重資料記憶體320中的每一個權重值皆具有唯一的地址。如此一來,地址生成器351可基於由控制邏輯電路360所產生的用以指示當前所需要的至少一輸入資料D1~DN及其對應的至少一權重值W1~WN的地址控制訊號CON-A來產生至少一第一地址A-D1~A-DN以及至少一第二地址A-W1~A-WN。Specifically, when multiple entries of input data are written into the input data memory 310, the address generator 351 may generate multiple addresses accordingly to ensure that each entry of input data written into the input data memory 310 has a unique address. Similarly, when multiple weight values corresponding to the multiple entries of input data are written into the weight data memory 320, the address generator 351 may generate multiple addresses accordingly to ensure that each weight value written into the weight data memory 320 has a unique address. In this way, the address generator 351 can generate at least one first address A-D1~A-DN and at least one second address A-W1~A-WN based on the address control signal CON-A generated by the control logic circuit 360 to indicate at least one input data D1~DN currently required and its corresponding at least one weight value W1~WN.
在步驟S403中,地址生成器351透過地址匯流排BA將至少一第一地址A-D1~A-DN以及至少一第二地址A-W1~A-WN分別傳送至輸入資料記憶體310以及權重資料記憶體320。In step S403 , the address generator 351 transmits at least one first address A-D1-A-DN and at least one second address A-W1-A-WN to the input data memory 310 and the weight data memory 320 respectively via the address bus BA.
在步驟S404中,資料擷取器352基於資料控制訊號CON-D,自輸入資料記憶體310以及權重資料記憶體320分別提取至少一輸入資料D1~DN以及至少一權重值W1~WN。具體來說,由控制邏輯電路360所產生的資料控制訊號CON-D可用以指示提取資料的時序關係。資料擷取器352可基於資料控制訊號CON-D來進行提取輸入資料D1~DN以及權重值W1~WN的操作,以確保資料讀取的正確性。In step S404, data extractor 352 extracts at least one input data D1-DN and at least one weight value W1-WN from input data memory 310 and weight data memory 320, respectively, based on data control signal CON-D. Specifically, data control signal CON-D generated by control logic circuit 360 can be used to indicate the timing of data extraction. Data extractor 352 extracts input data D1-DN and weight values W1-WN based on data control signal CON-D to ensure accurate data reading.
根據上述,控制邏輯電路360可分別藉由地址控制訊號CON-A以及資料控制訊號CON-D來控制地址生成器351以及資料擷取器352的操作,以確保資料讀取邏輯電路350執行讀取資料操作的資料正確性及時序關係。Based on the above, the control logic circuit 360 can control the operations of the address generator 351 and the data extractor 352 via the address control signal CON-A and the data control signal CON-D, respectively, to ensure the data accuracy and timing relationship of the data read operation performed by the data read logic circuit 350.
在步驟S405中,控制邏輯電路360可判斷資料擷取器352是否提取成功。若是資料擷取器352提取成功,進入步驟S406。反之,若是資料擷取器352提取失敗,則回到步驟S404中,使得資料擷取器352可重新提取輸入資料D1~DN以及權重值W1~WN。In step S405, control logic circuit 360 determines whether data extractor 352 successfully retrieves the data. If so, the process proceeds to step S406. Otherwise, if data extractor 352 fails, the process returns to step S404, allowing data extractor 352 to re-retrieve input data D1-DN and weights W1-WN.
在一實施例中,控制邏輯電路360可基於第一地址A-D1~A-DN以及第二地址A-W1~A-WN,判斷資料擷取器352是否提取成功。具體來說,控制邏輯電路360可驗證第一地址A-D1~A-DN以及第二地址A-W1~A-WN是否皆符合預期範圍。若符合,則第一地址A-D1~A-DN以及第二地址A-W1~A-WN為正確的地址,資料擷取器352可成功提取對應的輸入資料D1~DN以及權重值W1~WN。若不符合,則第一地址A-D1~A-DN以及第二地址A-W1~A-WN的其中之一為錯誤的地址,意即,地址生成器351所生成的地址有誤,資料擷取器352無法成功提取對應的輸入資料D1~DN及/或權重值W1~WN。換言之,資料擷取器352提取失敗。In one embodiment, control logic circuit 360 can determine whether data extraction by data extractor 352 is successful based on the first addresses A-D1-A-DN and the second addresses A-W1-A-WN. Specifically, control logic circuit 360 can verify whether the first addresses A-D1-A-DN and the second addresses A-W1-A-WN are within expected ranges. If so, the first addresses A-D1-A-DN and the second addresses A-W1-A-WN are correct, and data extractor 352 can successfully extract the corresponding input data D1-DN and weight values W1-WN. If they do not match, then one of the first addresses A-D1-A-DN and the second addresses A-W1-A-WN is an incorrect address. This means that the address generated by address generator 351 is incorrect, and data extractor 352 cannot successfully extract the corresponding input data D1-DN and/or weight values W1-WN. In other words, data extractor 352 has failed.
在一實施例中,控制邏輯電路360可基於輸入資料D1~DN以及權重值W1~WN,判斷資料擷取器352是否提取成功。具體來說,控制邏輯電路360可判斷輸入資料D1~DN以及權重值W1~WN是否為有效(valid)資料。例如,控制邏輯電路360可檢查輸入資料D1~DN以及權重值W1~WN的有效標誌位元。若是有效標誌位元為邏輯值1,則代表資料為有效資料,資料擷取器352提取成功。若是有效標誌位元為邏輯值0,則代表資料為無效(invalid)資料,資料擷取器352提取失敗。在另一實施例中,若是有效標誌位元為邏輯值0,則代表資料為有效資料,資料擷取器352提取成功。若是有效標誌位元為邏輯值1,則代表資料為無效資料,資料擷取器352提取失敗。In one embodiment, the control logic circuit 360 can determine whether the data extractor 352 has successfully extracted the data based on the input data D1-DN and the weight values W1-WN. Specifically, the control logic circuit 360 can determine whether the input data D1-DN and the weight values W1-WN are valid data. For example, the control logic circuit 360 can check the valid flag bits of the input data D1-DN and the weight values W1-WN. If the valid flag bit is a logical value of 1, it indicates that the data is valid and the data extractor 352 has successfully extracted the data. If the valid flag bit is a logical value of 0, it indicates that the data is invalid and the data extractor 352 has failed to extract the data. In another embodiment, if the valid flag bit is a logical value of 0, it means that the data is valid data and the data extractor 352 has successfully extracted it. If the valid flag bit is a logical value of 1, it means that the data is invalid data and the data extractor 352 has failed to extract it.
在一實施例中,控制邏輯電路360可基於輸入緩衝器370以及權重緩衝器380的性能,判斷資料擷取器352是否提取成功。具體來說,輸入緩衝器370以及權重緩衝器380需要具備足夠的儲存容量及/或足夠的傳輸速率來儲存及/或傳輸由資料擷取器352所提取的輸入資料D1~DN以及權重值W1~WN。若是輸入緩衝器370以及權重緩衝器380的儲存容量不足及/或傳輸速率不足,會導致資料延遲或丟失等問題,控制邏輯電路360可判斷資料擷取器352提取失敗。若是輸入緩衝器370以及權重緩衝器380具有足夠的儲存容量及傳輸速率,控制邏輯電路360可判斷資料擷取器352提取成功。In one embodiment, the control logic circuit 360 can determine whether the data extraction by the data extractor 352 is successful based on the performance of the input buffer 370 and the weight buffer 380. Specifically, the input buffer 370 and the weight buffer 380 need to have sufficient storage capacity and/or sufficient transmission rate to store and/or transmit the input data D1-DN and weight values W1-WN extracted by the data extractor 352. If the storage capacity and/or transmission rate of the input buffer 370 and the weight buffer 380 are insufficient, which may result in data delay or loss, the control logic circuit 360 may determine that the data acquisition by the data acquisition device 352 has failed. If the storage capacity and transmission rate of the input buffer 370 and the weight buffer 380 are sufficient, the control logic circuit 360 may determine that the data acquisition by the data acquisition device 352 has succeeded.
在一實施例中,控制邏輯電路360可基於是否發生雜訊干擾或傳輸錯誤,判斷資料擷取器352是否提取成功。若是在資料擷取器352提取輸入資料D1~DN以及權重值W1~WN的過程中,發生雜訊干擾或傳輸錯誤,控制邏輯電路360可判斷資料擷取器352提取失敗。若是在資料擷取器352提取輸入資料D1~DN以及權重值W1~WN的過程中,並未發生雜訊干擾或傳輸錯誤,控制邏輯電路360可判斷資料擷取器352提取成功。In one embodiment, the control logic circuit 360 can determine whether the data extractor 352 successfully extracts the data based on whether noise interference or transmission errors occur. If noise interference or transmission errors occur during the process of extracting the input data D1-DN and weight values W1-WN by the data extractor 352, the control logic circuit 360 can determine that the data extractor 352 has failed to extract the data. If noise interference or transmission errors do not occur during the process of extracting the input data D1-DN and weight values W1-WN by the data extractor 352, the control logic circuit 360 can determine that the data extractor 352 has successfully extracted the data.
在一實施例中,控制邏輯電路360可基於地址控制訊號CON-A及資料控制訊號CON-D的正確性,判斷資料擷取器352是否提取成功。具體來說,控制邏輯電路360可分別藉由地址控制訊號CON-A以及資料控制訊號CON-D來控制地址生成器351以及資料擷取器352的操作,以確保資料的正確性及時序關係。控制邏輯電路360可透過模擬測試、驗證測試等測試方法來檢查其所生成的地址控制訊號CON-A以及資料控制訊號CON-D的正確性,意即,檢查地址控制訊號CON-A以及資料控制訊號CON-D是否符合需求。若是地址控制訊號CON-A及/或資料控制訊號CON-D有誤,控制邏輯電路360可判斷資料擷取器352提取失敗。若是地址控制訊號CON-A以及資料控制訊號CON-D正確,控制邏輯電路360可判斷資料擷取器352提取成功。In one embodiment, control logic circuit 360 can determine whether data acquisition by data extractor 352 is successful based on the correctness of address control signal CON-A and data control signal CON-D. Specifically, control logic circuit 360 can control the operations of address generator 351 and data extractor 352 using address control signal CON-A and data control signal CON-D, respectively, to ensure data accuracy and timing. Control logic circuit 360 can verify the correctness of address control signal CON-A and data control signal CON-D generated by it through testing methods such as simulation testing and verification testing, that is, to check whether address control signal CON-A and data control signal CON-D meet requirements. If the address control signal CON-A and/or the data control signal CON-D are incorrect, the control logic circuit 360 may determine that the data acquisition by the data acquisition device 352 has failed. If the address control signal CON-A and the data control signal CON-D are correct, the control logic circuit 360 may determine that the data acquisition by the data acquisition device 352 has succeeded.
在步驟S406中,資料擷取器352透過資料匯流排BD將至少一輸入資料D1~DN以及至少一權重值W1~WN分別傳送至輸入緩衝器370以及權重緩衝器380。輸入緩衝器370用以暫存從輸入資料記憶體310所提取出的輸入資料D1~DN。權重緩衝器380用以暫存從權重資料記憶體320所提取出的權重值W1~WN。In step S406, the data extractor 352 transmits at least one input data D1-DN and at least one weight value W1-WN to the input buffer 370 and the weight buffer 380 via the data bus BD. The input buffer 370 is used to temporarily store the input data D1-DN extracted from the input data memory 310. The weight buffer 380 is used to temporarily store the weight values W1-WN extracted from the weight data memory 320.
在步驟S407中,邏輯運算電路340分別自輸入緩衝器370以及權重緩衝器380獲得至少一輸入資料D1~DN以及至少一權重值W1~WN,以根據至少一輸入資料D1~DN以及至少一權重值W1~WN計算輸出資料Z。具體來說,輸入緩衝器370的寄存器(Register)(未繪示)可將輸入資料D1~DN載入乘法器341的寄存器(未繪示)。類似地,權重緩衝器380的寄存器(未繪示)可將權重值W1~WN載入乘法器341的寄存器。接下來,乘法器341可分別計算輸入資料D1~DN及其對應的權重值W1~WN的多個乘積。乘法器341的寄存器將多個乘積載入累加器342的寄存器(未繪示),累加器342可將多個乘積進行加總,以計算出輸出資料Z。In step S407, the logic operation circuit 340 receives at least one input data D1-DN and at least one weight value W1-WN from the input buffer 370 and the weight buffer 380, respectively, and calculates the output data Z based on the at least one input data D1-DN and the at least one weight value W1-WN. Specifically, a register (not shown) of the input buffer 370 can load the input data D1-DN into a register (not shown) of the multiplier 341. Similarly, a register (not shown) of the weight buffer 380 can load the weight values W1-WN into the register of the multiplier 341. Next, multiplier 341 calculates multiple products of the input data D1-DN and their corresponding weights W1-WN. The register of multiplier 341 loads these products into the register of accumulator 342 (not shown). Accumulator 342 then sums these products to calculate output data Z.
在步驟S408中,邏輯運算電路340將輸出資料Z輸出至輸出緩衝區390。在步驟S409中,輸出資料記憶體330傳送輸出緩衝器390中的輸出資料Z。具體來說,輸出緩衝器390的寄存器(未繪示)可將輸出資料Z載入輸出資料記憶體330的寄存器(未繪示)。輸出資料記憶體330可將輸出資料Z傳送至對應的組件(例如,中央處理器)。In step S408, the logic operation circuit 340 outputs the output data Z to the output buffer 390. In step S409, the output data memory 330 transfers the output data Z from the output buffer 390. Specifically, a register (not shown) in the output buffer 390 can load the output data Z into a register (not shown) in the output data memory 330. The output data memory 330 can then transfer the output data Z to a corresponding component (e.g., a central processing unit).
如此一來,本發明的記憶體裝置30可藉由裝置內的邏輯運算電路340來進行資料運算的操作,以降低存取延遲時間。另外,本發明的記憶體裝置30還可藉由控制邏輯電路360所生成的地址控制訊號CON-A以及資料控制訊號CON-D來控制資料讀取邏輯電路350的操作,以確保資料的正確性及時序關係。此外,控制邏輯電路360還可進一步地判斷資料讀取邏輯電路350是否提取成功,以再次確認資料的正確性。據此,本發明的記憶體裝置30可大幅降低邏輯運算電路340與輸入資料記憶體310、權重資料記憶體320以及輸出資料記憶體330的溝通傳輸時間,並確保資料的正確性及時序關係,可滿足對於大語言模型(LLM)的龐大資料量的處理需求。In this way, the memory device 30 of the present invention can perform data operations through the logic circuit 340 within the device, thereby reducing access latency. Furthermore, the memory device 30 of the present invention can control the operation of the data read logic circuit 350 by controlling the address control signal CON-A and the data control signal CON-D generated by the control logic circuit 360 to ensure data accuracy and timing. Furthermore, the control logic circuit 360 can further determine whether the data read logic circuit 350 has successfully retrieved the data, thereby reconfirming the data's accuracy. Accordingly, the memory device 30 of the present invention can significantly reduce the communication transmission time between the logical operation circuit 340 and the input data memory 310, weight data memory 320, and output data memory 330, while ensuring the accuracy and timing of the data, and can meet the processing requirements of large language models (LLMs) with large amounts of data.
圖5繪示本發明的一實施例的一種記憶體裝置的操作方法的流程圖。操作方法可由圖1的記憶體裝置10來實現。請參照圖1與圖5。在步驟S501中,透過輸入資料記憶體110儲存多筆輸入資料。在步驟S502中,透過權重資料記憶體120儲存分別對應於多筆輸入資料的多個權重值。在步驟S503中,透過邏輯運算單元140根據至少一輸入資料及其對應的至少一權重值計算輸出資料,並將輸出資料儲存至輸出資料記憶體130。在步驟S504中,透過輸出資料記憶體130傳送輸出資料。FIG5 is a flow chart showing an operating method of a memory device according to an embodiment of the present invention. The operating method can be implemented by the memory device 10 of FIG1 . Please refer to FIG1 and FIG5 . In step S501 , a plurality of input data are stored in the input data memory 110 . In step S502 , a plurality of weight values corresponding to the plurality of input data are stored in the weight data memory 120 . In step S503 , output data is calculated based on at least one input data and at least one weight value corresponding thereto by the logic operation unit 140 , and the output data is stored in the output data memory 130 . In step S504, the output data is transmitted via the output data memory 130.
綜上所述,本發明所提供的記憶體裝置及其操作方法,可大幅降低邏輯運算電路與輸入資料記憶體、權重資料記憶體以及輸出資料記憶體之間的溝通傳輸時間,並確保資料讀取邏輯電路所提取出的資料的正確性及時序關係,以滿足對於大語言模型(LLM)的龐大資料量的處理需求。In summary, the memory device and operating method provided by the present invention can significantly reduce the communication transmission time between the logic operation circuit and the input data memory, weight data memory, and output data memory, while ensuring the accuracy and timing of the data extracted by the data read logic circuit, thereby meeting the large data processing requirements of large language models (LLMs).
雖然本發明已以實施例揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明的精神和範圍內,當可作些許的更動與潤飾,故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed above by way of embodiments, they are not intended to limit the present invention. Any person having ordinary skill in the art may make slight modifications and improvements without departing from the spirit and scope of the present invention. Therefore, the scope of protection of the present invention shall be determined by the scope of the attached patent application.
10、30:記憶體 110、310:輸入資料記憶體 120、320:權重資料記憶體 130、330:輸出資料記憶體 140、340:邏輯運算電路 141、341:乘法器 142、342:累加器 350:資料讀取邏輯電路 351:地址生成器 352:資料擷取器 360:控制邏輯電路 370:輸入緩衝器 380:權重緩衝器 390:輸出緩衝器 A-D1、A-DN、A-W1、A-WN:地址 BA、BD:匯流排 CON-A、CON-D:控制訊號 D1、D2、Di、DN:輸入資料 W1、W2、Wi、WN:權重值 S401、S402、S403、S404、S405、S406、S407、S408、S409、S501、S502、S503、S504:步驟 Z:輸出資料 10, 30: Memory 110, 310: Input Data Memory 120, 320: Weight Data Memory 130, 330: Output Data Memory 140, 340: Logical Operation Circuit 141, 341: Multiplier 142, 342: Accumulator 350: Data Read Logic Circuit 351: Address Generator 352: Data Extractor 360: Control Logic Circuit 370: Input Buffer 380: Weight Buffer 390: Output Buffer A-D1, A-DN, A-W1, A-WN: Address BA, BD: Buses CON-A, CON-D: Control signals D1, D2, Di, DN: Input data W1, W2, Wi, WN: Weight values S401, S402, S403, S404, S405, S406, S407, S408, S409, S501, S502, S503, S504: Steps Z: Output data
圖1繪示本發明的一實施例的一種記憶體裝置的示意圖。 圖2繪示本發明的一實施例的邏輯運算電路的運作的示意圖。 圖3繪示本發明的一實施例的一種記憶體裝置的示意圖。 圖4繪示本發明的一實施例的一種記憶體裝置的操作方法的流程圖。 圖5繪示本發明的一實施例的一種記憶體裝置的操作方法的流程圖。 Figure 1 is a schematic diagram of a memory device according to an embodiment of the present invention. Figure 2 is a schematic diagram of the operation of a logic operation circuit according to an embodiment of the present invention. Figure 3 is a schematic diagram of a memory device according to an embodiment of the present invention. Figure 4 is a flow chart of a method for operating a memory device according to an embodiment of the present invention. Figure 5 is a flow chart of a method for operating a memory device according to an embodiment of the present invention.
10:記憶體裝置 10: Memory device
110:輸入資料記憶體 110: Input data memory
120:權重資料記憶體 120: Weight data memory
130:輸出資料記憶體 130: Output data memory
140:邏輯運算電路 140: Logical Operation Circuit
141:乘法器 141: Multiplier
142:累加器 142: Accumulator
D1、DN:輸入資料 D1, DN: Input data
W1、WN:權重值 W1, WN: weight values
Z:輸出資料 Z: Output data
Claims (14)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW113121728A TWI898651B (en) | 2024-06-12 | 2024-06-12 | Memory device and operation mothod thereof |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW113121728A TWI898651B (en) | 2024-06-12 | 2024-06-12 | Memory device and operation mothod thereof |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| TWI898651B true TWI898651B (en) | 2025-09-21 |
| TW202548755A TW202548755A (en) | 2025-12-16 |
Family
ID=97832268
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW113121728A TWI898651B (en) | 2024-06-12 | 2024-06-12 | Memory device and operation mothod thereof |
Country Status (1)
| Country | Link |
|---|---|
| TW (1) | TWI898651B (en) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW318228B (en) * | 1996-01-02 | 1997-10-21 | Motorola Inc | |
| US20200285950A1 (en) * | 2017-04-04 | 2020-09-10 | Hailo Technologies Ltd. | Structured Weight Based Sparsity In An Artificial Neural Network Compiler |
| US11816045B2 (en) * | 2016-10-27 | 2023-11-14 | Google Llc | Exploiting input data sparsity in neural network compute units |
-
2024
- 2024-06-12 TW TW113121728A patent/TWI898651B/en active
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW318228B (en) * | 1996-01-02 | 1997-10-21 | Motorola Inc | |
| US11816045B2 (en) * | 2016-10-27 | 2023-11-14 | Google Llc | Exploiting input data sparsity in neural network compute units |
| US20200285950A1 (en) * | 2017-04-04 | 2020-09-10 | Hailo Technologies Ltd. | Structured Weight Based Sparsity In An Artificial Neural Network Compiler |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20260017134A1 (en) | Memory module register access | |
| TWI498913B (en) | Ecc implementation in non-ecc components | |
| US9766820B2 (en) | Arithmetic processing device, information processing device, and control method of arithmetic processing device | |
| US8239708B2 (en) | System on chip (SoC) device verification system using memory interface | |
| CN102918513B (en) | For enabling the methods, devices and systems of determinacy interface | |
| CN116256621B (en) | Testing method, device, electronic equipment and storage medium of core particles | |
| CN101903956B (en) | Self-timed error correcting code evaluation system and method | |
| TWI768435B (en) | Semiconductor layered device with data bus inversion | |
| CN103984506B (en) | The method and system that data of flash memory storage equipment is write | |
| US11176018B1 (en) | Inline hardware compression subsystem for emulation trace data | |
| CN101331464A (en) | Storage area allocation system, method and control device | |
| CN112668266A (en) | Correction method of time sequence path | |
| CN107797821A (en) | Retry read method and the device using this method | |
| CN110310693A (en) | In-Line ECC Module with Cache | |
| CN114664366A (en) | Memory device and method of reading the same | |
| CN105426314A (en) | Process mapping method for FPGA memory | |
| US9891986B2 (en) | System and method for performing bus transactions | |
| TWI898651B (en) | Memory device and operation mothod thereof | |
| US9047329B1 (en) | Method and system for an algorithm and circuit for a high performance exact match lookup function | |
| TW202548755A (en) | Memory device and operation mothod thereof | |
| CN121326221A (en) | Memory devices and their operation methods | |
| CN210072600U (en) | Electronic device | |
| CN111341374B (en) | Memory test method and device and readable memory | |
| US9285828B2 (en) | Memory system with improved bus timing calibration | |
| US8635566B2 (en) | Parity error detection verification |