TWI855303B

TWI855303B - Multiply-and-accumulate unit and multiply-and-accumulate methods

Info

Publication number: TWI855303B
Application number: TW111110603A
Authority: TW
Inventors: 沃金奧洛齊亞; 明夏金
Original assignee: 美商聖巴諾瓦系統公司
Priority date: 2021-03-23
Filing date: 2022-03-22
Publication date: 2024-09-11
Also published as: TW202307647A

Abstract

Floating point Multiply-Add, Accumulate Unit, supporting BF16 format for Multiply-Accumulate operations, and FP32 Single-Precision Addition complying with the IEEE 754 Standard. The Multiply-Accumulate unit uses higher radix and longer internal 2’s complement significand representation to facilitate precision as well as comparison and operation with negative numbers. The addition can be performed using Carry-Save format to avoid long carry propagation and speed up the operation. The circuit uses early exponent comparison to shorten the accumulate pipeline stage. Operations including overflow detection, zero detection and sign extension are adopted for 2s complement and Carry-Save format

Description

Multiplication and accumulation unit and multiplication and accumulation method

本揭露的領域是算術邏輯電路的實現，包括浮點乘加累加電路，有時也稱為乘法及累加電路，用於高速處理器，包括其配置成有效執行訓練和推理的處理器。 The field of this disclosure is the implementation of arithmetic logic circuits, including floating-point multiply-add-accumulate circuits, sometimes referred to as multiply-and-accumulate circuits, for use in high-speed processors, including processors configured to efficiently perform training and inference.

優先權申請案的參照References for priority applications

本申請案請求下列優先權：2021年11月23日提交的美國專利申請案第17/534,376號、2021年9月2日提交的美國專利申請案第17/465,558號、2021年8月9日提交的美國專利申請案第17/397,241號、2021年5月19日提交的美國臨時專利申請案第63/190,749號、2021年4月13日提交的美國臨時專利申請案第63/174,460號、2021年3月25日提交的美國臨時專利申請案第63/166,221號、2021年3月23日提交的美國臨時專利申請案第63/165,073號及2021年8月31日提交的美國臨時專利申請案第63/239,384號。上述所有八件申請案均以參照方式併入本文。 This application claims priority to U.S. Patent Application No. 17/534,376 filed on November 23, 2021, U.S. Patent Application No. 17/465,558 filed on September 2, 2021, U.S. Patent Application No. 17/397,241 filed on August 9, 2021, U.S. Provisional Patent Application No. 63/190,74 filed on May 19, 2021, 9, U.S. Provisional Patent Application No. 63/174,460 filed on April 13, 2021, U.S. Provisional Patent Application No. 63/166,221 filed on March 25, 2021, U.S. Provisional Patent Application No. 63/165,073 filed on March 23, 2021, and U.S. Provisional Patent Application No. 63/239,384 filed on August 31, 2021. All eight of the above applications are incorporated herein by reference.

如在高性能處理器中實現的包括浮點、乘法及累加單元的算術邏輯電路是相對複雜的邏輯電路。乘法及累加電路應用於矩陣乘法和其它複雜的數學運算，其應用於機器學習和推理引擎。 Arithmetic logic circuits including floating point, multiplication and accumulation units implemented in high-performance processors are relatively complex logic circuits. Multiplication and accumulation circuits are used for matrix multiplication and other complex mathematical operations, which are used in machine learning and inference engines.

實質上，乘法及累加電路生成項A(i)*B(i)的序列的總和S(i)，通常表示如下：

Essentially, the multiplication and accumulation circuit generates the sum S(i) of a sequence of terms A(i)*B(i), usually expressed as follows:

在此，週期(i)處的總和S(i)等於將項A(i)*B(i)加到總和S(i-1)上，即項A(0)*B(0)到A(i-1)*B(i-1)的累加。最終的總和S(N-1)對於N個週期(從0到N-1)的乘法及累加運算的總和輸出。 Here, the sum S(i) at cycle (i) is equal to adding the term A(i)*B(i) to the sum S(i-1), which is the accumulation of terms A(0)*B(0) to A(i-1)*B(i-1). The final sum S(N-1) is the sum output of the multiplication and accumulation operations for N cycles (from 0 to N-1).

在浮點實現中，每個週期將包括指數值和有效值的兩個輸入浮點運算元A(i)和B(i)相乘，以產生乘法器輸出項A(i)*B(i)，接著透過將當前週期的乘法器輸出項A(i)*B(i)與前一個週期的累加器輸出總和S(i-1)相加來計算累加器輸出總和S(i)。 In the floating-point implementation, each cycle multiplies two input floating-point operands A(i) and B(i), including the exponent value and the effective value, to generate a multiplier output term A(i)*B(i), and then calculates the accumulator output sum S(i) by adding the multiplier output term A(i)*B(i) of the current cycle to the accumulator output sum S(i-1) of the previous cycle.

在計算中用於對浮點數進行編碼的浮點編碼格式中，可以對數字進行正規化，以使有效數在二進制點的左側包括一位整數(在二進制中始終為「1」)，以及由二進制點右側的一些位元表示的小數，而所述數字僅使用小數進行編碼。編碼中省略了二進制1整數，因為它可以透過正規化形式隱含。以這種方式編碼的浮點編碼格式數字的運算考慮到二進制點左側的整數，稱為「隱含1」。 In a floating-point encoding format used to encode floating-point numbers in computation, numbers can be normalized so that the significand includes a one-bit integer (always "1" in binary) to the left of the binary point, and a fraction represented by some bits to the right of the binary point, with the number encoded using only the fraction. The binary 1 integer is omitted from the encoding because it can be implied by the normalization form. Operations on floating-point encoding format numbers encoded in this way take into account the integer to the left of the binary point, which is called "implied 1".

浮點數的乘法可以透過將指數相加，乘以有效數，接著正規化結果來實現，透過將輸出的結果有效數移位並調整輸出的指數以適應這種移位。 Multiplication of floating point numbers can be performed by adding the exponents, multiplying by the significand, and then normalizing the result by shifting the output significand and adjusting the output exponent to accommodate this shift.

浮點數的加法可以透過首先識別較大的指數，以及運算元的指數之間的差異，接著將具有最小指數的運算元的有效數移位以與較大的指數對齊來實現。最後，將結果正規化，這可能涉及有效數的額外移位和指數的調整。 Addition of floating-point numbers can be implemented by first identifying the larger exponent and the difference between the exponents of the operands, then shifting the significand of the operand with the smallest exponent to align with the larger exponent. Finally, the result is normalized, which may involve additional shifting of the significand and adjustment of the exponent.

致使格式不支援的數字的計算，諸如浮點編碼格式，會致使異常訊號。在資料流架構和其它執行複雜演算法(如機器學習演算法)的架構中，這些異常可能致使演算法停止或失敗。即時系統中致使演算法停止或失敗的異常可能致使系統故障或其它性能問題。 Calculations on numbers in unsupported formats, such as floating point encoding formats, can result in exceptions. In data streaming architectures and other architectures that execute complex algorithms (such as machine learning algorithms), these exceptions can cause the algorithms to stop or fail. Exceptions that cause algorithms to stop or fail in real-time systems can cause system failures or other performance issues.

需要提供可應用於複雜資料處理設置的異常處理系統。 There is a need to provide an exception handling system that can be applied to complex data processing settings.

110:Bfloat16 110:Bfloat16

130:IEEE754單精確度32位元浮點(FP32) 130: IEEE754 single-precision 32-bit floating point (FP32)

202:乘法器電路 202:Multiplier circuit

210a:乘法器及加法器方塊 210a: Multiplier and adder block

210b:指數方塊 210b: Index Block

210:多工器 210:Multiplexer

211:多工器 211:Multiplexer

212:多工器 212: Multiplexer

213:運算元A 213: Operator A

214:運算元B 214: Operator B

215:基底8轉換器 215: Base 8 converter

216:運算元C；線 216: Operator C; line

217:BF16格式或FP32格式 217: BF16 format or FP32 format

218:線 218: Line

219:線 219: Line

220:方塊 220: Block

221:線 221: Line

222:雙匯流排 222:Double bus

223:匯流排 223:Bus

224:輸出匯流排 224: Output bus

225:輸出匯流排 225: Output bus

226:匯流排 226: Bus

227:匯流排 227: Bus

228:匯流排 228:Bus

229:匯流排 229: Bus

230:進位保留加法器 230: Carry-save adder

240:累加器 240: Accumulator

240A:指數控制單元 240A: Index control unit

240B:指數比較器單元 240B: Exponential comparator unit

240C:有效數部分 240C: Significant number part

241:42位元小數進位暫存器 241: 42-bit decimal carry register

242:42位元小數總和暫存器 242: 42-bit decimal sum register

250:進位保留到符號數值轉換方塊 250: Carry-preserve to sign-value conversion block

251:匯流排 251:Bus

252:匯流排 252:Bus

260:基底8到基底2轉換和正規化方塊 260: Basis 8 to Basis 2 Transformation and Normalization Blocks

270:FP32或BF16方塊 270: FP32 or BF16 block

300:簡化方塊圖 300: Simplified block diagram

410:8X8 BF16乘法器電路 410:8X8 BF16 multiplier circuit

420:暫存器 420: Register

421:暫存器 421: Register

422:線 422: Line

423:線 423: Line

424:線 424: Line

425:線 425: Line

426:線 426: Line

427:線 427: Line

428:7位元LSB匯流排 428: 7-bit LSB bus

429:7位元LSB匯流排 429: 7-bit LSB bus

430:7位元漣波進位加法器 430: 7-bit ripple-carry adder

440:線 440: Line

450:暫存器 450: Register

460:暫存器 460: Register

461:反相器 461: Inverter

462:線 462: Line

464:指數加法器電路 464: Exponential adder circuit

465:線 465: Line

466:10位元值 466: 10-bit value

467:特殊指數檢測方塊 467: Special index detection block

468:線 468: Line

469:線 469: Line

470:暫存器 470: Register

471a:反互斥或閘 471a: Anti-mutex or gate

471b:反互斥或閘 471b: Anti-mutex or gate

471c:反互斥或閘 471c: Anti-mutex or gate

473:線 473: Line

501:線 501: Line

502:有效數最終加法器電路 502: Significant number final adder circuit

503:線 503: Line

504:暫存器 504: Cache

506:溢位選擇電路 506: Overflow selection circuit

507:線 507: Line

508:符號位元選擇電路 508:Sign bit selection circuit

509:線 509: Line

510:指數選擇電路 510: Index selection circuit

511:線 511: Line

512:有效數選擇電路 512: Valid number selection circuit

513:線 513: Line

514:8位元左移位器電路 514: 8-bit left shifter circuit

515:匯流排 515:Bus

516:2的補數反相加1電路 516:2's complement inverting and adding 1 circuit

517:匯流排 517:Bus

518:多工器電路 518:Multiplexer circuit

519:溢位訊號 519: Overflow signal

520:暫存器 520: Register

521:線 521: Line

522:指數溢位檢測電路 522: Exponent overflow detection circuit

523:輸入匯流排；線 523: Input bus; line

524:指數異常處理電路 524: Index abnormal processing circuit

525:線 525: Line

526:指數異常檢測電路 526: Index abnormality detection circuit

527:線 527: Line

528:輸出異常控制訊號產生電路 528: Output abnormal control signal generating circuit

529:匯流排 529:Bus

530:暫存器 530: Register

531:符號位元 531:Sign bit

532:暫存器 532: Register

533:線 533: Line

537:輸出匯流排；線 537: output bus; line

539:線 539: Line

541:線 541: Line

543:線 543: Line

544:及閘 544: And the gate

545:線 545: Line

547:匯流排 547:Bus

549:線 549: Line

551:線 551: Line

553:匯流排 553:Bus

560:暫存器Fcin 560: Register Fcin

592:基底8轉換器方塊 592: Base 8 converter block

592a:最終加法有效數選擇和基底8轉換子方塊 592a: Final addition significand selection and base 8 conversion sub-block

592b:指數異常處理子方塊 592b: Index exception handling sub-block

602:乘積 602: Product

604:進位捨入方塊 604: Rounding block

605:溢位檢測方塊 605: Overflow detection block

606:LZA電路 606:LZA circuit

607:匯流排 607:Bus

608:移位暫存器SHR8/16/24電路 608: Shift register SHR8/16/24 circuit

609:移位器電路 609: Shifter circuit

610:移位器電路 610: Shifter circuit

611:匯流排 611:Bus

612:總和捨入方塊 612: Total rounded to square

613:匯流排 613:Bus

614:進位保留加法器電路 614: Carry-save adder circuit

617:及閘 617: And the gate

619:反相器 619: Inverter

630:移位器指數控制訊號產生/旁路控制電路 630: Shifter index control signal generation/bypass control circuit

633:線 633: Line

634:線 634: Line

636:輸出管線暫存器 636: Output pipeline register

644:線 644: Line

646:線 646: Line

647:線 647: Line

648:線 648: Line

650:16位元狀況暫存器 650: 16-bit status register

652:16位元乘法器指數比較電路 652:16-bit multiplier exponential comparison circuit

654:指數累加器(Eacc)暫存器 654: Exponent accumulator (Eacc) register

660:遞增器 660:Multiplier

661:遞減器 661:Reducer

662:符號擴展檢測單元 662:Symbol expansion detection unit

664:及閘 664: And the gate

665:及閘 665: And the gate

667:進位匯流排 667: Carry bus

668:及閘 668: And the gate

669:總和匯流排 669: Total bus

670:或閘 670: or gate

671:線；輸入 671: line; input

673:輸入 673: Input

675:輸入 675: Input

677:輸入 677: Input

679:線 679: Line

681:線 681: Line

682:線 682: Line

683:線 683: Line

684:簡單乘積捨入方塊 684: Simple product rounding block

685:輸出 685: Output

686:及閘 686: And the gate

687:及閘 687: And the gate

688:及閘 688: And the gate

689:輸出 689: Output

692:匯流排 692:Bus

693:匯流排 693:Bus

700:最終轉換 700: Final conversion

270:正規化轉換到符號數值格式方塊 270: Normalization to symbolic numeric format block

270a:從進位保留轉換到符號數值格式方塊 270a: Convert from carry-save to sign-value format block

270b:從基底8轉換到基底2浮點數方塊 270b: Convert from base 8 to base 2 floating point number block

702:匯流排 702: Bus

704:匯流排 704:Bus

706:線 706: Line

708:43位元加法器電路 708:43-bit adder circuit

710:第二電路LZA/LOA 710: Second circuit LZA/LOA

711:線 711: Line

712:線 712: Line

714:LZA POS選擇電路 714:LZA POS selection circuit

715:匯流排 715:Bus

716:匯流排 716:Bus

717:匯流排 717:Bus

718:反相加1電路 718: Inverting plus 1 circuit

719:線 719: Line

720:有效數選擇多工器電路 720: Valid number selection multiplexer circuit

726:符號位元 726:Sign bit

728:有效數零檢測電路 728: Effective zero detection circuit

730:暫存器 730: Register

735:SHL左移位器電路 735:SHL left shifter circuit

736:線 736: Line

738:匯流排 738:Bus

739:訊號 739:Signal

740:指數加法器電路 740: Exponential adder circuit

742:線 742: Line

744:線 744: Line

745:匯流排 745:Bus

746:線 746: Line

747:線 747: Line

748:暫存器 748: Register

750:異常控制(8位元)暫存器 750: Abnormal control (8-bit) register

751:異常控制暫存器 751: Abnormal control register

752:溢/欠檢測電路 752:Over/under detection circuit

753:線 753: Line

755:線 755: Line

756:小數零暫存器 756: Decimal zero register

758:norm_en 758:norm_en

760:欠檢測多工器 760: Under-detected multiplexer

770:39位元有效數 770: 39-bit significant digit

788:線 788: Line

801:控制線訊號Out_FP32 801: Control line signal Out_FP32

802:第一多工器 802: First multiplexer

803:線 803: Line

805:匯流排 805:Bus

807:匯流排 807:Bus

809:匯流排 809:Bus

810A:加法器符號正狀況電路 810A: Adder sign positive state circuit

811B:狀況電路 811B: Status circuit

812A:電路 812A: Circuit

817:線 817: Line

819:線 819: Line

820:多工器 820:Multiplexer

821:線 821: Line

823:捨入到零選擇線 823: Round to zero selection line

825:線 825: Line

827:線 827: Line

829:線 829: Line

830:捨入電路 830: Rounding circuit

831:線 831: Line

833:匯流排 833:Bus

835:線 835: Line

837:39位元有效數暫存器770匯流排fpst_l[38：0] 837:39-bit valid number register 770 bus fpst_l[38:0]

840:多工器 840:Multiplexer

850:第二捨入遞增電路 850: Second rounding increment circuit

860:第一捨入遞增電路 860: First round-up increment circuit

907:線 907: Line

930:23位元有效數(IEEE 754)暫存器 930: 23-bit significand (IEEE 754) register

953:八狀況 953: Eight Situations

957:線 957: Line

959:線 959: Line

961:線 961: Line

962:無窮大控制 962: Infinite Control

963:線 963: Line

964:EPST_L[9：0]匯流排 964:EPST_L[9:0] bus

965:匯流排 965:Bus

967:訊號線 967:Signal line

968:9位元Enorm[8：0]匯流排 968: 9-bit Enorm[8:0] bus

969:線 969: Line

970:零控制邏輯 970: Zero Control Logic

971:線 971: Line

972:線 972: Line

974:第一多工器(零) 974: First multiplexer (zero)

975:線 975: Line

976:多工器 976:Multiplexer

979:指數訊號匯流排 979: Index signal bus

980:指數暫存器 980: Index register

982:遞增器 982:Multiplier

983:訊號線 983:Signal line

986:欠位/溢位檢測和指數異常檢測電路 986: Under/overflow detection and index abnormality detection circuit

987:或閘 987: or gate

988:符號生成和異常處理電路 988:Symbol generation and exception handling circuits

990:暫存器 990: Register

991:線 991: Line

992:無窮大檢測電路 992: Infinite detection circuit

994:非正規電路 994: Non-conventional circuits

1000:浮點數範圍 1000: floating point number range

1010:方塊 1010:Block

1100:範例高階架構方塊圖 1100: Example high-level architecture block diagram

1102:乘法器異常處理方塊 1102: Multiplier exception processing block

1104:乘法器異常旗標 1104: Multiplier exception flag

1106:匯流排 1106: Bus

1108:匯流排乘法器異常狀況訊號 1108: Bus multiplier abnormal condition signal

1110:乘法器方塊 1110:Multiplier block

1113:運算元A 1113: Operator A

1114:運算元B 1114: Operator B

1115:乘法器異常結果 1115: Abnormal result of multiplier

1116:運算元C 1116: Operator C

1118:運算元C基本轉換方塊 1118: Operator C basic conversion block

1120:運算元C異常狀況訊號匯流排 1120: Operator C abnormal condition signal bus

1122:匯流排 1122: Bus

1124:累加器迴圈 1124: Accumulator loop

1126:異常輸出控制訊號生成方塊 1126: Abnormal output control signal generation block

1128:異常控制 1128: Abnormal Control

1129:輸出方塊 1129: Output block

1130:進位保留加法器(CSA)方塊 1130: Carry-save adder (CSA) block

1131:輸出方塊 1131: Output block

1132:匯流排 1132: Bus

1134:加法器正規化異常處理方塊 1134: Adder normalization exception handling block

1138:匯流排 1138:Bus

1139:加法器異常結果 1139: Adder abnormal result

1140:加法器異常旗標方塊 1140: Adder exception flag block

1300:異常處理結構 1300:Exception processing structure

1302:乘法器異常旗標生成方塊 1302: Multiplier exception flag generation block

1304:異常旗標生成方塊 1304: Abnormal flag spawning block

1306:加法器異常旗標生成方塊 1306: Adder abnormal flag generation block

1308:浮點乘法累加器異常方塊 1308: Floating point multiplier accumulator exception block

1310:乘法器異常結果生成方塊 1310: Multiplier abnormal result generation block

1312:異常結果生成方塊 1312: Abnormal result generation block

1314:加法器異常結果生成方塊 1314: Adder abnormal result generation block

1320A:乘法器符號生成狀況方塊 1320A: Multiplier symbol generation status block

1320B:方塊 1320B: Block

1320C:方塊 1320C: Block

1322A:乘法器指數生成狀況方塊 1322A: Multiplier exponent generation status block

1322B:方塊 1322B: Block

1322C:方塊 1322C: Block

1322D:方塊 1322D: Block

1324A:乘法小數生成狀況方塊 1324A: Multiplication decimal generation status block

1324B:方塊 1324B: Block

1324C:方塊 1324C: Block

1326A:加法器符號生成狀況方塊 1326A: Adder symbol generation status block

1326B:方塊 1326B: Block

1326C:方塊 1326C: Block

1328A:加法器指數生成狀況方塊 1328A: Adder index generation status block

1328B:方塊 1328B: Block

1328C:方塊 1328C: Block

1328D:方塊 1328D: Block

1330A:加法器小數生成狀況方塊 1330A: Adder decimal generation status block

1330B:方塊 1330B: Block

1330C:方塊 1330C: Block

1381:乘法器溢位旗標狀況方塊 1381:Multiplier overflow flag status block

1382:乘法器欠位旗標狀況方塊 1382: Multiplier under-bit flag status block

1383:乘法器無效旗標狀況方塊 1383: Multiplier invalid flag status block

1387:加法器溢位旗標狀況方塊 1387: Adder overflow flag status block

1388:加法器欠位旗標狀況方塊 1388: Adder under-bit flag status block

1389:加法器無效旗標狀況方塊 1389: Adder invalid flag condition block

1400:實現 1400: Realization

1402:最低有效數位元(LSB) 1402: Least Significant Bit (LSB)

1404:最高有效數位元(MSB) 1404: Most Significant Bit (MSB)

1406:最低有效數位元(LSB) 1406: Least Significant Bit (LSB)

1408:最高有效數位元(MSB) 1408: Most Significant Bit (MSB)

1410:最低有效數位元(LSB) 1410: Least Significant Bit (LSB)

1412:最高有效數位元(MSB) 1412: Most significant bit (MSB)

1414:乘法運算致能 1414: Multiplication operation enabled

1418:乘法運算致能 1418: Multiplication operation enabled

1420:及閘 1420: And the gate

1430:及閘 1430: And the gate

1440:乘積指數及閘 1440: Product exponent and gate

1442:輸出 1442: Output

1444:及閘 1444: And the gate

1445:反或閘 1445: Anti-OR Gate

1446:乘法器溢位旗標狀況 1446: Multiplier overflow flag status

1450:反或閘 1450: Anti-OR Gate

1460:反或閘 1460: Anti-OR Gate

1470:乘積指數反或閘 1470: Product exponent anti-OR gate

1472:輸出 1472: Output

1475:反或閘 1475: Anti-OR Gate

1480:乘法器欠位及閘 1480: Multiplier lacks bits and gates

1482:乘法器欠位旗標狀況 1482: Multiplier under-bit flag status

1500:實現 1500: Realization

1501:乘法運算致能 1501: Multiplication operation enabled

1502:最低有效數位元(LSB) 1502: Least Significant Bit (LSB)

1504:最高有效數位元(MSB)八位元 1504: Most significant bit (MSB) octet

1506:最低有效數位元(LSB) 1506: Least Significant Bit (LSB)

1508:最高有效數位元(MSB) 1508: Most Significant Bit (MSB)

1510:及閘 1510: And the gate

1520:反或閘 1520: Anti-OR Gate

1530:及閘 1530: Jimen

1540:反或閘 1540: Anti-OR Gate

1550:及閘 1550: Ji Gate

1560:及閘 1560: Ji Gate

1570:或閘 1570: Or gate

1580:乘法器無效及閘 1580: Multiplier invalid and gated

1552:第一輸入 1552: First input

1562:第二輸入 1562: Second input

1572:輸出 1572: Output

1582:乘法器無效旗標狀況 1582: Multiplier invalid flag condition

1600:實現 1600: Implementation

1618:符號A 1618:Symbol A

1620:EX-或閘 1620:EX-or-gate

1622:符號B 1622:Symbol B

1630:及閘 1630: Jimen

1632:第二輸入 1632: Second input

1700:實現 1700: Realization

1710:或閘 1710: Or gate

1720:及閘 1720: Jimen

1730:或閘 1730: Or gate

1740:反或閘 1740: Anti-OR Gate

1750:多工器 1750:Multiplexer

1760:或閘 1760: Or gate

1770:多工器 1770:Multiplexer

1752:乘法器指數 1752:Multiplier index

1762:輸出 1762: Output

1772:乘法器小數匯流排 1772:Multiplier fractional bus

1800:示意性實現 1800: Schematic implementation

1810:及閘 1810: Jimen

1820:反或閘 1820: Anti-or gate

1830:及閘 1830: Jiguan

1822:非輸入指數無窮大 1822: Non-import index is infinitely large

1832:加法器溢位 1832: Adder overflow

1850:及閘 1850: Jiguan

1860:EX-或閘 1860:EX-or-gate

1870:及閘 1870: Jiguan

1880:反或閘 1880: Anti-or Gate

1890:及閘 1890: Jiguan

1872:輸出精確零 1872: Output exactly zero

1882:非輸入精確零 1882: Non-input exact zero

1892:加法器欠位 1892: Adder lacks position

1900:實現 1900: Realization

1910:或閘 1910: Or gate

1920:EX-或閘 1920:EX-or-gate

1930:及閘 1930: Ji-Gan

1912:輸出 1912: Output

1932:加法器無效旗標 1932: Adder invalid flag

1940:反或閘 1940: Anti-or Gate

1942:輸出 1942: Output

1950:及閘 1950: Jimen

1960:EX-或閘 1960:EX-or-gate

1965:或閘 1965: Or Gate

1967:欠位 1967: Lack of position

1968:反相閘 1968: Inverting Gate

1970:及閘 1970: Jimen

1972:欠位 1972: Lack of position

1980:或閘 1980: Or Gate

1990:及閘 1990: Jimen

1982:輸出 1982: Output

1992:訊號加法器符號正位元 1992: Signal adder symbol positive bit

2000:實現 2000: Realization

2005:反相閘 2005: Inverting Gate

2010:反及閘 2010: Anti-gate

2020:及閘 2020: Gate

2030:及閘 2030: Gate

2040:或閘 2040: Or gate

2050:或閘 2050: or gate

2060:及閘 2060: And the gate

2070:EX-或閘 2070:EX-or-gate

2080:或閘 2080: or gate

2012:輸出 2012: Output

2022:輸出 2022: Output

2032:輸出 2032: Output

2042:加法器符號負狀況位元 2042: Adder sign negative status bit

2052:判定 2052: Judgment

2082:加法器指數全「0」選擇 2082: Adder index all "0" selection

2100:示意性實現 2100: Schematic implementation

2110:或閘 2110: or gate

2120:或閘 2120: or gate

2112:判定 2112: Judgment

2122:加法器指數全「1」選擇 2122: Adder index all "1" selection

2130:或閘 2130: or gate

2131:選擇器控制 2131:Selector control

2132:加法器小數輸出 2132: Adder decimal output

2150:多工器 2150:Multiplexer

[圖1]說明了BFloat16和浮點IEEE-754標準的編碼格式。 [Figure 1] illustrates the encoding format of BFloat16 and the floating-point IEEE-754 standard.

[圖2]說明了BF16和FP32格式的具有進位保留累加器的浮點乘加累加單元的高階方塊圖。 [Figure 2] illustrates a high-level block diagram of a floating-point multiply-add-accumulate unit with carry-save accumulator for BF16 and FP32 formats.

[圖3]說明了具有兩個輸入(運算元A和運算元B)的乘法器電路的階層式方塊圖。 [Figure 3] illustrates a hierarchical block diagram of a multiplier circuit with two inputs (operand A and operator B).

[圖4a]說明了包含8x8乘法器部分乘積簡化樹的範例乘法器及加法器方塊。 [Figure 4a] illustrates an example multiplier and adder block containing a partial product reduction tree of 8x8 multipliers.

[圖4b]說明了具有特殊指數檢測方塊的範例指數單元。 [Figure 4b] illustrates an example index unit with a special index detection block.

[圖5A]說明了顯示包含範例最終加法、有效數選擇和基底8轉換方塊和範例指數異常處理方塊的基底8轉換器的階層式方塊圖。 [FIG. 5A] illustrates a hierarchical block diagram showing a base-8 converter including example final addition, significand selection, and base-8 conversion blocks and example exponent exception handling blocks.

[圖5B]說明了最終部分乘積加法、有效數選擇和基底8轉換方塊的範例性示意圖表示。 [FIG. 5B] illustrates an exemplary schematic representation of the final partial product addition, significand selection, and base 8 conversion blocks.

[圖5C]說明了異常處理方塊的範例性示意圖表示。 [Figure 5C] illustrates an exemplary schematic representation of an exception processing block.

[圖6]說明了進位保留累加單元的高階階層式方塊圖。 [Figure 6] illustrates the high-level hierarchical block diagram of the carry-save accumulation unit.

[圖7A]說明了包含兩個階層式塊：指數控制單元和有效數單元的累加器的高階階層式方塊圖。 [Figure 7A] illustrates a high-level hierarchical block diagram of an accumulator that includes two hierarchical blocks: an exponent control unit and a significand unit.

[圖7B]說明了指數控制單元的範例性階層式方塊圖和示意圖。 [Figure 7B] illustrates an exemplary hierarchical block diagram and schematic diagram of an index control unit.

[圖7C]說明了有效數單位的範例性階層式方塊圖和示意圖。 [Figure 7C] illustrates an exemplary hierarchical block diagram and schematic diagram of significant figure units.

[圖8A]說明了範例性階層式方塊，其顯示了包含兩個子方塊的正規化轉換到符號數值格式方塊，從進位保留到符號數值子方塊的第一轉換及從基底8到基底2的浮點數子方塊的第二轉換。 [Figure 8A] illustrates an exemplary hierarchical block showing a normalized conversion to signed numeric format block containing two sub-blocks, a first conversion from carry-preserving to signed numeric sub-block and a second conversion from base 8 to base 2 floating point sub-block.

[圖8B]說明了從進位保留到符號數值方塊的轉換的範例性示意圖。 [Figure 8B] illustrates an exemplary schematic diagram of the conversion from carry-save to signed-valued blocks.

[圖8C]說明了從基底8到基底2的浮點數方塊的轉換的範例性示意圖。 [Figure 8C] illustrates an exemplary schematic diagram of the conversion of a floating point number block from base 8 to base 2.

[圖9A]說明了顯示捨入及轉換為BF16或IEEE 754 32位元單精確度格式子方塊以及指數和異常處理子方塊的範例性階層式方塊。 [Figure 9A] illustrates an example hierarchical block showing rounding and conversion to BF16 or IEEE 754 32-bit single precision format subblocks as well as exponentiation and exception handling subblocks.

[圖9B]說明了顯示捨入和轉換為BF16或IEEE 754 32位元SP格式方塊的範例性示意圖。 [Figure 9B] illustrates an example schematic diagram showing rounding and conversion to BF16 or IEEE 754 32-bit SP format blocks.

[圖9C]說明了顯示指數和異常處理方塊的範例性示意圖。 [Figure 9C] illustrates an exemplary schematic diagram showing the index and exception handling blocks.

[圖10]說明了用於機器學習的進位保留累加單元中的異常處理所處理的浮點數範圍。 [Figure 10] illustrates the range of floating-point numbers handled by exception handling in the carry-save-accumulate unit used for machine learning.

[圖11]顯示了描繪用於機器學習的進位保留累加單元中的異常處理元素的高階架構方塊圖。 [Figure 11] shows a high-level block diagram depicting the exception processing elements in the carry-save-accumulate unit used for machine learning.

[圖12A]說明了包含BF16格式的輸入A、BF16格式的輸入B和FP32格式的輸入C的第一運算模式高階方塊圖架構，其中BF16指定16位元機器學習浮點編碼格式，稱為「B-float」，或Google開發的(腦浮點(Brain Floating Point))，而FP32指定32位元單精確度IEEE 754標準表示。 [Figure 12A] illustrates a high-level block diagram architecture of the first operation mode including input A in BF16 format, input B in BF16 format, and input C in FP32 format, where BF16 specifies a 16-bit machine learning floating point encoding format called "B-float", or Google-developed (Brain Floating Point), and FP32 specifies a 32-bit single-precision IEEE 754 standard representation.

[圖12B]說明了包含BF16格式的輸入A、BF16格式的輸入B以及執行累加的第二運算模式高階方塊圖架構。 [Figure 12B] illustrates the high-level block diagram architecture of the second operation mode including input A in BF16 format, input B in BF16 format, and performing accumulation.

[圖12C]說明了包含FP32格式的輸入A和FP32格式的輸入C的第三運算模式高階方塊圖架構。 [Figure 12C] illustrates the high-level block diagram architecture of the third operation mode including input A in FP32 format and input C in FP32 format.

[圖13]說明了異常處理結構的高階方塊圖。 [Figure 13] illustrates a high-level block diagram of the exception handling structure.

[圖14A]描繪了乘法器溢位旗標狀況電路。 [Figure 14A] depicts the multiplier overflow flag condition circuit.

[圖14B]顯示了乘法器欠位旗標狀況電路。 [Figure 14B] shows the multiplier under-bit flag condition circuit.

[圖15]說明了乘法器無效旗標狀況電路。 [Figure 15] illustrates the multiplier invalid flag condition circuit.

[圖16]描繪了乘法符號生成狀況電路。 [Figure 16] depicts the circuit for generating the multiplication symbol.

[圖17A]顯示了乘法指數生成狀況電路。 [Figure 17A] shows the multiplication exponent generation circuit.

[圖17B]描繪了乘法小數生成狀況電路。 [Figure 17B] depicts the multiplication decimal generation circuit.

[圖18A]說明了加法器溢位旗標狀況電路。 [Figure 18A] illustrates the adder overflow flag condition circuit.

[圖18B]顯示了加法器欠位旗標狀況電路。 [Figure 18B] shows the adder under-bit flag condition circuit.

[圖19A]顯示了加法器無效旗標狀況電路。 [Figure 19A] shows the adder invalid flag condition circuit.

[圖19B]描繪了加法器符號正狀況電路。 [Figure 19B] depicts the adder sign positive condition circuit.

[圖20A]描繪了加法器符號負電路。 [Figure 20A] depicts the adder sign-negative circuit.

[圖20B]說明了加法器指數生成全「0」狀況電路。 [Figure 20B] illustrates the circuit for the adder exponent generating all "0" conditions.

[圖21A]說明了加法器指數生成全「1」狀況電路。 [Figure 21A] illustrates the circuit for the adder exponent to generate all "1" conditions.

[圖21B]顯示了加法小數生成狀況電路。 [Figure 21B] shows the circuit for generating addition decimals.

[Content of invention] and [implementation method]

提供了實現具有異常處理的可配置和可重新配置資料流架構的算術單元的技術的詳細描述。Shah等人於2020年11月10日發布的美國專利第10,831,507號中描述了範例可重構資料流架構，所述專利透過參照併入，如同在本文中完整闡述一樣。算術單元可以使用輸入運算元執行複數個浮點算術運算並生成至少一個輸出運算元，其中輸入運算元的來源、輸出運算元的目的地和運算是可配置的，並且可透過可以在資料流運算期間保持靜態的配置資料重新配置。 A detailed description of techniques for implementing an arithmetic unit with a configurable and reconfigurable dataflow architecture with exception processing is provided. An example reconfigurable dataflow architecture is described in U.S. Patent No. 10,831,507, issued November 10, 2020, to Shah et al., which is incorporated by reference as if fully set forth herein. The arithmetic unit can perform a plurality of floating-point arithmetic operations using input operands and generate at least one output operand, wherein the source of the input operands, the destination of the output operands, and the operation are configurable and reconfigurable via configuration data that can remain static during the dataflow operation.

在至少一個浮點算術運算的執行中，檢測到與非法運算相關的異常以及與所使用的浮點編碼格式不正常表示的結果的生成相關的異常，並將運算結果設置為在運算期間可用於進一步處理的值，不需要由例如運行時處理器進行特殊的中斷處理。結果，資料流運算能夠在不因至少一些異常而中斷的情況下完成。 During the execution of at least one floating-point arithmetic operation, exceptions associated with illegal operations and with the generation of results that are not normally representable by the used floating-point encoding format are detected, and the result of the operation is set to a value that is available for further processing during the operation without requiring special interrupt handling by, for example, a runtime processor. As a result, the dataflow operation can be completed without being interrupted by at least some of the exceptions.

在一些實施例中，在控制流架構上使用的算術運算和算術單元可以實現本文中描述的異常處理技術。 In some embodiments, arithmetic operations and arithmetic units used on a control flow architecture can implement the exception handling techniques described herein.

浮點進位保留MAC(FP-CS-MAC)Floating Point Carry-Save MAC (FP-CS-MAC)

描述了可以在三種運算模式下運算的FP-CS-MAC，諸如：輸入A(BF16)x輸入B(BF16)+累加迴路輸入A(BF16)x輸入B(BF16)+輸入C(FP32)或單一32位元浮點加法，諸如：輸入A(FP32)+輸入C(FP32)運算元A可以是任何格式，而在此實現中，其為以下兩種格式之一：BF16或FP32，其中BF16是一種包含8位元指數、1符號位元、7位元有效數的格式，其中隱含1整數位元，共有8有效數位元。FP32被稱為單精確度32位元、IEEE浮點754標準。 Describes an FP-CS-MAC that can operate in three modes, such as: Input A (BF16) x Input B (BF16) + Accumulation loop Input A (BF16) x Input B (BF16) + Input C (FP32) or a single 32-bit floating point addition, such as: Input A (FP32) + Input C (FP32) Operand A can be in any format, and in this implementation, it is one of the following two formats: BF16 or FP32, where BF16 is a format containing an 8-bit exponent, 1 sign bit, and 7 bits of significand, including 1 integer bit implicitly, for a total of 8 significand bits. FP32 is called single precision 32-bit, IEEE floating point 754 standard.

可以使用其它編碼格式，並且可以對所描述的實現進行適當的調整。 Other encoding formats may be used and the described implementation may be adapted appropriately.

描述了一種三模式浮點進位保留MAC(FP-CS-MAC)單元，包含實現為管線、響應於管線時脈運行的電路。在一些實現中，管線時脈可以是千兆赫(GHz)量級或更快。當管線時脈運行時，時脈的每個週期對應於一個管線週期。因此，在一些實施例中，管線週期可以小於一奈秒。在管線中，管線的階段包括在第一管線時脈脈衝(例如，時脈脈衝的前沿)保持階段輸入資料的輸入暫存器或資料儲存，以及在下一個管線時脈脈衝的階段(例如，下一個時脈脈衝的前沿，定義一個管線時脈週期)的暫存器階段輸出資料的輸出暫存器或資料儲存。在第一管線時脈脈衝開始一個管線週期(i)時，所述階段的輸出暫存器保存前一個管線週期(i-1)的階段輸出資料，而管線中一個階段的階段輸出資料至少是下一個階段的階段輸入資料的一部分。每個階段的電路必須在管線週期內可靠地穩定下來，因此快速的管線時脈對時序關鍵階段造成了很大的困難。 A tri-mode floating point carry-save MAC (FP-CS-MAC) unit is described, including circuitry implemented as a pipeline, operating in response to a pipeline clock. In some implementations, the pipeline clock can be on the order of gigahertz (GHz) or faster. When the pipeline clock is operating, each cycle of the clock corresponds to one pipeline cycle. Thus, in some embodiments, the pipeline cycle can be less than one nanosecond. In a pipeline, a pipeline stage includes an input register or data storage that holds stage input data at a first pipeline clock pulse (e.g., the leading edge of the clock pulse), and an output register or data storage that outputs data at a register stage at a next pipeline clock pulse (e.g., the leading edge of the next clock pulse, defining a pipeline clock cycle). When the first pipeline clock pulse starts a pipeline cycle (i), the output register of the stage saves the stage output data of the previous pipeline cycle (i-1), and the stage output data of one stage in the pipeline is at least part of the stage input data of the next stage. The circuit of each stage must be reliably stabilized within the pipeline cycle, so the fast pipeline clock poses great difficulties to the timing-critical stage.

三模式浮點進位保留MAC(FP-CS-MAC)單元的一種實現包含6個管線階段。透過增加管線階段數可以進一步提高速度。透過減少管線階段數可以進一步降低功率。一般來說，管線階段的最佳數量取決於特定的技術和設計要求。第一主要單元是BF16乘法器，在本例中它在兩個管線階段中實現，並包括用於將乘法器結果轉換為16位元的2的補數有效數和指數的轉換單元。第三管線階段是進位保留累加階段。接下來的兩個階段將進位總和格式的結果轉換回傳統正規化符號數值格式，諸如輸出編碼格式所需的BF16或FP32。 One implementation of a tri-mode floating-point carry-save MAC (FP-CS-MAC) unit contains six pipeline stages. Further speed improvements can be achieved by increasing the number of pipeline stages. Further power reductions can be achieved by reducing the number of pipeline stages. In general, the optimal number of pipeline stages depends on the specific technology and design requirements. The first major unit is the BF16 multiplier, which in this case is implemented in two pipeline stages and includes a conversion unit for converting the multiplier result to a 16-bit 2's complement significand and exponent. The third pipeline stage is the carry-save accumulate stage. The next two stages convert the result in carry-sum format back to a traditional normalized signed numeric format, such as BF16 or FP32, as required by the output encoding format.

最後一個管線階段執行正規化和捨入以產生結果。在這種情況下，最終格式為BF16或FP32格式。輸入運算元有效數介於1

|a|<2之間，因為它們在十進制點左側包含隱含的1，並且僅包括有效數的小數部分。所述單元不支援非正規化數字並將它們截斷為零。因此，使用BF16或FP32，輸入運算元的範圍為±2-126到(2-2-7)×2127。如果小於±2-126，則超出此範圍的數字截斷為零，或者如果大於±(2-2-7)×2127，則轉換為±無窮大。 The last pipeline stage performs normalization and rounding to produce the result. In this case, the final format is BF16 or FP32 format. The input operands have a valid value between 1

|a|<2, since they include an implicit 1 to the left of the decimal point and include only the fractional part of the significand. The unit does not support denormalized numbers and truncates them toward zero. Therefore, with BF16 or FP32, the range of input operands is ±2-126 to (2-2-7)×2127. Numbers outside this range are truncated to zero if less than ±2-126, or converted to ±infinity if greater than ±(2-2-7)×2127.

浮點編碼格式Floating point encoding format

圖1說明了兩種編碼格式的位元模式。第一位元格式的第一範例圖說明了Bfloat16 110。Bfloat16浮點編碼格式(有時「BF16」)是16位元數字格式。BF16保留了IEEE單精確度數的近似動態範圍。說明的BF16格式包括7位元小數、用以完成有效數的「隱含位元」或「隱藏位元」、8位元指數和一個符號位元。 FIG. 1 illustrates the bit patterns of the two encoding formats. The first example diagram of the first bit format illustrates Bfloat16 110. The Bfloat16 floating point encoding format (sometimes "BF16") is a 16-bit numeric format. BF16 preserves the approximate dynamic range of IEEE single-precision numbers. The illustrated BF16 format includes a 7-bit fraction, an "implicit bit" or "hidden bit" to complete the significand, an 8-bit exponent, and a sign bit.

第二張圖說明了IEEE 754單精確度32位元浮點(FP32)130編碼格式。說明的IEEE 754單精確度32位元浮點130包括23位元小數、用以完成有效數的「隱含位元」或「隱藏位元」、8位元指數和和一個符號位元。這兩種編碼格式的特徵是FP32格式中的數字可以透過丟棄23位元部分的16個較低有效數位元來轉換為BF16格式，在一些實施例中進行捨入以選擇較低階位元。 The second figure illustrates the IEEE 754 single precision 32-bit floating point (FP32) 130 encoding format. The illustrated IEEE 754 single precision 32-bit floating point 130 includes a 23-bit fraction, "implicit bits" or "hidden bits" to complete the significand, an 8-bit exponent and a sign bit. The two encoding formats are characterized in that numbers in the FP32 format can be converted to the BF16 format by discarding the 16 less significant bits of the 23-bit portion, and in some embodiments rounding is performed to select the lower-order bits.

系統方塊圖System Block Diagram

圖2是BF16和FP32格式的具有進位保留累加器的浮點乘加累加單元的高階方塊圖。運算元A 213被說明為BF16格式或FP32格式217。運算元B 214是BF16格式並且是乘法器電路202的第一輸入。第二輸入是BF16運算元A 213。當運算元A和運算元B都為BF16格式時，運算元A和運算元B可以佔用單一32位元暫存器，每個使用16位元，表示乘法器和乘法器的被乘數輸入。乘法器電路210的乘積(A*B)輸出在線218以進位及(Carry-Sum)形式產生，其為方塊220中最終加法器的輸入。方塊220還將結果轉換為2的補數形式，並且包括基底8轉換器電路以支援基底8運算。 FIG. 2 is a high-level block diagram of a floating-point multiply-add-accumulate unit with carry-save accumulators in BF16 and FP32 formats. Operand A 213 is illustrated as being in BF16 format or FP32 format 217. Operand B 214 is in BF16 format and is the first input to multiplier circuit 202. The second input is BF16 operand A 213. When both operand A and operand B are in BF16 format, operand A and operand B can occupy a single 32-bit register, each using 16 bits, representing the multiplier and multiplicand inputs to the multiplier. The product (A*B) output of multiplier circuit 210 is generated in carry-sum form on line 218, which is the input to the final adder in block 220. Block 220 also converts the result to 2's complement form and includes base-8 converter circuitry to support base-8 operations.

當管線以單一32位元加法運算時，(一個運算元)運算元A可以繞過乘法器電路202，而用於加法的第二運算元C來自線216。 When the pipeline operates on a single 32-bit addition, (one operand) operand A can bypass multiplier circuit 202, and the second operand C used for the addition comes from line 216.

在這個範例中，運算元C 216是32位元運算元，其被輸入到基底8轉換器215，其在線219上將結果輸出到多工器210和211之一的第一輸入。多工器210和211的第二輸入為從累加器240的輸出反饋的線224和226上用於進位及總和值C/S-ACC的兩條匯流排(以及未顯示的指數)。多工器211和212輸出指數和有效數作為匯流排223的兩個值。 In this example, operand C 216 is a 32-bit operand that is input to base-8 converter 215, which outputs the result on line 219 to the first input of one of multiplexers 210 and 211. The second inputs to multiplexers 210 and 211 are two buses for the carry and sum value C/S-ACC (and the exponent not shown) on lines 224 and 226 fed back from the output of accumulator 240. Multiplexers 211 and 212 output the exponent and significand as two values on bus 223.

進位保留加法器230在線221上接收方塊220的輸出，並在匯流排223上接收多工器211、212的輸出。進位保留加法器230在進入累加器240的雙匯流排222上輸出總和的指數和C/S值。累加器240在輸出匯流排224和225上以進位保留形式提供C/S-ACC指數和有效數，輸出匯流排224和225反饋給多工器211、多工器212，並在匯流排226上以進位保留形式提供C/S-ACC指數和有效數到進位保留到符號數值轉換方塊250，所述方塊執行匯流排226上的有效數的進位及總和值的最終相加，並在匯流排227上將得到的有效數轉換為符號數值格式。匯流排252和251將來自累加器240的資料傳送到進位保留到符號數值轉換方塊250。 Carry-save adder 230 receives the output of block 220 on line 221 and the outputs of multiplexers 211, 212 on bus 223. Carry-save adder 230 outputs the exponent of the sum and the C/S value on dual bus 222 which enter accumulator 240. Accumulator 240 provides the C/S-ACC index and significand in carry-save form on output buses 224 and 225, which are fed back to multiplexers 211, 212, and provide the C/S-ACC index and significand in carry-save form on bus 226 to carry-save to sign value conversion block 250, which performs the final addition of the carry and sum of the significand on bus 226 and converts the resulting significand to sign value format on bus 227. Buses 252 and 251 pass data from accumulator 240 to carry-save to sign value conversion block 250.

基底8到基底2轉換和正規化方塊260在匯流排227上具有輸入，並在匯流排228上將正規化結果輸出以供後正規化、捨入和轉換到FP32或BF16方塊270，其在匯流排229上將輸出轉換為FP32或BF16格式。運算在匯流排229上以32位元FP32格式或16位元BF16格式輸出結果「Z」。 Base 8 to base 2 conversion and normalization block 260 has input on bus 227 and outputs the normalized result on bus 228 for post-normalization, rounding, and conversion to FP32 or BF16 block 270, which converts the output to FP32 or BF16 format on bus 229. The operation outputs the result "Z" on bus 229 in either 32-bit FP32 format or 16-bit BF16 format.

因此，圖2說明了可以實現為多級管線的電路範例，所述多級管線配置成以三種模式執行，包括用於輸入浮點運算元序列的乘法及累加運算。在此範例中，電路可以配置成管線，包括含有具有和及進位輸出的浮點乘法器的第一級、包括用於乘法器的和及進位輸出的乘法器輸出加法器與用以將乘法器加法器輸出轉換為具有2的補數有效數的基底8格式的電路的第二級、包括有效數電路和累加器加法器的指數電路的第三級、將累加器符號位元、累加器指數和累加器有效數和及進位值轉換為符號數值有效數格式的第四級，將符號數值有效數格式從基底8對齊轉換為基底2對齊並產生正規化的指數和有效數的第五級，以及用以執行捨入和轉換為標準浮點表示的第六級。 Thus, FIG. 2 illustrates an example circuit that may be implemented as a multi-stage pipeline configured to execute in three modes, including multiplication and accumulation operations for a sequence of input floating-point operands. In this example, the circuit may be configured as a pipeline including a first stage including floating point multipliers with sum and carry outputs, a second stage including a multiplier output adder for the sum and carry outputs of the multipliers and a circuit for converting the multiplier adder output to a base-8 format with a 2's complement significand, a third stage including a significand circuit and an exponent circuit for an accumulator adder, a fourth stage for converting the accumulator sign bit, the accumulator exponent, and the accumulator significand and carry values to a signed valued significand format, a fifth stage for converting the signed valued significand format from base-8 aligned to base-2 aligned and producing a normalized exponent and significand, and a sixth stage for performing rounding and conversion to a standard floating point representation.

本文描述的技術提供了一種乘法累加法來計算項A(i)*B(i)的總和S(i)，其中(i)從0到N-1，N為總和中的項數。所述方法可以包含以浮點編碼格式接收運算元A(i)和運算元B(i)的序列，其中(i)從0到N-1；將運算元A(i)和運算元B(i)相乘以產生包含乘法器輸出指數和乘法器輸出有效數的格式的項A(i)*B(i)，並將乘法器輸出有效數轉換為2的補數格式；使用進位保留加法器將項A(i)*B(i)的2的補數格式有效數相加到總和S(i-1)的有效數，並為總和S(i)產生和及進位值；從A(i)*B(i)的乘法器輸出指數與總和S(i-1)的指數中選擇總和S(i)的指數，以產生總和S(i)的指數；以及將和及進位值與總和S(i)的指數轉換為正規化浮點編碼格式。 The technology described herein provides a multiply-accumulate method to calculate a sum S(i) of terms A(i)*B(i), where (i) ranges from 0 to N-1 and N is the number of terms in the sum. The method may include receiving a sequence of operands A(i) and B(i) in a floating point encoding format, where (i) ranges from 0 to N-1; multiplying the operands A(i) and B(i) to produce a term A(i)*B(i) in a format including a multiplier output exponent and a multiplier output significand, and converting the multiplier output significand to a 2's complement format; using a carry-save adder to add the term A(i)*B(i) to the sum S(i). )*B(i) in two's complement format is added to the significand of the sum S(i-1), and a sum and carry value is generated for the sum S(i); an exponent of the sum S(i) is selected from the multiplier output exponent of A(i)*B(i) and the exponent of the sum S(i-1) to generate an exponent of the sum S(i); and the sum and carry value and the exponent of the sum S(i) are converted to a normalized floating point encoding format.

此外，所述方法可以包括以基底8格式提供乘法器輸出指數和項A(i)*B(i)的乘法器輸出有效數，以及在轉換為可以是基底2的正規化浮點編碼格式之前以基底8格式產生和及進位值與總和S(i)的指數。 In addition, the method may include providing the multiplier output exponent and the multiplier output significand of the term A(i)*B(i) in base 8 format, and generating the sum and carry value and the exponent of the sum S(i) in base 8 format before converting to a normalized floating point encoding format that may be base 2.

累加加法階段所需的對齊取決於許多狀況，包括總和S(i-1)有效數溢位、總和S(i-1)符號擴展以及加數：項A(i)*B(i)及總和S(i-1)的指數之間的差異。可以確定這些狀況並將其組合用於在同一管線週期(例如，六級範例中的第三級)中進行對齊，從而實現快速執行和更快的管線時脈。在本文提供的實施例中，所述單元執行計算項A(i)*B(i)的總和S(i)的方法，其中(i)從0到N-1，並且N為總和的項數，所述方法包含：接收浮點編碼格式的運算元A(i)和運算元B(i)序列，其中(i)從0到N-1；將運算元A(i)和運算元B(i)相乘以在第一個管線週期期間產生項A(i)*B(i)，其格式包括項A(i)*B(i)的乘法器輸出指數和項A(i)*B(i)的乘法器輸出有效數，並在第一個管線週期期間將項A(i)*B(i)的乘法器輸出指數與總和S(i-1)的累加器輸出指數進行比較以針對總和S(i)產生比較訊號；將項A(i)*B(i)相加到總和S(i-1)以在下一個管線週期期間產生總和S(i)，其格式包括總和S(i)的累加器輸出指數及總和S(i)的累加器輸出有效數，其中所述加法包括確定總和S(i)的累加器輸出指數，並將總和S(i-1)的累加器輸出有效數和項A(i)*B(i)的乘法器輸出有效數中的一者或兩者移位作為總和S(i)的比較訊號的結果。 The alignment required in the accumulation addition stage depends on many conditions, including the overflow of the significand of the sum S(i-1), the sign extension of the sum S(i-1), and the difference between the addends: the term A(i)*B(i) and the exponent of the sum S(i-1). These conditions can be determined and combined for alignment in the same pipeline cycle (e.g., the third stage in the six-stage example), thereby achieving fast execution and faster pipeline clock. In an embodiment provided herein, the unit executes a method for calculating a sum S(i) of terms A(i)*B(i), where (i) ranges from 0 to N-1, and N is the number of terms in the sum, the method comprising: receiving a sequence of operators A(i) and B(i) in floating point encoding format, where (i) ranges from 0 to N-1; multiplying the operators A(i) and B(i) to generate the term A(i)*B(i) during a first pipeline cycle, wherein the format includes a multiplier output exponent of the term A(i)*B(i) and a multiplier output significand of the term A(i)*B(i); and performing the multiplication of the terms A(i)*B(i) during the first pipeline cycle. ) and the accumulator output index of the sum S(i-1) to generate a comparison signal for the sum S(i); adding the term A(i)*B(i) to the sum S(i-1) to generate the sum S(i) during the next pipeline cycle, the format of which includes the accumulator output index of the sum S(i) and the accumulator output valid number of the sum S(i), wherein the addition includes determining the accumulator output index of the sum S(i), and shifting one or both of the accumulator output valid number of the sum S(i-1) and the multiplier output valid number of the term A(i)*B(i) as the result of the comparison signal of the sum S(i).

執行在第一個管線週期期間將項A(i)*B(i)的乘法器輸出指數與總和S(i-1)的累加器輸出指數進行比較的步驟，以產生用於總和S(i)的比較訊號，同時在下一個管線週期(早期指數比較)中執行對運算元的調整使得能夠使用具有較短關鍵時序路徑並且可在較高時脈速度下運算的累加器級的管線。 The step of comparing the multiplier output index of the term A(i)*B(i) with the accumulator output index of the sum S(i-1) during the first pipeline cycle is performed to generate a comparison signal for the sum S(i), while the adjustment of the operands is performed in the next pipeline cycle (early index comparison) to enable the use of a pipeline with an accumulator stage having a shorter critical timing path and operating at a higher clock speed.

浮點乘數Floating point multiplier

浮點乘法器包括指數電路和有效數電路。指數部分執行運算元指數的加法，而有效數部分執行運算元有效數的二進制乘法。進入乘法器的運算元是「正規化」浮點數，其中第一位元是1。因此，運算元有效數(m)介於1

m<2之間，即大於或等於1且小於2。因此，兩個運算元有效數的乘積在1

p<4的範圍內，並且永遠不會等於或大於4。 A floating-point multiplier consists of an exponent circuit and a significand circuit. The exponent section performs addition of the operand exponents, while the significand section performs binary multiplication of the operand significands. The operands entering the multiplier are "normalized" floating-point numbers, where the first bit is 1. Therefore, the operand significand (m) is between 1 and 2.

m<2, that is, greater than or equal to 1 and less than 2. Therefore, the product of the effective numbers of the two operands is 1.

p<4 and will never be equal to or greater than 4.

如果作為有效數乘法的結果的乘積p在2

p<4的範圍內，則指數將遞增，而有效數向右移位一個二進制位置以進行正規化。 If the product p of the result of the multiplication of the significands is in 2

In the range p < 4, the exponent is incremented and the significand is shifted right by one binary position to normalize.

第一管線階段使用8x8位元整數乘法器執行指數的加法和運算元有效數的乘法，包括用於部分乘積的進位保留加法器。在使用進位保留加法器對所有部分乘積加總之後，乘法器陣列的結果可以包括兩部分：乘法器陣列的最高有效部分中部分乘積的進位保留加法器的8位元總和與9位元進位，以及乘法器陣列的最低有效部分的8位元乘積。在此範例中，使用漣波進位加法器將最低有效部分中8位元的部分乘積相加，因為這些位元來自部分乘積簡化樹。這種總和可以使用漣波進位加法器來完成，因為來自乘法器的最低有效部分的時間到達分佈是從最低有效數位元(LSB)到最高有效數位元(MSB)及時到達的那部分，足夠做漣波進位加法器。施加漣波進位加法器(RCA)顯著降低了乘法器的複雜度(圖4a)。 The first pipeline stage performs addition of exponents and multiplication of operand significands using 8x8-bit integer multipliers, including carry-save adders for partial products. After summing all partial products using carry-save adders, the result of the multiplier array can include two parts: the 8-bit sum of the carry-save adders and the 9-bit carry of the partial products in the most significant part of the multiplier array, and the 8-bit product of the least significant part of the multiplier array. In this example, the 8-bit partial products in the least significant part are summed using ripple carry adders because these bits come from the partial product reduction tree. This summation can be done using a ripple-carry adder because the time arrival distribution of the least significant portion from the multiplier is that portion that arrives in time from the least significant bit (LSB) to the most significant bit (MSB) is sufficient for a ripple-carry adder. Applying a ripple-carry adder (RCA) significantly reduces the complexity of the multiplier (Figure 4a).

所述級包括響應於在管線時脈上暫存的第一和第二輸入運算元，在管線時脈之前提供乘法器有效數和乘法器指數值的乘法器電路。乘法器電路包括有效數乘法器電路和指數加法器電路，所述有效數乘法器電路具有用以產生進位及總和值的部分乘積以產生乘法器輸出有效數的高階位元的進位保留加法器，和用以產生有效數進位及總和值輸出的低階位元的部分乘積的漣波進位加法器。此外，乘法器電路包括基底8轉換電路，用以將乘法器有效數和乘法器指數值轉換為乘法器輸出指數和有效數的基底8格式；以及2的補數轉換電路，用以將乘法器的有效數值轉換為乘法器輸出有效數的2的補數表示。 The stage includes a multiplier circuit that provides a multiplier significand and a multiplier exponent value prior to the pipeline clock in response to first and second input operands that are temporarily stored on the pipeline clock. The multiplier circuit includes a significand multiplier circuit and an exponent adder circuit, the significand multiplier circuit having a carry-save adder for generating partial products of carry and sum values to generate high-order bits of a multiplier output significand, and a ripple-carry adder for generating partial products of the significand carry and sum value output low-order bits. In addition, the multiplier circuit includes a base-8 conversion circuit for converting the multiplier significand and the multiplier exponent value into a base-8 format of the multiplier output exponent and significand; and a two's complement conversion circuit for converting the multiplier significand value into a two's complement representation of the multiplier output significand.

指數是單獨相加的。兩個指數都為大於零的正數。當加法結果是大於256的數字時，指示是來自指數加法器的進位輸出訊號。如果結果指數等於255，則判定正無窮大指示。如果指數等於0，則根據IEEE 754標準規則將有效數設置為零。在此實現中，如果乘積的指數為0，則結果的有效數強制為0，因此表示+/-零浮點數(圖4b)。在其它實施例中，可以不同地對待次正規數。 The exponents are added separately. Both exponents are positive numbers greater than zero. When the addition result is a number greater than 256, the indication is the carry-out signal from the exponent adder. If the result exponent is equal to 255, a positive infinite indication is determined. If the exponent is equal to 0, the significand is set to zero according to the IEEE 754 standard rules. In this implementation, if the exponent of the product is 0, the significand of the result is forced to 0, thus representing a +/- zero floating point number (Figure 4b). In other embodiments, subnormal numbers may be treated differently.

指數加法需要從結果中減去127，因為在BF16和FP32編碼格式中，兩個運算元都包含127偏置。將結果加129可以加快轉換程序，這是透過反相輸入之一的指數的MSB並將1引入加法器的進位輸入來實現的。這極大地簡化了電路並且可以減少管線階段所需的時間(圖4b)。 Adding the exponent requires subtracting 127 from the result because both operands contain a 127 bias in the BF16 and FP32 encoding formats. Adding 129 to the result speeds up the conversion process, which is accomplished by inverting the MSB of the exponent at one of the inputs and introducing a 1 into the carry input of the adder. This greatly simplifies the circuit and can reduce the time required for pipeline stages (Figure 4b).

我們透過以下方式證明這個程序的正確性：加法致使127的兩個偏置相加，使偏置為254。然而，由於加法器的進位輸出(即256)被忽略，結果偏置將是-2。我們可以透過在運算結果中加上129來得到127。這是透過反相運算元的MSB來實現的，在負運算元的情況下，這相當於加128，因為MSB位置包含零。在MSB等於1的正運算元的情況下，這也相當於加128。進位輸入處的額外1使結果偏置：-2+129，等於所需的127偏置。 We prove the correctness of this procedure in the following way: The addition results in the addition of two biases of 127, resulting in a bias of 254. However, since the carry-out of the adder (i.e., 256) is ignored, the resulting bias will be -2. We can get 127 by adding 129 to the result of the operation. This is done by inverting the MSB of the operand, which in the case of a negative operand is equivalent to adding 128, since the MSB position contains a zero. In the case of a positive operand with an MSB equal to 1, this is also equivalent to adding 128. The extra 1 at the carry-in biases the result: -2+129, which is equal to the desired bias of 127.

相同的管線階段將結果轉換為基底8的數字，其中包含5位元指數和適當地向右移位7個位置的有效數。對於剩餘3指數位元的值表示的數量，轉換成5位元指數需要從第7位置左移。這需要有效數穿過左移位器，所述左移位器將根據8位元指數的3-LSB位元的需求將有效數從0位元向左移位到7位元位置。(圖5b) The same pipeline stage converts the result to a base 8 number consisting of a 5-bit exponent and the significand shifted right by 7 positions as appropriate. Conversion to a 5-bit exponent requires a left shift from the 7th position for the value of the remaining 3 exponent bits to represent the quantity. This requires the significand to pass through a left shifter which will shift the significand left from bit 0 to the 7th position as required by the 3-LSB bits of the 8-bit exponent. (Figure 5b)

乘法器透過識別源自部分乘積簡化樹(PPRT)的訊號到達分佈不均勻來節省計算時間。LSB位元首先到達，接著是下一個位元，對於PPRT的前8個最低有效數位元(LSB)依此類推。由於不相等的到達分佈，LSB部分的相加可以在乘法器陣列的延遲下被屏蔽(「隱藏」)，從而為管線階段(例如，在上面概述的範例中的第二管線階段)提供節省(在時間上)。對LSB部分進行加總使用8位元漣波進位加法器(RCA)，以將使用用於部分乘積的進位保留加法器的進位傳播加法器(CPA)的大小從17位元減少到9位元。用於下一個管線階段的MSB部分，包括只有9位元長的最終加法器。乘積的有效數在管線階段中透過將最終加法器的最高有效數9位元相加並利用先前使用前一個管線階段的漣波進位加法器形成的最低有效數8位元進行擴充來形成(圖4a)。 The multiplier saves computation time by recognizing that the arrival distribution of signals originating from the partial product reduction tree (PPRT) is uneven. The LSB bit arrives first, followed by the next bit, and so on for the first 8 least significant bits (LSBs) of the PPRT. Due to the unequal arrival distribution, the addition of the LSB portion can be masked ("hidden") under the delay of the multiplier array, thereby providing savings (in time) to the pipeline stage (e.g., the second pipeline stage in the example outlined above). Summing the LSB portion uses an 8-bit ripple carry adder (RCA) to reduce the size of the carry-propagate adder (CPA) using the carry-save adder for the partial product from 17 bits to 9 bits. The MSB portion for the next pipeline stage, including the final adder, is only 9 bits long. The significand of the product is formed in the pipeline stage by adding the most significant 9 bits from the final adder and extending it with the least significant 8 bits previously formed using the ripple-carry adder from the previous pipeline stage (Figure 4a).

圖3是具有兩個輸入(線213上的運算元A和線214上的運算元B)的乘法器電路202的簡化方塊圖300。乘法器電路202包含兩個方塊(乘法器及加法器方塊210a和指數方塊210b)。 FIG. 3 is a simplified block diagram 300 of a multiplier circuit 202 having two inputs (operand A on line 213 and operator B on line 214). The multiplier circuit 202 includes two blocks (multiplier and adder block 210a and exponent block 210b).

圖4a說明了乘法器及加法器方塊210a的範例，其顯示8x8乘法器部分乘積簡化樹，具有用於更高有效數位元的部分乘積的進位保留加法器，沒有具有用於較低有效數位元的部分乘積加法的7-LSB漣波進位加法器方塊的最終16位元加法器(在下一階段提供)。運算元A 213儲存在包括三個欄位：Sa、Ea和Fa的暫存器420中。Sa是符號位元。Ea是八個指數位元，而Fa是有效數的小數部分。Fa欄位在線422上施加到8X8 BF16乘法器電路410的第一輸入。運算元B 214儲存在暫存器421中，暫存器421包含三個欄位：Sb、Eb和Fb。Sb是符號位元。Eb是八個指數位元，而Fb是有效數的小數部分。Fb欄位在線423上施加到8X8 BF16乘法器電路410的第二個輸入。在線440上，乘法器電路410的輸入是強制零位元，當為零時，強制8X8 BF16乘法器電路產生零輸出。 FIG4a illustrates an example of a multiplier and adder block 210a showing an 8x8 multiplier partial product reduction tree with carry-save adders for partial products of higher-significant bits without a final 16-bit adder (provided in the next stage) with a 7-LSB ripple carry adder block for partial product addition of lower-significant bits. Operand A 213 is stored in register 420 which includes three fields: Sa, Ea, and Fa. Sa is the sign bit. Ea is the eight exponent bits, and Fa is the fractional portion of the significand. The Fa field is applied to the first input of the 8X8 BF16 multiplier circuit 410 on line 422. Operand B 214 is stored in register 421, which contains three fields: Sb, Eb, and Fb. Sb is the sign bit. Eb is the eight exponent bits, and Fb is the fractional portion of the significand. The Fb field is applied to the second input of the 8X8 BF16 multiplier circuit 410 on line 423. The input to the multiplier circuit 410 on line 440 is the force zero bit, which, when zero, forces the 8X8 BF16 multiplier circuit to produce a zero output.

8X8 BF16乘法器電路410輸出兩個7位元LSB匯流排428和429，其為7位元漣波進位加法器430的輸入。此外，8X8 BF16乘法器電路410輸出8個和位元S8 426和9個進位位元C9 427。在線424上，7位元漣波進位加法器430輸出7個位元，而在線425上，進位輸出位元COUT輸出到暫存器450。暫存器450具有以下映射：線424映射到PL[6：0]，線425上的COUT映射到C7，線426上的S8映射到Sp[14：7]，線427上的C9映射到Cp[14：6]。 The 8X8 BF16 multiplier circuit 410 outputs two 7-bit LSB busses 428 and 429, which are inputs to the 7-bit ripple-carry adder 430. In addition, the 8X8 BF16 multiplier circuit 410 outputs 8 sum bits S8 426 and 9 carry bits C9 427. On line 424, the 7-bit ripple-carry adder 430 outputs 7 bits, and on line 425, the carry output bit COUT is output to the register 450. The register 450 has the following mapping: line 424 maps to PL[6:0], COUT on line 425 maps to C7, S8 on line 426 maps to Sp[14:7], and C9 on line 427 maps to Cp[14:6].

圖4b說明了具有特殊指數檢測方塊467的範例指數單元(例如圖3的210b)。如圖4a中的運算元A 213在暫存器420中，而如圖4a中的運算元B在暫存器421中。線465上的Ea是特殊指數檢測方塊和指數加法器電路464的一個輸入。線462上的Eb是特殊指數檢測方塊的第二輸入。線462上的Eb的7個最低有效數位元被輸入到指數加法器電路464，而第8位元在第8位元位置進入指數加法器電路464之前由反相器461反相。指數加法器電路464的進位值設置為「1」。 FIG. 4b illustrates an example exponent unit (e.g., 210b of FIG. 3 ) with a special exponent detection block 467. Operand A 213 as in FIG. 4a is in register 420, and operand B as in FIG. 4a is in register 421. Ea on line 465 is an input to the special exponent detection block and exponent adder circuit 464. Eb on line 462 is a second input to the special exponent detection block. The 7 least significant digits of Eb on line 462 are input to the exponent adder circuit 464, and the 8th bit is inverted by inverter 461 before entering the exponent adder circuit 464 at the 8th bit position. The carry value of the exponent adder circuit 464 is set to "1".

指數加法器電路464對Ea 465和Eb 462進行運算，將它們加在一起並減去127的偏置值。輸出是到暫存器470的10位元值466。兩個額外位元，超出了編碼指數所需的8位元，用於檢測指數溢位情況。這10位元在指數異常處理電路524中進一步檢查，如圖5C所示。 Exponent adder circuit 464 operates on Ea 465 and Eb 462, adding them together and subtracting the bias value of 127. The output is a 10-bit value 466 to register 470. Two extra bits, beyond the 8 bits required to encode the exponent, are used to detect exponent overflow conditions. These 10 bits are further checked in exponent exception handling circuit 524, as shown in Figure 5C.

輸入指數訊號在特殊指數檢測方塊467中被檢查為零，如線468上的訊號所示，或無效，如線469上的訊號所示。來自暫存器420和421的符號位元Sa及Sb輸入到反互斥或閘471a，其輸出被施加到反互斥或閘471b。此外，線469上的無效訊號被輸入到反互斥或閘471c。如果無效訊號為零，則結果符號是Sa及Sb的互斥或函數。如果Invalid為真(等於「1」)，則乘積符號Sp設置為「零」，如編碼標準中所指定。 The input index signal is checked in special index detection block 467 to be zero, as indicated by the signal on line 468, or invalid, as indicated by the signal on line 469. The sign bits Sa and Sb from registers 420 and 421 are input to anti-mutex OR gate 471a, whose output is applied to anti-mutex OR gate 471b. In addition, the invalid signal on line 469 is input to anti-mutex OR gate 471c. If the invalid signal is zero, the result sign is the exclusive OR function of Sa and Sb. If Invalid is true (equal to "1"), the product sign Sp is set to "zero" as specified in the coding standard.

基底8轉換Base 8 conversion

圖5A是顯示基底8轉換器方塊592(例如圖2的方塊220)的簡化圖。基底8轉換器方塊592包含兩個子方塊，在此範例中，最終加法有效數選擇和基底8轉換子方塊592a和指數異常處理子方塊592b。 FIG5A is a simplified diagram showing a base-8 converter block 592 (e.g., block 220 of FIG2 ). The base-8 converter block 592 includes two sub-blocks, in this example, a final addition significand selection and base-8 conversion sub-block 592a and an exponent exception handling sub-block 592b.

轉換為基底8，2的補數有效數Convert to base 8, 2's complement significand

外部輸入運算元A在第二管線階段轉換為基底8編碼。運算元A有效數被轉換為2的補數有效數。有效數擴展為34位元，包括兩個有效數符號位元。如圖5b所示，得到的管線暫存器520包含5位元指數、34位元有效數和兩個額外的狀態位元，總共41位元。 The external input operand A is converted to base 8 encoding in the second pipeline stage. The operand A significand is converted to a 2's complement significand. The significand is expanded to 34 bits, including two significand sign bits. As shown in FIG. 5b, the resulting pipeline register 520 contains a 5-bit exponent, a 34-bit significand, and two additional status bits, for a total of 41 bits.

使用指數的最後3位元實現到基底8的轉換，以將來自圖4a的暫存器450的24位元運算元有效數對齊為32位元基底8有效數，其中有效數的LSB與32位元有效數的LSB對齊，如果指數的3-LSB等於0(從二進制點向右移位8個位置)。由指數的3-LSB表示的任何值都為有效數向左移位的量(從第8位元位置開始)，以補償從指數中截斷的那些位元。直到二進制點的其餘位元，以及超過的兩位元，都用符號擴展位元填充。在所有三個指數LSB都為b'1的情況下，也就是說，等於十進制7，32位元有效數的第一個有效數位元將是非零位元，也就是說，正規化有效數。由於有效數表示為2的補數，有效數點左側的兩個額外位元將用於儲存符號位元(包括擴展符號位元)。使用額外的第二符號位元，而不是一個，以便保留符號，因為可能的溢位情況會致使2位元整數覆蓋較低符號位元(圖5b)。 The conversion to base 8 is performed using the last 3 bits of the exponent to align the 24-bit operand significand from register 450 of FIG. 4a to a 32-bit base 8 significand, where the LSB of the significand is aligned with the LSB of the 32-bit significand if the 3-LSB of the exponent is equal to 0 (shifted 8 positions right from the binary point). Any value represented by the 3-LSB of the exponent is the amount by which the significand is shifted left (starting at the 8th bit position) to compensate for those bits truncated from the exponent. The remaining bits up to the binary point, and any two bits beyond, are filled with sign-extension bits. In the case where all three exponent LSBs are b'1, that is, equal to decimal 7, the first significand bit of the 32-bit significand will be nonzero, that is, the normalized significand. Since the significand is represented as a 2's complement, the two extra bits to the left of the significand point are used to store the sign bit (including the extended sign bit). The extra second sign bit is used, rather than one, in order to preserve the sign because a possible overflow condition would cause the 2-bit integer to overwrite the lower sign bit (Figure 5b).

根據乘積的符號，有效數被穿過或被反相，以建立有效數的2的補數負表示。此實現不同於IEEE 754，其中有效數可以是正數或負數。此運算是透過在24位元有效數上相加一個符號位元並在符號等於1(負數)時反相這些位元來執行的。 Depending on the sign of the product, the significand is either traversed or inverted to create a 2's complement negative representation of the significand. This implementation differs from IEEE 754, where the significand can be either positive or negative. The operation is performed by adding a sign bit to the 24-bit significand and inverting the bits if the sign equals 1 (negative).

指數在-126到126之間的值檢查。如果大於 126，則將其視為無窮大；如果小於-126，則將其視為非正規化數(小於-126)並轉換為零(圖5c)。 The exponent is checked between -126 and 126. If it is greater than 126, it is considered infinite; if it is less than -126, it is considered a denormalized number (less than -126) and converted to zero (Figure 5c).

在某些實現中，這一階段管線的最終暫存器包含正規化浮點乘積，所述乘積具有5位元指數和34位元2的補數有效數，(含有乘積的重複符號，並且沒有隱含的1)和三個指數狀態位元。 In some implementations, the final register at this stage of the pipeline contains the normalized floating-point product with a 5-bit exponent and a 34-bit 2's complement significand (with repeated signs of the product and no implicit 1s) and three exponent status bits.

圖5B說明了最終部分乘積加法、有效數選擇和基底8轉換子方塊592a的範例性示意圖。暫存器450(圖4a)包含PL[6：0]、C7、Sp[14：7]和Cp[14：6]的欄位。暫存器470(圖4b)包含10位元乘積指數(Ep)值的欄位。暫存器504包括狀態位元。 FIG5B illustrates an exemplary schematic diagram of the final partial product addition, significand selection, and base 8 conversion subblock 592a. Register 450 (FIG4a) includes fields for PL[6:0], C7, Sp[14:7], and Cp[14:6]. Register 470 (FIG4b) includes fields for a 10-bit product exponent (Ep) value. Register 504 includes status bits.

在管線以FP32加法模式運行的情況下，運算元A為FP32格式並繞過乘法器。在這種情況下，運算元A源自暫存器460，佔用兩個組合的16位元暫存器420和421。線511上的add_op控制訊號指示何時將管線模式設置為加法(本例中為單精確度浮點)或累加。 In the case where the pipeline is running in FP32 addition mode, operand A is in FP32 format and bypasses the multiplier. In this case, operand A is sourced from register 460 and occupies two combined 16-bit registers 420 and 421. The add_op control signal on line 511 indicates when the pipeline mode is set to addition (single precision floating point in this case) or accumulation.

有效數最終加法器電路502在線503上接收Sp[14：7]、線501上的Cp[14：6]和線507上的進位位元C7作為輸入，將溢位訊號519輸出到溢位選擇電路506。溢位選擇電路506具有輸入匯流排523，其是線509上的PL[6：0]和線521上的有效數最終加法器電路502輸出的組合。反或閘522具有線525上的輸入指數溢位位元及線468上的零強制位元並輸出線527上的訊號。溢位選擇電路506輸出的線527和匯流排529上的訊號路由到及閘544，其在指數溢位的情況下以及在有效數被強制為零的情況下將有效數設置為全零。此外，有效數選擇電路512使用線511上的add_op控制訊號在匯流排515上的旁路有效數Fa[22：0]或匯流排553上的及閘544輸出之間進行選擇。 Significand final adder circuit 502 receives Sp[14:7] on line 503, Cp[14:6] on line 501, and carry bit C7 on line 507 as inputs, and outputs overflow signal 519 to overflow select circuit 506. Overflow select circuit 506 has input bus 523, which is a combination of PL[6:0] on line 509 and the significand final adder circuit 502 output on line 521. NOR gate 522 has input exponent overflow bit on line 525 and zero force bit on line 468 and outputs signal on line 527. Overflow select circuit 506 outputs signals on line 527 and bus 529 that are routed to AND gate 544, which sets the significand to all zeros in the event of exponent overflow and in the event that the significand is forced to zero. Additionally, significand select circuit 512 uses the add_op control signal on line 511 to select between the bypassed significand Fa[22:0] on bus 515 or the AND gate 544 output on bus 553.

指數選擇電路510在8指數位元、線517上的Ep[7：0]或線513上的旁路指數位元Ea[30：23]位元之間進行選擇，並將線533上的選定指數輸出到暫存器520的E_mult欄位。符號位元選擇電路508接收在線473上的Sp符號位元(圖4b)和旁路符號位元Sa作為輸入，並將符號位元531輸出到暫存器520中的S_mult欄位。 The index select circuit 510 selects between the 8 exponent bits, Ep[7:0] on line 517, or the bypassed exponent bits Ea[30:23] on line 513, and outputs the selected exponent on line 533 to the E_mult field of register 520. The sign bit select circuit 508 receives the Sp sign bit on line 473 (FIG. 4b) and the bypassed sign bit Sa as inputs, and outputs the sign bit 531 to the S_mult field in register 520.

線511上的add_op控制訊號路由到有效數選擇電路512、指數選擇電路510和符號位元選擇電路508作為它們的控制輸入。 The add_op control signal on line 511 is routed to the significand select circuit 512, the exponent select circuit 510, and the sign bit select circuit 508 as their control inputs.

有效數選擇電路512的輸出進入8位元左移位器電路514。來自指數選擇電路510的線533的較低三位元[2：0]在線533上輸出以控制8位元左移位器電路514。8位元左移位器電路514的輸出匯流排537饋入多工器電路518，所述多工器電路518在線537上的輸入(如果有效數為正的情況)和線539上的輸入(有效數為負的情況)之間進行選擇。這由符號位元531選擇。2的補數反相加1電路516在線537上建立移位器輸出的2的補數，並在線539上輸出補數值。線541上的多工器電路518的輸出以34位元F_mult有效數進入管線暫存器520。所述程序將選定的有效數轉換為2的補數表示的有效數，所述有效數為32位元長，具有2個符號位元，儲存在管線暫存器520中。 The output of the significand select circuit 512 enters the 8-bit left shifter circuit 514. The lower three bits [2:0] of line 533 from the exponent select circuit 510 are output on line 533 to control the 8-bit left shifter circuit 514. The output bus 537 of the 8-bit left shifter circuit 514 feeds the multiplexer circuit 518, which selects between the input on line 537 (if the significand is positive) and the input on line 539 (if the significand is negative). This is selected by the sign bit 531. The 2's complement invert and add 1 circuit 516 creates the 2's complement of the shifter output on line 537 and outputs the complement value on line 539. The output of multiplexer circuit 518 on line 541 enters pipeline register 520 as a 34-bit F_mult significand. The program converts the selected significand into a 2's complement significand, which is 32 bits long with 2 sign bits and is stored in pipeline register 520.

圖5C說明了指數異常處理子方塊592b的方塊圖。如參考圖5b所述，有效數最終加法器電路502從暫存器450接收輸入Sp[14：7]503、Cp[14：6]501和進位位元C7 507。有效數最終加法器電路502的溢位輸出連接到指數異常處理電路524。在檢測到溢位情況時，有效數最終加法器電路502判定溢位訊號519作為指數異常處理電路524的第一輸入。指數異常處理電路524的第二輸入是來自匯流排517上的暫存器470的指數位元Ep[9：0](輸入運算元指數的和)。第三輸入是線523上的指數溢位檢測電路522的輸出訊號。接著將指數異常處理電路524的輸出在線549上輸入到指數異常檢測電路526和指數選擇電路510(參照圖5b描述)。 FIG5C illustrates a block diagram of the exponent exception handling sub-block 592 b. As described with reference to FIG5 b, the significand final adder circuit 502 receives inputs Sp[14:7] 503, Cp[14:6] 501, and carry bit C7 507 from register 450. The overflow output of the significand final adder circuit 502 is connected to the exponent exception handling circuit 524. When an overflow condition is detected, the significand final adder circuit 502 asserts the overflow signal 519 as the first input of the exponent exception handling circuit 524. The second input of the exponent exception handling circuit 524 is the exponent bit Ep[9:0] (the sum of the input operand exponents) from register 470 on bus 517. The third input is the output signal of the exponent overflow detection circuit 522 on line 523. The output of the exponent exception handling circuit 524 is then input to the exponent exception detection circuit 526 and the exponent selection circuit 510 on line 549 (see Figure 5b for description).

匯流排517上的指數位元Ep[9：0]被輸入到指數溢位檢測電路522，其檢測溢位狀況：exp_ovf=Ec[8],：表示如果位元8為1，則檢測到指數溢位，exp_povf=~Ec[9]&Ec[8]：如果位元9為0，而位元8為1；正溢位，exp_novf=Ec[9]& Ec[8]：如果位元9和位元8都為1；負溢位。 The exponent bits Ep[9:0] on bus 517 are input to the exponent overflow detection circuit 522, which detects the overflow condition: exp_ovf=Ec[8]: if bit 8 is 1, exponent overflow is detected, exp_povf=~Ec[9]&Ec[8]: if bit 9 is 0 and bit 8 is 1; positive overflow, exp_novf=Ec[9]&Ec[8]: if both bit 9 and bit 8 are 1; negative overflow.

電路522的線523上的第一輸出路由到指數異常處理電路524，線543上的第二輸出路由到輸出異常控制訊號產生電路528，第三輸出包括到圖5B中的閘522的線525上的指數溢位位元。 The first output on line 523 of circuit 522 is routed to exponent exception handling circuit 524, the second output on line 543 is routed to output exception control signal generating circuit 528, and the third output includes the exponent overflow bit on line 525 to gate 522 in FIG. 5B.

指數異常檢測電路526例如透過匯流排547向暫存器532輸出包括以下三位元的異常：of(溢位)；uf(欠位)；以及nv(無效)。 The index abnormality detection circuit 526 outputs the following three-bit abnormality to the register 532 through the bus 547: of (overflow); uf (underflow); and nv (invalid).

檢測到以下情況時會發生這種情況： This occurs when the following conditions are detected:

of(溢位)-意思是如果Ec為11111111，並且未檢測到無窮大，則將其解釋為溢位。 of(overflow) - means if Ec is 11111111 and infinity is not detected, it is interpreted as overflow.

uf(欠位)-意思是如果Ec是00000000並且零(有效數)未發出訊號，則為欠位情況。 uf (under-bit) - means if Ec is 00000000 and zero (significant number) is not signaled, it is an under-bit condition.

nv(無效)「1」-意思是結果無效。 nv (invalid) "1" - means the result is invalid.

輸出異常控制訊號產生電路528具有四個輸入。第一輸入是線511上的add_op控制訊號，其指示累加或旁路加法模式，第二輸入是線509上的狀態位元(無窮大、零或無效)，線545上的第三輸入從在暫存器460的Ea[30：23]位元或指數異常處理電路524的輸出之間多工的指數選擇電路510路由，而第四輸入來自指數溢位檢測電路522的線543上的第二輸出。輸出異常控制訊號產生電路528在線551上輸出五個位元，表示儲存到暫存器530中的exp_mul_zero、exp_mul_inf、exp_zero_en、exp_inf_en和f_zero_en。 The output exception control signal generating circuit 528 has four inputs. The first input is the add_op control signal on line 511, which indicates the accumulation or bypass addition mode, the second input is the status bit on line 509 (infinite, zero or invalid), the third input on line 545 is routed from the index selection circuit 510 multiplexed between the Ea[30:23] bits of register 460 or the output of the exponent exception handling circuit 524, and the fourth input comes from the second output on line 543 of the exponent overflow detection circuit 522. The output exception control signal generating circuit 528 outputs five bits on line 551, representing exp_mul_zero, exp_mul_inf, exp_zero_en, exp_inf_en, and f_zero_en stored in register 530.

exp_mul_zero 含義：乘數乘積指數為零，exp_mul_inf 含義：乘數乘積指數為無窮大exp_zero_en 含義：當(乘數輸入指數之一為零，且乘數輸入指數均不為零)或乘數乘積指數有負溢位時致能，exp_inf_en 含義：當乘數輸入指數之一為無限大或乘數乘積指數有正溢位時致能f_zero_en 含義：當exp_zero_en訊號致能或乘數乘積指數溢位(正或負)或當乘數乘積指數為零時致能。 exp_mul_zero means: the multiplier product exponent is zero, exp_mul_inf means: the multiplier product exponent is infinite, exp_zero_en means: enabled when (one of the multiplier input exponents is zero, and both multiplier input exponents are not zero) or the multiplier product exponent has a negative overflow, exp_inf_en means: enabled when one of the multiplier input exponents is infinite or the multiplier product exponent has a positive overflow, f_zero_en means: enabled when the exp_zero_en signal is enabled or the multiplier product exponent overflows (positive or negative) or when the multiplier product exponent is zero.

進位保留累加單元Carry-save Accumulation Unit

圖6說明了用於有效數的進位保留累加器(例如圖2的240)的方塊圖600。基底8轉換器215接收運算元C作為輸入，並在線219上以基底8格式輸出運算元C到多工器210和多工器211。多工器210和211的兩個額外的輸入是從累加器總和暫存器242和累加器進位暫存器241反饋的匯流排224和225。多工器210和多工器211的輸出路由到移位器電路609和610，移位器電路609和610執行右移8/16/24位元或左移8位元。移位器電路609和610的輸出路由到進位保留加法器電路(CSA)614。進位保留加法器電路614具有來自8/16/24電路608的右移電路的第三輸入，其輸入是A*B(BF16)或A(FP32)運算元單獨的乘積602。線667和669上的進位保留加法器電路614的輸出路由到向S位元暫存器636提供輸出的LZA電路606，並且路由到向O位元暫存器634提供輸出的溢位檢測方塊605。 FIG6 illustrates a block diagram 600 of a carry-save accumulator (e.g., 240 of FIG2) for a significand. The base-8 converter 215 receives operand C as input and outputs operand C in base-8 format on line 219 to multiplexers 210 and multiplexer 211. Two additional inputs to multiplexers 210 and 211 are buses 224 and 225 fed back from accumulator sum register 242 and accumulator carry register 241. The outputs of multiplexers 210 and multiplexer 211 are routed to shifter circuits 609 and 610, which perform right shifts of 8/16/24 bits or left shifts of 8 bits. The outputs of shifter circuits 609 and 610 are routed to carry-save adder circuit (CSA) 614. Carry-save adder circuit 614 has a third input from the right shift circuit of 8/16/24 circuit 608, whose input is the product 602 of the A*B (BF16) or A (FP32) operands alone. The output of carry-save adder circuit 614 on lines 667 and 669 is routed to LZA circuit 606 which provides an output to S-bit register 636, and to overflow detection block 605 which provides an output to O-bit register 634.

進位保留累加單元包括在周期(i)的第一管線時脈接收項A(i)*B(i)的乘法器輸出有效數和表示總和值S(i-1)的先前累加器輸出的反饋總和及進位值的有效數電路。有效數電路包括2的補數、進位保留加法器，用以在第二管線時脈上為總和S(i)產生總和及進位累加器輸出有效數值。進位保留累加單元包括在第一管線時脈接收項A(i)*B(i)的乘法器輸出指數以及表示總和值S(i-1)的先前累加器輸出的反饋指數值的指數電路，以針對總和值S(i)在第二管線時脈上產生累加器輸出指數值。有效數電路包括有效數移位器，響應於儲存在第一管線時脈處的指數比較訊號，以對齊乘法器輸出有效數和用於加法的反饋總和及進位值。指數電路響應於儲存在第一管線時脈處的指數比較訊號，以產生累加器輸出指數值。管線包括指數比較電路，以在第一管線時脈之前將項A(i)*B(i)的乘法器輸出指數與總和S(i-1)的反饋指數值進行比較，以產生儲存在第一管線時脈處的指數比較訊號。 The carry-save accumulation unit includes a significand circuit that receives the multiplier output significand of item A(i)*B(i) and the feedback sum and carry value of the previous accumulator output representing the sum value S(i-1) at the first pipeline clock of cycle (i). The significand circuit includes a two's complement, carry-save adder to generate the sum and carry accumulator output significand value for the sum S(i) at the second pipeline clock. The carry-save accumulation unit includes an index circuit that receives the multiplier output index of item A(i)*B(i) and the feedback index value of the previous accumulator output representing the sum value S(i-1) at the first pipeline clock to generate the accumulator output index value for the sum value S(i) at the second pipeline clock. The significand circuit includes a significand shifter responsive to an exponent compare signal stored at the first pipeline clock to align the multiplier output significand with the feedback sum and carry value for addition. The index circuit is responsive to the exponent compare signal stored at the first pipeline clock to generate an accumulator output exponent value. The pipeline includes an index compare circuit to compare the multiplier output exponent of term A(i)*B(i) with the feedback exponent value of the sum S(i-1) before the first pipeline clock to generate an exponent compare signal stored at the first pipeline clock.

本實施例中的進位保留累加單元包括溢位檢測器電路，以在第一管線時脈處為反饋的總和及進位值中的至少一個產生指示溢位狀況的第一狀況訊號，以及前導符號位元檢測器電路，以產生指示反饋的總和及進位值中的至少一個在第一管線時脈處具有多於或等於數量8的擴展符號位元的第二狀況訊號。指數電路和有效數電路也響應於第一狀況訊號和第二狀況訊號。溢位和前導符號位元調整以及指數比較調整被組合以由移位器在同一管線週期中實現，如參考下面的表格1：CSA單元控制所述。 The carry-save accumulation unit in this embodiment includes an overflow detector circuit to generate a first condition signal indicating an overflow condition for at least one of the sum and carry values fed back at a first pipeline clock, and a leading sign bit detector circuit to generate a second condition signal indicating that at least one of the sum and carry values fed back has an extended sign bit greater than or equal to the number 8 at the first pipeline clock. The exponent circuit and the significand circuit are also responsive to the first condition signal and the second condition signal. Overflow and leading sign bit adjustments and exponent comparison adjustments are combined to be implemented by the shifter in the same pipeline cycle, as described in reference to Table 1: CSA Unit Control below.

此外，管線的這一級具有累加器模式及加總模式，並包括選擇器，以在累加器模式下提供反饋的累加器輸出，並在加總模式下將第三浮點輸入運算元提供給有效數電路及指數電路。有效數電路可以包括有效數移位器，響應於儲存在第一管線時脈處的指數比較訊號，以在累加器模式中對齊乘法器輸出有效數及反饋的總和及進位值以進行加法，並在加總模式中對齊乘法器輸出有效數和第三輸入運算元的有效數以進行加法。指數電路響應於儲存在第一管線時脈的指數比較訊號，以產生累加器輸出指數值。管線包括指數比較電路，以在累加器模式下在第一管線時脈之前將乘法器輸出指數與反饋的指數值進行比較，並在加總模式下將乘法器輸出指數與第三輸入運算元的指數進行比較，以產生儲存於第一管線時脈的指數比較訊號。 In addition, this stage of the pipeline has an accumulator mode and a summing mode, and includes a selector to provide the accumulator output of the feedback in the accumulator mode and to provide the third floating point input operand to the significand circuit and the index circuit in the summing mode. The significand circuit may include a significand shifter, responsive to an exponent compare signal stored at the first pipeline clock, to align the multiplier output significand and the sum and carry values of the feedback for addition in the accumulator mode, and to align the multiplier output significand and the significand of the third input operand for addition in the summing mode. The index circuit is responsive to the exponent compare signal stored at the first pipeline clock to generate the accumulator output exponent value. The pipeline includes an index comparison circuit to compare the multiplier output index with the fed-back index value before the first pipeline clock in the accumulator mode, and to compare the multiplier output index with the index of the third input operand in the summing mode to generate an index comparison signal stored in the first pipeline clock.

有效數電路：Valid digital circuit:

進位保留加法器(CSA)有效數階段有兩條路徑：累加器路徑，其中來自累加器的運算元可以向右移位8、16或24位元，並且可以向左移位8位元，以及乘法器路徑，其中來自乘法器的運算元可以向右移位8、16或24位元。當使用基底8指數時，右移8、16或24位元對應於運算元之間的指數差1、2或3。當進位保留加法器輸出符號擴展超過8位元的數字時，完成8位元左移。 The carry-save adder (CSA) significand stage has two paths: the accumulator path, where the operands from the accumulator can be shifted right by 8, 16, or 24 bits and can be shifted left by 8 bits, and the multiplier path, where the operands from the multiplier can be shifted right by 8, 16, or 24 bits. When using base-8 exponents, a right shift of 8, 16, or 24 bits corresponds to an exponent difference of 1, 2, or 3 between the operands. An 8-bit left shift is done when the carry-save adder outputs a number that sign-extends more than 8 bits.

如果運算元指數之間的差大於3，則意味著其中一個運算元向右移位超過了24位元，這使運算元向右對齊太遠，無法在較大運算元的範圍內。這種情況相當於將更大的運算元加零，或者使用旁路多工器將更大的運算元不變地簡單傳遞給累加器(圖7c)。 If the difference between the operand indices is greater than 3, it means that one of the operands has been shifted right by more than 24 bits, which aligns the operand too far to the right to fit within the larger operand. This situation is equivalent to adding zeros to the larger operand, or using a bypass multiplexer to simply pass the larger operand to the accumulator unchanged (Figure 7c).

此實現透過在指數差大於3時將CSA與零相加來消除旁路多工器，相當於繞過運算元。CSA的輸入來自乘法器和累加器，並由及閘來閘控。移位器和指數控制單元檢測到這種情況並將適當的運算元設置為零。這種實現方式在每條路徑中節省了一個多工器階段。 This implementation eliminates the bypass multiplexer by summing the CSA with zero when the exponent difference is greater than 3, effectively bypassing the operands. The inputs to the CSA come from the multiplier and accumulator and are gated by AND gates. The shifter and exponent control unit detect this condition and set the appropriate operand to zero. This implementation saves one multiplexer stage in each path.

符號擴展的檢測發生在3：2進位保留加法器階段之後。如果檢測到這種情況，則設置符號擴展位元S和溢位位元O，並在隨後的管線時脈中進行處理。為了不因溢位而丟失符號位元，計算時攜帶重複符號。引入了額外的複雜度以提高精確度。這涉及將累加器擴展到36或40位元。在另一種實現中，引入檢測邏輯改進了時序和準確性。檢測邏輯從三個輸入683、685、689獲取進位保留加法器電路(CSA)614的輸入，而不是進位保留加法器電路(CSA)614的兩個輸出，並且是另一個相關揭露的標的。 Detection of sign extension occurs after the 3:2 carry-save adder stage. If this condition is detected, the sign-extend bit S and overflow bit O are set and processed in the subsequent pipeline clock. In order not to lose the sign bit due to overflow, the calculation is carried over with duplicate signs. Additional complexity is introduced to improve accuracy. This involves extending the accumulator to 36 or 40 bits. In another implementation, introducing detection logic improves timing and accuracy. The detection logic obtains the input of the carry-save adder circuit (CSA) 614 from three inputs 683, 685, 689, rather than two outputs of the carry-save adder circuit (CSA) 614, and is the subject of another related disclosure.

指數電路：Exponential Circuit:

「指數控制單元」比較來自乘法器的第一指數運算元和來自累加器的第二指數運算元之間的指數差。指數控制單元檢查由比較乘法器和累加器指數生成的狀況，並根據表1選擇運算元路徑。同時，確定新的累加器指數並將其儲存到指數累加器(Eacc)暫存器654(圖7b)。 The "Exponent Control Unit" compares the exponent difference between the first exponent operand from the multiplier and the second exponent operand from the accumulator. The Exponent Control Unit checks the condition generated by comparing the multiplier and accumulator exponents and selects the operand path according to Table 1. At the same time, the new accumulator exponent is determined and stored in the exponent accumulator (Eacc) register 654 (Figure 7b).

指數部分有兩個分支：左分支和右分支。左分支由輸入671和673(進入或閘)組成，選擇兩個指數中較大的一個，接著成為結果指數。根據表1選擇此狀況。由輸入675和677(進入指數輸出或閘)組成的右分支將根據表1中描述的狀況選擇Ea+1或Ea-1。如果發出有效數溢位訊號，累加器有效數應向右移位8位元(SHR_8)，並且指數加1。 The exponent section has two branches: the left branch and the right branch. The left branch, consisting of inputs 671 and 673 (entering the OR gate), selects the larger of the two exponents, which then becomes the resultant exponent. This condition is selected according to Table 1. The right branch, consisting of inputs 675 and 677 (entering the exponent output OR gate), will select Ea+1 or Ea-1 according to the condition described in Table 1. If the significand overflow is signaled, the accumulator significand shall be shifted right by 8 bits (SHR_8) and the exponent shall be incremented by 1.

在CS加法期間執行溢位(O)檢測。如果檢測到溢位，則O位元被鎖存到輸出管線暫存器中。溢位情況將在下一個週期根據表1進行校正。 Overflow (O) detection is performed during CS addition. If an overflow is detected, the O bit is latched into the output pipeline register. The overflow condition will be corrected in the next cycle according to Table 1.

CS-Accumulation Implementation

指數路徑和有效數路徑的功能是相互依賴的，其取決於有效數部分產生的指數和「符號擴展」(SE)和「溢位」(O)訊號的狀態。有兩個累加器，一個用於進位，另一個用於總和。它們使用3：2進位保留加法器(CSA)與乘積相加，並穿透過兩條單獨的路徑，一條用於進位，另一條用於總和。 The functions of the exponent path and the significand path are interdependent, depending on the exponent and the states of the "sign extend" (SE) and "overflow" (O) signals generated by the significand. There are two accumulators, one for the carry and the other for the sum. They are added to the product using a 3:2 carry-save adder (CSA) and traverse two separate paths, one for the carry and the other for the sum.

管線階段的目標暫存器是一個累加器，其包含進位及總及(兩個暫存器)。在接下來的管線階段(管線4和管線5)中執行到傳統格式的轉換。進位保留階段可以是時序關鍵階段。因此，特別注意本節中描述的指導設計決策的時序和區域。所述管線階段的關鍵路徑包含：指數控制、三個2：1多工器、一個5位元減法器、一個5位元減法器和比較單元，在指數部分，而在有效數部分，指數控制、5位元遞增器、3：2進位保留加法器(CSA)和一個及閘。關鍵路徑可以穿過指數路徑和有效數路徑，就像本設計中的情況一樣。 The destination register for the pipeline stage is an accumulator that contains the carry and sum (two registers). The conversion to the legacy format is performed in the following pipeline stages (Pipeline 4 and Pipeline 5). The carry-save stage can be a timing critical stage. Therefore, special attention is paid to the timing and area that guide the design decisions described in this section. The critical paths of the pipeline stage include: exponent control, three 2:1 multiplexers, a 5-bit subtractor, a 5-bit subtractor and compare unit in the exponent section, and exponent control, a 5-bit incrementer, a 3:2 carry-save adder (CSA), and an AND gate in the significand section. The critical path can traverse both the exponent path and the significand path, as is the case in this design.

累加器設計Accumulator Design

圖7A說明了累加器240的簡化方塊圖610，其包括三個電路方塊：指數控制單元240A、指數比較器單元240B和有效數部分240C。 FIG. 7A illustrates a simplified block diagram 610 of the accumulator 240, which includes three circuit blocks: an exponent control unit 240A, an exponent comparator unit 240B, and a significand portion 240C.

圖7B說明了指數控制單元240A和指數比較器單元240B的範例性階層式方塊圖和示意圖。移位器指數控制訊號產生/旁路控制電路630接收來自：accum_ld、exp_zero_en、f_zero_en、e_cin_zero、551、csa_ovf位元634 O及Signext的輸入，其為S位元636以及16位元乘法器指數比較電路652的輸出，其存入16位元狀況暫存器650，十六個指數比較位元*：

*其中： emult：是乘積指數 eaccu：是累加器指數 emmp 含義：emult大3以上 eamp 含義：eaccu大4以上 7B illustrates an exemplary hierarchical block diagram and schematic diagram of the index control unit 240A and the index comparator unit 240B. The shifter index control signal generation/bypass control circuit 630 receives inputs from: accum_ld, exp_zero_en, f_zero_en, e_cin_zero, 551, csa_ovf bits 634 O and Signext, which is the S bit 636 and the output of the 16-bit multiplier index compare circuit 652, which is stored in the 16-bit status register 650, the sixteen exponent compare bits*:

*Among them: emult: is the multiplication index eaccu: is the accumulator index emmp means: emult is greater than 3 eamp means: eaccu is greater than 4

還有其它控制訊號： There are other control signals:

accum_ld-含義：累加器接收輸入C值。 accum_ld - Meaning: The accumulator receives the input C value.

exp_zero_en-含義：將乘積指數設置為零。 exp_zero_en - Meaning: Set the product exponent to zero.

f_zero_en-含義：如果Exponent=0，則將乘積有效數設置為0(因為不允許非正規) f_zero_en - Meaning: If Exponent=0, set the product significand to 0 (because non-normal is not allowed)

e_cin_zero-含義：輸入C指數等於零 e_cin_zero - Meaning: Input C index equals zero

移位器指數控制訊號產生/旁路控制電路630的輸出是控制訊號：線638上的累加移位器控制、線636上的累加旁路控制、線634上的乘法器旁路控制、以及線632上的乘法器移位器控制、線646上的Ea_sel、線642上的Ea1m_sel、線648上的Em_sel、線645上的Ea1p_sel。 The outputs of the shifter index control signal generation/bypass control circuit 630 are control signals: accumulator shifter control on line 638, accumulator bypass control on line 636, multiplier bypass control on line 634, and multiplier shifter control on line 632, Ea_sel on line 646, Ea1m_sel on line 642, Em_sel on line 648, and Ea1p_sel on line 645.

比較電路652比較來自以下的兩個運算元的指數：(1)線521上的乘法器指數E_mult及線679上的累加器指數；(2)或輸入A(在旁路模式下來自線521上的指數E_mult)及線679上的累加器指數；(3)或輸入A(在旁路模式下來自線521上的指數E_mult)及輸入C(來自線460上的指數Ec)。比較電路652產生儲存在16位元狀況暫存器650中的以下狀況位元：emult：乘法器指數；eaccu：累加器指數；z_diff-emult與eaccu相同；mgrt-emult大於eaccu；agrt-eaccu大於emult；em1p-emult大1；em2p-emult大2；em3p-emult大3；emmp-emult大3；ea1p-eaccu大1；ea2p-eaccu大2；ea3p-eaccu大3；ea4p-eaccu大4；eamp-eaccu大4以上；emz-emult為零；eaz-eaccu為零；eminf-emult是無窮大；並且eainf-eaccu是無窮大。16位元狀況暫存器650透過匯流排621與移位元指數控制訊號產生/旁路控制電路630介面。16位元狀況暫存器650儲存來自總和S(i-1)的Eacc的比較結果，並且在為總和S(i)產生Eacc期間，E_mult暫存器520在累加模式中儲存項A(i)*B(i)。 Comparison circuit 652 compares the indices from the following two operands: (1) the multiplier index E_mult on line 521 and the accumulator index on line 679; (2) either input A (in bypass mode from the index E_mult on line 521) and the accumulator index on line 679; (3) either input A (in bypass mode from the index E_mult on line 521) and input C (input Ec from line 460). Comparison circuit 652 generates the following status bits which are stored in 16-bit status register 650: emult: multiplier index; eaccu: accumulator index; z_diff-emult is the same as eaccu; mgrt-emult is greater than eaccu; agrt-eaccu is greater than emult; em1p-emult is greater than 1; em2p-emult is greater than 2; em3p-emult is greater than 3; emmp-emult is greater than 3; ea1p-eaccu is greater than 1; ea2p-eaccu is greater than 2; ea3p-eaccu is greater than 3; ea4p-eaccu is greater than 4; eamp-eaccu is greater than 4 or more; emz-emult is zero; eaz-eaccu is zero; eminf-emult is infinitely large; and eainf-eaccu is infinitely large. The 16-bit status register 650 interfaces with the shift index control signal generation/bypass control circuit 630 via bus 621. The 16-bit status register 650 stores the comparison result of Eacc from the sum S(i-1), and during the generation of Eacc for the sum S(i), the E_mult register 520 stores the term A(i)*B(i) in the accumulation mode.

比較電路652在線647上的輸入來自減法器電路646。減法器電路646在線521上接收來自管線暫存器520的E_mult和多工器642的輸出，多工器642在暫存器460的Ec和或閘670的線679上的新指數輸出之間進行選擇，其中多工器642線665上指示模式的accum_en訊號控制。(圖7b) Comparator circuit 652 has an input on line 647 from subtractor circuit 646. Subtractor circuit 646 receives E_mult from pipeline register 520 on line 521 and the output of multiplexer 642, which selects between Ec of register 460 and the new index output on line 679 of OR gate 670, controlled by the accum_en signal on line 665 of multiplexer 642 indicating the mode. (Figure 7b)

線618上的Exp_Zero_En被施加到反相器619，其輸出被用作及閘617的輸入。來自管線暫存器520的E_mult指數位元也被輸入到及閘617，其在線681上的輸出饋送到及閘668，其中線648上的Em_sal位元來自移位器指數控制訊號產生/旁路控制電路630以穿過或阻擋E_mult。及閘668的線671上的輸出連接到四輸入或閘670。或閘670具有三個其它輸入，包括及閘615的輸出，其由訊號Ea_sel選擇以穿過或阻擋Eaccum，而線616上遞增器660的輸出和線663上的遞減器661的輸出各自由輸出Ea1p_sel和Ea1m_sel在及閘664和665處分別控制。根據選擇訊號(其中只有一個可以是1)、648、646、645、642，選擇適當的指數作為或閘670的輸出。這個輸出是Eacc訊號，也稱為新指數，其為指數累加器(Eacc)暫存器654的輸入，也是多工器642的輸入。 Exp_Zero_En on line 618 is applied to inverter 619, whose output is used as an input to AND gate 617. The E_mult exponent bit from pipeline register 520 is also input to AND gate 617, whose output on line 681 is fed to AND gate 668, where the Em_sal bit on line 648 comes from the shifter index control signal generation/bypass control circuit 630 to pass or block E_mult. The output of AND gate 668 on line 671 is connected to a four-input OR gate 670. OR gate 670 has three other inputs, including the output of AND gate 615, which is selected by signal Ea_sel to pass or block Eaccum, and the output of multiplier 660 on line 616 and the output of subtractor 661 on line 663, which are each controlled by outputs Ea1p_sel and Ea1m_sel at AND gates 664 and 665, respectively. Based on the select signals (only one of which can be 1), 648, 646, 645, 642, the appropriate index is selected as the output of OR gate 670. This output is the Eacc signal, also called the new index, which is the input to the exponent accumulator (Eacc) register 654 and also the input to multiplexer 642.

多工器665的輸出(「總和S(i)的新指數，或運算元C的指數，取決於模式)在此實施例中也記錄在暫存器460中，所述暫存器460在線644上連接為遞增器660和遞減器661的輸入。 The output of multiplexer 665 ("the new index of sum S(i) or the index of operator C, depending on the mode") is also recorded in register 460 in this embodiment, which is connected on line 644 as input to incrementer 660 and decrementer 661.

因此，當線679上的新指數使用總和S(i-1)產生的比較位元表示總和S(i)時，新指數將在暫存器520中的項A(i-1)*B(i-1)的E-mult值進行比較，以產生將與總和S(i)一起鎖存的比較訊號，並用於在產生總和S(i+1)期間的移位器控制。 Thus, when the new index on line 679 represents sum S(i) using the compare bit generated by sum S(i-1), the new index will be compared with the E-mult value of item A(i-1)*B(i-1) in register 520 to generate a compare signal that will be latched with sum S(i) and used for shifter control during the generation of sum S(i+1).

圖7C是有效數部分240C的示意圖。說明了移位器指數控制訊號產生/旁路控制電路630，其顯示四個輸出控制訊號。第一控制訊號是在線638上的累加移位器控制，其為移位器電路SHR8/16/24/SHL8 609和610的選擇訊號。兩個移位器電路：SHR8/16/24/SHL8 609和610在線682上和線683上接收來自一組多工器210和211的輸入。線665上的accum_en訊號控制多工器210和多工器211，以在線224上的總和、線225上的進位或線219上的值之間進行選擇，源自暫存器Fcin 560或邏輯「0」作為多工器211的另一個輸入。多工器210和多工器211將匯流排682和匯流排683上的選定值輸出到移位器電路SHR8/16/24/SHL8 609和610。移位器電路609和610在匯流排692和693上輸出移位值。匯流排692可以直接介接到匯流排613或可以遍歷可選的進位捨入方塊604，所述進位捨入方塊604將捨入位元額添加到匯流排613。匯流排693可以直接介接到匯流排611或可以遍歷可選的總和捨入方塊612，並將捨入位元添加到693。匯流排611和613是及閘687和688的輸入，其輸出在線689和685上用作進位保留加法器電路614的輸入(AND符號表示匯流排上每條訊號線的及閘的多樣性：613、611和607。及閘688和及閘686的輸入由線633上及線634上的控制訊號分別選擇。 FIG. 7C is a schematic diagram of the significand portion 240C. The shifter index control signal generation/bypass control circuit 630 is illustrated, which shows four output control signals. The first control signal is the accumulator shifter control on line 638, which is the select signal for the shifter circuits SHR8/16/24/SHL8 609 and 610. The two shifter circuits: SHR8/16/24/SHL8 609 and 610 receive inputs from a set of multiplexers 210 and 211 on lines 682 and 683. The accum_en signal on line 665 controls multiplexers 210 and multiplexer 211 to select between the sum on line 224, the carry on line 225, or the value on line 219, originating from register Fcin 560, or a logical "0" as the other input to multiplexer 211. Multiplexers 210 and multiplexer 211 output the selected value on bus 682 and bus 683 to shifter circuits SHR8/16/24/SHL8 609 and 610. Shifter circuits 609 and 610 output the shifted value on buses 692 and 693. Bus 692 may interface directly to bus 613 or may traverse optional round-up block 604 which adds rounding bits to bus 613. Bus 693 may interface directly to bus 611 or may traverse optional round-up sum block 612 and add rounding bits to 693. Buses 611 and 613 are the inputs to AND gates 687 and 688, whose outputs are used as inputs to carry-save adder circuit 614 on lines 689 and 685 (the AND symbol indicates the diversity of AND gates for each signal line on the bus: 613, 611, and 607. The inputs to AND gates 688 and 686 are selected by control signals on lines 633 and 634, respectively.

管線暫存器520中的F_mult值被輸入到簡單乘積捨入方塊684，其在線603上直接輸入到移位暫存器SHR 8/16/24電路608。SHR 8/16/24電路608的選擇訊號是線632上的多移位器控制，其在線601上的F_mult輸入和線603上的捨入乘積之間進行選擇。輸出是匯流排607上包含42位元(34+8)的乘積，其施加於及閘686的輸入，其輸出是進位保留加法器電路614的輸入。 The F_mult value in pipeline register 520 is input to simple product round block 684 which is input directly to shift register SHR 8/16/24 circuit 608 on line 603. The select signal for SHR 8/16/24 circuit 608 is the multi-shifter control on line 632 which selects between the F_mult input on line 601 and the rounded product on line 603. The output is the product containing 42 bits (34+8) on bus 607 which is applied to the input of AND gate 686 whose output is the input of carry-save adder circuit 614.

42位元3：2進位保留加法器電路614具有三個輸入，包括及閘686的輸出683、及閘687的輸出689和及閘688的輸出685。42位元3：2進位保留加法器電路614的輸出是兩條匯流排：總和匯流排669和進位匯流排667。兩個輸出669和667分別透過匯流排669進入42位元小數總和暫存器242，而透過匯流排667進入42位元小數進位暫存器241。匯流排669和匯流排667也是溢位檢測方塊605和符號擴展檢測單元662的輸入。溢位檢測方塊605和符號擴展檢測單元662這兩個方塊向O位元634提供輸出，即csa_ovf訊號及是符號擴展訊號的S-bit 636。符號擴展檢測單元662具有在線665上的致能位元accum_en訊號，當運算為累加時，所述訊號被設置為邏輯「1」。符號擴展檢測模組僅在致能「accum_en」訊號時才可運算。 The 42-bit 3:2 carry-preserve adder circuit 614 has three inputs, including output 683 of AND gate 686, output 689 of AND gate 687, and output 685 of AND gate 688. The outputs of the 42-bit 3:2 carry-preserve adder circuit 614 are two buses: sum bus 669 and carry bus 667. The two outputs 669 and 667 enter the 42-bit fractional sum register 242 through bus 669 and enter the 42-bit fractional carry register 241 through bus 667, respectively. Bus 669 and bus 667 are also inputs of overflow detection block 605 and sign extension detection unit 662. The overflow detection block 605 and the sign extension detection unit 662 provide outputs to the O bit 634, namely the csa_ovf signal and the S-bit 636 which is the sign extension signal. The sign extension detection unit 662 has an enable bit accum_en signal on line 665, which is set to a logical "1" when the operation is accumulation. The sign extension detection module can only operate when the "accum_en" signal is enabled.

共有三種運算模式可供選擇，其為：輸入A(BF16)x輸入B(BF16)+輸入C(FP32)，輸入A(BF16)x輸入B(BF16)+累加迴圈(總和)，輸入A(FP32)+輸入C(FP32)。 There are three operation modes to choose from: input A (BF16) x input B (BF16) + input C (FP32), input A (BF16) x input B (BF16) + accumulation loop (sum), input A (FP32) + input C (FP32).

「accum_en」訊號僅在第二模式狀況(累加)期間致能。在加法模式中，不需要符號擴展檢測。僅在累加模式下才需要，因為符號擴展位元的逐漸增長只能在累加運算期間發生。 The "accum_en" signal is enabled only during the second mode state (accumulate). In addition mode, sign-extend detection is not required. It is only required in accumulation mode because the gradual increase of the sign-extend bit can only occur during accumulation operation.

符號擴展檢測單元662：Symbol expansion detection unit 662:

根據一些態樣，符號擴展檢測單元662附接到總和及進位兩者的累加器輸出。當檢測到10位元符號(包括兩個符號位元，以及總和或進位的第一個位元組中的額外8位元)時，輸出將在下一個週期(SHL_8)中左移以保持運算元的準確性。如果不進行符號擴展檢測，則在正常運算期間，運算元的有效數位元會逐漸向右移位，直到被擴展符號位元替換，從而致使精確度損失。在所述實現中，每次運算元之一檢測到至少10個前導符號位元時，調整執行將運算元左移8位元位置。透過將指數值減一來相應地調整指數，這在同一周期中執行。當在累加器的進位或總和部分檢測到S時，S位元被鎖存在輸出管線暫存器636中，用於在下一個週期進行校正。校正運算將累加器向左移位8位元位置(SHL_8)。有時這種情況可能會在下一個動作(需要SHR_8)時自行取消，通常保持不變，如表1所示。 According to some aspects, a sign extend detection unit 662 is attached to the accumulator outputs for both the sum and the carry. When a 10-bit sign is detected (including two sign bits, and an additional 8 bits in the first byte of the sum or carry), the output is left-shifted in the next cycle (SHL_8) to maintain accuracy of the operands. Without sign extend detection, during normal operation, the significand bits of the operands are progressively shifted to the right until replaced by the extended sign bit, resulting in a loss of accuracy. In the implementation, an adjustment is performed to shift the operand left by 8 bit positions each time one of the operands detects at least 10 leading sign bits. The exponent is adjusted accordingly by decrementing the exponent value by one, which is performed in the same cycle. When S is detected in the carry or sum portion of the accumulator, the S bit is latched in the output pipeline register 636 for correction in the next cycle. The correction operation shifts the accumulator left by 8 bit positions (SHL_8). Sometimes this may cancel itself on the next action (requiring SHR_8), usually it remains unchanged as shown in Table 1.

正規化和轉換為符號數值格式Normalization and conversion to symbolic numeric format

圖8A說明了正規化轉換到符號數值格式方塊270，其包含兩個子方塊，第一子方塊是從進位保留轉換到符號數值格式方塊270a，第二子方塊是從基底8轉換到基底2浮點數方塊270b。 FIG8A illustrates the normalization conversion to signed value format block 270, which includes two sub-blocks, the first sub-block is the conversion from carry-save to signed value format block 270a, and the second sub-block is the conversion from base 8 to base 2 floating point block 270b.

圖8B說明了從進位保留轉換到符號數值格式方塊270a的範例性示意圖。兩個暫存器，42位元小數總和暫存器242和42位元小數進位暫存器241輸出移位進位[42：0]匯流排704及符號擴展總和[42：0]匯流排702作為43位元加法器電路708的輸入。第二電路LZA/LOA 710接收輸入匯流排702和匯流排704。第二電路LZA/LOA 710在線711上輸出兩條匯流排POS_P[5：0]，而在線712上輸出POS_N[5：0]到第三LZA POS選擇電路714。LZA POS選擇電路714的輸出，例如，透過匯流排715，是POS[5：0]，其映射到暫存器730作為6位元位置，指定正規化有效數所需的向左移位數量。 8B illustrates an exemplary schematic diagram of the carry-save to sign-value format block 270a. Two registers, the 42-bit fractional sum register 242 and the 42-bit fractional carry register 241 output the shift-carry [42:0] bus 704 and the sign-extend-sum [42:0] bus 702 as inputs to the 43-bit adder circuit 708. The second circuit LZA/LOA 710 receives the input bus 702 and the bus 704. The second circuit LZA/LOA 710 outputs two buses POS_P[5:0] on line 711 and POS_N[5:0] on line 712 to the third LZA POS select circuit 714. The output of LZA POS select circuit 714, for example, via bus 715, is POS[5:0], which is mapped to register 730 as a 6-bit position specifying the amount of left shift required to normalize the significand.

43位元加法器電路708在線719上輸出訊號SIGN以控制LZA POS選擇電路714，將匯流排716路由到「0」分支輸入上的有效數選擇多工器電路720，並將匯流排716路由到負的輸入：反相加1電路718。2的有效數選擇多工器電路720的「1」分支接收匯流排717，其表示負有效數，轉換為正一。符號719控制有效數選擇多工器，使得輸出738總是包含正有效數。2的補數選擇多工器電路720的輸出是匯流排738，其作為41位元正有效數映射到暫存器730。線706上的5位元指數直接映射到暫存器730以及符號位元726。此步驟完成了以進位保留格式表示的累加器有效數轉換到符號數值基底8格式。 43-bit adder circuit 708 outputs signal SIGN on line 719 to control LZA POS select circuit 714 to route bus 716 to significand select multiplexer circuit 720 on the "0" branch input and to route bus 716 to the negative input: inverting add-one circuit 718. The "1" branch of 2's complement significand select multiplexer circuit 720 receives bus 717, which represents a negative significand, converted to positive one. Sign 719 controls the significand select multiplexer so that output 738 always contains a positive significand. The output of 2's complement select multiplexer circuit 720 is bus 738, which is mapped to register 730 as a 41-bit positive significand. The 5-bit exponent on line 706 maps directly to register 730 and sign bit 726. This step completes the conversion of the accumulator significand represented in carry-save format to signed numeric base-8 format.

在這個階段，總和匯流排702和進位匯流排704上的兩個值(表示進位保留格式的有效數)在43位元加法器電路708中相加，以產生有效數的符號數值格式。前導零/前導一預測器(第二LZA/LOA電路)710將計算兩個數字：前導零的數字711(在有效數716為正的情況下)和前導一的數字712(在有效數716為負的情況下)。根據有效數符號位元719，多工器LZA POS選擇電路714將選擇正確的位置並將其儲存到暫存器730中。LZ和LO位置POS_P和POS_N都為6位元長數字，預計包含32個前導零或一的情況。 At this stage, the two values on the sum bus 702 and the carry bus 704 (representing the significand in carry-save format) are added in the 43-bit adder circuit 708 to produce the sign-value format of the significand. The leading zero/leading one predictor (second LZA/LOA circuit) 710 will calculate two numbers: the number with leading zeros 711 (in the case of the significand 716 being positive) and the number with leading ones 712 (in the case of the significand 716 being negative). Based on the significand sign bit 719, the multiplexer LZA POS select circuit 714 will select the correct position and store it in the register 730. LZ and LO positions POS_P and POS_N are both 6-bit numbers long and are expected to contain 32 leading zeros or ones.

如果43位元加法器電路708的輸出的有效數為負，則所述負數被轉換為正數(因為IEEE 754使用符號數值表示，也就是說，正有效數)。為此目的，使用了2的補數轉換器718。符號位元719控制多工器電路720，因此如果數字為正，則將其直接儲存到41位元有效數暫存器730。如果輸出為負，則線717上的輸出，即716上的值轉換為正值，將在線738上傳遞到暫存器730。 If the significand of the output of the 43-bit adder circuit 708 is negative, then the negative number is converted to a positive number (because IEEE 754 uses signed numerical representation, that is, positive significands). For this purpose, a 2's complement converter 718 is used. The sign bit 719 controls the multiplexer circuit 720 so that if the number is positive, it is directly stored in the 41-bit significand register 730. If the output is negative, the output on line 717, i.e. the value on 716 is converted to a positive value and is passed to register 730 on line 738.

有效數的預測6位元位置將被相加到5位元指數以產生符合標準浮點數表示的新8位元指數，而有效數將相對於浮點有效數對齊，使用相同的6位元預測位置(圖8c)。 The predicted 6-bit position of the significand will be added to the 5-bit exponent to produce a new 8-bit exponent that conforms to standard floating-point representation, and the significand will be aligned with the floating-point significand, using the same 6-bit predicted position (Figure 8c).

圖8C說明了從基底8到基底2浮點數轉換的方塊270b的範例性示意圖。暫存器730透過41位元匯流排731連接到SHL左移位器電路735。暫存器730透過41位元匯流排734介接到有效數零檢測電路728。暫存器730的6位元位置欄位在線723上提供Pos[5：0]以控制SHL左移位器電路735。線723上的Pos[5：0]也是指數加法器電路740的輸入。暫存器730的指數5位元欄位是線721上到指數加法器電路740的第二輸入。指數加法器電路740調整(增加)由POS[5：0]指示的有效數向左移位的位置數量的指數，並在線736上提供輸出到暫存器748。然而，鑑於預測器可能在一個位置上出錯，移位器的輸出被傳遞到溢/欠檢測電路752，所述電路將透過發出訊號739來發出錯誤訊號，訊號739施加於指數加法器電路740的進位輸入且於欠檢測多工器760的控制輸入。欠檢測多工器760具有線746上的輸入，其中來自移位器電路735的線742上的有效數處於相同位置(未檢測到錯誤)，以及線747上的輸入，其中在線742上的有效數被向左移位一位元(檢測到錯誤)。如果訊號739指示欠檢測，則正確的輸出將透過匯流排745鎖存到暫存器770。透過進位輸入將1輸入至加法器來校正指數調整中的錯誤。符號值從暫存器730複製到暫存器747。 FIG8C illustrates an exemplary schematic diagram of block 270 b for base 8 to base 2 floating point conversion. Register 730 is connected to SHL left shifter circuit 735 via 41-bit bus 731. Register 730 is interfaced to significand zero detection circuit 728 via 41-bit bus 734. The 6-bit position field of register 730 provides Pos[5:0] on line 723 to control SHL left shifter circuit 735. Pos[5:0] on line 723 is also an input to exponent adder circuit 740. The exponent 5-bit field of register 730 is the second input to exponent adder circuit 740 on line 721. Exponent adder circuit 740 adjusts (increases) the exponent by the number of positions indicated by POS[5:0] shifted left by the significand and provides an output on line 736 to register 748. However, since the predictor may be wrong in one position, the output of the shifter is passed to overflow/under detection circuit 752, which will signal an error by issuing signal 739, which is applied to the carry input of exponent adder circuit 740 and at the control input of under-detect multiplexer 760. Underdetect multiplexer 760 has an input on line 746 where the significand on line 742 from shifter circuit 735 is in the same position (no error detected), and an input on line 747 where the significand on line 742 is shifted one bit to the left (an error detected). If signal 739 indicates an underdetection, the correct output is latched into register 770 via bus 745. The error in exponent adjustment is corrected by inputting a 1 to the adder via the carry input. The sign value is copied from register 730 to register 747.

暫存器730中的符號位元透過線744傳遞到暫存器747。 The sign bit in register 730 is transferred to register 747 via line 744.

異常控制(8位元)暫存器750在線724上將其值傳遞給異常控制暫存器751。異常控制暫存器位元的含義如下： Exception control (8-bit) register 750 passes its value to exception control register 751 on line 724. The meaning of the exception control register bits is as follows:

exp_inf_en：運算元A為無窮大或運算元B為無窮大 exp_inf_en: Operator A is infinite or operator B is infinite

z_diff 累加器指數等於乘積指數 z_diff accumulator index equals product index

s_mult：乘積符號 s_mult: product sign

s_cin：輸入C符號 s_cin: Input C symbol

e_mul_zero：乘積指數=0 e_mul_zero: multiplication exponent = 0

e_cin_zero：輸入C指數=0 e_cin_zero: Input C index = 0

e_mul_inf：乘積指數等於無窮大。 e_mul_inf: The product exponent is equal to infinity.

e_cin_inf：輸入C指數等於無窮大。 e_cin_inf: Input C index equal to infinity.

異常控制(8位元)暫存器750的位元六726，是「z_diff」，表示累加器和乘積指數之間的指數比較結果。當等於1時，累加器的指數等於乘積的指數。當「z_diff」=0時，表示累加器指數小於或等於乘積指數。位元[6]「z_diff」是在線726上到有效數零檢測電路728的第一輸入。有效數零檢測電路728在線753上輸出1位元訊號，其代替位元[6]「z_diff」，在異常控制暫存器751中現在變成「pos_zero」753，指出結果有效數為零。一旦累加運算完成並且運算進行到正規化，有效數零檢測電路728的第二輸出在線755上為小數零暫存器756提供訊號。 Bit six 726 of the exception control (8-bit) register 750 is "z_diff", which indicates the result of the exponent comparison between the accumulator and the product exponent. When equal to 1, the accumulator exponent is equal to the product exponent. When "z_diff" = 0, it indicates that the accumulator exponent is less than or equal to the product exponent. Bit [6] "z_diff" is the first input to the significand zero detection circuit 728 on line 726. The significand zero detection circuit 728 outputs a 1-bit signal on line 753, which replaces bit [6] "z_diff", which now becomes "pos_zero" 753 in the exception control register 751, indicating that the result significand is zero. Once the accumulation operation is complete and the operation is normalized, the second output of the significand zero detection circuit 728 provides a signal to the fractional zero register 756 on line 755.

圖9A說明了執行700最終轉換為BF16或IEEE 754 32位元單精確度格式的方塊270，包括執行捨入及轉換為BF16或IEEE 754 32位元單精確度格式的符號數值格式方塊270a和執行指數和異常處理的子方塊270b。 FIG. 9A illustrates block 270 that performs the final conversion to BF16 or IEEE 754 32-bit single precision format of 700, including block 270a that performs rounding and conversion to BF16 or IEEE 754 32-bit single precision format of signed value and sub-block 270b that performs exponent and exception handling.

捨入並轉換為FP32格式Round and convert to FP32 format

根據某些態樣，最後階段是管線6，其執行將結果捨入為標準浮點符號/數值數，其具有下列：符號位元、8位元指數、23位元、正規化有效數(+1隱含整數位元)。在將31位元有效數轉換為具有一隱含位元的24位元正規化有效數的期間，將結果從31位元捨入到24位元。在此實現中，實現了兩種捨入模式：向零捨入(RTZ)截斷、捨入到最近偶數(RNE)。然而，任何其它捨入模式，例如，捨入到最近奇數(RNO)都很容易被併入。 According to some aspects, the last stage is pipeline 6, which performs rounding of the result to a standard floating point sign/value number with the following: sign bit, 8-bit exponent, 23-bit, normalized significand (+1 implicit integer bit). During the conversion of the 31-bit significand to a 24-bit normalized significand with one implicit bit, the result is rounded from 31 bits to 24 bits. In this implementation, two rounding modes are implemented: round towards zero (RTZ), truncate, and round to nearest even (RNE). However, any other rounding mode, for example, round to nearest odd (RNO) can easily be merged in.

根據一些態樣，捨入邏輯檢查來自暫存器770的39個有效數位元中的最後15個LSB位元(不計算構成總42位元(39+3個GRS位元)的GRS位元)，並確定剩餘的24位元是否需要捨入(根據施加的規則：RNE或RTZ)。RNE所需的遞增器包含在捨入框中。從累加器(CSA)運算中攜帶的三個位元GRS在此實現中被忽略。它們可以併入到其它可能實現的最終捨入中。 According to some aspects, the rounding logic examines the last 15 LSB bits of the 39 significand bits from register 770 (not counting the GRS bits that make up the total 42 bits (39+3 GRS bits)) and determines whether the remaining 24 bits need to be rounded (according to the rule applied: RNE or RTZ). The incrementer required for RNE is included in the rounding box. The three bits of GRS carried over from the accumulator (CSA) operation are ignored in this implementation. They can be incorporated into the final rounding of other possible implementations.

捨入是透過以下幾種方式之一完成的：(a)在CS累加運算期間，施加捨入到最接近奇數(RNO)，-針對總和訊號，僅在將捨入位元插入進位LSB打開位置的情況下，-分別針對每個總和訊號及進位訊號，(b)在CS累加期間，以及在管線6階段(最終捨入)，以及(c)僅在管線6階段，而CSA被禁用。 Rounding is done in one of several ways: (a) during CS accumulation, with round to nearest odd (RNO) applied, - for the sum signal, only if the round bit is inserted with the carry LSB on, - separately for each sum and carry signal, (b) during CS accumulation, and in pipeline 6 stage (final rounding), and (c) only in pipeline 6 stage with CSA disabled.

根據特定應用程式施加的精確度和特定要求施加每種捨入模式。 Each rounding mode is applied according to the accuracy and specific requirements imposed by the specific application.

根據需要，管線6的輸出是FP32或BF16。因此，有效數長度是24(23+隱含位元)或8位元(7+隱含位元)。這由施加到第一多工器的「Out_FP32」訊號控制。如果捨入致使25位元有效數，有效數將右移一個位置，指數將遞增1。 The output of pipeline 6 is either FP32 or BF16, as required. Therefore, the significand length is either 24 (23 + implicit bit) or 8 bits (7 + implicit bit). This is controlled by the "Out_FP32" signal applied to the first multiplexer. If rounding results in a 25-bit significand, the significand is shifted right one position and the exponent is incremented by 1.

正確正規化和捨入的結果儲存在管線6的輸出暫存器中，作為由1位元符號、8位元指數和7位元小數的有效數組成的BF16數，或由1位元符號、8位元指數和23位元小數的有效數組成的FP32數。 The correctly normalized and rounded result is stored in the output register of pipeline 6 as a BF16 number consisting of a 1-bit sign, an 8-bit exponent, and a 7-bit fractional significand, or a FP32 number consisting of a 1-bit sign, an 8-bit exponent, and a 23-bit fractional significand.

圖9B說明了顯示捨入和轉換為BF16或IEEE 754 32位元SP格式的範例性示意圖。39位元有效數暫存器 770匯流排fpst_l[38：0]837將fpst_l[32：0]819或fpst_l[16：0]819提供給包含保護位元、捨入位元和黏性位元的捨入電路830。控制Out_FP32選擇線819上將由捨入電路830捨入的有效數部分。在選擇32位元SP格式的情況下，較高24位元[38：16]增加3個捨入位元。多工器840在線835上的「0」輸入或線825上的捨入電路830輸出之間進行選擇，其中多工器840由捨入到零選擇線823控制。這種情況在指數超過-126並且有效數變為非正規時發生，在此實現中致使捨入為零。線827上的輸出將23位元(一個隱含的)和3捨入位元路由到第一捨入遞增電路860，從而在線819上以IEEE 754 SP 32位元格式適當地捨入有效數。 FIG9B illustrates an exemplary schematic diagram showing rounding and conversion to BF16 or IEEE 754 32-bit SP format. 39-bit Significand Register 770 Bus fpst_l[38:0] 837 provides fpst_l[32:0] 819 or fpst_l[16:0] 819 to a rounding circuit 830 including protection bits, rounding bits, and sticky bits. The significand portion to be rounded by the rounding circuit 830 is controlled on Out_FP32 select line 819. In the case where the 32-bit SP format is selected, 3 rounding bits are added to the upper 24 bits [38:16]. Multiplexer 840 selects between a "0" input on line 835 or the rounding circuit 830 output on line 825, where multiplexer 840 is controlled by round to zero select line 823. This occurs when the exponent exceeds -126 and the significand becomes non-normal, causing rounding to zero in this implementation. The output on line 827 routes the 23 bits (one implicit) and 3 rounding bits to the first round increment circuit 860, which properly rounds the significand in IEEE 754 SP 32-bit format on line 819.

當選擇BF16輸出格式時，第二捨入遞增電路850用於捨入成BF-16格式。線817上的39位元有效數從fpst_l[38：0]817轉換到BF-16輸出在捨入遞增電路850中完成，得到7位元(隱含一個)，增加了1個捨入決策位元，以及額外的16個零。這表示線821上的輸出處的有效數作為多工器802的輸入之一。 When the BF16 output format is selected, the second round increment circuit 850 is used to round to the BF-16 format. The conversion of the 39-bit significand on line 817 from fpst_l[38:0] 817 to the BF-16 output is completed in the round increment circuit 850, resulting in 7 bits (one implicit), an increase of 1 round decision bit, and an additional 16 zeros. This represents the significand at the output on line 821 as one of the inputs to the multiplexer 802.

第一多工器802使用控制線訊號Out_FP32 801選擇FP-32或BF16輸出。當Out_FP32 801有效時，其在線805上輸出FP-32格式有效數。當Out_FP32 801控制訊號無效時，線805上的輸出是BF-16格式的有效數。第一多工器802的輸出是匯流排805，其分為匯流排807和匯流排809，進入第二多工器810。多工器810由線805上輸出匯流排的第24位元控制，線803上的frnd[23]訊號。如[0093]中所述，在捨入產生25位元有效數的情況下，第24位元將為1。在這種情況下，線803上的訊號frnd[23]位元選擇輸入匯流排807，其為匯流排805向右移位一位元位置。(frnd[23]訊號還將結果的指數遞增1，以調整右移。如果frnd[23]等於0，則不需要右移，並且匯流排805將透過選定的輸入匯流排809直接穿過線831。 The first multiplexer 802 uses the control line signal Out_FP32 801 to select FP-32 or BF16 output. When Out_FP32 801 is valid, it outputs a FP-32 formatted significand on line 805. When the Out_FP32 801 control signal is not valid, the output on line 805 is a BF-16 formatted significand. The output of the first multiplexer 802 is bus 805, which is divided into bus 807 and bus 809, and enters the second multiplexer 810. Multiplexer 810 is controlled by the 24th bit of the output bus on line 805, the frnd[23] signal on line 803. As described in [0093], in the case of rounding to produce a 25-bit significand, the 24th bit will be 1. In this case, the frnd[23] bit signal on line 803 selects input bus 807, which shifts bus 805 right by one bit position. (The frnd[23] signal also increments the index of the result by 1 to adjust for the right shift. If frnd[23] is equal to 0, no right shift is required and bus 805 passes directly through line 831 via the selected input bus 809.)

第三多工器(零)820在線829輸入上的全「0」或線831上的fnorm[22：0]之間進行選擇。如果輸出指數為無窮大或非正規，則輸出有效數被強制為零，這是透過線788上的控制訊號Zero_Sel完成，其選擇所有「0」輸入829。如果沒有例外，則正規化有效數匯流排831被路由為匯流排833，並映射到23位元有效數(IEEE 754)暫存器930。 The third multiplexer (zero) 820 selects between all "0"s on line 829 input or fnorm[22:0] on line 831. If the output exponent is infinite or non-normal, the output significand is forced to zero, which is accomplished via the control signal Zero_Sel on line 788, which selects all "0" input 829. If there are no exceptions, the normalized significand bus 831 is routed to bus 833 and mapped to the 23-bit significand (IEEE 754) register 930.

圖9C說明了顯示指數和異常處理方塊270b的範例性示意圖。8位元指數748提供EPST_L[9：0]964匯流排，並且如果frnd[23]=1則遞增1。這是透過將frnd[23]路由到遞增器982的進位位置來完成的。指數遞增器982的輸出是9位元Enorm[8：0]匯流排968，其為第一多工器(零)974的第一輸入。第一多工器(零)974的第二輸入是全「0」匯流排829。第一多工器(零)974的目的是將指數設置為全「0」，以防異常需要，由異常控制(8位元)暫存器750透過零控制邏輯970指示。 FIG9C illustrates an exemplary schematic diagram showing the index and exception handling block 270 b. The 8-bit index 748 is provided to the EPST_L[9:0] 964 bus and is incremented by 1 if frnd[23]=1. This is accomplished by routing frnd[23] to the carry position of the incrementor 982. The output of the index incrementor 982 is the 9-bit Enorm[8:0] bus 968, which is the first input to the first multiplexer (zero) 974. The second input to the first multiplexer (zero) 974 is the all-zero bus 829. The purpose of the first multiplexer (zero) 974 is to set the index to all "0" in case an exception is required, indicated by the exception control (8-bit) register 750 through the zero control logic 970.

異常控制(8位元)暫存器750在以下八狀況953上運算：exp_inf_en、pos_zero、s_mult、s_cin、 e_mul_zero、e_cin_zero、e_mul_zero和e_cin_inf。 The exception control (8-bit) register 750 operates on the following eight conditions 953: exp_inf_en, pos_zero, s_mult, s_cin, e_mul_zero, e_cin_zero, e_mul_zero, and e_cin_inf.

零控制邏輯970具有三個輸入：線961上來自互斥或閘的sign_diff和線959上的e_cin_zero、線957上來自異常控制(8位元)暫存器750的e_mul_zero，以及線972上的輸出控制第一多工器(零)974。第一多工器(零)974的線975上的輸出穿過多工器976，提供儲存在指數暫存器980中的指數訊號匯流排979。如果無窮大控制962在線963上發出無窮大訊號，則線907上的全「1」輸入穿過多工器976，將所有指數位元設置為「1」，如IEEE 754標準所推薦的。 Zero control logic 970 has three inputs: sign_diff from the exclusive OR gate on line 961 and e_cin_zero on line 959, e_mul_zero from the exception control (8-bit) register 750 on line 957, and output control first multiplexer (zero) 974 on line 972. The output of first multiplexer (zero) 974 on line 975 passes through multiplexer 976 to provide index signal bus 979 stored in index register 980. If infinity control 962 sends an infinity signal on line 963, the all "1" input on line 907 passes through multiplexer 976 to set all index bits to "1" as recommended by the IEEE 754 standard.

訊號的含義是：(在第089段中解釋) The meaning of the signal is: (explained in paragraph 089)

exp_inf_en 運算元A為無窮大或運算元B為無窮大 exp_inf_en Operand A is infinite or operand B is infinite

pos_zero 結果有效數為零 pos_zero The result is effectively zero

s_mult 乘積符號 s_mult product sign

s_cin 輸入C符號 s_cin inputs C symbol

e_mul_zero 乘積指數=0 e_mul_zero multiplication exponent = 0

e_cin_zero 輸入C指數=0 e_cin_zero Input C index = 0

e_mul_inf 乘積指數等於無窮大。 e_mul_inf The product exponent is equal to infinity.

e_cin_inf 輸入C指數等於無窮大。 e_cin_inf Input C exponent equal to infinity.

此外，訊號「sign_diff」表示乘積「s_mult」的符號與輸入C的符號「s_cin」不同。所述訊號是透過對取自異常控制(8位元)暫存器750的s_mult及s_cin訊號施加互斥或函數獲得的。 In addition, the signal "sign_diff" indicates that the sign of the product "s_mult" is different from the sign of the input C "s_cin". The signal is obtained by applying a mutual exclusion or function to the s_mult and s_cin signals obtained from the exception control (8-bit) register 750.

異常控制(8位元)暫存器750在匯流排965上向符號生成和異常處理電路988以及欠位/溢位檢測和指數異常檢測電路986提供以下訊號：s_cin、s_mult、exp_inf_en、e_cin_inf、e_mul_inf、pos_zero、sign_diff。電路986的控制訊號是線969上的norm_en 758和線971上的小數零暫存器756。電路986的輸出是三個訊號ov(溢位)、uf(欠位)及(無效)nv。線991上的第四輸出從無窮大檢測電路992發送到符號生成和異常處理電路988指示溢位。 The exception control (8-bit) register 750 provides the following signals on bus 965 to the sign generation and exception handling circuit 988 and the under/overflow detection and exponent exception detection circuit 986: s_cin, s_mult, exp_inf_en, e_cin_inf, e_mul_inf, pos_zero, sign_diff. The control signals for circuit 986 are norm_en 758 on line 969 and fractional zero register 756 on line 971. The outputs of circuit 986 are the three signals ov (overflow), uf (underflow), and nv (invalid). A fourth output on line 991 is sent from the infinity detection circuit 992 to the sign generation and exception handling circuit 988 to indicate overflow.

訊號具有以下縮寫，如下所示：s_cin(輸入C符號)、s_mult(乘積符號)、e_mul_zero(乘積指數零)、e_cin_zero(輸入C指數零)、e_mul_inf(乘積指數無窮大)和e_cin_inf(輸入C指數無窮大)。 The signals have the following abbreviations as follows: s_cin (sign of input C), s_mult (sign of product), e_mul_zero (product exponent zero), e_cin_zero (input C exponent zero), e_mul_inf (product exponent infinitely large), and e_cin_inf (input C exponent infinitely large).

兩個相關事件致使欠位。一個是在±2-126[其中-126是最小指數值]之間建立一個微小的非零結果，因為它非常小，稍後可能會致使一些其它異常，諸如除法時溢位。另一個事件是在對如此小的數字進行近似的期間出現了準確性的異常損失。當傳遞的結果與在指數範圍和精確度無界的情況下計算的結果不同時，可能會檢測到精確度損失。除了要求單精確度和雙精確度外，IEEE標準754不追蹤精確度。在此揭露的實現中，不使用「非正規」數，並且指數值為-126且有效數小於1的任何值將被轉換為零。透過將所有有效數設置為零並將指數值設置為零來表示零，這在我們揭露的實現中由異常處理電路處理。 Two related events cause underscaling. One is the creation of a tiny non-zero result between ±2-126 [where -126 is the smallest exponent value] which, because it is so small, may later cause some other anomaly, such as overflow on division. The other event is an unusual loss of accuracy during the approximation of such a small number. The loss of accuracy may be detected when the result passed is different from the result calculated if the exponent range and precision were unbounded. IEEE Standard 754 does not track accuracy other than requiring single and double precision. In the disclosed implementation, "denormal" numbers are not used, and any value with an exponent value of -126 and a significand less than 1 is converted to zero. Zero is represented by setting all significands to zero and setting the exponent value to zero, which is handled by the exception handling circuitry in our disclosed implementation.

符號生成和異常處理電路988透過匯流排965 和無窮大檢測電路992接收來自異常控制(8位元)暫存器750的輸入。符號生成和異常處理電路988的輸出是一符號位元，其透過訊號線983被儲存到暫存器990。 The symbol generation and exception handling circuit 988 receives input from the exception control (8-bit) register 750 via bus 965 and the infinity detection circuit 992. The output of the symbol generation and exception handling circuit 988 is a symbol bit, which is stored in the register 990 via signal line 983.

無窮大檢測電路992在指數訊號匯流排979輸入上運行，並且如果它檢測到所有指數位元為1，其將向或閘987提供「1」，或閘987又將其輸出788設置為「1」。這設置了ZERO-SEL訊號788，所述ZERO-SEL訊號788將有效數設置為全零(Mux 820，圖9B)。 Infinity detection circuit 992 operates on the index signal bus 979 input, and if it detects that all exponent bits are 1, it provides a "1" to OR gate 987, which in turn sets its output 788 to "1". This sets ZERO-SEL signal 788, which sets the significand to all zeros (Mux 820, FIG. 9B).

當指數訊號匯流排979上的指數值超出範圍時，非正規電路994檢測到這種情況，並在訊號線967上發出欠位狀況的訊號。這種情況也被發訊號給或閘987，其在線788上產生訊號ZERO-SEL。線788(圖9B)上的ZERO-SEL訊號將指示多工器820將所有「0」插入有效數，從而建立正確的IEEE 754「零」表示(指數和有效數都包含所有「0」)。 When the index value on index signal bus 979 is out of range, denormal circuit 994 detects this condition and signals an under-bit condition on signal line 967. This condition is also signaled to OR gate 987, which generates signal ZERO-SEL on line 788. The ZERO-SEL signal on line 788 (FIG. 9B) will instruct multiplexer 820 to insert all "0"s into the significand, thereby creating the correct IEEE 754 "zero" representation (both the exponent and the significand contain all "0"s).

描述了，使用具有基底為8的指數的進位保留加法和累加的浮點乘加累加單元。這平衡了指數單位的關鍵時序與有效數單位的關鍵時序。此外，與浮點IEEE-754標準中提出的使用符號數值表示不同，使用2的補數系統來表示也具有運算元符號的正和負有效數。當指數相等以確定IEEE-754標準規定的兩者中的較大者時，這避免了不必要的有效數減法。引入2的補數表示需要適用於正數和負數兩者的新穎前導零(前導一)檢測器(預測器)。這同樣適用於溢位(OV)檢測。此外，有必要確定何時將進位及總和相加會致使長符號擴展(SE)，這需要引入新穎的設計特徵。 A floating-point multiply-add-accumulate unit using carry-save addition and accumulation with base 8 exponents is described. This balances the critical timing of the exponent unit with the critical timing of the significand unit. In addition, a 2's complement system is used to represent positive and negative significands that also have operand signs, as opposed to using signed numeric representation as proposed in the floating-point IEEE-754 standard. This avoids unnecessary significand subtraction when the exponents are equal to determine the greater of the two as specified by the IEEE-754 standard. The introduction of the 2's complement representation requires novel leading zero (leading one) detectors (predictors) that apply to both positive and negative numbers. The same applies to overflow (OV) detection. Additionally, it is necessary to determine when adding the carry and sum will result in a sign extension (SE), which requires the introduction of novel design features.

描述了浮點乘加累加單元，支援乘法累加運算的BF16格式，以及符合IEEE 754標準的FP32單精確度加法。乘法累加單元使用更高的內部精確度和更長的累加器，透過將運算元轉換為更高基底和更長的內部2的補數有效數表示，以促進精確度以及與負數的比較和運算。使用進位保留格式執行加法，以避免長進位傳播並加快運算。2的補數和進位保留格式採用溢位檢測、零檢測和符號擴展等運算。溢位和符號擴展的處理允許相對獨立於累加器大小的快速運算。在不影響時間的累加運算中引入了適合機器學習的捨入，大大提高了計算的準確性。 The floating-point multiply-add-accumulate unit is described, supporting the BF16 format for multiply-accumulate operations, and FP32 single-precision addition compliant with the IEEE 754 standard. The multiply-accumulate unit uses higher internal precision and longer accumulators by converting the operands to a higher radix and longer internal 2's complement significand representation to facilitate accuracy as well as comparisons and operations with negative numbers. Additions are performed using the carry-save format to avoid long carry propagation and speed up operations. Operations such as overflow detection, zero detection, and sign extension are employed in the 2's complement and carry-save formats. The handling of overflow and sign extension allows for fast operations that are relatively independent of the accumulator size. Rounding suitable for machine learning is introduced in the accumulation operation without affecting the time, greatly improving the accuracy of the calculation.

Abnormal handling

圖10至21與上述電路和方法中的異常處理有關。根據用於表示有效數和指數的特定編碼格式，浮點數可以採用特殊情況下的值，諸如：正無窮大或負無窮大、零和非正規數字。 Figures 10 to 21 relate to exception handling in the above circuits and methods. Depending on the specific encoding format used to represent the significand and exponent, floating point numbers can take on values in special cases such as positive or negative infinity, zero, and non-normal numbers.

圖10將方塊1010中所示的浮點數範圍1000說明為由許多項劃分為感興趣區域的水平數線。用語的定義在表1中定義。 FIG. 10 illustrates the floating point number range 1000 shown in block 1010 as a horizontal number line divided into regions of interest by a number of items. Definitions of Terms are defined in Table 1.

Floating point special numbers

表1中顯示的以下列表包含三行。第一行列出了特殊浮點數的定義。第二行列出了BF16浮點編碼格式的值。第三行顯示浮點編碼FP32格式的值。 The following list shown in Table 1 contains three rows. The first row lists the definitions of special floating-point numbers. The second row lists the values in the BF16 floating-point encoding format. The third row shows the values in the floating-point encoding FP32 format.

在表1的註釋欄中，用語(+)Nan包括BF16中的值7F81和FP32中的值7F800001，作為表示NaN的兩種約定。用語(+)範數(Norm)被列為(+)Pi，(3.14...)並且用語(-)範數被列為(-)Pi以用於測試目的。項(+)非範數及(-)非範數是最小的可表示值。(+)Zero及(-)Zero有兩種表示形式，區別在於最高有效符號位元。對於與符號位元有關的所有其它用語也是如此。 In the Notes column of Table 1, the term (+)Nan includes the value 7F81 in BF16 and the value 7F800001 in FP32 as two conventions for representing NaN. The term (+)Norm is listed as (+)Pi, (3.14...) and the term (-)Norm is listed as (-)Pi for testing purposes. The terms (+)Norm and (-)Norm are the smallest representable values. (+)Zero and (-)Zero have two representations that differ in the most significant sign bit. The same is true for all other terms related to the sign bit.

BFloat16浮點格式，也稱為大腦浮點格式，(有時也稱為「BF16」)是一種16位元數字編碼格式。BF16保留了IEEE單精確度數的近似動態範圍。BF16格式包括一7位元小數(也稱為尾數或有效數)、一「隱含位元」或「隱藏位元」、一8位元指數和一符號位元。單精確度浮點值可以轉換為BF16以加速機器學習。動態範圍與使用8位元精確度而非24位元小數的單精確度FP32(8位元指數)相同。BFloat16可以降低記憶體需求、可以降低儲存需求、可以提高機器學習演算法的計算速度。BF16是32位元單精確度IEEE 754格式的截斷16位元版本，其意於加速機器學習。 The BFloat16 floating point format, also known as the brain floating point format, (sometimes referred to as "BF16") is a 16-bit number encoding format. BF16 preserves the approximate dynamic range of IEEE single-precision numbers. The BF16 format consists of a 7-bit fraction (also called the mantissa or significand), a "hidden bit" or "hidden bit", an 8-bit exponent, and a sign bit. Single-precision floating point values can be converted to BF16 to speed up machine learning. The dynamic range is the same as single-precision FP32 (8-bit exponent) using 8-bit precision instead of 24-bit fraction. BFloat16 can reduce memory requirements, can reduce storage requirements, and can increase the computation speed of machine learning algorithms. BF16 is a truncated 16-bit version of the 32-bit single-precision IEEE 754 format intended for accelerating machine learning.

第二種數字格式是IEEE 754單精確度32位元浮點(FP32)。IEEE 754單精確度32位元浮點包括一23位元小數、一「隱含」位元或「隱藏位元」、一8位元指數和一符號位元。 The second numeric format is IEEE 754 single-precision 32-bit floating point (FP32). IEEE 754 single-precision 32-bit floating point consists of a 23-bit fraction, an "implicit" or "hidden bit", an 8-bit exponent, and a sign bit.

表2的以下內容列出了BFloat16用語及其數值定義。 Table 2 below lists the BFloat16 terms and their value definitions.

表3的以下內容列出了額外的BFloat16用語及其二進制格式的數字定義。正無窮大和負無窮大定義為當所有指數位元都等於一時且當所有小數位都等於零時。正負NaN(非數字)定義為當所有指數位元都等於一時且當並非所有小數位元都等於零時。正負非範數定義為當所有指數位元都等於零且當並非所有小數位元都等於零時。正無窮大或負無窮大、NaN或非範數取決於符號位元。 Table 3 below lists additional BFloat16 terms and their binary numeric definitions. Positive and negative infinite are defined when all exponent bits are equal to one and when all fraction bits are equal to zero. Positive and negative NaN (Not a Number) are defined when all exponent bits are equal to one and when not all fraction bits are equal to zero. Positive and negative NON are defined when all exponent bits are equal to zero and when not all fraction bits are equal to zero. Whether it is positive or negative infinite, NaN, or NON depends on the sign bit.

在一些實施例中，用於機器學習運算的異常處理單元不支援非正規或NaN運算。非正規數被視為零，而NaN數被視為無窮大。 In some embodiments, the exception processing unit used for machine learning operations does not support denormal or NaN operations. Denormal numbers are treated as zero, and NaN numbers are treated as infinite.

Abnormal

IEEE標準754-2019規範的第7章描述了下面列出的五類浮點異常。根據一個實施例，實現以下類別中的三個：(1)無效運算、(3)溢位及(4)欠位。本實施例不支援除以零及不精確。 Chapter 7 of the IEEE Standard 754-2019 specification describes the five categories of floating-point exceptions listed below. According to one embodiment, three of the following categories are implemented: (1) invalid operation, (3) overflow, and (4) under. This embodiment does not support division by zero and inexact.

1)無效運算 1) Invalid operation

2)除以零 2) Divide by zero

3)溢位 3) Overflow

4)欠位 4) Lack of position

5)不精確 5) Inaccurate

根據一些其它實施例，實現了四個類別：(1)無效運算、(2)除以零、(3)溢位及(4)欠位。根據其它實施例，實現了所有五個類別。 According to some other embodiments, four categories are implemented: (1) invalid operation, (2) division by zero, (3) overflow, and (4) under. According to other embodiments, all five categories are implemented.

Invalid operation

IEEE標準754-2019規範將以下內容描述為無效運算：a)對信令NaN的任何通用計算運算；b)乘法：乘法(0,∞)或乘法(∞,0)；c)積和熔加(fusedMultiplyAdd)：積和熔加(0,∞,c)或積和熔加(∞,0,c)；d)加法或減法或積和熔加：無窮大的幅度減法，如加法(+∞,-∞)；e)除法：除法(0,0)或除法(∞,∞)；f)餘數：餘數(x,y)，當y為零或x為無窮大，且兩者都不是NaN時；g)如果運算元為負數，則平方根；以及h)當結果不適合目標格式或當一個運算元是有限的而另一個是無限的時，量化。 The IEEE Standard 754-2019 specification describes the following as invalid operations: a) any general computation on a signaling NaN; b) multiplication: multiply(0,∞) or multiply(∞,0); c) fusedMultiplyAdd: fusedMultiplyAdd(0,∞,c) or fusedMultiplyAdd(∞,0,c); d) addition or subtraction or fusedMultiplyAdd: Infinite magnitude subtraction, such as addition (+∞, -∞); e) division: division(0,0) or division(∞,∞); f) remainder: remainder(x,y) when y is zero or x is infinite, and neither is NaN; g) square root if the operand is negative; and h) quantization when the result does not fit in the target format or when one operand is finite and the other is infinite.

根據一個實施例，進位保留累加單元中的異常處理實現上面列出的無效運算a/b/c/d。根據另一實施例，類別(a)至(h)的任何組合都為可能的。 According to one embodiment, exception handling in the carry-save accumulation unit implements the invalid operations a/b/c/d listed above. According to another embodiment, any combination of categories (a) to (h) is possible.

Invalid operation exception

表4的以下內容列出了產生異常的無效運算。 The following table 4 lists invalid operations that generate exceptions.

Overflow exception

表5的以下內容列出了兩個運算元最大範數(最大範數)乘以最大範數的一個範例的溢位異常。這些溢位異常會產生結果「帶正負號無窮大(Signed Infinity)」。當結果大於帶正負號最大範數(Signed最大範數)並且僅當運算元的輸入上不存在無限大值時，才會發生此溢位異常。 The following in Table 5 lists the overflow exceptions for the case of two operand maximums (maximum norms) multiplied by one of the maximum norms. These overflow exceptions produce the result "Signed Infinity". This overflow exception occurs when the result is greater than the signed maximum norm (Signed Maximum Norm) and only when there are no infinities on the inputs of the operands.

Abnormal position

在一個實施例中，存在幾種情況，其中結果小於帶正負號最小範數。當運算元輸入上沒有精確的零值時，會發生此異常。當加法器範數/捨入：(+)範數+(-)範數發生時，結果是非正規(非正規)值(但不是精確零)，但實際結果將是「帶正負號零」。參見表6，其中顯示兩個運算元最小範數(最小範數)乘以最小範數的範例。 In one embodiment, there are several cases where the result is less than the signed minimum norm. This exception occurs when there is no exact zero value on the operand input. When adder norm/rounding: (+) norm + (-) norm occurs, the result is a non-normal (non-normal) value (but not exact zero), but the actual result will be "signed zero". See Table 6, which shows an example of two operand minimum norms (minimum norms) multiplied by minimum norms.

在一些實施例中，異常處理可以分為「異常旗標生成」及「異常結果生成」。例如在圖12A的浮點乘法器方塊1110和浮點進位保留加法器方塊1130上處理異常。 In some embodiments, exception handling can be divided into "exception flag generation" and "exception result generation". For example, exceptions are handled on the floating-point multiplier block 1110 and the floating-point carry-save adder block 1130 of FIG. 12A.

Abnormal flag generation

在一些實施例中，浮點乘法器異常旗標被提供用於：(1)溢位、(2)欠位，以及(3)無效。 In some embodiments, floating-point multiplier exception flags are provided for: (1) overflow, (2) under, and (3) invalid.

在一些實施例中，提供了以下浮點加法器異常旗標：(1)溢位，(2)欠位，以及(3)無效。 In some embodiments, the following floating-point adder exception flags are provided: (1) overflow, (2) under, and (3) invalid.

Abnormal result generation

異常處理的運算在IEEE標準754-2019第6章中進行了解釋。以下兩個表格總結了乘法和加法運算的一個實施例和實現。 The operations for exception handling are explained in Chapter 6 of IEEE Standard 754-2019. The following two tables summarize an embodiment and implementation of the multiplication and addition operations.

Multiplier operation

根據一個實施例，表7的內容列出了具有無效、欠位和溢位運算的註釋的乘法運算。 According to one embodiment, the contents of Table 7 list multiplication operations with annotations for invalid, under, and overflow operations.

Addition operation

根據一個實施例，表8的加法器運算列表的內容具有無效及溢位運算的註釋。 According to one embodiment, the contents of the adder operation list of Table 8 have annotations for invalid and overflow operations.

本文描述了一種異常處理電路，用於根據浮點編碼格式檢測乘法器中的至少一個無效運算或結果以及加法器中的至少一個無效運算或結果，並設置乘法器或加法器的輸出運算元到可用於進一步處理的值。 An exception processing circuit is described herein for detecting at least one invalid operation or result in a multiplier and at least one invalid operation or result in an adder based on a floating point encoding format and setting the output operand of the multiplier or adder to a value that can be used for further processing.

High-level architecture

圖11說明了一個範例高階架構方塊圖1100，其描繪了用於機器學習的進位保留累加單元中的異常處理的元件。 FIG. 11 illustrates an example high-level architectural block diagram 1100 depicting components of exception handling in a carry-save-accumulate unit for machine learning.

在一個實施例中，進位保留累加單元設計中的異常處理包含三個不同的輸入訊號。這些是運算元A 1113、運算元B 1114和運算元C 1116。運算元A 1113可以是BF16和FP32格式。運算元B 1114是BF16格式，運算元C 1116是FP32格式。運算元也稱為輸入。 In one embodiment, exception handling in a carry-save-accumulate unit design includes three different input signals. These are operator A 1113, operator B 1114, and operator C 1116. Operator A 1113 can be in BF16 and FP32 formats. Operator B 1114 is in BF16 format, and operator C 1116 is in FP32 format. Operators are also referred to as inputs.

乘法器異常處理方塊1102匯入運算元A 1113和運算元B 1114。乘法器異常處理方塊1102的輸出連接到以下：(1)經由匯流排1106到乘法器異常旗標1104，(2)經由匯流排乘法器異常狀況訊號1108到異常輸出控制訊號生成1126，及(3)乘法器異常結果1115。 The multiplier exception handling block 1102 imports operand A 1113 and operand B 1114. The output of the multiplier exception handling block 1102 is connected to the following: (1) via bus 1106 to the multiplier exception flag 1104, (2) via bus multiplier exception condition signal 1108 to the exception output control signal generation 1126, and (3) the multiplier exception result 1115.

FP32格式的運算元C 1116進入運算元C基本轉換方塊1118並輸出到：(1)透過運算元C異常狀況訊號匯流排1120到異常輸出控制訊號生成1126，及(2)透過匯流排1122到進位保存加法器方塊1130。 Operand C 1116 in FP32 format enters operand C basic conversion block 1118 and is output to: (1) via operator C exception condition signal bus 1120 to exception output control signal generation 1126, and (2) via bus 1122 to carry save adder block 1130.

進位保留加法器(CSA)方塊1130透過匯流排1122處理兩個輸入：(1)乘法器異常結果1115，及(2)來自運算元C基本轉換方塊1118的輸出。在一個實施例中，CSA方塊1130具有累加器迴圈1124，其將僅在迴圈結束時輸出資料。CSA模組1130透過匯流排1132輸出。 The carry-save adder (CSA) block 1130 processes two inputs via bus 1122: (1) the multiplier exception 1115, and (2) the output from the operator C basic conversion block 1118. In one embodiment, the CSA block 1130 has an accumulator loop 1124 that will output data only at the end of the loop. The CSA module 1130 is output via bus 1132.

加法器正規化異常處理方塊1134匯入兩個輸入。第一輸入是透過匯流排1132輸出的CSA方塊1130。第二輸入是來自異常輸出控制訊號生成1126方塊的異常控制1128。 The adder normalization exception handling block 1134 takes two inputs. The first input is the CSA block 1130 output through bus 1132. The second input is the exception control 1128 from the exception output control signal generation 1126 block.

加法器正規化異常處理方塊1134的輸出是加法器異常結果1139和路由到加法器異常旗標方塊1140的匯流排1138。 The outputs of the adder normalization exception handling block 1134 are the adder exception result 1139 and bus 1138 which is routed to the adder exception flag block 1140.

運算元A是32位元匯流排，並且為乘法運算提供BF16輸入，而為加法運算提供FP32輸入。運算元B僅用於乘法運算，始終具有BF16 16位元輸入格式。運算元C用於加法運算或累加器初始化，始終具有FP32 32位元輸入格式。乘法器部分具有分離的輸出旗標(溢位、欠位和無效)，而乘法器異常結果成為加法器的直接輸入。加法器的異常狀況訊號由乘法器產生並連接到「異常控制訊號產生」方塊，並具有來自運算元C和累加器的異常狀況訊號，為加法器正規化方塊產生「異常控制」訊號以用於加法器異常處理。 Operand A is a 32-bit bus and provides BF16 inputs for multiplication and FP32 inputs for addition. Operand B is used for multiplication only and always has BF16 16-bit input format. Operand C is used for addition or accumulator initialization and always has FP32 32-bit input format. The multiplier section has separate output flags (overflow, underflow, and invalid), and the multiplier exception result becomes a direct input to the adder. The adder exception signal is generated by the multiplier and connected to the "Exception Control Signal Generation" block, and has the exception signals from the operator C and the accumulator to generate the "Exception Control" signal for the adder normalization block for adder exception handling.

Operation mode

根據一個實施例，進位保留累加單元設計中的異常處理支援三種不同的運算模式。 According to one embodiment, exception handling in a carry-save-accumulate unit design supports three different operation modes.

圖12A說明了根據圖11的高階方塊圖架構的第一運算模式1200。乘法累加運算顯示BF16格式的運算元A 1113，其首先與BF16格式的運算元B 1114相乘，接著乘積以FP32格式相加到運算元C 1116。乘加運算在單一運算中完成。此運算產生乘法和加法異常旗標和結果。 FIG12A illustrates a first operation mode 1200 according to the high-level block diagram architecture of FIG11. The multiply-accumulate operation shows operand A 1113 in BF16 format first multiplied with operand B 1114 in BF16 format, and then the product is added to operand C 1116 in FP32 format. The multiply-accumulate operation is completed in a single operation. This operation generates multiplication and addition exception flags and results.

乘法器方塊1110匯入運算元A 1113和運算元B 1114。乘法器方塊1110的輸出連接到以下：(1)透過匯流排1106到乘法器異常旗標1104方塊，以及(2)連接乘法器異常結果1115到進位保留加法器(CSA)方塊1130。在一些實施例中，第一運算模式使用乘法器異常旗標1104以用於統計目的。 Multiplier block 1110 imports operand A 1113 and operand B 1114. The output of multiplier block 1110 is connected to: (1) multiplier exception flag 1104 block via bus 1106, and (2) multiplier exception result 1115 to carry save adder (CSA) block 1130. In some embodiments, the first operation mode uses multiplier exception flag 1104 for statistical purposes.

進位保留加法器(CSA)方塊1130處理以下兩個輸入：(1)乘法器異常結果1115及(2)運算元C 1116。CSA方塊1130具有兩個輸出。第一輸出是加法器異常結果1139，其以BF16或FP32格式路由到輸出方塊1129。第二輸出是路由到加法器異常旗標方塊1140的匯流排1138。 The carry-save adder (CSA) block 1130 processes the following two inputs: (1) the multiplier exception result 1115 and (2) the operator C 1116. The CSA block 1130 has two outputs. The first output is the adder exception result 1139, which is routed to the output block 1129 in BF16 or FP32 format. The second output is bus 1138 which is routed to the adder exception flag block 1140.

圖12B說明了根據圖11的高階方塊圖架構的第二運算模式1202。乘法累加運算顯示為BF16格式的運算元A 1113與BF16中的運算元B 1114在單一運算中相乘。它在累加迴圈結束時產生輸出結果。在累加期間，加法器輸出和加法器異常被禁用。此運算產生乘法和加法異常旗標和結果。 FIG12B illustrates a second operation mode 1202 according to the high-level block diagram architecture of FIG11. The multiply-accumulate operation is shown as operand A 1113 in BF16 format multiplied with operand B 1114 in BF16 in a single operation. It produces an output result at the end of the accumulation loop. During accumulation, adder outputs and adder exceptions are disabled. This operation produces multiplication and addition exception flags and results.

乘法器方塊1110匯入運算元A 1113和運算元B 1114。乘法器方塊1110的輸出連接到以下方塊：(1)透過匯流排1106到乘法器異常旗標1104，及(2)透過乘法器異常結果1115到進位保留加法器(CSA)1130。在一些實施例中，第二運算模式使用乘數異常旗標1104以用於統計目的。 Multiplier block 1110 imports operand A 1113 and operand B 1114. The output of multiplier block 1110 is connected to the following blocks: (1) to multiplier exception flag 1104 via bus 1106, and (2) to carry-save adder (CSA) 1130 via multiplier exception result 1115. In some embodiments, the second operation mode uses multiplier exception flag 1104 for statistical purposes.

進位保留加法器(CSA)方塊1130處理以下兩個輸入：(1)乘法器異常結果1115和累加器迴圈1124。CSA方塊1130具有兩個輸出。第一輸出是加法器異常結果1139，其僅在累加器迴圈1124結束時以BF16或FP32格式路由到輸出方塊1131。第二輸出是匯流排1138，其路由到加法器異常旗標方塊1140。 The carry-save adder (CSA) block 1130 processes the following two inputs: (1) the multiplier exception result 1115 and the accumulator loop 1124. The CSA block 1130 has two outputs. The first output is the adder exception result 1139, which is routed to the output block 1131 in BF16 or FP32 format only at the end of the accumulator loop 1124. The second output is the bus 1138, which is routed to the adder exception flag block 1140.

圖12C說明了根據圖11的高階方塊圖架構的第三運算模式1203。加法運算顯示為FP32格式的運算元A 1113與FP32格式的運算元C 1116相加。此運算僅產生相加異常旗標和結果。乘法器異常處理被禁用。 FIG12C illustrates a third operation mode 1203 according to the high-level block diagram architecture of FIG11. An addition operation is shown as an addition of an FP32 formatted operand A 1113 and an FP32 formatted operand C 1116. This operation generates only an addition exception flag and result. Multiplier exception handling is disabled.

進位保留加法器(CSA)方塊1130將兩個輸入相加。FP32格式的運算元A 1113與FP32格式的運算元C 1116相加。CSA方塊1130具有兩個輸出。第一輸出是加法器異常結果1139，其以BF16或FP32格式路由到輸出方塊1129。第二輸出是路由到加法器異常旗標方塊1140的匯流排1138。 The carry-save adder (CSA) block 1130 adds two inputs. Operand A 1113 in FP32 format is added to operand C 1116 in FP32 format. CSA block 1130 has two outputs. The first output is the adder exception result 1139 which is routed to output block 1129 in BF16 or FP32 format. The second output is bus 1138 which is routed to adder exception flag block 1140.

Exception handling structure

根據一些實施例，異常處理分為「異常旗標生成」和「異常結果生成」，兩者也可以分為乘法器異常和加法器異常。 According to some embodiments, exception handling is divided into "exception flag generation" and "exception result generation", both of which can also be divided into multiplier exceptions and adder exceptions.

乘法器及加法器旗標生成產生溢位、欠位和無效旗標。這些旗標在下面的應用中顯示為一組電路實現。 The multiplier and adder flag generation produces overflow, underrun, and invalid flags. These flags are shown as a set of circuit implementations in the application below.

乘法器及加法器異常結果生成包括三狀況：(1)符號生成，(2)指數生成，以及(3)小數生成。所述符號具有正輸出或負輸出。當異常情況發生時，指數可以具有全「0」和全「1」狀況，而當不發生異常情況時，指數可以具有正規輸出。小數值可以有兩狀況；對於所有異常情況，其都為「0」，而對於非異常情況，其為正常的。 Multiplier and adder abnormal result generation includes three conditions: (1) sign generation, (2) exponent generation, and (3) fractional generation. The sign has a positive output or a negative output. When an abnormal situation occurs, the exponent can have all "0" and all "1" conditions, and when no abnormal situation occurs, the exponent can have a normal output. The fractional value can have two conditions; for all abnormal situations, it is "0", and for non-abnormal situations, it is normal.

圖13說明了在高階方塊圖中描繪的異常處理結構1300。浮點乘法累加器異常方塊1308將旗標輸出到異常旗標生成方塊1304並將結果輸出到異常結果生成方塊1312。旗標包含狀態資訊或用於由專用異常處理方塊處理的資料，如下文將進一步描述的。 FIG. 13 illustrates an exception handling structure 1300 depicted in a high-level block diagram. The floating point multiply accumulator exception block 1308 outputs a flag to the exception flag generation block 1304 and outputs a result to the exception result generation block 1312. The flags contain status information or data for processing by a dedicated exception handling block, as described further below.

異常旗標生成方塊1304可以將旗標輸出到乘法器異常旗標生成方塊1302或加法器異常旗標生成方塊1306。乘法器異常旗標生成方塊1302驅動乘法器溢位旗標狀況方塊1381、乘法器欠位旗標狀況方塊1382，以及乘法器無效旗標狀況方塊1383。 The abnormal flag generation block 1304 can output the flag to the multiplier abnormal flag generation block 1302 or the adder abnormal flag generation block 1306. The multiplier abnormal flag generation block 1302 drives the multiplier overflow flag status block 1381, the multiplier under-bit flag status block 1382, and the multiplier invalid flag status block 1383.

加法器異常旗標生成方塊1306驅動加法器溢位旗標狀況方塊1387、加法器欠位旗標狀況方塊1388和加法器無效旗標狀況方塊1389。 The adder exception flag generation block 1306 drives the adder overflow flag condition block 1387, the adder underrun flag condition block 1388, and the adder invalid flag condition block 1389.

異常結果生成方塊1312將結果提供給乘法器異常結果生成方塊1310和加法器異常結果生成方塊1314。 The abnormal result generation block 1312 provides the result to the multiplier abnormal result generation block 1310 and the adder abnormal result generation block 1314.

乘法器異常結果生成方塊1310將結果輸出到：(1)乘法器符號生成狀況方塊1320A，(2)乘法器指數生成狀況方塊1322A，以及(3)乘法小數生成狀況方塊 1324A。 The multiplier abnormal result generation block 1310 outputs the result to: (1) the multiplier sign generation status block 1320A, (2) the multiplier exponent generation status block 1322A, and (3) the multiplication fraction generation status block 1324A.

加法器異常結果生成方塊1314將結果輸出到：(1)加法器符號生成狀況方塊1326A，(2)加法器指數生成狀況方塊1328A，以及(3)加法器小數生成狀況方塊1330A。 Adder exception result generation block 1314 outputs the result to: (1) adder sign generation status block 1326A, (2) adder exponent generation status block 1328A, and (3) adder fraction generation status block 1330A.

乘法器符號生成狀況方塊1320A將狀況輸出到方塊1320B和1320C。乘法器指數生成狀況方塊1322A將狀況輸出到方塊1322B、1322C和1322D。乘法小數生成狀況方塊1324A將狀況輸出到方塊1324B和1324C。 Multiplier sign generation condition block 1320A outputs condition to blocks 1320B and 1320C. Multiplier exponent generation condition block 1322A outputs condition to blocks 1322B, 1322C, and 1322D. Multiplication fraction generation condition block 1324A outputs condition to blocks 1324B and 1324C.

加法器符號生成狀況方塊1326A將狀況輸出到方塊1326B和1326C。加法器指數生成狀況方塊1328A將狀況輸出到方塊1328B、1328C和1328D。加法器小數生成狀況方塊1330A將狀況輸出到方塊1330B和1330C。 Adder sign generation condition block 1326A outputs condition to blocks 1326B and 1326C. Adder exponent generation condition block 1328A outputs condition to blocks 1328B, 1328C, and 1328D. Adder fraction generation condition block 1330A outputs condition to blocks 1330B and 1330C.

以下從圖14A到圖21B的圖說明了圖13中所示的高階方塊的示意性實現。例如，圖13中的方塊1381在圖14A中實現，方塊1382在圖14B中實現。下圖標題對應於圖13的方塊名稱。 The following figures from FIG. 14A to FIG. 21B illustrate schematic implementations of the high-level blocks shown in FIG. 13. For example, block 1381 in FIG. 13 is implemented in FIG. 14A, and block 1382 is implemented in FIG. 14B. The following figure titles correspond to the block names in FIG. 13.

Condition circuit

圖14A描繪了乘法器溢位旗標狀況電路1381的一種實現1400。示意圖顯示了乘法器溢位旗標狀況1446在來自乘法器溢位及閘1444的高階輸出上有效。及閘1444具有以下三個輸入：(1)乘法運算致能1414，(2)反或閘1445的輸出，以及(3)1442，其為乘積指數及閘1440的輸出，也稱為Ep(指數積)和乘法器乘積指數。 FIG14A depicts one implementation 1400 of the multiplier overflow flag condition circuit 1381. The schematic diagram shows that the multiplier overflow flag condition 1446 is valid at the high-order output from the multiplier overflow AND gate 1444. AND gate 1444 has the following three inputs: (1) multiplication operation enable 1414, (2) the output of the NOR gate 1445, and (3) 1442, which is the output of the product exponent AND gate 1440, also known as Ep (exponential product) and the multiplier product exponent.

反或閘1445具有兩個輸入eainf和ebinf。如果輸入A為無窮大，則出現訊號指數A無窮大(eainf)，並且在指數等於0xFF時被檢測到(意味著全「1」)。如果輸入B為無窮大，則出現訊號指數B無窮大(ebinf)，並且在指數等於0xFF(表示全「1」)時被檢測到。上面顯示的表2定義了BFloat16用語及其數值定義。Eainf是及閘1420的輸出。及閘1420的輸入是運算元A的指數，並且被顯示具有最低有效數位元(LSB)1402和最高有效數位元(MSB)1404，包含八個指數位元。 The NOR gate 1445 has two inputs, eainf and ebinf. If input A is infinite, the signal index A is infinite (eainf) appears and is detected when the index is equal to 0xFF (meaning all "1's"). If input B is infinite, the signal index B is infinite (ebinf) appears and is detected when the index is equal to 0xFF (meaning all "1's"). Table 2 shown above defines the BFloat16 terms and their numeric definitions. Eainf is the output of AND gate 1420. The input of AND gate 1420 is the exponent of operand A and is shown to have the least significant bit (LSB) 1402 and the most significant bit (MSB) 1404, comprising eight exponent bits.

Ebinf是及閘1430的輸出。及閘1430的輸入是運算元B的指數，並且被顯示具有最低有效數位元(LSB)1406和最高有效數位元(MSB)1408，包含八個指數位元。 Ebinf is the output of AND gate 1430. The input to AND gate 1430 is the exponent of operand B and is shown with the least significant bit (LSB) 1406 and the most significant bit (MSB) 1408, comprising eight exponent bits.

乘積指數及閘1440的輸入是最低有效數位元(LSB)1410和最高有效數位元(MSB)1412，包含八個指數位元。 The inputs to the product exponent and gate 1440 are the least significant bit (LSB) 1410 and the most significant bit (MSB) 1412, which contain eight exponent bits.

圖14B描繪了乘法器欠位旗標狀況電路1382的一種實現1400。乘法器欠位旗標狀況1482在來自乘法器欠位及閘1480的高階輸出上有效。及閘1480具有以下三個輸入：(1)乘法運算致能1418，(2)反或閘1475的輸出，及(3)1472，其為乘積指數反或閘1470的輸出。 FIG. 14B depicts an implementation 1400 of a multiplier under-bit flag condition circuit 1382. The multiplier under-bit flag condition 1482 is active at the high-order output from the multiplier under-bit AND gate 1480. AND gate 1480 has three inputs: (1) multiplication enable 1418, (2) the output of NOR gate 1475, and (3) 1472, which is the output of product exponent NOR gate 1470.

反或閘1475有兩個輸入，eaz(輸入A指數為零)和ebz(輸入B指數為零)。Eaz是反或閘1450的輸出。反或閘1450的輸入是運算元A的指數，並且被顯示為具有最低有效數位元(LSB)1422和最高有效數位元(MSB)1424，包含八個指數位元。 NOR gate 1475 has two inputs, eaz (input A exponent is zero) and ebz (input B exponent is zero). Eaz is the output of NOR gate 1450. The input to NOR gate 1450 is the exponent of operand A and is shown as having least significant bit (LSB) 1422 and most significant bit (MSB) 1424, comprising eight exponent bits.

Ebz是反或閘1460的輸出。反或閘1460的輸入是運算元B的指數，並且被顯示為具有最低有效數位元(LSB)1426和最高有效數位元(MSB)1428，包含八個指數位元。 Ebz is the output of NOR gate 1460. The input to NOR gate 1460 is the exponent of operand B and is shown as having least significant bit (LSB) 1426 and most significant bit (MSB) 1428, comprising eight exponent bits.

乘積指數反或閘1470的輸入是最低有效數位元(LSB)1432和最高有效數位元(MSB)1434，包含八個指數位元。 The inputs to the product exponent NOR gate 1470 are the least significant bit (LSB) 1432 and the most significant bit (MSB) 1434, which contain eight exponent bits.

當乘法運算被致能、Ep(乘積指數)為0x00，並且任何乘法器指數輸入不為零時，乘法器欠位旗標的判定發生。不為零的乘法器指數輸入意味著運算元A或運算元B的任何指數都不是0x00。 The multiplier under-bit flag is asserted when multiplication is enabled, Ep (product exponent) is 0x00, and any multiplier exponent input is non-zero. A non-zero multiplier exponent input means that any exponent of operand A or operand B is not 0x00.

乘法器無效旗標的判定根據以下兩狀況發生：(1)乘法運算被致能及(2)無效為「1」。無效的「1」定義為當運算元A的指數為無窮大(0xFF)且運算元B的指數為零(0x00)時或當運算元B的指數為無窮大(0xFF)且運算元A的指數為零(0x00)時的狀況。 The invalid flag of the multiplier is determined based on the following two conditions: (1) the multiplication operation is enabled and (2) invalid is "1". Invalid "1" is defined as the condition when the exponent of operand A is infinite (0xFF) and the exponent of operand B is zero (0x00) or when the exponent of operand B is infinite (0xFF) and the exponent of operand A is zero (0x00).

圖15描繪了方塊1383中乘法器無效旗標狀況電路的一種實現1500。乘法器無效旗標狀況1582在來自乘法器無效及閘1580的高階輸出上有效。及閘1580具有以下兩個輸入：(1)乘法運算致能1501，及(2)1572，或閘1570的輸出。 FIG. 15 depicts one implementation 1500 of the multiplier invalid flag condition circuit in block 1383. The multiplier invalid flag condition 1582 is asserted at the high-order output from the multiplier invalid AND gate 1580. AND gate 1580 has the following two inputs: (1) multiplication operation enable 1501, and (2) 1572, the output of OR gate 1570.

或閘1570有兩個輸入，第一輸入是從及閘1550導出的1552：第二輸入是從及閘1560導出的1562。 OR gate 1570 has two inputs, the first input is 1552 derived from AND gate 1550, and the second input is 1562 derived from AND gate 1560.

及閘1550有兩個輸入：eainf和ebz。訊號ebz出現在輸入B為零時，即(+)零或(-)零，狀況是指數為0x00(表示全為「0」)。Eainf是及閘1510的輸出。及閘1510的輸入是運算元A的指數，並且顯示為具有最低有效數位元(LSB)1502和最高有效數位元(MSB)八位元1504。Ebz是反或閘1520的輸出。反或閘1520的輸入是運算元B的八位元指數，並且顯示為具有最低有效數位元(LSB)1506和最高有效數位元(MSB)1508。 AND gate 1550 has two inputs: eainf and ebz. Signal ebz appears when input B is zero, that is, (+) zero or (-) zero, the condition is the index 0x00 (indicating all "0"). Eainf is the output of AND gate 1510. The input of AND gate 1510 is the index of operand A and is displayed as having the least significant bit (LSB) 1502 and the most significant bit (MSB) eight bits 1504. Ebz is the output of NOR gate 1520. The input of NOR gate 1520 is the eight-bit index of operand B and is displayed as having the least significant bit (LSB) 1506 and the most significant bit (MSB) 1508.

及閘1560有兩個輸入：ebinf和eaz。Ebinf是及閘1530的輸出。及閘1530的輸入是運算元B的八位元指數，並顯示為具有最低有效數位元(LSB)1506和最高有效數位元(MSB)1508。Eaz是反或閘1540的輸出。訊號eaz出現在輸入A為零時，即(+)零或(-)零)，而指數為0x00(表示全「0」)。反或閘1540的輸入是運算元A的指數，並且被說明具有最低有效數位元(LSB)1502和最高有效數位元(MSB)八位元1504，包含八個指數位元。 AND gate 1560 has two inputs: ebinf and eaz. Ebinf is the output of AND gate 1530. The input to AND gate 1530 is the eight-bit exponent of operand B and is shown as having least significant bit (LSB) 1506 and most significant bit (MSB) 1508. Eaz is the output of NOR gate 1540. Signal eaz appears when input A is zero, i.e., (+) zero or (-) zero), and the exponent is 0x00 (representing all "0"). The input to NOR gate 1540 is the exponent of operand A and is illustrated as having least significant bit (LSB) 1502 and most significant bit (MSB) eight bits 1504, comprising eight exponent bits.

圖16描繪了乘法器符號生成狀況電路1320的一種實現1600。乘法器旗標狀況1632由及閘1630產生。圖15的乘法器無效旗標狀況682是及閘1630的第一輸入。及閘1630的第二輸入是從EX-或閘1620的輸出導出的1632。符號A 1618和符號B 1622包含EX-或閘1620的輸入。 FIG. 16 depicts an implementation 1600 of the multiplier symbol generation state circuit 1320. The multiplier flag state 1632 is generated by the AND gate 1630. The multiplier invalid flag state 682 of FIG. 15 is the first input of the AND gate 1630. The second input of the AND gate 1630 is derived 1632 from the output of the EX-OR gate 1620. Symbol A 1618 and symbol B 1622 comprise the inputs of the EX-OR gate 1620.

圖17A顯示了乘法器指數生成狀況電路1322 的一種實現1700。乘法器指數1752根據三狀況生成。它們是：(1)全「0」(0x00=零)，(2)全「1」(0xFF=無窮大)，及(3)正規指數。 FIG. 17A shows an implementation 1700 of the multiplier exponent generation circuit 1322. The multiplier exponent 1752 is generated based on three conditions. They are: (1) all "0" (0x00 = zero), (2) all "1" (0xFF = infinity), and (3) a normal exponent.

第一狀況是指數全「0」。如果運算元A的任何指數或運算元B的任何指數為零，則可能出現第一狀況。如果乘法器指數計算結果為負溢位，也可能出現第一狀況，這意味著乘數指數計算結果小於-126。 The first case is when the exponents are all zero. The first case can occur if any exponent of operand A or any exponent of operand B is zero. The first case can also occur if the multiplier exponent calculation results in a negative overflow, which means that the multiplier exponent calculation result is less than -126.

第二狀況是全「1」。如果運算元A的任何指數或運算元B的任何指數為無窮大，則可能出現第二狀況。如果乘法器指數計算結果為正溢位，也可能出現第二狀況，即乘法器指數計算結果大於Emax(+127)。 The second condition is all "1". The second condition may occur if any exponent of operand A or any exponent of operand B is infinitely large. The second condition may also occur if the multiplier exponent calculation result is a positive overflow, that is, the multiplier exponent calculation result is greater than Emax (+127).

第三狀況是正規輸出，定義為沒有第一狀況或第二狀況正在發生。這被定義為除全「0」或全「1」以外的狀況。 The third condition is a normal output, defined as none of the first or second conditions occurring. This is defined as anything other than all "0" or all "1".

八位元寬的多工器1750具有三個8位元匯流排作為輸入。這些是全「0」、全「1」和其它匯流排。使用或閘1710、及閘1720、或閘1730和反或閘1740的控制閘控制了多工器1750的輸出。 The eight-bit wide multiplexer 1750 has three 8-bit buses as inputs. These are the all-0, all-1, and other buses. The output of the multiplexer 1750 is controlled by control gates using OR gate 1710, AND gate 1720, OR gate 1730, and NOR gate 1740.

圖17B顯示了乘法器小數生成狀況電路1324的一種實現1700。乘法器小數匯流排1772是16位元寬並且根據兩狀況產生：(1)異常，(2)正規小數。兩個多工器輸入是：(1)全「0」狀況(0x00=零)，及(2)正規小數狀況。 FIG17B shows one implementation 1700 of the multiplier fraction generation circuit 1324. The multiplier fraction bus 1772 is 16 bits wide and generates based on two conditions: (1) abnormal, (2) normal fraction. The two multiplexer inputs are: (1) all "0" condition (0x00 = zero), and (2) normal fraction condition.

多工器1770有兩個16位元輸入匯流排。這些是全「0」和小數匯流排。或閘1760控制1770多工16位元寬多工輸出的閘控。 Multiplexer 1770 has two 16-bit input buses. These are the all-zero and fractional buses. OR gate 1760 controls the gating of the 1770 multiplexed 16-bit wide multiplexed outputs.

訊號exp_overflow和ezero是或閘1760的兩個輸入。訊號exp_overflow是正溢位或負溢位的邏輯或，其中溢位在上面的圖17A中進行了描述。在一些實施例中，在指數計算之後提供一個狀態位元以檢測指數溢位。用語ezero定義為eaz(輸入A指數為零)或ebz(輸入B指數為零)。或閘1760的輸出是1762，其為在兩個16位元寬多工器1770輸入匯流排之間進行選擇的控制。全「0」發生在以下情況：(1)乘積指數為正溢位，或(2)存在負溢位，或(3)運算元A的指數為零，或(4)運算元B的指數為零。 Signals exp_overflow and ezero are two inputs to OR gate 1760. Signal exp_overflow is the logical OR of positive overflow or negative overflow, where overflow is described in FIG. 17A above. In some embodiments, a status bit is provided after the exponent calculation to detect exponent overflow. The term ezero is defined as eaz (input A exponent is zero) or ebz (input B exponent is zero). The output of OR gate 1760 is 1762, which is a control that selects between the two 16-bit wide multiplexer 1770 input buses. All "0" occurs when: (1) the product exponent is positive overflow, or (2) there is a negative overflow, or (3) the exponent of operand A is zero, or (4) the exponent of operand B is zero.

在一些實施例中，進位保留累加單元中的異常處理不支援非正規或NaN運算。在這種情況下，非正規數被視為零，而NaN數被視為無窮大。當任何這些異常發生時，小數輸出將為全「0」。否則，多工器1770的輸出是正規化小數。 In some embodiments, exception handling in the carry-save accumulation unit does not support denormal or NaN operations. In this case, denormal numbers are treated as zero and NaN numbers are treated as infinite. When any of these exceptions occur, the fractional output will be all "0". Otherwise, the output of multiplexer 1770 is a normalized fractional number.

圖18A說明了方塊1387中的加法器溢位旗標狀況電路的一個示意性實現1800。名為加法器溢位1832的訊號演示了溢位旗標狀況並且是及閘1830的輸出。當及閘1830具有以下三個輸入時，發生加法器溢位旗標狀況：(1)正規化_致能，(2)及閘1810輸出命名為溢位，及(3)反或閘1820輸出命名為非輸入指數無窮大1822。及閘1810的輸入是八個正規化指數位元。反或閘1820的輸入是乘積指數無窮大和輸入C指數無窮大。總之，在正規化期間，如果正規化指數等於0xFF並且沒有具有指數無窮大狀況的輸入，則可能發生加法器溢位1832旗標狀況。 FIG18A illustrates a schematic implementation 1800 of the adder overflow flag condition circuit in block 1387. A signal named adder overflow 1832 demonstrates the overflow flag condition and is the output of AND gate 1830. The adder overflow flag condition occurs when AND gate 1830 has the following three inputs: (1) normalize_enable, (2) AND gate 1810 output named overflow, and (3) NOR gate 1820 output named non-input exponent infinity 1822. The inputs to AND gate 1810 are the eight normalized exponent bits. The inputs to NOR gate 1820 are product exponent infinity and input C exponent infinity. In summary, during normalization, if the normalized exponent is equal to 0xFF and there are no inputs with infinite exponent conditions, an adder overflow 1832 flag condition may occur.

圖18B顯示了方塊1388中加法器欠位旗標狀況電路的一種實現1800。加法器欠位1892在具有以下三個輸入的及閘1890高階輸出上有效：(1)正規化_致能，(2)及閘1850輸出命名為欠位，及(3)反或閘1880輸出命名為非輸入精確零1882。及閘1850的輸入是八個正規化指數位元及小數輸出零。總之，當正規化指數為0x00(等於0)且輸入不精確為零時，通常會在正規化期間發生加法器欠位旗標情況。以下段落將描述可以檢查非輸入精確零1882狀況的電路的一種實現。 FIG18B shows one implementation 1800 of the adder under-bit flag condition circuit in block 1388. Adder under-bit 1892 is valid at the high-order output of AND gate 1890 having three inputs: (1) normalize_enable, (2) AND gate 1850 output named under-bit, and (3) NOR gate 1880 output named not-input-exact-zero 1882. The inputs to AND gate 1850 are the eight normalized exponent bits and fractional output zero. In summary, the adder under-bit flag condition typically occurs during normalization when the normalized exponent is 0x00 (equal to 0) and the input is not exactly zero. The following paragraphs describe one implementation of a circuit that can check for the not-input-exact-zero 1882 condition.

進一步描述圖18B的電路，非輸入精確零訊號1882的生成是透過反或閘1880的三個輸入。這三個輸入是：(1)乘法器乘積指數零，(2)輸入C指數零，以及(3)輸出精確零1872，其源自及閘1870的輸出。及閘1870具有兩個輸入。第一輸入是名為sign_diff的EX-或1860閘輸出。EX-或1860閘在兩個輸入(乘法器乘積符號和輸入C符號)上運行。及閘1870的第二輸入是(+)零，其中(+)零是輸入C指數大於或等於乘法器乘積指數與小數出零訊號邏輯上AND。 Further describing the circuit of FIG. 18B , the generation of the non-input exact zero signal 1882 is through three inputs of the inverted OR gate 1880. The three inputs are: (1) the multiplier product exponent zero, (2) the input C exponent zero, and (3) the output exact zero 1872, which is derived from the output of the AND gate 1870. The AND gate 1870 has two inputs. The first input is the output of the EX-OR 1860 gate named sign_diff. The EX-OR 1860 gate operates on two inputs (the multiplier product sign and the input C sign). The second input of the AND gate 1870 is (+) zero, where the (+) zero is the logical AND of the input C exponent greater than or equal to the multiplier product exponent and the fractional output zero signal.

加法器欠位是以下狀況的組合，如電路所示。第一狀況；(1)是最終正規化時致能。第二狀況；(2)是最終加法指數結果時，即正規化指數等於0x00並且最終加法小數結果(小數輸出零訊號)不完全為零時。第三狀況；(3)是乘法器乘積指數零、輸入C指數零和輸出非精確零均未致能時，這意味著其中一個為零。如果三狀況中的任何一個為零，則不會發生加法器欠位。 Adder undershoot is a combination of the following conditions, as shown in the circuit. The first condition; (1) is enabled during final normalization. The second condition; (2) is when the final addition index result, that is, the normalized index is equal to 0x00 and the final addition fractional result (fractional output zero signal) is not exactly zero. The third condition; (3) is when the multiplier product index zero, input C index zero, and output inexact zero are all not enabled, which means that one of them is zero. If any of the three conditions is zero, then adder undershoot does not occur.

圖19A顯示了方塊1389中的加法器無效旗標狀況電路的一種實現1900。參考表8，當將(+)無窮大及(-)無窮大相加時會出現一種無效狀況。將(+)無限大及(+)無限大相加或將(-)無限大及(-)無限大相加不會致使無效旗標狀況1389。所述電路檢查具有相反符號的無限大的兩個運算元。加法器無效1932旗標是及閘1930的輸出。及閘1930具有以下三個輸入：(1)正規化致能，(2)名為1912的或閘1910輸出，及(3)來自EX-或閘1920的sign_diff輸出。 FIG19A shows one implementation 1900 of the adder invalid flag condition circuit in block 1389. Referring to Table 8, an invalid condition occurs when adding (+) infinity and (-) infinity. Adding (+) infinity and (+) infinity or adding (-) infinity and (-) infinity does not cause an invalid flag condition 1389. The circuit checks for two operands with infinities of opposite signs. The adder invalid 1932 flag is the output of AND gate 1930. AND gate 1930 has the following three inputs: (1) normalization enable, (2) the output of OR gate 1910 named 1912, and (3) the sign_diff output from EX-OR gate 1920.

加法器無效或閘1910的兩個輸入是乘法器乘積指數無窮大和輸入C指數無窮大。EX-或閘1920對兩個輸入、乘法器乘積符號和輸入C符號進行運算。 The two inputs of adder invalid OR gate 1910 are the multiplier product exponent infinitely large and input C exponent infinitely large. EX-OR gate 1920 operates on two inputs, the multiplier product sign and the input C sign.

總結電路功能，當(1)致能最終正規化，(2)如果來自輸入A(或乘法器乘積)和輸入C的輸入符號不同(正及負，或負及正)時，以及(3)兩個輸入的指數是無窮大(0xFF)，則會出現加法器無效旗標狀況。 To summarize the circuit functionality, the adder invalid flag condition occurs when (1) final normalization is enabled, (2) if the inputs from input A (or multiplier product) and input C are of different signs (positive and negative, or negative and positive), and (3) the exponents of both inputs are infinite (0xFF).

Adder symbol generation circuit

圖19B顯示了用於生成加法器符號正輸出的加法器符號正狀況電路810A的一種實現1900。如前述，異常結果生成具有三個部分：符號生成；指數生成；以及小數生成。產生的符號輸出可以是正的或負的。符號輸出的一種實現具有兩個電路，其結合起來為異常結果生成提供正確的符號輸出。第一電路是加法器符號正狀況電路810A。當加法器符號正1992位元等於「1」時，加法器符號輸出具有正狀況(+)。 FIG. 19B shows an implementation 1900 of an adder sign positive state circuit 810A for generating an adder sign positive output. As previously described, exception result generation has three parts: sign generation; exponent generation; and fractional generation. The sign output generated can be positive or negative. One implementation of the sign output has two circuits that combine to provide the correct sign output for exception result generation. The first circuit is the adder sign positive state circuit 810A. When the adder sign positive 1992 bit is equal to "1", the adder sign output has a positive state (+).

符號輸出函數根據以下相加將正狀況強制為零：(1)(+)Zero+(-)Zero，(2)(+)非範數+(-)非範數(反之亦然)，(3)符號不同且指數輸入之一為無窮大((+)Inf+(-)Inf反之亦然)，(4)符號不同且乘法器乘積為正且指數為無窮大((+)零x(+)Inf=(+)Inf)，(5)符號不同且運算元C指數大於運算元A(或乘法器乘積)指數，(6)等式和小數輸出為零，以及(7)當乘法器乘積符號和輸入C符號均為正時。 The sign output function forces the positive condition to zero based on the following additions: (1) (+)Zero + (-)Zero, (2) (+)Normal + (-)Normal (or vice versa), (3) the signs are different and one of the exponential inputs is infinitely large ((+)Inf + (-)Inf or vice versa), (4) the signs are different and the multiplier product is positive and the exponent is infinitely large ((+)Zero x (+)Inf = (+)Inf), (5) the signs are different and the exponent of operator C is greater than the exponent of operator A (or the multiplier product), (6) the equality and fractional outputs are zero, and (7) when both the sign of the multiplier product and the sign of input C are positive.

圖19B中顯示用於生成強制為「0」的符號位元以用於正狀況的第一電路的一種實現，其中正號是訊號加法器符號正1992，其為及閘1990的輸出。及閘1990有兩個輸入。第一輸入是名為1982的或閘1980輸出，第二輸入是sign_diff，其來自EX-或閘1960。EX-或閘1960閘對兩個輸入進行運算，乘法器乘積符號及輸入C符號。或閘1980有四個輸入。這些是欠位1967、1972以及加零及小數輸出零。 One implementation of a first circuit for generating a sign bit forced to "0" for the positive condition is shown in FIG. 19B, where the positive sign is the signal adder sign positive 1992, which is the output of the AND gate 1990. The AND gate 1990 has two inputs. The first input is the output of the OR gate 1980 named 1982, and the second input is sign_diff, which comes from the EX-OR gate 1960. The EX-OR gate 1960 gate operates on two inputs, the multiplier product sign and the input C sign. The OR gate 1980 has four inputs. These are the lack bits 1967, 1972 and the add zero and fractional output zero.

及閘1950的輸出稱為欠位，其為反或閘1940輸出1942與AND小數輸出零的組合。反或閘1940的輸入是8位元正規化指數向量。或閘1965的兩個輸入是乘法器乘積指數無窮大和輸入C指數無窮大。及閘1970具有第一輸入乘法器指數無窮大致能，而第二輸入是反相閘1968的輸出，其輸入是乘法器乘積符號。 The output of AND gate 1950 is called the lack bit, which is a combination of the output 1942 of NOR gate 1940 and the AND fractional output zero. The input of NOR gate 1940 is an 8-bit normalized exponent vector. The two inputs of OR gate 1965 are the multiplier product exponent infinite and the input C exponent infinite. AND gate 1970 has a first input multiplier exponent infinite capability and a second input is the output of NOR gate 1968, whose input is the multiplier product sign.

圖20A描繪了用於產生加法符號負(-)狀況輸出的加法符號負狀況電路方塊1326C的一種實現2000。這是用於生成如上所述的符號輸出的第二電路。符號輸出包括兩個電路，其結合起來產生正(+)狀況或負(-)狀況。當加法器符號負狀況2042位元等於「1」時，加法符號負狀況電路810B生成負(-)狀況。首先將第二電路描述為示意圖實現，並在下面的幾段中總結了所述功能。 FIG. 20A depicts an implementation 2000 of an adder sign negative condition circuit block 1326C for generating an adder sign negative (-) condition output. This is a second circuit for generating a sign output as described above. The sign output includes two circuits that combine to generate a positive (+) condition or a negative (-) condition. When the adder sign negative condition 2042 bit is equal to "1", the adder sign negative condition circuit 810B generates a negative (-) condition. The second circuit is first described as a schematic implementation, and the functionality is summarized in the following paragraphs.

圖20A中顯示針對負(-)狀況產生強制為1'的符號輸出位元的電路的一種實現。或閘2040輸出加法器符號負狀況2042。或閘2040的兩個輸入是及閘2020輸出2022及及閘2030輸出2032。當及閘2020被判定時，可能出現第一加法符號負狀況。及閘2020對輸入乘法器乘積符號和輸入C指數進行運算，如果兩個符號都為負，則判定輸出。 One implementation of a circuit for generating a signed output bit forced to 1' for a negative (-) condition is shown in FIG. 20A. OR gate 2040 outputs adder signed negative condition 2042. The two inputs of OR gate 2040 are AND gate 2020 output 2022 and AND gate 2030 output 2032. When AND gate 2020 is determined, the first adder signed negative condition may occur. AND gate 2020 operates on the input multiplier product sign and the input C exponent, and if both signs are negative, the output is determined.

及閘2030的判定可能會發生第二加法器符號負狀況。這將在以下及閘2030的輸入被判定時發生：乘法器指數無窮大致能、乘法器乘積符號和2012。反及閘2010的輸出是2012由乘法器乘積指數無窮大及以運算元C符號作為輸入的反相閘2005輸出致能。 The determination of AND gate 2030 may result in a second adder sign negative condition. This will occur when the following inputs to AND gate 2030 are determined: the multiplier index is infinitely large, the multiplier product sign, and 2012. The output of anti-AND gate 2010 is 2012 enabled by the multiplier product index infinitely large and the output of inverting gate 2005 with the operator C sign as input.

總結加法器符號負狀況，根據以下狀況，符號輸出對於負狀況(-)被強制為邏輯「1」：(1)當運算元A(或乘法器乘積)和運算元C符號均為「1」時(均為負數)；(2)或當運算元A(或乘法器乘積)為負無窮大且運算元C不是正無窮大時。 To summarize the adder signed negative condition, the signed output is forced to a logical "1" for the negative condition (-) according to the following conditions: (1) when both operand A (or multiplier product) and operand C are signed "1" (both are negative numbers); (2) or when operand A (or multiplier product) is negative infinity and operand C is not positive infinity.

Additive exponential circuit

加法器指數輸出是使用三狀況生成的。第一狀況是全「0」(0x00=零)。第二狀況是全「1」，其中0xFF將等於無窮大狀況。第三狀況是正規指數。 The adder exponent output is generated using three conditions. The first condition is all "0"s (0x00 = zero). The second condition is all "1"s, where 0xFF would be equal to the infinite condition. The third condition is the normal exponent.

圖20B說明了在全「0」狀況電路方塊1328B中的加法器指數生成的一種實現2000。加法器指數全「0」選擇2082是具有三個輸入的或閘2080的輸出。每個輸入代表單獨的狀況。 FIG. 20B illustrates an implementation 2000 of adder index generation in the all “0” state circuit block 1328B. Adder index all “0” select 2082 is the output of an OR gate 2080 having three inputs. Each input represents a separate state.

當乘法器乘積指數零或輸入C指數零進入或閘2050時，產生第一加法器指數全「0」狀況。所述狀況判定2052以觸發加法器指數全「0」選擇2082。 When the multiplier product index zero or the input C index zero enters the OR gate 2050, the first adder index all "0" condition is generated. The condition determination 2052 triggers the adder index all "0" selection 2082.

當及閘2060具有(+)零和sign_diff輸入時，產生第二加法器指數全「0」狀況，其中(+)零是輸入C指數大於或等於乘法器乘積指數與小數輸出零訊號邏輯上AND。EX-或閘2070對乘法器乘積符號及輸入C符號進行運算以生成sign_diff。當及閘2060被判定時，2062可運算以觸發加法器指數全「0」選擇2082。 When AND gate 2060 has (+) zero and sign_diff input, a second adder index all "0" condition is generated, where (+) zero is the input C index greater than or equal to the multiplier product index and the fractional output zero signal logically ANDed. EX-OR gate 2070 operates on the multiplier product sign and the input C sign to generate sign_diff. When AND gate 2060 is determined, 2062 can be operated to trigger the adder index all "0" selection 2082.

第三加法器指數全「0」狀況由輸入到或閘2080的小數輸出零生成以觸發加法器指數全「0」選擇2082。 The third adder index all "0" condition is generated by the fractional output zero input to the OR gate 2080 to trigger the adder index all "0" select 2082.

圖21A說明了用於所有「1」狀況電路方塊1328B的加法器指數生成的一種示意性實現2100。具有三個輸入狀況的或(OR)輸出函數確定全「1」輸出狀況，並在以下段落中進行描述。狀況電路811B將加法器指數全「1」選擇2122顯示為包含三個輸入的或閘2120的輸出。每個輸入代表單獨的狀況。 FIG. 21A illustrates a schematic implementation 2100 of adder index generation for all "1" state circuit block 1328B. An OR output function with three input conditions determines the all "1" output condition and is described in the following paragraphs. State circuit 811B displays adder index all "1" select 2122 as the output of an OR gate 2120 containing three inputs. Each input represents a separate condition.

當乘法器乘積指數無窮大或輸入C指數無窮大是或閘2110的有效輸入時，生成第一個全「1」狀況。所述狀況判定2112。 When the multiplier product exponent is infinite or the input C exponent is infinite and is a valid input to the OR gate 2110, the first all-"1" condition is generated. The condition is determined in 2112.

當加法器正規化指數[8]的第8位元出現溢位結果時，生成第二個全「1」狀況。 The second all-one condition is generated when the 8th bit of the adder normalized exponent [8] overflows.

第三個全「1」狀況由乘法器指數無窮大致能生成，其中乘法器輸出指數為無窮大或具有正溢位。 The third all-ones condition is generated by infinite multiplier exponents, where the multiplier output exponent is infinite or has positive overflow.

總而言之，當輸入A(或乘法器乘積)或輸入C的任何指數為無窮大，或最終加法器正規化指數溢位，或乘法器輸出指數為無窮大或正溢位((+)零x(+)Inf=(+)Inf)，生成全「1」狀況。 In summary, when any exponent of input A (or multiplier product) or input C is infinite, or the final adder normalized exponent overflows, or the multiplier output exponent is infinite or positive overflow ((+)zero x (+)Inf = (+)Inf), an all-ones condition is generated.

圖21B顯示了加法器小數生成狀況電路1330A的一種示意性實現2100。所述電路根據控制多工器的三個選擇器狀況路由實際正規化小數值或23位元零。強制前23位元匯流排為零的三個狀況是：(1)任何正規化指數溢位；(2)正規化指數欠位；或(3)或乘法器輸出指數為無窮大或正溢位否則，這種情況是正常的，並且包含23位元正規化小數匯流排的第二匯流排路由到加法器小數2132輸出。 FIG21B shows a schematic implementation 2100 of the adder fraction generation circuit 1330A. The circuit routes the actual normalized fraction value or 23 bits of zero based on the three selector states that control the multiplexer. The three conditions that force the first 23 bits of the bus to be zero are: (1) any normalized exponent overflow; (2) normalized exponent underrun; or (3) or the multiplier output exponent is infinite or positive overflow. Otherwise, this condition is normal and a second bus containing the 23-bit normalized fraction bus is routed to the adder fraction 2132 output.

電路812A的示意圖顯示加法器小數2132輸出為由多工器2150提供的23位元匯流排。或閘2130輸出選擇器控制2131以在多工器2150的兩個23位元輸入匯流排之間進行選擇。多工器2150具有兩個23位元匯流排作為輸入。第一輸入匯流排是23個全零位元匯流排。第二輸入匯流排是23位元正規化小數匯流排。或閘2130根據上面段落中描述的三個輸入狀況將控制2131輸出到多工器2150。 The schematic diagram of circuit 812A shows that the adder fraction 2132 outputs a 23-bit bus provided by multiplexer 2150. OR gate 2130 outputs selector control 2131 to select between the two 23-bit input buses of multiplexer 2150. Multiplexer 2150 has two 23-bit buses as inputs. The first input bus is a 23-bit all-zero bit bus. The second input bus is a 23-bit normalized fraction bus. OR gate 2130 outputs control 2131 to multiplexer 2150 based on the three input conditions described in the above paragraph.

加法器小數2132的輸出有兩狀況，即全「0」(0x00=零)和正規小數。在一個實施例中，進位保留累加單元中的異常處理不支援非正規或NaN運算，將非正規數視為零，而將NaN數視為無窮大。當任何這些異常發生時，小數輸出將為全「0」。 The output of adder decimal 2132 has two conditions, all "0" (0x00 = zero) and normal decimal. In one embodiment, the exception handling in the carry-save accumulation unit does not support denormal or NaN operations, and denormal numbers are treated as zero, while NaN numbers are treated as infinite. When any of these exceptions occurs, the decimal output will be all "0".

當發生溢位或欠位異常，或乘法器輸出指數為無窮大或正溢位((+)零乘以(+)Inf=(+)Inf)時，出現全「0」。 When an overflow or underbit exception occurs, or the multiplier output exponent is infinite or positive overflow ((+) zero multiplied by (+) Inf = (+) Inf), all "0" appears.

Multiplier-accumulator operation case study

在下一段中，x是乘法運算元，而+是總和運算元，=是等於運算元。 In the next paragraph, x is the multiplication operator, + is the sum operator, and = is the equality operator.

雖然本發明是透過參考上面詳述的各種實施例和範例來揭露的，但是應當理解，這些範例意於說明性的而不是限制性的。預期本領域技術人員將容易想到修改和組合，這些修改和組合將在本發明的精神和所附請求的範圍內。 Although the present invention is disclosed by reference to the various embodiments and examples described in detail above, it should be understood that these examples are intended to be illustrative rather than restrictive. It is expected that those skilled in the art will readily conceive of modifications and combinations that will be within the spirit of the present invention and the scope of the appended claims.

210:多工器 210:Multiplexer

211:多工器 211:Multiplexer

212:多工器 212: Multiplexer

213:運算元A 213: Operator A

214:運算元B 214: Operator B

215:基底8轉換器 215: Base 8 converter

216:運算元C；線 216: Operator C; line

217:BF16格式或FP32格式 217: BF16 format or FP32 format

218:線 218: Line

219:線 219: Line

220:方塊 220: Block

221:線 221: Line

222:雙匯流排 222:Double bus

223:匯流排 223:Bus

224:輸出匯流排 224: Output bus

225:輸出匯流排 225: Output bus

226:匯流排 226: Bus

227:匯流排 227: Bus

228:匯流排 228:Bus

229:匯流排 229: Bus

230:進位保留加法器 230: Carry-save adder

240:累加器 240: Accumulator

251:匯流排 251:Bus

252:匯流排 252:Bus

270:FP32或BF16方塊 270: FP32 or BF16 block

Claims

A multiplication-accumulation unit comprises: a pipeline configured to perform multiplication-accumulation operations on a series of input floating-point operands, the pipeline comprising: a significand circuit receiving a multiplier output significand in a 2's complement format and feeding back a sum and carry value outputted by a feedback accumulator in a pipeline cycle, the significand circuit comprising a 2's complement, carry-save adder to generate a sum of an accumulator output and a carry accumulator output significand value; and an exponent circuit receiving a multiplier output exponent in a base-8 format and a feedback exponent value outputted by the feedback accumulator in a base-8 format in the pipeline cycle to generate an accumulator output exponent value outputted by the accumulator.

A multiplication-accumulation unit as claimed in claim 1, wherein: the significand circuit includes a significand shifter, which responds to an exponent comparison signal to align the multiplier output significand with the feedback sum and carry value for addition; the index circuit responds to the exponent comparison signal to generate the accumulator output exponent value; and the pipeline includes an exponent comparison circuit, which compares the multiplier output exponent with the feedback exponent value before the pipeline cycle to generate the exponent comparison signal.

A multiplication-accumulation unit as claimed in claim 2, wherein the pipeline comprises: an overflow detector circuit for generating an overflow signal indicating an overflow condition for at least one of the sum and carry values of the feedback; and the exponent circuit and the significand circuit also respond to the overflow signal.

A multiplication-accumulation unit as claimed in claim 2, wherein the pipeline comprises: an overflow detector circuit for generating a first condition signal indicating an overflow condition of at least one of the sum and carry value of the feedback; a leading sign bit detector circuit for generating a second condition signal indicating that at least one of the sum and carry value of the feedback has greater than or equal to 8 extended sign bits; and the exponent circuit and the significand circuit are also responsive to the first condition signal and the second condition signal.

A multiply-accumulate unit as claimed in any one of claims 1 to 4, wherein the pipeline comprises: a multiplier circuit, responsive to a first input operand and a second input operand, providing a multiplier significand value and a multiplier exponent value prior to the pipeline cycle; a base-8 conversion circuit, for converting the multiplier significand value and the multiplier exponent value to a base-8 format for the multiplier output exponent and significand; and a two's complement conversion circuit, for converting the multiplier significand value to a two's complement representation of the multiplier output significand.

A multiplication-accumulation unit as claimed in any one of claims 1 to 4, wherein the pipeline comprises: A multiplier circuit, which is used to respond to a first input operand and a second input operand, and provide a multiplier effective value and a multiplier exponent value before the pipeline cycle, wherein the multiplier circuit comprises an effective number multiplier circuit and an exponent adder circuit, the effective number multiplier circuit having a carry-save adder for generating partial products of carry values and sum values to generate high-order bits of the multiplier output effective number, and a ripple carry adder for generating partial products of the effective number carry output and the low-order bits of the sum output.

A multiply-accumulate unit as claimed in any one of claims 1 to 4, wherein the pipeline comprises: a multiplier circuit comprising a significand multiplier circuit and an exponent adder circuit, responsive to a first input operand and a second input operand, providing a multiplier significand value and a multiplier exponent value prior to the pipeline cycle, and wherein the first input operand and the second input operand are in floating point format including a shift amount so that a negative exponent is represented by a positive binary number, and a circuit for correcting the shift amount by inverting the most significant digit of one of the first operand and the second operand and adding 1 to a carry input of the exponent adder circuit.

A multiplication-accumulation unit as claimed in any one of claims 1 to 4, wherein the pipeline has an accumulator mode and a summing mode, and comprises: a selector for providing the accumulator output of the feedback in the accumulator mode, and for providing a third floating-point input operand to the significand circuit and the exponent circuit in the summing mode.

A multiply-accumulate unit as claimed in claim 1, wherein the pipeline comprises: a first stage comprising a floating-point multiplier having sum and carry outputs; a second stage comprising a multiplier output adder for the sum output and the carry output of the multiplier, and a circuit for converting the multiplier adder output to a base-8 format having a 2's complement significand; a third stage comprising the significand circuit and the exponent circuit; a fourth stage for converting the accumulator sign bit, the accumulator exponent, and the accumulator significand sum and carry value to a signed value significand format; a fifth stage for converting the signed value significand format from base-8 alignment to base-2 alignment and generating a normalized exponent and significand; and a sixth stage for performing rounding and converting to a standard floating point representation.

A multiplication-accumulation method for calculating the sum S(i) of terms A(i)*B(i), wherein (i) ranges from 0 to N-1, and N is the number of terms in the sum, the method comprising: receiving a series of operands A(i) and operands B(i) in floating point format, (i) ranging from 1 to N in a multiplication-accumulation unit; multiplying the operands A(i) and B(i) by the multiplier pipeline stage of the multiplication-accumulation unit to generate the terms A(i)*B(i) and the multiplier output significand in a format including a multiplier output exponent in base 8 format, and converting the multiplier output significand into a 2's complement format; A carry-save adder is used in the accumulator pipeline stage of the addition unit to add the multiplier output valid number of the item A(i)*B(i) in the 2's complement format to the valid number of the sum S(i-1), and to generate a sum and a carry value for the sum S(i); an exponent of the sum S(i) is selected from the multiplier output exponent of the item A(i)*B(i) and the exponent of the sum S(i-1) through the index circuit of the multiplication-accumulation unit to generate the exponent of the sum S(i) in the base 8 format; and the sum and the carry value of the sum S(i) in the base 8 format and the exponent are converted into a normalized floating point format.

The multiplication-accumulation method of claim 10 includes comparing the exponent of the sum S(i-1) with the multiplier output exponent of the item A(i)*B(i) to generate an exponent comparison signal; and in response to the exponent comparison signal, aligning the sum and carry value of the sum S(i-1) with the effective number of the item A(i)*B(i) so as to perform addition in the carry-save adder.

The multiplication-accumulation method of claim 11 includes generating an overflow signal indicating an overflow condition for at least one of the sum and carry value of the sum S(i-1); and in response to the overflow signal, aligning the sum and carry value of the sum S(i-1) and the exponent with the multiplier output exponent and the multiplier output valid number of the item A(i)*B(i) so as to perform addition in the carry-save adder.

The multiplication-accumulation method of claim 11 includes: generating a first condition signal indicating an overflow condition of at least one of the sum and carry value of the sum S(i-1); generating a second condition signal indicating that at least one of the sum and carry value of the sum S(i-1) has greater than or equal to 8 extended sign bits; and in response to the first condition signal and the second condition signal, aligning the sum and carry value and the exponent of the sum S(i-1) with the multiplier output exponent and the multiplier output valid number of the item A(i)*B(i) so as to perform addition in the carry-save adder.

A multiplication-accumulation method as claimed in any one of claims 10 to 13, wherein multiplying operand A(i) by operand B(i) to generate the term A(i)*B(i) includes using a carry-save adder for generating a partial product of a carry value and a sum value to generate high-order bits of the multiplier output significand, and using a ripple-carry adder for generating a partial product of a carry value and a low-order bit of the sum output.

A multiplication-accumulation method as claimed in any one of claims 10 to 13, wherein the operands A(i) and B(i) are in floating point format including a shift amount such that a negative exponent is represented by a positive binary number, and correcting the shift amount includes inverting the most significant digit of one of the operands A(i) and B(i) and adding 1 to a carry input of an exponent adder circuit.

A multiplication-accumulation unit, comprising: a pipeline configured to perform floating-point multiplication-accumulation operations on a sum S(i) of terms A(i)*B(i), where (i) ranges from 0 to N-1, and N is the number of terms in the sum, the pipeline comprising: a multiplier pipeline stage, comprising a multiplier circuit for providing a multiplier significand and a multiplier index value of the term A(i)*B(i) in response to a first input operand and a second input operand, a base-8 conversion circuit for converting the multiplier significand and the multiplier index value of the term A(i)*B(i) into a base-8 format, and a base-8 conversion circuit for converting the multiplier significand value into the base-8 format of the term A(i)*B(i). A 2's complement conversion circuit for a 2's complement representation of a multiplier output significand; an accumulator stage, comprising a significand circuit, the significand circuit adds the multiplier output significand of the term A(i)*B(i) to the feedback sum and carry value of the sum S(i-1), and generates the sum and carry value of the sum S(i), the significand circuit comprising a 2's complement carry-save adder to generate the sum value of the sum S(i) and the carry accumulator output significand value; and an exponent circuit, which receives the multiplier exponent value of the term A(i)*B(i) and the feedback exponent value of the sum S(i-1), to generate the accumulator output exponent value for the sum S(i).

A multiplication-accumulation unit as claimed in claim 16, wherein: the significand circuit of the accumulator stage includes a significand shifter responsive to an exponent comparison signal; the exponent circuit is responsive to the exponent comparison signal; and the pipeline includes an exponent comparison circuit for comparing the multiplier exponent value of the term A(i)*B(i) with the feedback exponent of the sum S(i-1) to generate the exponent comparison signal for generating the sum S(i).

The multiplication-accumulation unit of claim 17, wherein the accumulator stage comprises: an overflow detector circuit for generating a first condition signal indicating an overflow condition of at least one of the sum and carry values of the feedback of the sum S(i-1); a leading sign bit detector circuit for generating a second condition signal indicating that at least one of the sum and carry values of the feedback of the sum S(i-1) has greater than or equal to 8 extended sign bits; and the exponent circuit and the significand circuit are also responsive to the first condition signal and the second condition signal.