JPH03250314A

JPH03250314A - Arithmetic unit for multiplication remainder

Info

Publication number: JPH03250314A
Application number: JP4573690A
Authority: JP
Inventors: Takayuki Hasebe; 高行長谷部; Naoya Torii; 直哉鳥居; Ryota Akiyama; 良太秋山
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1990-02-28
Filing date: 1990-02-28
Publication date: 1991-11-08

Abstract

(57)【要約】本公報は電子出願前の出願データであるた
め要約のデータは記録されません。(57) [Summary] This bulletin contains application data before electronic filing, so abstract data is not recorded.

Description

【発明の詳細な説明】〔概　要〕ハードウェアの削減を図り、且つ高速処理が可能な乗算
剰余演算装置に関し、高速性を損なうことなくハードウェア規模の細小を図る
ことを目的とし、ＡｘＢ　　（ｍｏｄ　　ｎ）　　（ｎはＫｍビット、Ａ
。[Detailed Description of the Invention] [Summary] Regarding a modular multiplication calculation device that can reduce hardware and perform high-speed processing, the purpose of this invention is to reduce the scale of the hardware without sacrificing high speed. mod n) (n is Km bits, A
.

Ｂ＜ｎ）　　（ｍ　＝１．２．・・）の乗算剰余を行う
乗算剰余演算装置であって、ＫｍビットのＡと、Ｂをに
ビット　（１ワード）毎にｍ個に分割した、上位ワード
より順にＢとの乗算を行ってＫｍ＋Ｋビットの部分積を
ｍ回得る乗算回路と、乗算回路の出力に得られるＫ　ｍ
　−ｉ−Ｋビットのデータのうち、Ｋｍビットよりあふ
れたにビットのデータを検索ビットとし、Ｋｍ＋Ｋビッ
トのデータの下位ＫｍビットをオールＯとし上位にビッ
トを検索ビットとするものの値を剰余の法ｎで割った余
りである剰余結果を出力する剰余テーブルと、剰余テー
ブルの出力Ｋｍビットと部分積の下位Ｋｍビットとを加
算することにより、ＫｍＸＫビットの乗算剰余演算の中
間結果を得る第１の加算器と、中間結果をにビットだけ
上位桁にシフトするシフト手段と、シフト手段のシフト
結果と次のワードに対するＫｍ−１−にビットの部分積
とを加算する第２の加算器と、第１の加算器に得られる
中間結果を補正して剰余を選択する補正回路とを具備し
、第２の加算器の出力の下位Ｋｒｒ＋ビットと剰余テー
ブルの出力とを第１の加算器により加算して得た中間結
果を、シフト手段によりにビットシフトし、第２の加算
器により再び次のサイクルで乗算回路から出力される部
分積に加算する操作をｍ回繰り返すことによりＫｍｘＫ
ｍビットの乗算剰余演算の中間結果を得、最後に、補正
回路を用いて剰余の法ｎ以下に結果を収約ることにより
高速に乗算剰余演算を行うように構成する。B<n) (m = 1.2...) is a multiplication remainder arithmetic device that performs the multiplication remainder, and is a high-order multiplication unit that divides Km bits of A and B into m pieces each bit (1 word). A multiplication circuit that sequentially multiplies words with B to obtain Km+K bit partial products m times, and K m obtained at the output of the multiplication circuit.
-i- Of the K-bit data, the bit data overflowing the Km bit is used as the search bit, and the value of the Km+K-bit data where the lower Km bits are all O's and the upper bit is the search bit is determined by the remainder method. A first method that obtains the intermediate result of a KmXK-bit multiplication remainder operation by adding the remainder table that outputs the remainder result that is the remainder after dividing by n, and the output Km bits of the remainder table and the lower Km bits of the partial product. an adder, a shift means for shifting the intermediate result by bits to higher order digits, a second adder for adding the shift result of the shift means and a partial product of bits to Km-1- for the next word; and a correction circuit that corrects the intermediate result obtained by the first adder to select a remainder, and the first adder adds the lower Krr+ bits of the output of the second adder and the output of the remainder table. By repeating m times the operation of bit-shifting the intermediate result obtained by using the shift means and adding it again to the partial product output from the multiplication circuit in the next cycle by the second adder, KmxK is obtained.
The system is configured to obtain intermediate results of m-bit remainder multiplication operations, and finally use a correction circuit to converge the results to less than the modulus n of the remainder, thereby performing the multiplication remainder operations at high speed.

[Industrial application field]

本発明は、ハードウェアの削減を図り、且つ高速処理が
可能な乗算剰余演算装置に関するものである。TECHNICAL FIELD The present invention relates to a modular multiplication arithmetic device that reduces hardware and is capable of high-speed processing.

通信データの暗号化及び認証アルゴリズムとして便利な
公開鍵暗号方式が知られている。この公開鍵暗号方式で
は、通信データを暗号化する場合、送信者は相手受信者
が予め公開している鍵を用いて電文を暗号化し、−刃受
信者は公開鍵と異なる受信者のみが知っている秘密の鍵
により受信電文を複合化する。公開鍵暗号方式はこのよ
うに送信者、受信者間で同一の鍵を管理する慣用暗号方
式き異なり、受信者は受信者のみが知っている１個の鍵
のみを管理すればよく安全性が高い暗号方式である。公
開鍵暗号方式で特に優れた方式の−っにＲ３Ａ暗号方式
（Ｒ，Ｌ、Ｒｉｖｅｓｔ、　Ａ、Ｓｈａｍｉｒ　ａｎｄ
　Ｌ。Public key cryptography is known as a convenient encryption and authentication algorithm for communication data. In this public key encryption method, when encrypting communication data, the sender encrypts the message using a key that has been made public in advance by the other recipient, and the recipient only knows the public key, which is different from the public key. The received message is decrypted using a secret key. In this way, public-key cryptography differs from conventional cryptography in which the same key is managed between the sender and the receiver, and the receiver only has to manage one key that only the receiver knows and is secure. It is a high-quality encryption method. The R3A cryptosystem (R, L, Rivest, A, Shamir and
L.

Ａｄｌｅｍａｎ）があるが、この方式ではアルゴリズム
は多数桁の乗算および剰余演算を数百回繰り返す操作を
基本とする。この為、高速に剰余演算を行うことが必要
であり、剰余テーブルを用いた方式％式％第９図は上記公開鍵暗号方式の原理図である。Adleman), but in this method, the algorithm is based on operations that repeat multi-digit multiplication and remainder operations several hundred times. For this reason, it is necessary to perform remainder calculations at high speed, and a method using a remainder table is shown in FIG. 9.

同図において、暗号化では平文ＭをユーザＡが乗算剰余
演算装置９１によりＭをｅ回乗算したものをｎで割った
剰余Ｃ＝Ｍ＠（ｍａｄ　ｎ）の演算を行って暗号文（Ｃ
）とし、復号化では暗号文（Ｃ）をユーザＢが乗算剰余
演算装置９２によりＣをｄ回乗算したものをｎで割った
剰余Ｍ＝Ｃｄ（ｍｏｄ　ｎ）の演算を行って復号文（Ｍ
）としている。ここで、ユーザＢの公開鍵ｅ１ユーザＢ
の秘密ｌｉｄ、及び除数ｎの関係は、ｐ、ｑを大きな素
数とすると、ｎ＝ｐｘｑｅ　ｘ　ｄ　＝　１　（ｍｏｄ（ＬＣＭ（１１−１，ｑ
−１）））で表される関係を満たす必要がある。In the same figure, in encryption, user A multiplies plaintext M e times using a multiplication remainder calculation device 91, and divides the resulting product by n to calculate the remainder C=M@(mad n), resulting in a ciphertext (C
), and in decryption, user B multiplies the ciphertext (C) d times using the multiplication remainder arithmetic unit 92, calculates the remainder M=Cd (mod n) by dividing the product by n, and obtains the decrypted text (M
). Here, user B's public key e1 user B
The relationship between the secret lid of
It is necessary to satisfy the relationship expressed by -1))).

本発明による乗算剰余演算装置は、上記公開鍵暗号方式
に特に有効に適用される。The modular multiplication calculation device according to the present invention is particularly effectively applied to the above-mentioned public key cryptosystem.

[Conventional technology]

剰余テーブルを用いた従来例の構成の概要を第１０図に
示す。本従来例では、Ａ　Ｘ　Ｂ　（ｍｏｄ　ｎ）の計
算の場合で、Ａ、　Ｂ、　及びｎは各々５１２　ビット
の正の整数であり、３２　Ｘ　３２ビツトの乗算器を基
本単位として持つと仮定する。第１０図において、従来
の高速並列乗算剰余演算装置は、乗算処理部１０１、剰
余処理部１０２、後段処理部１０３の３つより構成され
ている。乗算処理部１０１では３２　Ｘ　３２ビツトの
乗算器を複数数個用いて５１２　Ｘ　５１２　ビットの
乗算器１０４が構成されており、Ａ　Ｘ　Ｂ　（５１２
Ｘ　５１２ｔ＝”ット）の乗算を行い、１０２４ビツト
の乗算結果がａカされる。FIG. 10 shows an outline of the configuration of a conventional example using a remainder table. In this conventional example, in the case of calculating A X B (mod n), it is assumed that A, B, and n are each 512-bit positive integers, and that the basic unit is a 32 x 32-bit multiplier. . In FIG. 10, the conventional high-speed parallel multiplication remainder calculation device is composed of three parts: a multiplication processing section 101, a remainder processing section 102, and a subsequent stage processing section 103. In the multiplication processing unit 101, a 512 x 512 bit multiplier 104 is configured using a plurality of 32 x 32 bit multipliers, and A x B (512
The multiplication is performed by x 512t="cut", and the 1024-bit multiplication result is added to a.

乗算結果の１０２４ビツトのうち下位５１２　ビットは
そのまま剰余処理部１０２内の５２０ビツト加算器１０
５　に送られ、上位５１２　ビットは３２ビツト毎に１
６個に分割されて、１６個の剰余テーブル回路１０６−
１〜１０６−１６に入力される。剰余テーブル回路１０
６−１〜１０６−１６の各々の構成は第１１図及び第１
２図に詳細に示されている。The lower 512 bits of the 1024 bits of the multiplication result are directly sent to the 520-bit adder 10 in the remainder processing section 102.
5, and the upper 512 bits are sent to 1 every 32 bits.
Divided into 6 parts, 16 remainder table circuits 106-
1 to 106-16. Remainder table circuit 10
The configuration of each of 6-1 to 106-16 is shown in FIG.
This is shown in detail in Figure 2.

各剰余テーブル回路１０６−ｉ　は、第１１図に示すよ
うに、１６個の剰余テーブルモジュール１１１−１〜１
１１−１６で構成されており、各剰余テーブルモジュー
ルは剰余結果の５１２ビツトを、３２ビツトの桁毎に分
割した内容の剰余テーブルを有している。Each remainder table circuit 106-i includes 16 remainder table modules 111-1 to 1, as shown in FIG.
Each remainder table module has a remainder table containing contents obtained by dividing the 512 bits of the remainder result into 32-bit units.

各剰余テーブルモジュール１１１−ｊでは、第１２図に
示すように、人力された３２ビツトを更に、４ビツト毎
に８つの剰余テーブル１２−１〜１２−８に分割し、４
ビツト毎の各桁が剰余テーブル１２１−１〜１２１−８
の何れかに入力される。各剰余テーブルは入力された４
ビツトの値をアドレスとして、そのアドレスに対応する
３２ビツトの値を剰余の法ｎで割った余りである剰余結
果を出力する。８つの剰余テーブル１２１−１〜１２１
−８の出力は加算器１２２〜１２８　によりトーナメン
ト加算され、計３５ビットの値として出力される。In each remainder table module 111-j, as shown in FIG.
Each digit of each bit is the remainder table 121-1 to 121-8.
is entered in either. Each remainder table is filled with 4
The bit value is used as an address, and the remainder result, which is the remainder obtained by dividing the 32-bit value corresponding to that address by the modulus n of the remainder, is output. Eight remainder tables 121-1 to 121
The outputs of -8 are tournament-added by adders 122 to 128 and output as a total of 35 bits.

このように、剰余処理部１０２では、５１２　ビットの
入力を１６個に分割した３２ビツトに対して、各剰余テ
ーブルモジュールが３５ビツトを出力し、したがって、
各剰余テーブル回路１０６−ｉ　は５６０　ビットを出
力するので、人力ビット数と出力ビツト数に４８ビット
ドの差がある。したがって、剰余テーブル回路の各出力
の間には３ビツトの重なりが生じている。又、剰余テー
ブル回路１０６−１〜１０６−１６は１６個あるので、
この出力を加算することを考慮すると、剰余テーブル回
路１０６−１〜１０６−１６の出力と乗算処理部１０１
の出力の下位の５１２　ビットとの加算には、計５２０
　ビットの加算器が必要となる。In this way, in the remainder processing unit 102, each remainder table module outputs 35 bits for 32 bits obtained by dividing the 512-bit input into 16 pieces.
Since each remainder table circuit 106-i outputs 560 bits, there is a difference of 48 bits between the number of input bits and the number of output bits. Therefore, there is an overlap of 3 bits between each output of the remainder table circuit. Also, since there are 16 remainder table circuits 106-1 to 106-16,
Considering the addition of these outputs, the outputs of the remainder table circuits 106-1 to 106-16 and the multiplication processing unit 101
A total of 520 bits are added to the lower 512 bits of the output of
A bit adder is required.

第１０図に戻り、５２０ビツトの加算器１０５にて加算
を行った結果が後段処理回路１０３に送られる。Returning to FIG. 10, the result of addition performed by the 520-bit adder 105 is sent to the subsequent processing circuit 103.

後段処理回路１０３の人力では、Ａ　Ｘ　Ｂ（＋ｎｏｄ
　ｉ）にｎ倍数を加えた形式になっているので、この値
をｎ以下の値に収める処理を行う。この５２０ビツトの
加算結果の上位８ビツトを剰余テーブル１０８に人力し
、その出力５１２　ビットと５２０　ビット加算器１０
５の出力の下位５１２　ビットとを加算器１０９で加算
することにより、５１３　ビットの結果が得られる。With the manual power of the post-processing circuit 103, A X B (+nod
Since the format is i) plus a multiple of n, processing is performed to keep this value within n. The upper 8 bits of this 520-bit addition result are entered into the remainder table 108, and the output 512 bits and 520 bits are added to the adder 10.
By adding the lower 512 bits of the output of 5 in adder 109, a 513-bit result is obtained.

この結果はＡ　Ｘ　Ｂ（ｍｏｄ　ｎ）、　Ａ　Ｘ　Ｂ（
ｍｏｄ　ｎ）　＋　ｎ、　ＡＸ　Ｂ（ｍｏｃｌ　ｎ）　
＋　２ｎの値のどれかになるので、加算器１１１．１１
２で加算器１０９の出力に−ｎ、−２ｎを加えた結果の
符号を判定回路１１２でチエツクし、負になるものの出
力を選択することにより、Ａ×Ｂ（ｍａｄ　ｎ）の値を
セレクタで選択して、５１２ビツトｘ５１２　　ビット
をｎで割った値である乗算剰余結果を得る。This result is A X B (mod n), A X B (
mod n) + n, AX B (mocl n)
+2n, so adder 111.11
In step 2, the sign of the result of adding -n and -2n to the output of the adder 109 is checked in the judgment circuit 112, and by selecting the negative output, the value of A×B (mad n) can be determined by the selector. Select to obtain the modulus result, which is 512 bits x 512 bits divided by n.

[Problem to be solved by the invention]

上記の従来技術で説明したように、従来の乗算剰余演算
回路では、乗算処理部１０１で５１２　ビットＸ５１２
ヒツトの乗算を行って１０２４ビツトの乗に結果を得て
から、剰余テーブルを参照して剰余を求めているので、
高速ではあるが、乗算器１０４、加算器１０５等の規模
が莫大となり、剰余テーブル１０８も５１２ヒツト分を
全て有する必要があるので、回路規模が大きくなり、回
路規模が小さいことを求められている場合には実現が難
しく、又、ＬＳＩ化を行った場合には、チップ分割の方
法及び制御が複雑になるという欠点を持っていた。As explained in the above prior art, in the conventional multiplication remainder calculation circuit, the multiplication processing unit 101 performs 512 bits x 512
After multiplying by 1024 bits and obtaining the result, the remainder is calculated by referring to the remainder table.
Although it is fast, the scale of the multiplier 104, adder 105, etc. is enormous, and the remainder table 108 must also have all 512 hits, so the circuit scale is large and the circuit scale is required to be small. In some cases, it is difficult to realize this, and when LSI is implemented, the chip division method and control become complicated.

本発明の目的は、剰余テーブルを用いて部分積の中間乗
算剰余を得た後に最終的な乗算剰余結果を得るという構
想に基づき、乗算剰余演算装置において、ハードウェア
規模の縮小を図り、且つ高速処理の必要性に応じて簡単
に拡張できる構成のものを提供することにある。The purpose of the present invention is to reduce the hardware scale and achieve high speed in a multiplication remainder arithmetic device based on the concept of obtaining the final multiplication remainder result after obtaining intermediate multiplication remainders of partial products using a remainder table. The purpose is to provide a structure that can be easily expanded according to processing needs.

[Means to solve the problem]

第１図は本発明の原理ブロック図である。同図におイテ
、Ａ×Ｂ　（ｍｏｄ　　ｎ）（Ａ、Ｂ、ｎはＫｍビット
））　　（ｍ＝１．２．・・）の乗算剰余を行う乗算剰
余演算装置が示されており、１はＫｍビットのＡと、Ｂ
の分割されたにビットとの乗算を行ってＫｒｎ＋Ｋビッ
トの部分積を得る乗算回路、２は乗算回路１の出力に得
られるＫｍ十にビットのデータのうち、Ｋｎｎビットよ
りあふれたにビットのデータを検索ビットとし、Ｋｍ＋
Ｋビットのデータの下位Ｋｍビットをオール０とし上位
にビットを該検索ビットとするものの値を剰余の法ｎで
割った余りである剰余結果を出力する剰余テーブル、３
は剰余テーブル２の出力Ｋｍビットと部分積の下位Ｋｍ
ビットとを加算することにより、ＫｍＸＫビットの乗算
剰余演算の中間結果を得る第１の加算器、４は中間結果
をにビット上位桁にシフトするシフト手段、５はシフト
手段４のシフト結果と次のワードに対するＫｍ＋Ｋビッ
トの部分積とを加算する第２の加算器、６は第１の加算
器３に得られる中間結果を補正して剰余を選択する補正
回路である。第２の加算器５の出力と剰余テーブル２の
出力とを第１の加算器３により加算して得た中間結果を
、シフト手段４によりにビットシフトし、第２の加算器
５により再び次のサイクルで乗算加算器１から出力され
る部分積に加算する操作をｍ回繰り返すことによりＫｍ
×Ｋｍビットの乗算剰余演算の中間結果を得、最後に、
補正回路３を用いて除数ｎ以下に結果を収めることによ
り高速に乗算剰余演算を行う。FIG. 1 is a block diagram of the principle of the present invention. The figure shows a multiplication remainder arithmetic device that performs the multiplication remainder of A×B (mod n) (A, B, n are Km bits) (m=1.2...). is Km bits A and B
A multiplication circuit that multiplies the divided bits of Krn+K bits to obtain a partial product of Krn+K bits, 2 is the Krn+K bit data obtained at the output of the multiplication circuit 1, and the Knn bits of data overflowing from the Knn bits. As the search bit, Km+
A remainder table that outputs a remainder result that is the remainder obtained by dividing the value of K bits of data in which the lower Km bits are all 0 and the upper bits are the search bits by the modulus n of the remainder, 3
is the output Km bits of remainder table 2 and the lower Km of the partial product.
A first adder which obtains an intermediate result of a KmXK-bit multiplication remainder operation by adding bits; 4 is a shift means for shifting the intermediate result to the upper bit digit; A second adder 6 which adds Km+K-bit partial products for the word , is a correction circuit which corrects the intermediate result obtained by the first adder 3 and selects a remainder. The intermediate result obtained by adding the output of the second adder 5 and the output of the remainder table 2 by the first adder 3 is bit-shifted by the shifting means 4, and then the intermediate result obtained by adding the output of the second adder 5 and the output of the remainder table 2 is bit-shifted by the second adder 5. By repeating the operation of adding to the partial product output from multiplier adder 1 m times in the cycle of Km
Obtain the intermediate result of the multiplication remainder operation of ×Km bits, and finally,
By using the correction circuit 3 to keep the result below the divisor n, the multiplication remainder calculation is performed at high speed.

乗算加算器１ににビットの桁上がり保存のにビット）加
算器を追加して、ＫｍｘＫ　Ｉ　　（Ｉ＝１．２，４゜
８　・・ｍ）ビットの部分積を求める構成とし、Ｋｍ×
ＫＩビットの部分積の上位ＫＩビットのデータをにビッ
トずつアドレスとして順次人力する１個の拡張剰余テー
ブル４を剰余テーブル２に設け、拡張剰余テーブルの出
力を上位の拡張剰余テーブルの出力と加算する為の加算
器を追加した構成とすルコとにより、ＫｍｘＫＩ　　（
１＝１．２，４．８　　・・ｍ）の乗算剰余をｍ回の計
算で行えるようにして拡張可能な乗算剰余演算装置が得
られる。A bit adder is added to the multiplier adder 1 to store the bit carry, and the configuration is configured to obtain the partial product of Km x K I (I = 1.2, 4゜8...m) bits,
One extended remainder table 4 is provided in the remainder table 2, in which the data of the upper KI bits of the partial product of KI bits are manually entered bit by bit as addresses, and the output of the extended remainder table is added to the output of the upper extended remainder table. KmxKI (
1=1.2, 4.8, . . . m) by performing m calculations, an expandable multiplication remainder arithmetic device is obtained.

[For production]

乗算を行う場合に全ての乗算を行わずに、ｍ回に分割し
た部分積を上位から求めていき、部分積で処理単位のビ
ット幅よりあふれたビットで剰余テーブルを参照して、
部分積の剰余を計算し、この結果を次段の部分積と加算
し、加算結果のあふれたビットで部分積の加算の剰余を
求めていく。When performing multiplication, instead of performing all multiplications, find the partial product divided into m times from the top, and refer to the remainder table with the bits that exceed the bit width of the processing unit in the partial product.
The remainder of the partial product is calculated, this result is added to the partial product of the next stage, and the remainder of the addition of the partial products is determined using the overflowing bits of the addition result.

又、部分積計算回路に上段の部分積計算回路からの部分
積と加算を行える様に、加算器を加えることにより、部
分積の幅を拡張出来る。Further, by adding an adder to the partial product calculation circuit so that the partial product can be added to the partial product from the upper partial product calculation circuit, the width of the partial product can be expanded.

乗算を一度に行わずに部分積を求めて剰余テーブルを参
照する方式にすることにより、あふれるビット数に対応
した大きさの剰余テーブルを持つだけでよくなり、ｎｍ
ｘｎの部分積の場合には、従来方式に比べ剰余テーブル
の大きさを１／ｍに削減出来る。又、この場合乗算器の
個数も１．／ｍですむ。高速性が比較的求められない場
合や、小さいハードウェアが求められている場合には計
算の回数を増やす代わりに、乗算回路及び剰余テーブル
の規模を小さくすることができる。又、部分積の幅を拡
張出来る構成にすることにより、高速で用いたい場合に
は、乗算及び剰余テーブルの参照を一度に行える様にな
り、各種速度に応じた拡張が可能となる。By using a method that calculates partial products and refers to the remainder table without performing multiplication all at once, it is only necessary to have a remainder table of a size corresponding to the number of overflowing bits, and nm
In the case of partial products of xn, the size of the remainder table can be reduced to 1/m compared to the conventional method. Also, in this case, the number of multipliers is also 1. /m is enough. When relatively high speed is not required or when small hardware is required, the scale of the multiplication circuit and remainder table can be reduced instead of increasing the number of calculations. Furthermore, by configuring the width of the partial product to be expandable, when high-speed use is desired, multiplication and reference to the remainder table can be performed at the same time, allowing expansion to suit various speeds.

〔Example〕

第２図は本発明の実施例による乗算剰余演算装置の構成
を示すブロック図である。同図に示すように、本発明の
実施例による装置は、乗算回路２００、剰余制御回路２
１０、剰余テーブル２２０及び後段処理部２３０からな
っている。FIG. 2 is a block diagram showing the configuration of a modular multiplication calculation device according to an embodiment of the present invention. As shown in the figure, the apparatus according to the embodiment of the present invention includes a multiplication circuit 200, a remainder control circuit 2
10, a remainder table 220, and a subsequent processing section 230.

ここで、基本処理単位は３２ビツト（１ワード−３２ビ
ツト）とすると、乗数Ｂは、次のように表すことができ
る。Here, assuming that the basic processing unit is 32 bits (1 word - 32 bits), the multiplier B can be expressed as follows.

Ｂ＝Ｂ；ｓｘ　２”’　＋ＢＩ４Ｘ　２””十・・・・
・・十Ｂ、　Ｘ　２３２＋　Ｂｏ　　（１）したがって
、Ａ×Ｂ　（ｍｏｄｎ）は、部分積Ｍ１＝　Ａ　Ｘ　Ｂ
　＋　を用いて次のように表される。B=B;sx 2"' +BI4X 2""10...
...10B, X 232+ Bo (1) Therefore, A×B (modn) is the partial product M1= A X B
It is expressed as follows using +.

Ａ×Ｂ　（ｍｏｄｎ）＝　（・・　（（ＭＩｓ（ｍｏｄ
ｎ）Ｘ２”）＋Ｍ、４）　　（ｍｏｄｎ）Ｘ２”−・＋
Ｍｏ）　　（ｍｏｄｎ）　　（２）本装置は、上記の式（２）に従い、部分積の剰余を剰余
テーブルを用いて求める。但し、剰余を行う場合には、
途中計算では剰余結果をｎ以下には収めずに、後段処理
部２３０における最終処理段階でｎ以下に収めている。A×B (modn) = (... ((MIs(mod
n)X2”)+M, 4) (modn)X2”-・+
Mo) (modn) (2) This device calculates the remainder of the partial product using the remainder table according to the above equation (2). However, when doing the remainder,
In the intermediate calculation, the residual result is not kept below n, but is kept below n in the final processing stage in the post-processing section 230.

乗算回路２００は、５１２　ビットの被乗数へを格納す
るレジスタ　（ＲＦ−Ａ）２０１　と、５１２　ビット
の乗数Ｂを格納するレジスタ（ＲＦ−Ｂ）２０２と、被
乗数へを３２ビット単位（＾０．　Ａ１．　、、、、　
Ａ１５）に分割した一つと乗数Ｂを３２ビット単位（Ｂ
Ｏ，Ｂｌ、　、、、、　Ｂ１５）に分割した一つとを順
次乗算する３２ピツ）　Ｘ３２ビツトの乗算器２０３と
、３２ビツト加算器２０４及び３２ビツトレジスタ２０
５　と、３２ビツト加算器２０６及び３２ビツトレジス
タ２０７　と、キャリー用１ビツトレジスタ２０８　と
から構成される。The multiplication circuit 200 includes a register (RF-A) 201 that stores a 512-bit multiplicand, a register (RF-B) 202 that stores a 512-bit multiplier B, and a register (RF-B) 202 that stores a 512-bit multiplicand, and a register (RF-B) 202 that stores a 512-bit multiplicand. ．、、、、
A15) and the multiplier B in 32-bit units (B
32-bit multiplier 203, 32-bit adder 204, and 32-bit register 20.
5, a 32-bit adder 206, a 32-bit register 207, and a 1-bit carry register 208.

先ず、乗算器の一方の入力には、Ｂ１５が入力され、も
う一方の入力には、Ａ１５．　Ａ１４．　、、、、　Ａ
Ｏと順次人力されていき、Ｂ１５の値はＡＯまで入力さ
れるまで不変である。First, B15 is input to one input of the multiplier, and A15. A14. ,,,, A
0 is input manually, and the value of B15 remains unchanged until AO is input.

乗算器２０３からの３２ビツトＸ３２　ビットの乗算出
力の下位３２ビツトが加算器２０４に入力され、上位３
２ビツトが加算器２０６　に入力される。The lower 32 bits of the 32-bit x 32-bit multiplication output from the multiplier 203 are input to the adder 204, and the upper 3
Two bits are input to adder 206.

下位３２ビツトの加算器２０４では、今回の乗算出力の
下位３２ビツトと、レジスタ２０７に格納されている前
回の乗算出力の上位３２ビツトとを加算し、その結果を
レジスタ２０５に格納する。第１回目の乗算では、レジ
スタ２０７は空なので、加算器２０４では下位３２ビツ
トを単に通過させるだけである。The lower 32 bit adder 204 adds the lower 32 bits of the current multiplication output and the upper 32 bits of the previous multiplication output stored in the register 207, and stores the result in the register 205. In the first multiplication, register 207 is empty, so adder 204 simply passes the lower 32 bits.

上位３２ビツトの加算器２０６では、今回の乗算出力の
上記３２ビツトと、レジスタ２０８　に格納されている
前回の加算で生じたキヤＩＪ−Ｃと、下位ビット加算器
２０４で生じたキャリー口とを加算し、その結果をレジ
スタ２０７に格納する。このとき、加算器２０６で生じ
たキャリーはレジスタ２０８に格納される。これにより
、八Ｘ　Ｂ、５の部分積が計算される。The upper 32 bit adder 206 combines the 32 bits of the current multiplication output, the carry IJ-C generated in the previous addition stored in the register 208, and the carry port generated in the lower bit adder 204. The result is stored in the register 207. At this time, the carry generated by adder 206 is stored in register 208. As a result, the partial product of 8×B,5 is calculated.

乗算回路２００の出力としては、レジスタ２０５から５
１２ビツト×３２ビツトの部分積Ａ　Ｘ　Ｂ、５が先ず
出力され、次にＡ　Ｘ　Ｌｔが出力され、以下、Ａ×Ｂ
ｏまで順次出力される。As the output of the multiplication circuit 200, the registers 205 to 5
The 12-bit x 32-bit partial product A x B, 5 is first output, then A x Lt is output, and hereafter A x B
It is output sequentially up to o.

２１３及び２１４と、検索ビット幅力テ３２ビットの剰
余テーブル２１５　と、内部剰余テーブル２１６　と、
３２ビツトシフト器２１７とで構成されている。213 and 214, a remainder table 215 with a search bit width of 32 bits, an internal remainder table 216,
It is composed of a 32-bit shifter 217.

乗算回路２００から最初に部分積Ａ　Ｘ　Ｂ、５が剰余
制御回路２１０の加算器２１１に転送される。剰余制御
回路では、部分積の下位５１２　ビットをレジスタファ
イル（ＲＦＩ）２１２に格納し、５１２　ビットを越え
る３２ビツトを剰余テーブル２１５に転送する。First, the partial product A x B, 5 is transferred from the multiplication circuit 200 to the adder 211 of the remainder control circuit 210 . The remainder control circuit stores the lower 512 bits of the partial product in a register file (RFI) 212 and transfers the 32 bits exceeding the 512 bits to the remainder table 215.

剰余テーブル２１５の詳細な構成を第３図に示す。The detailed structure of the remainder table 215 is shown in FIG.

第３図において、剰余テーブル２１５は４つの剰余テー
ブル３１〜３４とトーナメント加算のだめの加算器３５
〜３７で構成されており、送られてきた３２ビツトを、
本実施例では、分割の単位を８ビツトとして４つに分割
し、剰余テーブル３１〜３４にアドレスとして入力する
。４つの剰余テーブル３１〜３４のそれぞれには各桁の
８ビツトのアドレスを上位ビットとし下位ビット　（５
１２ビット、５２０ビツト、５２８ビツト及び５３６ビ
ツト）をオール０とした値に対してｎを法として剰余を
とった値５１２ビットが格納されている。各剰余テーブ
ルにおける格納データの例及び格納形式の詳細は、第６
図及び第７図を参照して具体例で後述するように、一つ
の８ビツトの各上位アドレスに対して５１２　ビットの
剰余が３２ビット単位で１６個に分割されて格納されて
いる。In FIG. 3, the remainder table 215 includes four remainder tables 31 to 34 and an adder 35 for tournament addition.
~37 bits, and the sent 32 bits are
In this embodiment, the unit of division is 8 bits, and the data is divided into four parts and input into the remainder tables 31 to 34 as addresses. Each of the four remainder tables 31 to 34 has the 8-bit address of each digit as the upper bit and the lower bit (5
12 bits, 520 bits, 528 bits, and 536 bits) are all 0, and a value of 512 bits is stored, which is the remainder modulo n. Examples of data stored in each remainder table and details of the storage format can be found in Chapter 6.
As will be described later in a specific example with reference to FIG. 7 and FIG. 7, the 512-bit remainder for each 8-bit upper address is divided into 16 32-bit units and stored.

剰余テーブルに送られてきた３２ビツトは８ビット毎に
４つに分割され各剰余テーブル３１〜３４に上位アドレ
スとして人力される。上位アドレスが設定されると、剰
余の結果の桁指定の４ビツトの下位アドレスを順次イン
クリメントし、下位ワードから出力する。The 32 bits sent to the remainder table are divided into four parts of 8 bits each and input into each of the remainder tables 31-34 as upper addresses. When the upper address is set, the 4-bit lower address specifying the digit of the remainder result is sequentially incremented and output starting from the lower word.

４つの剰余テーブルから出力された剰余出力はキャリー
セーブされた加算器３５．３６．　　及び３７でトーナ
メント加算され、最大５１４ビツトの出力として下位ワ
ードから３２ビット単位に順次出力される。The remainder outputs from the four remainder tables are carried into carry-save adders 35, 36. and 37, and are sequentially output in units of 32 bits starting from the lower word as an output of a maximum of 514 bits.

キャリーセーブされた加算器とは、下位ビットの加算値
のオーバフローによるキャリーを上位ビットの加算値に
加える形式の加算器をいう。例えば、８５Ｅ７　＋　Ｂ
ＢＢ３　＝　１７１９Ａを計算する場合、先ず下位８ビ
ット同士を加算してＥＴ　＋　８３　＝　１９Ａを得て
下位８ビツトの９Ａを第１ワードとして出力し、次に上
位８ビット同士の加算で８５　＋　ＢＢを求めるが、こ
の場合光に計算した下位８ビット同士の加算結果のオー
バフローした上位４ビツトの１をも加算する。そして、
得られる値１７１の下位８ビツトの７１を第２ワードと
して出力する。上記値１７１の上位４ビツトは第３ワー
ドの出力となる。こうして加算結果である１７１９Ａが
得られる。A carry-saved adder is an adder that adds a carry due to an overflow of the added value of the lower bits to the added value of the upper bits. For example, 85E7 + B
When calculating BB3 = 1719A, first add the lower 8 bits to get ET + 83 = 19A, output 9A of the lower 8 bits as the first word, then add the upper 8 bits to get 85 + BB In this case, the overflowed 1 of the upper 4 bits of the calculated addition result of the lower 8 bits is also added to the light. and,
The lower 8 bits 71 of the obtained value 171 are output as the second word. The upper 4 bits of the value 171 become the output of the third word. In this way, the addition result 1719A is obtained.

上記の実施例では乗算回路２００の出力のオーバフロー
分である入力３２ビツトを８ビツト毎に４分割して剰余
テーブル３１〜３４に入力したが、３２ビツトのあふれ
ビットを単一の剰余テーブルに人力するように構成して
もよい。ただし、この場合は剰余テーブルのメモリ容量
が莫大なものとなる。また、あふれビットの３２ビツト
を２ビツト、４ビツト等必要に応じて任意の数に分割し
てもよい。あふれビットをいくつに分割して剰余テーブ
ルに人力するかは、剰余テーブルのメモリ容量と処理速
度との関係で定まる。In the above embodiment, the 32 input bits, which are the overflow portion of the output of the multiplication circuit 200, are divided into 4 parts every 8 bits and input into the remainder tables 31 to 34. It may be configured to do so. However, in this case, the memory capacity of the remainder table becomes enormous. Further, the 32 overflow bits may be divided into any number of bits, such as 2 bits, 4 bits, etc., as necessary. How many overflow bits are divided and manually stored in the remainder table is determined by the relationship between the memory capacity of the remainder table and the processing speed.

第２図に戻り、剰余制御回路２１０では、剰余テーブル
２１５からの最大５１４ビツトの出力と加算器２１３か
らの部分積の下位５１２ビツトとを加算器２１４で加算
して、部分積の中間剰余結果を得る。ここで、中間剰余
結果と称するのは、結果が剰余の法ｎ以下に収まってお
らずに、剰余結果にｎの正数倍を加算した結果になって
いる為である。剰余テーブル２１５を参照して中間剰余
結果を得ている間に、乗算回路２００では次の桁の部分
積Ａ×Ｂ＋４を求めている。次に、Ａ　ｘ　Ｂ　Ｉａの
部分積とＡ　Ｘ　Ｂ　＋　５の中間剰余結果を３２ビツ
トシフトしたもの（実際には、加算のタイミングを１サ
イクル遅らせる）とを加算器２１１で加える。加算した
結果は、最大５４７ビツトとなり３５ビツトのあぶれを
生じる。この為、前記の３２ビツトの剰余テーブルの他
に、３ビツトの内部剰余テーブル２１６が設けられてい
る。Returning to FIG. 2, in the remainder control circuit 210, the adder 214 adds the maximum 514 bits of output from the remainder table 215 and the lower 512 bits of the partial product from the adder 213 to obtain the intermediate remainder result of the partial product. get. Here, the reason why the result is called an intermediate remainder result is that the result is not within the modulus n of the remainder, but is the result of adding a positive multiple of n to the remainder result. While the intermediate remainder result is obtained by referring to the remainder table 215, the multiplication circuit 200 obtains the partial product A×B+4 of the next digit. Next, the adder 211 adds the partial product of A x B Ia and the intermediate remainder result of A x B + 5 shifted by 32 bits (actually, the timing of addition is delayed by one cycle). The result of addition is a maximum of 547 bits, resulting in a 35-bit error. For this reason, in addition to the 32-bit remainder table described above, a 3-bit internal remainder table 216 is provided.

３ビツトの内部剰余テーブル２１６の出力は、剰余テー
ブル２１５で剰余テーブル出力をトーナメント加算して
いる間に、加算器２１３で部分積の下位５１２ビツトと
加算され、５１３ビツトの結果を得る。この結果と剰余
テーブル２１５の出力とを加算器２１４で加算してＡＸ
　（ＢＩＳＸ２３２＋Ｂ１４）　の剰余を得る。この操
作をＡ　ｘ　Ｂ　ｏの部分積まで１６回繰り返すと、Ａ
×Ｂの中間剰余としての５１５　ビットの値が加算器２
１４の出力に得られる（前述の式（２）を参照）。The output of the 3-bit internal remainder table 216 is added to the lower 512 bits of the partial product in the adder 213 while the remainder table 215 performs tournament addition of the remainder table output to obtain a 513-bit result. This result and the output of the remainder table 215 are added by an adder 214, and AX
Obtain the remainder of (BISX232+B14). If this operation is repeated 16 times until the partial product of A x Bo
The 515-bit value as the intermediate remainder of ×B is sent to the adder 2.
14 (see equation (2) above).

後段処理部２３０は、加算器２１４の出力の下位５１１
ビツトを格納するレジスタ２３１　と、４ビツトの人力
をアドレスとする倍数テーブル２３２と、倍数テーブル
の出力の５１２ビツトとレジスタ２３１からの下位５１
１　ビットとを加算する加算器２３３　と、加算器２３
３の出力に得られる乗算剰余結果又は乗算剰余結果十ｎ
の値を格納するレジスタ２３４と、加算器２３３の出力
に−ｎを加算する加算器２３５と、加算器２３５の出力
を格納するレジスタ２３６と、レジスタ２３４の出力ま
たはレジスタ２３６の出力を符号チエツクにより選択し
て出力するセレクタ２３７とで構成されている。The post-processing unit 230 processes the lower 511 output of the adder 214.
A register 231 that stores bits, a multiple table 232 that uses 4-bit input as an address, and 512 bits of output from the multiple table and the lower 51 bits from register 231.
an adder 233 that adds 1 bit;
The multiplication remainder result obtained from the output of 3 or the multiplication remainder result 1n
A register 234 stores the value of , an adder 235 adds -n to the output of the adder 233, a register 236 stores the output of the adder 235, and the output of the register 234 or the output of the register 236 is checked by a sign check. It is composed of a selector 237 that selects and outputs.

倍数テーブル２３２は、剰余テーブルと同一の構成で、
加算器２１４の出力に得られた５１５ビツトのうち、あ
ふれた３ビツトと５１２　ビットの最上位ビットの合計
４ビツトで倍数テーブル２３２を参照する。加算器２３
３で倍数テーブル出力の５１２ビツトと加算器２１４の
出力の下位５１１ビツトを加算すると、乗算剰余結果又
は乗算剰余結果＋ｎの値が得られ、レジスタ２３４に格
納される。格納するのと平行して、加算器２３５ではこ
の結果に−ｎを加算しレジスタ２３６に格納する。そし
て、レジスタ２３４゜２３６の中に格納された値の符号
をチエツクし、レジスタ２３４．２３６より引き出すこ
とにより乗算剰余結果が得られる。The multiple table 232 has the same configuration as the remainder table,
Of the 515 bits obtained as the output of the adder 214, the multiple number table 232 is referred to using a total of 4 bits, including the overflowing 3 bits and the most significant bit of the 512 bits. Adder 23
By adding the 512 bits of the multiple table output and the lower 511 bits of the output of the adder 214 in step 3, the value of the multiplication remainder result or the multiplication remainder result +n is obtained and stored in the register 234. In parallel with the storage, the adder 235 adds -n to this result and stores it in the register 236. Then, by checking the sign of the value stored in the registers 234 and 236 and drawing it from the registers 234 and 236, the multiplication remainder result is obtained.

上記では、最後に倍数テーブルとしてあふれビットに１
゛ビット加えた４ビツトで倍数テーブルをえる回数が２
回となるだけであり、同じ様に乗算剰余の結果が得られ
る。In the above, at the end, 1 is added to the overflow bit as a multiple table.
The number of times the multiple table is retrieved by adding 4 bits is 2.
, and the result of the multiplication remainder is obtained in the same way.

以上の本発明の説明から明らかなように、本発明の実施
例によれば、乗算回路では部分積を演算し、各部分積に
対して剰余テーブルを参照することを繰り返すことによ
り中間剰余を求め、後段処理部で最終的な剰余を求釣る
と、うにしたので、乗算回路での乗算器、加算器等の回
路規模は従来に比べて大幅に減少する。As is clear from the above description of the present invention, according to the embodiment of the present invention, the multiplication circuit calculates partial products and calculates intermediate remainders by repeatedly referring to the remainder table for each partial product. , the final remainder is calculated in the subsequent processing section, so the circuit scale of the multipliers, adders, etc. in the multiplication circuit is significantly reduced compared to the conventional method.

第２図に示した実施例における処理速度を更に速くした
い場合は、第４図に示すように、第２図に示した乗算回
路及び剰余制御回路を並列に配置して拡張すればよい。If it is desired to further increase the processing speed of the embodiment shown in FIG. 2, the multiplication circuit and remainder control circuit shown in FIG. 2 may be arranged in parallel and expanded as shown in FIG.

高速化のために回路を並列に拡張する場合には、並列数
に応じて基本処理単位が３２ビツトの倍数である６４ビ
ツト、１２８　ビット。When expanding circuits in parallel to increase speed, the basic processing unit can be 64 bits or 128 bits, which are multiples of 32 bits, depending on the number of parallel circuits.

２５６ビツト、５１２　ビット・・・等となる。256 bits, 512 bits, etc.

第４図に示した実施例では、複数の乗算回路２００−１
．２００−２．２００−３．２００−４が並列に配置さ
れており、部分積を計算する各乗算回路の構成は、各乗
算回路の出力側にキャリーセーブの加算器２０９を追加
したことを除き第２図と同一である。この回路２０９に
おいて、上位の部分積乗算回路の出力と加算することに
より、ｎｍｘｎｋ　（ｋ＝１．２゜４・・・１６）の部
分積が計算できる構成となっている。本実施例では４並
列の場合について説明を行う、最上位の拡張乗算回路２
００−４ではＡ×Ｂ＋ｓの部分積が計算され、次の拡張
乗算回路２００−３ではＡ×Ｂ１４．２００−２　テは
Ａ　ｘ　Ｂ　！３，２００１ではＡ×ＢＩ２の部分積が
計算され、各々の部分積を加算することによりＡ　Ｘ　（Ｂ＋ｓＸ　２９６＋ＢＩ４Ｘ　２６４＋ＢＩ
３Ｘ　２３２＋Ｂ、。）という部分積の計算が行われる。この場合には、部分積
は５１２＋　１２８ビツトとなり、この部分積の上位１
２８　ビットがあふれる。In the embodiment shown in FIG. 4, a plurality of multiplication circuits 200-1
．． 200-2.200-3.200-4 are arranged in parallel, and the configuration of each multiplier circuit for calculating partial products is as follows, except that a carry-save adder 209 is added to the output side of each multiplier circuit. Same as Figure 2. This circuit 209 is configured to be able to calculate nmxnk (k=1.2°4...16) partial products by adding it to the output of the upper partial product multiplier circuit. In this embodiment, the explanation will be made for the case of 4 parallel circuits.
In 00-4, the partial product of A×B+s is calculated, and in the next extended multiplication circuit 200-3, A×B14.200-2 te is A x B! 3, 2001, the partial products of A×BI2 are calculated, and by adding each partial product, A X (B+sX 296+BI4X 264+BI
3X 232+B,. ) is calculated as the partial product. In this case, the partial product is 512+128 bits, and the top 1 of this partial product is
28 bits overflow.

あふれた１２８ビツトで第２図で説明した第１の実施例
と同様に剰余テーブル２２０及び拡張用剰余テーブル２
２０−１を参照しくこの場合には剰余テーブルも拡張さ
れて１２８ビツトの拡張剰余テーブル２１５−１〜２１
５−４及び加算器２１６−１〜２１６−４が必要となる
）、各拡張剰余テーブルの出力を上位の拡張剰余テーブ
ル出力と加算した結果と、加算器２０９の出力の下位５
１２ビツトとの加算を行う。拡張された場合にも、剰余
テーブル出力の数が増加する為に２回目から引くあふれ
ビット用の剰余テーブルの幅及び倍数テーブルの幅が広
くなる以外は、第１の実施例と同様である。With the overflowing 128 bits, the remainder table 220 and expansion remainder table 2 are
20-1, in this case, the remainder table is also expanded and becomes a 128-bit extended remainder table 215-1 to 215-1.
5-4 and adders 216-1 to 216-4), the result of adding the output of each extended remainder table to the output of the upper extended remainder table, and the lower 5 of the output of the adder 209.
Performs addition with 12 bits. Even in the case of expansion, the second embodiment is the same as the first embodiment except that the width of the remainder table for overflow bits and the width of the multiple table that are drawn from the second time onward become wider because the number of remainder table outputs increases.

より詳細には、加算器２０９からの出力は３２ビット単
位で順次出力されるが、下位５１２　ビットは加算器２
１１を介してレジスタ２１２に格納され、その次のあふ
れビットである３２ビツトは拡張剰余テーブル２１５−
４に送られ、さらに次のあふれビットの３２ビツトは拡
張剰余テーブル２１５−３に送られ、次の３２ビツトは
拡張剰余テーブル２１５−２に送られ、次の３２ビツト
が剰余テーブル２１５−１に送られる。More specifically, the output from adder 209 is sequentially output in units of 32 bits, and the lower 512 bits are output from adder 2.
11 to the register 212, and the next overflow bit, 32 bits, is stored in the extended remainder table 215-
The next 32 overflow bits are sent to the extended remainder table 215-3, the next 32 bits are sent to the extended remainder table 215-2, and the next 32 bits are sent to the remainder table 215-1. Sent.

各加算器２１６−４〜２１６−．１で１サイクル分遅れ
て剰余結果を上位の剰余結果と加算する。この結果、加
算器２１６−１の出力には剰余テーブルの同じ桁をすべ
て足し合わせたものが得られ、これを加算器２１４にて
、部分積の下位５１２　ビットと加算する。Each adder 216-4 to 216-. 1, the remainder result is added to the upper remainder result with a delay of one cycle. As a result, the output of the adder 216-1 is obtained by adding up all the same digits of the remainder table, and the adder 214 adds this to the lower 512 bits of the partial product.

以下、前述の第１の実施例と同様の動作を行ってセレク
タ２３７の出力にＡ×Ｂ　（ｍｏｄｎ）を得る。Thereafter, the same operation as in the first embodiment described above is performed to obtain A×B (modn) as the output of the selector 237.

以上述べたように、この第２の実施例により乗算回路を
拡張した場合も、剰余テーブルの幅及び倍数テーブルの
幅が広くなることを除き、第１の実施例と同様である。As described above, even when the multiplication circuit is expanded according to the second embodiment, the operation is the same as the first embodiment except that the width of the remainder table and the width of the multiple table become wider.

尚、第２図の実施例では部分積を求めるのに１６サイク
ル必要であったものが、本実施例では、４サイクル分の
処理時間で済、スループットが更に向上する。In addition, in the embodiment shown in FIG. 2, 16 cycles were required to obtain the partial products, but in this embodiment, the processing time is only 4 cycles, and the throughput is further improved.

第５図は実際の数値例での本発明の実施例による乗算剰
余演算を説明するための基本構成を示すブロック図であ
って、処理単位が第１の実施例におけるように３２ビッ
ト単位ではなくて、８ビット単位となっていることを除
き、第２図及び第３図の構成と同一であり、構成要素の
参照符号には番号にａを付して第２図及び第３図におけ
る構成要素の参照番号と対応づけである。尚、第５図に
おいては、図面の簡単化のために、レジスタ２３１．２
３４、及び２３６　は省略しである。FIG. 5 is a block diagram showing the basic configuration for explaining the modular multiplication operation according to the embodiment of the present invention using an actual numerical example, and the processing unit is not 32 bits as in the first embodiment. The structure is the same as that in FIGS. 2 and 3, except that it is in 8-bit units, and the reference numerals of the components are indicated by adding a to the numbers. This is the reference number and correspondence of the element. In addition, in FIG. 5, for the sake of simplicity, the register 231.2 is
34 and 236 are omitted.

第６Ａ図〜第６Ｄ図は第５図に示した剰余テーブル（Ｔ
ｌ＞　３１ａ、　　（Ｔ２）　３２ａ、　　（Ｔ３）　
３３ａ、　及び（Ｔ４）３４ａの内容を示す図、第６Ｉ
３図は内部テーブル２１６ａの内容を示す図、第６Ｆ図
は倍数テーブル２３２ａの内容を示す図である。Figures 6A to 6D are the remainder table (T
l> 31a, (T2) 32a, (T3)
33a, and (T4) Diagram showing the contents of 34a, No. 6I
3 is a diagram showing the contents of the internal table 216a, and FIG. 6F is a diagram showing the contents of the multiple table 232a.

第６Ａ図に示すテーブルＴ１において、人力アドレスが
００のときの剰余データは、００を上位の２ビツトとし
、下位８ビツトをオールＯとした１０ビツトのデータを
ｎで割った剰余００００である。In table T1 shown in FIG. 6A, the remainder data when the manual address is 00 is the remainder 0000 obtained by dividing 10-bit data with 00 as the upper 2 bits and lower 8 bits as all O's by n.

入力アドレスが０１のときの剰余データは、０１を上位
の２ビツトとし、下位８ビツトをオール０とした１０ビ
ツトのデータをｎで割った剰余３２Ｅ９である。The remainder data when the input address is 01 is the remainder 32E9 obtained by dividing 10 bit data by n, with 01 as the upper two bits and the lower eight bits as all 0s.

人力アドレスが１０のときの剰余データは、１０を上位
の２ビツトとし、下位８ビツトをオールＯとした１０ビ
ツトのデータをｎで割った剰余６５Ｄ２である。The remainder data when the manual address is 10 is the remainder 65D2 obtained by dividing the 10-bit data with 10 as the upper 2 bits and the lower 8 bits as all O's by n.

入力アドレスが１１のときの剰余データは、１１を上位
の２ビツトとし、下位８ビツトをオールＯとした１０ビ
ツトのデータをｎで割った剰余９８８Ｂである。The remainder data when the input address is 11 is the remainder 988B obtained by dividing 10 bits of data by n, with 11 as the upper 2 bits and the lower 8 bits as all O's.

第６Ｂ図に示すテーブルＴ２においては、人力アドレス
はあふれビットである８ビツトの第３及び第４ビツトの
２ビツトがアドレスとなるので、被除数は上位２ビツト
が入力アドレスで下位１０ビツトがオール０の値である
。In table T2 shown in FIG. 6B, the manual address uses the overflow bits, the third and fourth bits of the 8 bits, as the address, so the dividend is such that the upper 2 bits are the input address and the lower 10 bits are all 0s. is the value of

以下、第６Ｃ図及び第６Ｄ図においても、被除数の値は
それぞれ下位１２ビツト及び下位１４ビツトがオール０
で上位２ビツトが人力アドレスとなっている。Hereinafter, in FIGS. 6C and 6D, the value of the dividend is such that the lower 12 bits and lower 14 bits are all 0, respectively.
The top two bits are the human address.

第６６Ｅ１の内部テーブルでは、人力アドレスに等しい
上位３ビツトと下位１６ビツトのオールＯとの計１９ビ
ットをｎで割った剰余が、図示のように格納されている
。In the 66th E1 internal table, the remainder obtained by dividing a total of 19 bits, the upper 3 bits equal to the manual address and the lower 16 bits, all O's, by n is stored as shown.

第６Ｆ図の倍数テーブルでは、入力アドレスに等しい上
位４ビツトと下位７ビツトのオールＯとの計１１ビット
・トをｎで割った剰余が、図示のように格納されている
。In the multiple table of FIG. 6F, the remainder obtained by dividing a total of 11 bits, the upper 4 bits equal to the input address and the lower 7 bits, all O's, by n is stored as shown.

第７図はテーブルＴ１についての実際のテーブル格納形
式を説明する図である。同図において、テーブルＴ１で
は２ビツトのアドレス入力に対して剰余データの下位８
ビツトと上位８ビツトが別々に格納されている。そして
剰余データの下位及び上位を区別するために、２ビツト
のアドレス入力の更に下位に１ビツトのワード指定用ビ
ットを設け、このワード指定ビットにより剰余データを
ワード単位（８ビット単位）に読み出す。例えば、アド
レス０１に対応する剰余データは３２Ｅ９であるが、第
７図の格納形式では、アドレス入力が０１のときにワー
ド指定用ビットを０とすることにより先ず下位８ビツト
の日が出力され、次にワード指定用ビットをインクリメ
ントすることにより上位８ビツトの３２が読み出される
。FIG. 7 is a diagram illustrating the actual table storage format for table T1. In the same figure, in table T1, the lower 8 of the residual data for a 2-bit address input are
The bit and the upper 8 bits are stored separately. In order to distinguish between the lower and upper parts of the residual data, a 1-bit word designation bit is provided at the lower end of the 2-bit address input, and the surplus data is read out in word units (8-bit units) using this word designation bit. For example, the remainder data corresponding to address 01 is 32E9, but in the storage format shown in FIG. 7, by setting the word designation bit to 0 when the address input is 01, the day of the lower 8 bits is first output, Next, by incrementing the word designation bit, 32 of the upper eight bits are read out.

第８図は第５図の乗算剰余演算装置において、第６Ａ図
〜６Ｆ図に示したテーブルを用いた場合の実際の乗算剰
余演算の例を示す図である。同図において、数値は特に
断らない限り１６進表示である。FIG. 8 is a diagram showing an example of actual multiplication remainder calculation when the tables shown in FIGS. 6A to 6F are used in the multiplication remainder calculation device of FIG. 5. In the figure, numerical values are expressed in hexadecimal unless otherwise specified.

乗数ＡをＡ３７３、被乗数ＢをＢ２Ｈ２、除数ｎをＣＤ
１７として、Ａ　Ｘ　Ｂ　（ｍｏｄ　ｎ）の計算を説明
する乗数ＡをＡ３７３．　　被乗数ＢをＢ２巳６．除数
ｎをＣＤ１７として、八Ｘ　Ｂ　（ｍｏｄ　ｎ）の計算
を行う場合を説明する。Multiplier A is A373, multiplicand B is B2H2, divisor n is CD
17, the multiplier A to explain the calculation of A X B (mod n) is A373. Multiplicand B is B2 6. A case will be described in which 8X B (mod n) is calculated with the divisor n being CD17.

被乗数Ｂの下位８ビツトを８１、上位８ビツトを８２と
すると、本例ではＢ２＝８２．　　Ｂ１＝８６となる。Assuming that the lower 8 bits of multiplicand B are 81 and the upper 8 bits are 82, in this example B2=82. B1=86.

先ず、第１回目の部分積の計算では、乗算回路２００ａ
　（Ｄ出力トシテ、Ａ　Ｘ　Ｂ２＝（Ａ８７３）　Ｘ　
８２　＝　７５１ＦＦ６（１６ビツト＋８ビツト）が得
られる。この出力の上位８ビツトの７５（２進数表示で
０１１１０１０１）はあふれビットであり、４つの剰余
テーブル３１ａ〜３４ａに２ピツトスつ４つに分割して
人力される。この結果、剰余テーブル３１ａには下位２
ビツトの０１がアドレスとして入力され、第６Ａ図のテ
ーブルよりデータ３２Ｅ９が出力されることがわかる。First, in the first partial product calculation, the multiplication circuit 200a
(D output output, A X B2 = (A873)
82 = 751FF6 (16 bits + 8 bits) is obtained. The upper 8 bits of this output, 75 (01110101 in binary notation), are overflow bits, and are manually input by dividing into four 2-pit bits into the four remainder tables 31a to 34a. As a result, the remainder table 31a contains the lower two
It can be seen from the table of FIG. 6A that bit 01 is input as the address and data 32E9 is output.

同様に、剰余テーブル３２ａにはアドレス０１が人力、
データＣＢＡ４が出力され、剰余テーブル３３ａ　には
アドレス１１が入力、データ８８８３が出力され、剰余
テーブル３４ａにはアドレス０１が入力、データＢ５Ｅ
７が出力される。これらの出力がトーナメント加算され
ると、（Ｂ５８７＋ＢＢＢ３）　＋　（ＣＢＡ４＋３２
Ｅ９）　＝２７０７２が第５図の加算器３７ａの出力に
得られ、これと乗算回路２００ａの出力の下位１６ビツ
）　（ＩＦＦ６）とを加算器２１４ａにより加算すると
、第１回目の部分積として２９０１０が得られる。Similarly, in the remainder table 32a, address 01 is manually input,
Data CBA4 is output, address 11 is input to the remainder table 33a, data 8883 is output, address 01 is input to the remainder table 34a, and data B5E
7 is output. When these outputs are tournament-added, (B587+BBB3) + (CBA4+32
E9) = 27072 is obtained as the output of the adder 37a in Fig. 5, and when this and the lower 16 bits (IFF6) of the output of the multiplication circuit 200a are added by the adder 214a, 29010 is obtained as the first partial product. is obtained.

第２回目の部分積の計算では、Ａ　Ｘ　Ｂｌ・（Ａ８７
３）Ｘ　Ｂ６　＝　９７５７５２　　（１６ビツト＋８
ビツト）が乗算加算器２００ａ　　の出力に得られ、第
１回目の部分積を８ビツトだけシフトして得られる２９
０１０００　と加算すると加算器２１１ａの出力に３２
７７４５２が得られ、その上位１２ビツトの３２７　（
２進数表示で０１１００１．００１１１）はあふれビッ
トであり、４つの剰余テーブル３１ａ〜３４ａに下位ビ
ットから２ビツトずつ４つに分割して入力される。あふ
れビット中の上位３ビツトの旧１は内部テーブル２１６
ａに人力される。この結果、剰余テーブル３１ａから９
８８Ｂ、剰余テーブル３２ａからＣＢＡ４、剰余テーブ
ル３３ａからＣ１７Ｆ、内部テーブル２１６ａから８３
ＥＥがそれぞれ出力される。内部テーブル２１６ａの出
力は加算器２１１ａの出力の下位１６ビツトである７４
５２と加算されて、加算器２１３ａの出力にＦ９４０が
得られる。剰余テーブル３１１〜３４ａの出力はトーナ
メント加算されて、加算器３７ａの出力に２２５ＤＢが
得られる。加算器２１３ａの出力と加算器３７ａの出力
は、加算器２１４ａにより加算されて、３１８１Ｅとな
る。その上位７ビツト　（２進数表示で０１１０００１
）の更に上位４ビツト（０１１０）で倍数テーブル２３
２ａを引き９８８Ｂを得る。この値と加算器２１４ａの
出力の下位１６ビツトＩＥ１εとを加算して、加算器２
３３ａの出力にＢ２Ｏ２を得る。この値からｎを引くと
−Ｅ９Ｃ２と負の数になるので、−ｎを加算しない値Ｂ
６Ｄ９を剰余結果とする。In the second partial product calculation, A
3) X B6 = 975752 (16 bits + 8
bit) is obtained at the output of the multiplier adder 200a, and 29 bits are obtained by shifting the first partial product by 8 bits.
01000, the output of the adder 211a becomes 32
77452 is obtained, and its upper 12 bits are 327 (
011001.00111 (in binary notation) is an overflow bit, and is input to the four remainder tables 31a to 34a by dividing it into four parts each consisting of two bits starting from the lower bit. The old 1s in the upper 3 bits of the overflow bits are stored in the internal table 216.
It is manually operated by a. As a result, 9 from the remainder table 31a
88B, remainder table 32a to CBA4, remainder table 33a to C17F, internal table 216a to 83
EE is output respectively. The output of the internal table 216a is 74 which is the lower 16 bits of the output of the adder 211a.
52 to obtain F940 at the output of the adder 213a. The outputs of the remainder tables 311-34a are tournament-added to obtain 225 DB as the output of the adder 37a. The output of the adder 213a and the output of the adder 37a are added by the adder 214a, resulting in 3181E. The upper 7 bits (0110001 in binary notation)
) and the upper 4 bits (0110) of the multiple table 23.
Subtract 2a to get 988B. This value is added to the lower 16 bits IE1ε of the output of the adder 214a, and the adder 2
B2O2 is obtained at the output of 33a. If you subtract n from this value, it becomes -E9C2, a negative number, so the value B without adding -n
Let 6D9 be the remainder result.

〔Effect of the invention〕

以上の説明から明らかなように、本発明による乗算剰余
演算装置は、基本構成において回路規模が従来に比べて
大幅に減少し、且つ、処理速度を速くしたい場合は基本
の構成から拡張していくことにより、乗算剰余の速度を
変化させることが可能となる。すなわち、高速処理が必
要な場合には乗算回路ユニットと剰余テーブルを追加す
ることにより、速度を上げていくことが可能となり、必
要な速度及び回路規模によって構成を変化させることが
可能な乗算剰余演算装置を提供可能となる。As is clear from the above description, in the modular multiplication arithmetic device according to the present invention, the circuit scale in the basic configuration is significantly reduced compared to the conventional one, and the basic configuration can be expanded if the processing speed is desired to be increased. This makes it possible to change the speed of the multiplication remainder. In other words, when high-speed processing is required, it is possible to increase the speed by adding a multiplication circuit unit and a remainder table, and the configuration of the multiplication remainder operation can be changed depending on the required speed and circuit size. Equipment can be provided.

本発明による乗算剰余演算装置の適用分野は、前述した
公開鍵暗号方式に限定されるものではない。The field of application of the modular multiplication calculation device according to the present invention is not limited to the public key cryptosystem described above.

[Brief explanation of drawings]

第１図は本発明の原理ブロック図、第２図は本発明の一実施例による乗算剰余演算装置の構
成を示すブロック図、第３図は第２図に示した剰余テーブル回路の構成を示丈
ブロック図、第４図は本発明の他の実施例による乗算剰余演算装置の
構成を示すブロック図、第５図は本発明の実施例による実際の数値での演算を説
明するための乗算剰余演算装置を示すブロック図、第６Ａ図〜６Ｆ図は第５図における各テーブルの内容を
示す図、第７図は実際のテーブル格納形式を示す図、第８図は実
際の数値演算を説明する図、第９図は本発明の背景であ
る公開鍵暗号の原理図、第１０図は従来の乗算剰余演算装置の構成例を示すブロ
ック図、第１１図は第１θ図に示した従来の剰余テーブル回路の
構成例を示すブロック図、第１２図は第１１図に示した従来の剰余テーブルモジュ
ールの構成例を示すブロック図である。図において、１・・・乗算回路、２・・・剰余テーブル、３・・・第１の加算器、４・・・シフト手段、５・・・第２の加算器である。FIG. 1 is a block diagram of the principle of the present invention. FIG. 2 is a block diagram showing the configuration of a multiplication remainder arithmetic device according to an embodiment of the present invention. FIG. 3 is a block diagram showing the configuration of the remainder table circuit shown in FIG. FIG. 4 is a block diagram showing the configuration of a multiplication remainder calculation device according to another embodiment of the present invention. FIG. A block diagram showing the arithmetic unit; Figures 6A to 6F are diagrams showing the contents of each table in Figure 5; Figure 7 is a diagram showing the actual table storage format; Figure 8 explains actual numerical calculations. 9 is a principle diagram of public key cryptography which is the background of the present invention. FIG. 10 is a block diagram showing a configuration example of a conventional multiplication remainder calculation device. FIG. Block diagram showing an example of the configuration of a table circuit. FIG. 12 is a block diagram showing an example of the configuration of the conventional remainder table module shown in FIG. In the figure, 1... Multiplication circuit, 2... Remainder table, 3... First adder, 4... Shifting means, 5... Second adder.

Claims

[Claims] 1, A×B (mod n) (n is Km bits, A, B<
n) (m=1, 2,...) multiplication remainder arithmetic device, which divides Km bits A and B into m pieces every K bits (1 word), from the upper word. Multiply with B in order to obtain K
A multiplication circuit (1) that obtains a partial product of m+K bits m times, and of the Km+K bit data obtained at the output of the multiplication circuit (1), the K bit data that exceeds the Km bit is used as a search bit, and the Km+K bit is searched. Lower Km of data
A remainder table (2) that outputs a remainder result that is the remainder obtained by dividing the value of the bit with all 0 bits and the upper K bits as the search bit by the modulus n of the remainder; and Km bits output from the remainder table (2). A first adder (3) that obtains an intermediate result of a Km×K bit multiplication remainder operation by adding the lower Km bits of the partial product; and a shifter that shifts the intermediate result by K bits to the upper digits. means (4); and a second adder (5) for adding the shifted result of said shifting means (4) and the partial product of Km+K bits for the next word.
), and a correction circuit (6) that corrects the intermediate result obtained by the first adder (3) and selects a remainder, and the lower Km bits of the output of the second adder (5). and the remainder table (2
) by the first adder (3), the intermediate result obtained by adding the output of By repeating the operation of adding m times to the partial product output from the multiplication circuit (1) in a cycle, an intermediate result of Km x Km bit multiplication remainder operation is obtained, and finally, using the correction circuit (3), A multiplication remainder calculation device that performs multiplication remainder calculations at high speed by keeping the result within the modulus n of the remainder. 2. The multiplier circuit (1) stores the K bit carry.
By adding a bit adder, Km×KI (I=1,2,4
, 8...m) bit partial products are obtained, and the remainder table (2 ), and by adding an adder for adding the output of the extended remainder table to the output of the upper extended remainder table,
An expandable multiplication remainder arithmetic device capable of performing the multiplication remainder of Km×KI (I=1, 2, 4, 8, . . . m) in I calculations.