JP2018139017A

JP2018139017A - Memory type processor, apparatus including memory type processor, and method of using the same.

Info

Publication number: JP2018139017A
Application number: JP2015116086A
Authority: JP
Inventors: 井上　克己; Katsumi Inoue; 克己井上
Original assignee: Individual
Current assignee: Individual
Priority date: 2015-06-08
Filing date: 2015-06-08
Publication date: 2018-09-06
Also published as: WO2016199808A1

Abstract

【課題】並列演算の回路スペースを削減し、消費電力を削減するメモリ型プロセッサ、メモリ型プロセッサを含んだ装置、その使用方法を提供する。【解決手段】データテーブルの下段に設けられた１ｂｉｔ演算器１０５は、選択指定１１０されたアドレス１０４の記憶セル１０２のｂｉｔセルのデータ毎に対し、論理記憶１１６、論理積１１２、論理和１１３、論理否定（ＮＯＴ）１１４、排他論理和１１５、全加算２１１、その他のオプション機能、ならびにその組合せの演算器が指定した演算条件１１１で全レコード並列に演算が実行できる構成となっている。またこの１ｂｉｔ演算器１０５の演算結果を出力するため優先順出力回路（プライオリティアドレスエンコーダ出力回路）２１４などの演算結果出力１０６機能が備えられている。【選択図】図２The present invention provides a memory processor, a device including the memory processor, and a method of using the memory processor that reduce circuit space for parallel operation and reduce power consumption. A 1-bit computing unit 105 provided in a lower part of a data table is configured to perform logical storage 116, logical product 112, logical sum 113, logical data 116, logical data 112, logical data 113, logical data 116, logical data 112, logical sum 113, A logical negation (NOT) 114, an exclusive OR 115, a full addition 211, other optional functions, and a calculation condition 111 specified by the arithmetic unit of the combination thereof can be executed in parallel for all records. Further, a calculation result output 106 function such as a priority order output circuit (priority address encoder output circuit) 214 is provided to output the calculation result of the 1-bit calculator 105. [Selection] Figure 2

Description

本発明はメモリ型プロセッサ、メモリ型プロセッサを含んだ装置、その使用方法に関する。 The present invention relates to a memory type processor, an apparatus including the memory type processor, and a method of using the same.

現在のコンピュータはＣＰＵやＧＰＵなど逐次処理型プロセッサが情報処理の全てをこなすものであるので、ＣＰＵはＣＰＵが得意な処理も苦手な処理も一手に引き受けざるを得ない。
ビックデータの活用を本質的に考える上で、あらためて現在のノイマン型コンピュータの２つの大きな課題を明らかにしておく必要がある。 In a current computer, a sequential processing type processor such as a CPU or GPU performs all information processing, so the CPU has to take care of processing that the CPU is good at and is not good at.
In considering the use of big data essentially, it is necessary to clarify two major problems of the current Neumann computer.

第１の課題はノイマンバスボトルネックである。
ＣＰＵやＧＰＵにとってメモリ上のデータは裏返しになったトランプのような存在であり、一枚一枚（１アドレス、１アドレス）ごとにめくって（アクセス）して情報を探す以外にない。
ＣＰＵやＧＰＵがメモリ上の情報を逐次検索し特定の情報を見つけ出すような情報処理を行った場合、極めて多くの情報処理量となり待ち時間が多くなる。
情報探し処理は、メモリ上の情報を何度も繰り返しアクセスする必要があるので他の情報処理に比較してバスボトルネックの影響が大きい。
これがノイマン型コンピュータの宿命、情報探しバスボトルネックである。 The first problem is the Neumann bath bottleneck.
For the CPU and GPU, the data in the memory is like a playing card that is turned upside down, and there is no other way than searching for information by turning (accessing) each piece (one address, one address).
When the CPU or GPU performs information processing that sequentially searches for information in the memory and finds specific information, the amount of information processing becomes extremely large and the waiting time increases.
In the information search process, the information on the memory needs to be accessed repeatedly over and over, so that the influence of the bus bottleneck is greater than other information processing.
This is the fate of the Neumann computer and the information search bus bottleneck.

従って、特定の情報を見つけ出すような情報処理をする場合ＣＰＵやＧＰＵの負担、情報処理の負担を軽減するために様々な利用技術（ソフトウエアアルゴリズム）を考案し利用する以外方法はない。
情報を探し出すために利用される代表的なアルゴリズムとして、ハッシュテーブル、インデックス、木構造、バイナリサーチ、クラスタリング、これらの組合せなど無数に存在する。
これらのアルゴリズムはアルゴリズム毎にメタデータ（構造化データ）を必要とする。
これらの利用技術（ソフトウエアアルゴリズムとメタデータ）は、ＣＰＵの負担、情報処理の回数を軽減する手段、宿命を持って生まれたＣＰＵやＧＰＵを活かすための利用技術に他ならない。 Therefore, when performing information processing to find specific information, there is no method other than devising and using various utilization techniques (software algorithms) in order to reduce the burden on the CPU and GPU and the burden on information processing.
There are countless algorithms such as hash tables, indexes, tree structures, binary search, clustering, and combinations thereof as typical algorithms used to search for information.
These algorithms require metadata (structured data) for each algorithm.
These utilization techniques (software algorithms and metadata) are none other than utilization techniques for utilizing CPUs and GPUs born with fate, CPU burden, means for reducing the number of information processing, and fate.

つまり以上のようなアルゴリズムは、いずれもメモリ上のどこにどのような情報があるのかを事前に整理し、ＣＰＵが情報を探しやすいよう見出しやそのルートを作成、小さい順から順序よくデータを並べるなどの方法である。
この様なアルゴリズムによって検索時のＣＰＵやＧＰＵの負担は解消されるものの、前処理や後処理で複雑な情報処理を余儀なくされている、例えばデータの修正、挿入や削除に代表されるように情報データが修正、追加または削除される度に利用するアルゴリズムで作成されたメタデータの配列の並べ替えや、順番の変更など、これらのアルゴリズムのための前処理や後処理の情報処理が必要になっている。 In other words, all of the above algorithms sort out in advance where and what information is in the memory, create headings and their routes so that the CPU can easily find information, arrange data in order from the smallest, etc. Is the method.
Although such an algorithm eliminates the burden on the CPU and GPU at the time of search, complicated information processing is required for pre-processing and post-processing. For example, information such as data correction, insertion, and deletion Pre-processing and post-processing information processing for these algorithms is required, such as rearrangement of the metadata array created by the algorithm used every time data is modified, added, or deleted, and changing the order. ing.

以上の様なソフトウエアアルゴリズムとメタデータは、データベースの種類や規模によって適切なものを幾つか選択し最適化システムを構築する必要があるので、情報の検索、照合、認証など情報を探す処理を含む情報処理は知識と経験を持った専門家以外は手が出せないという大きな問題を抱えている。 Software algorithms and metadata such as those described above need to be selected according to the type and scale of the database, and an optimization system needs to be constructed. Including information processing, there is a big problem that only an expert with knowledge and experience can do it.

第２の課題はＣＰＵやＧＰＵの演算器である。
ＣＰＵやＧＰＵの心臓部であるＡＬＵ（Arithmetic and Logic Unit）は四則（算術）演算とブール代数に基づく論理（ブール）演算の双方を演算するための機能を持っている、ＡＬＵは一定のデータの幅（たとえば３２bit,や６４ｂｉｔを）並列に演算する必要があるので、１ユニットの回路規模が大きくならざるを得ない。
従って並列度を上げるとチップサイズが大きくなり、消費電力も膨大になる。
微細化技術の限界が間近に近づいている現在に至ってもこの問題の解決策は見出されていない。 The second problem is an arithmetic unit of CPU or GPU.
The ALU (Arithmetic and Logic Unit), which is the heart of the CPU and GPU, has a function to perform both arithmetic (arithmetic) operations and logical (Boolean) operations based on Boolean algebra. Since the width (for example, 32 bits or 64 bits) needs to be calculated in parallel, the circuit scale of one unit must be increased.
Therefore, increasing the parallelism increases the chip size and the power consumption.
A solution to this problem has not been found even today, when the limits of miniaturization technology are approaching.

本発明は一般のメモリに極めて少量（省スペース）の回路を組み込むだけで、メモリ内で目的の情報を探し出す処理や、比較演算、カウント演算、四則演算など多様な演算を高速に実現する、１ユニットの回路規模が極めて小さい並列演算素子を備えた新しい考え方の情報処理メモリ、つまりメモリ型のプロセッサを実現することが目的である。 The present invention realizes various operations such as a process of searching for target information in a memory, a comparison operation, a count operation, and an arithmetic operation at a high speed only by incorporating a very small amount (space-saving) circuit in a general memory. The purpose is to realize a new concept information processing memory, that is, a memory-type processor, equipped with a parallel arithmetic element whose unit circuit scale is extremely small.

本願発明者による発明の特許第４５８８１１４号情報絞り込み検出機能を備えたメモリはパターンマッチなど論理積演算が得意なメモリである。
またPCT/ＪＰ２０１３／０５９２６０号集合演算機能を備えたメモリは以上の情報絞り込み検出機能を備えたメモリの概念を拡大発展させて、論理積演算、論理和演算、論理否定演算などを自由に行うことができるメモリである。 Patent No. 4588114 of the invention of the present inventor A memory having an information narrowing detection function is a memory that is good at logical product operations such as pattern matching.
PCT / JP2013 / 059260 memory with set operation function expands and expands the concept of memory with information narrowing detection function, and can perform logical product operation, logical sum operation, logical negation operation, etc. It is memory that can.

特願平１０−２３２５３１、演算機能付きメモリは図に示されているようにブロック単位に演算回路を設けチップ効率の向上を目指すものである。 Japanese Patent Application No. 10-232531, a memory with an arithmetic function is intended to improve chip efficiency by providing an arithmetic circuit for each block as shown in the figure.

特願昭６３−１２８８４９、学習型文字検索装置と同装置の制御方式は、テキストデータの検索を実現するものであり、本願特許の情報処理と似た構成であるが、情報処理の内容はテキストのインデックス検索（あり／なし）のみに限定されており、本願発明のように、データ値の様々な演算を伴う高度で多目的な情報処理を目的としたものではない。
また学習型文字検索装置と同装置の制御方式文献の図２に示されるように演算器はＡＬＵを用いたものであり、本例のようにＡＬＵを単純に並べただけの構成では並列度が上がらず、発明の効果で記述されるように「パーソナルユーズでかなりの効果が期待できる程度」となっているものと推定出来る。
重要なことは並列度の向上と、演算内容の高度化と、相反する課題を克服することである。 Japanese Patent Application No. 63-128849, the learning type character search device and the control method of the same device realize the search of text data, and the configuration is similar to the information processing of the patent of this application, but the content of the information processing is text However, the present invention is not limited to advanced and versatile information processing involving various calculations of data values as in the present invention.
Further, as shown in FIG. 2 of the control type document of the learning type character search device and the same device, the arithmetic unit uses an ALU, and in the configuration in which the ALUs are simply arranged as in this example, the degree of parallelism is high. As described in the effect of the invention, it can be presumed that “a considerable effect can be expected in personal use”.
What is important is to improve the degree of parallelism and sophisticate the contents of operations and overcome conflicting issues.

また本願発明者による発明の特願特願２０１３−２６４７６３、情報検索機能を備えたメモリ、その利用方法、装置、情報処理方法は演算回路が具体的に示されていなかった、また演算の内容もインデックス検索やデータ値の一致や範囲の検索に限定されたものであった。
本願発明では単に検索など情報を探す処理にとどまることなく多様なデータの演算を超並列で実現するに最も相応しい演算回路を用いたメモリ型プロセッサを提供することである。 In addition, Japanese Patent Application No. 2013-264773 of the invention by the inventor of the present application, a memory having an information search function, a method of using the same, an apparatus, and an information processing method did not specifically show an arithmetic circuit, and the contents of the calculation It was limited to index searches, data value matches, and range searches.
It is an object of the present invention to provide a memory type processor using an arithmetic circuit most suitable for realizing various data operations in parallel without being limited to a process of searching for information such as a search.

特許第４５８８１１４号Japanese Patent No. 4588114 PCT/ＪＰ２０１３／０５９２６０PCT / JP2013 / 059260 特願平１０−２３２５３１Japanese Patent Application No. 10-232531 特願２０１３−２６４７６３Japanese Patent Application No. 2013-264863 特願昭６３−１２８８４９Japanese Patent Application No. 63-128849

この発明は、ノイマンバスボトルネックの影響が一番問題になる情報の検索や照合などの課題を抜本的に解消するとともに、並列演算に適した演算器、とその使い方を提案することで、これまでのＣＰＵやＧＰＵなど逐次処理型プロセッサが中心になった情報処理の課題の多くを抜本的に解消しようとするものである。
つまり、大量のデータの演算をＣＰＵやＧＰＵのみに任せるのではなく、メモリ内で実現可能な演算はメモリに演算させることにより、ＣＰＵやＧＰＵの負担が大きく、複雑で、専門家以外手が出せない情報処理を簡素化すると共に、消費電力を削減出来る、ビッグデータ社会の情報処理に最適な全く新しい情報処理の考え方に基づく情報演算機能をもったメモリ素子を実現することである。 The present invention drastically eliminates problems such as searching and collating information for which the effects of Neumann bus bottlenecks are the most problematic, and proposes an arithmetic unit suitable for parallel arithmetic and its usage. It is intended to drastically solve many of the problems of information processing centered on sequential processing processors such as CPUs and GPUs.
In other words, the calculation of a large amount of data is not left to the CPU or GPU alone, but the calculation that can be realized in the memory is performed by the memory, so that the load on the CPU and GPU is heavy, complicated, and a non-specialist can be put out. It is to realize a memory device having an information calculation function based on a completely new concept of information processing that is ideal for information processing in a big data society and can simplify power processing and reduce power consumption.

具体的にはメモリセルとのコンビネーションで多様な情報処理が可能な１ｂｉｔ演算器（論理演算器と四則演算器）で並列演算の回路スペースを削減し、消費電力を削減する。 Specifically, a 1-bit arithmetic unit (logical arithmetic unit and four arithmetic units) capable of various information processing in combination with a memory cell reduces the circuit space for parallel arithmetic and reduces power consumption.

請求項１では
外部入力機能から入力されるデータを記憶可能な多数ｂｉｔのメモリセルと、前記メモリセルのデータを１ｂｉｔセル毎に読み込み演算を実行し、その演算結果を１ｂｉｔ毎に前記メモリセルに書き込み可能とする１ｂｉｔ演算機能と、を１レコードとし
前記レコードを多数並列に配列し、全レコードの前記多数ｂｉｔのそれぞれの前記メモリセルを並列にアドレス選択可能にすると共に、演算条件を並列に指定する演算条件指定機能と、
前記レコードの前記演算結果を外部に出力する外部出力機能と、
を具備することを特徴とする。 According to a first aspect of the present invention, a multi-bit memory cell capable of storing data input from an external input function, and the memory cell data are read for each bit cell and an operation is performed, and the operation result is stored in the memory cell for each bit. A 1-bit operation function that enables writing is set as one record, and a large number of the records are arranged in parallel so that the memory cells of each of the many bits of all records can be selected in parallel, and operation conditions are designated in parallel. Calculation condition specification function to
An external output function for outputting the calculation result of the record to the outside;
It is characterized by comprising.

請求項２では
前記１ｂｉｔ演算機能は、
（１）論理積演算
（２）論理和演算と
（３）論理否定演算
（４）排他論理和演算
（５）半加算演算
（６）全加算演算
（７）演算記憶
（８）以上（１）から（７）の組み合わせ
以上の何れかの演算を実行することを特徴とする。 In claim 2, the 1-bit calculation function is:
(1) logical product operation (2) logical sum operation and (3) logical negation operation (4) exclusive OR operation (5) half addition operation (6) full addition operation (7) operation storage (8) or more (1) To (7), any one of the operations is executed.

請求項３では
前記外部入力機能は、
外部からの前記演算対象のデータを行列変換し前記演算対象のデータを書き込みする機能を具備すること特徴とする。 In claim 3, the external input function is:
A function of performing matrix transformation on the data to be calculated from outside and writing the data to be calculated is provided.

請求項４では
前記外部出力機能は、
（１）前記レコードの番地を優先順序順に出力
（２）前記レコードを幾つかに分割して、分割したレコード毎に優先順序順に出力
（３）何れかの前記レコードに演算結果があるか否かを出力
（４）前記全レコードを並列に出力
（５）以上（１）から（４）の組み合わせの出力
以上の何れかの出力機能を具備することを特徴とする。 In claim 4, the external output function is:
(1) Output the addresses of the records in order of priority (2) Divide the record into several parts and output the divided records in order of priority (3) Whether any of the records has a calculation result (4) Output all the records in parallel (5) or more (1) to (4) any combination output function is provided.

請求項５では
前記演算対象のデータを他の前記レコードへ転送する機能を具備することを特徴とする。 According to a fifth aspect of the present invention, there is provided a function of transferring the calculation target data to the other record.

請求項６では
ＣＰＵならびにその他の機能の回路に組み込みされたことを特徴とする。 The present invention is characterized in that it is incorporated in a circuit of a CPU and other functions.

請求項７では
ＦＰＧＡで実装したことを特徴とする。 The present invention is characterized in that it is implemented by FPGA.

請求項８では
請求項１記載のメモリ型プロセッサを含んだ装置。 An apparatus including the memory type processor according to claim 1.

請求項９では
請求項１記載のメモリ型プロセッサの使用方法で
このメモリに前記演算対象のデータと、演算結果を一時退避するためのワークエリアデータと、の双方を前記アドレスに割り付けし、双方のデータを用いて１ｂｉｔ演算機能を繰り返すことで任意のデータ幅のデータの、
（１）前記演算対象のデータの全レコード並列インデックス検索演算
（２）前記演算対象のデータの全レコード並列比較（一致、大小、範囲、最大・最小）演算
（３）前記演算対象のデータの全レコード並列カウント（アップ、ダウン）演算
（４）前記演算対象のデータの全レコード並列加減算演算
（５）前記演算対象のデータの全レコード並列乗除算演算
（６）前記演算対象のデータの全レコード並列平文の暗号化と、暗号文の平文復号演算
（７）前記演算対象のデータの全レコード並列行列変換演算
（８）前記演算対象のデータの全レコード並列データ作成演算
（９）以上の組みわせ演算
以上の何れかの演算をすることを特徴とする。 According to a ninth aspect of the present invention, in the method of using the memory type processor according to the first aspect, both the data to be calculated and the work area data for temporarily saving the calculation result are allocated to the address. By repeating the 1-bit calculation function using data,
(1) All-record parallel index search operation of the calculation target data (2) All-record parallel comparison (match, large, small, range, maximum / minimum) calculation of the calculation target data (3) All of the calculation target data Record parallel count (up / down) operation (4) All-record parallel addition / subtraction operation of the calculation target data (5) All-record parallel multiplication / division calculation of the calculation target data (6) All-record parallel of the calculation target data Encryption of plaintext and plaintext decryption operation of ciphertext (7) All-record parallel matrix conversion operation of the operation target data (8) Combination operation of all-record parallel data creation operation of the operation target data (9) or more Any one of the above operations is performed.

請求項１０では
請求項１記載のメモリ型プロセッサの使用方法で
前記演算結果に勝ち残りの前記レコードがあるかないかを判定し、その判定結果をもとに演算条件式を与えることを特徴とする。 According to a tenth aspect of the present invention, in the method of using the memory-type processor according to the first aspect, it is determined whether or not there is a remaining record as a result of the calculation, and a calculation conditional expression is given based on the determination result.

請求項１１では
請求項１記載のメモリ型プロセッサの使用方法で
前記多数配列されたレコードの中から複数レコードのメモリセルのデータを一組のデータと割り付けして使用することを特徴とする。 According to the eleventh aspect of the present invention, in the method of using the memory type processor according to the first aspect, data of memory cells of a plurality of records are allocated and used as a set of data among the records arranged in large numbers.

請求項１２では
前記メモリ型プロセッサを
（１）直列、並列、もしくは直並列に接続
（２）階層的接続
以上の何れか、もしくは双方の接続で使用することを特徴とする。 According to a twelfth aspect of the present invention, the memory-type processor is used for (1) serial, parallel, or series-parallel connection, or (2) a hierarchical connection or more, or both connections.

図１は、一般メモリの構成例である。FIG. 1 is a configuration example of a general memory. 図２は、メモリ型プロセッサの構成例である。FIG. 2 is a configuration example of a memory type processor. 図３は、メモリ型プロセッサの１ｂｉｔ論理（ブール）演算器の回路構成（例）である。（実施例１）FIG. 3 is a circuit configuration (example) of a 1-bit logic (Boolean) arithmetic unit of the memory type processor. (Example 1) 図４は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサのインデックス検索演算方法（例）の説明図である。（実施例２）FIG. 4 is an explanatory diagram of an index search calculation method (example) of a memory type processor using a 1-bit logic (Boolean) calculator. (Example 2) 図５は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサのデータ比較演算式（例）の説明図である。FIG. 5 is an explanatory diagram of a data comparison arithmetic expression (example) of a memory type processor using a 1-bit logic (Boolean) arithmetic unit. 図６は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサのデータ一致比較演算（一致検索）方法（例）の説明図である。（実施例３）FIG. 6 is an explanatory diagram of a data match comparison calculation (match search) method (example) of a memory type processor using a 1-bit logic (Boolean) calculator. Example 3 図７は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサのデータ大小比較演算（以上、未満検索）方法（例）の説明図である。（実施例４）FIG. 7 is an explanatory diagram of a data size comparison operation (above, less than search) method (example) of a memory type processor using a 1-bit logic (Boolean) arithmetic unit. (Example 4) 図８は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサの最大・最小比較演算（最大・最小検索）方法（例）の説明図である。（実施例５）FIG. 8 is an explanatory diagram of a maximum / minimum comparison operation (maximum / minimum search) method (example) of a memory type processor using a 1-bit logic (Boolean) arithmetic unit. (Example 5) 図９は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサのデータカウント演算方法のアドレス割り付け（例）の説明図である。FIG. 9 is an explanatory diagram of address allocation (example) of the data count calculation method of the memory type processor by the 1-bit logic (Boolean) arithmetic unit. 図１０は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサの加算カウント演算方法（例）の説明図である。（実施例６）FIG. 10 is an explanatory diagram of an addition count calculation method (example) of a memory type processor using a 1-bit logic (Boolean) calculator. (Example 6) 図１１は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサの減算カウント演算方法（例）の説明図である。（実施例７）FIG. 11 is an explanatory diagram of a subtraction count calculation method (example) of a memory type processor using a 1-bit logic (Boolean) arithmetic unit. (Example 7) 図１２は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサの全加算演算方法（例）の説明図である。（実施例８）FIG. 12 is an explanatory diagram of a full addition operation method (example) of a memory type processor by a 1-bit logic (Boolean) arithmetic unit. (Example 8) 図１３は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサの乗算演算方法（例）の説明図である。（実施例９）FIG. 13 is an explanatory diagram of a multiplication operation method (example) of a memory type processor using a 1-bit logic (Boolean) arithmetic unit. Example 9 図１４は、メモリ型プロセッサの１ｂｉｔ四則演算器の回路構成（例）である。（実施例１０）FIG. 14 is a circuit configuration (example) of a 1-bit four arithmetic unit of a memory type processor. (Example 10) 図１５は、メモリ型プロセッサによる多ビット並列四則演算方法（例）の説明図である。（実施例１１）FIG. 15 is an explanatory diagram of a multi-bit parallel arithmetic operation method (example) by a memory type processor. (Example 11) 図１６は、メモリ型プロセッサのデータシフト機能を備えた演算器の回路構成（例）である。（実施例１２）FIG. 16 is a circuit configuration (example) of an arithmetic unit having a data shift function of a memory type processor. (Example 12) 図１７は、メモリ型プロセッサによるデータの行列変換方法（例）の説明図（１）である。（実施例１３）FIG. 17 is an explanatory diagram (1) of the data matrix conversion method (example) by the memory-type processor. (Example 13) 図１８は、メモリ型プロセッサによるデータの行列変換方法（例）の説明図（２）である。FIG. 18 is an explanatory diagram (2) of the data matrix conversion method (example) by the memory type processor. 図１９は、メモリ型プロセッサの直並列接続の例である。FIG. 19 shows an example of series-parallel connection of memory type processors. 図２０は、メモリ型プロセッサの階層接続の例である。FIG. 20 shows an example of hierarchical connection of memory type processors. 図２１は、メモリ型プロセッサによる特徴データベースの例Ａである。（実施例１４）FIG. 21 is an example A of a feature database by a memory type processor. (Example 14) 図２２は、メモリ型プロセッサによる特徴データベースの例Ｂである。（実施例１５）FIG. 22 is an example B of the feature database by the memory type processor. (Example 15) 図２３は、メモリ型プロセッサによる並列データ作成方法（例）の説明図である。（実施例１６）FIG. 23 is an explanatory diagram of a parallel data creation method (example) by a memory type processor. (Example 16)

図１は、一般メモリの構成図である。
図１のメモリ１００はアドレスデコーダやデータバスなどの機能回路は省略されており、このメモリに自由に情報データが書き込み読み出し可能な構成で、１ワードがｎｂｉｔの幅１０３で、Ｎのアドレス１０４を持つ、Ｎ×ｎｂｉｔセルからなる記憶セル１０２からなり立っており、一般的にはアドレスデコーダなどの手段で外部から１からＮまでのアドレスを選択指定可能になっている。
現在のＣＰＵによる情報処理は、メモリ１００のデータ幅１０３が８ｂｉｔ、１６ｂｉｔ、３２ｂｉｔなど一定のデータ幅で、情報データの検索の場合アドレス数が１Ｍアドレスや１Ｇアドレスなど与えられたメモリのアドレス空間をＣＰＵが順次アドレスをアクセスしデータを読み込み逐次処理を行って行くものである。
この発明の情報処理は以上の一般のメモリ構造やデータベーステーブル構造のデータの幅とアドレスの概念を逆転させる考えでなり立っており、また１ｂｉｔ単位の並列演算を基本とするものである。 FIG. 1 is a configuration diagram of a general memory.
The memory 100 in FIG. 1 does not include functional circuits such as an address decoder and a data bus. The memory 100 has a configuration in which information data can be freely written to and read from the memory. One word has an n-bit width 103 and an N address 104. The memory cell 102 is made up of N × n bit cells, and generally addresses from 1 to N can be selected and designated from the outside by means such as an address decoder.
Information processing by the current CPU is that the data width 103 of the memory 100 is a constant data width such as 8 bits, 16 bits, 32 bits, and the address space of the given memory such as 1M address or 1G address is used when searching for information data. The CPU sequentially accesses the address, reads data, and performs sequential processing.
The information processing according to the present invention is based on the idea of reversing the concept of data width and address in the above general memory structure and database table structure, and is based on parallel operation in 1-bit units.

図２は、メモリ型プロセッサの構成例であり、外部入力機能から入力されるデータを記憶可能な多数ｂｉｔのメモリセルと、前記メモリセルのデータを１ｂｉｔセル毎に読み込み演算を実行し、その演算結果を１ｂｉｔ毎に前記メモリセルに書き込み可能とする１ｂｉｔ演算機能と、を１レコードとし前記レコードを多数並列に配列し、全レコードの前記多数ｂｉｔのそれぞれの前記メモリセルを並列にアドレス選択可能にすると共に、演算条件を並列に指定する演算条件指定機能と、前記レコードの前記演算結果を外部に出力する外部出力機能と、を具備することを特徴とするメモリ型プロセッサである、以下に詳細を説明する。 FIG. 2 shows an example of the configuration of a memory-type processor. The memory cell has a number of bits that can store data input from an external input function, and the memory cell data is read for each 1-bit cell, and the calculation is performed. A 1-bit operation function that enables writing of a result to the memory cell every 1 bit is made one record, and a large number of the records are arranged in parallel, so that the memory cells of the multiple bits of all records can be selected in parallel. And a memory type processor comprising a calculation condition specifying function for specifying calculation conditions in parallel and an external output function for outputting the calculation result of the record to the outside. explain.

図１同様図２においてもアドレスデコーダやデータバスなどの機能回路は省略されており、このメモリ型プロセッサ１０１の記憶セル１０２に自由に情報データが書き込み読み出し可能な構成になっている。 As in FIG. 1, functional circuits such as an address decoder and a data bus are omitted in FIG. 2, and information data can be freely written to and read from the memory cells 102 of the memory type processor 101.

通常のメモリの１ワードｎｂｉｔの幅１０３は、本メモリ型プロセッサ１０１の場合データベースのレコード数（ｎ）に相当し、１レコードのデータが縦列に配列され、アドレス１０４のＮは１レコードのフィールドデータ長に相当する構造と考えると理解しやすい。
つまり、このメモリ型プロセッサ１０１のメモリ部（データベース部）は、１レコードがＮｂｉｔのフィールデータ長でｎレコードのデータテーブルとなっている。 The width 103 of 1 word n bit in the normal memory corresponds to the number of records (n) in the database in the case of the memory type processor 101, and data of 1 record is arranged in a column, and N of the address 104 is a field of 1 record. It is easy to understand if it is considered as a structure corresponding to the data length.
That is, the memory unit (database unit) of the memory-type processor 101 is a data table of n records in which one record has a field data length of N bits.

データテーブルの下段に設けられた１ｂｉｔ演算器１０５は、選択指定１１０されたアドレス１０４の記憶セル１０２のｂｉｔセルのデータ毎に対し、論理記憶１１６、論理積１１２、論理和１１３、論理否定（ＮＯＴ）１１４、排他論理和１１５、全加算２１１、その他のオプション機能、ならびにその組合せの演算器が指定した演算条件１１１で全レコード並列に演算が実行できる構成となっている。
詳細は後述する。 The 1-bit computing unit 105 provided in the lower part of the data table performs logical storage 116, logical product 112, logical sum 113, logical negation (NOT) for each bit cell data of the storage cell 102 at the address 104 selected and designated 110. ) 114, exclusive OR 115, full addition 211, other optional functions, and a calculation condition 111 specified by the combination computing unit, all the records can be operated in parallel.
Details will be described later.

またこの１ｂｉｔ演算器１０５の演算結果を出力するため優先順出力回路（プライオリティアドレスエンコーダ出力回路）２１４などの演算結果出力１０６機能が備えられている。
後述するがこのメモリの大半はメモリセルそのものであり、そのごく一部のみが１ｂｉｔ演算器１０５ならびに演算結果出力１０６機能であるので、一般メモリに省スペースでこれらの機能を組み込みすることが可能でありデータベースに最適な大容量のメモリとすることが出来る。 Further, a calculation result output 106 function such as a priority order output circuit (priority address encoder output circuit) 214 is provided to output the calculation result of the 1-bit calculator 105.
As will be described later, since most of the memory is the memory cell itself, and only a part of the memory cell has the 1-bit computing unit 105 and the computation result output 106 function, it is possible to incorporate these functions in a general memory in a space-saving manner. There can be a large-capacity memory optimal for a database.

詳しくは後述するが１ｂｉｔ演算機能１０５は大きく分けて、１ｂｉｔ論理（ブール）演算機能１２３、ならびに１ｂｉｔ四則（数値）演算機能１２４の二種類がある。
論理演算は、結果が真か偽か、０か１の１ｂｉｔでよいのに対し、四則演算は、演算結果と桁上げの２ｂｉｔが必要である。
最初に１ｂｉｔ論理（ブール）演算機能１２３について説明を行う。 As will be described in detail later, the 1-bit calculation function 105 is roughly divided into two types: a 1-bit logic (Boolean) calculation function 123 and a 1-bit four-rule (numerical) calculation function 124.
The logical operation may be true or false, or 1 bit of 0 or 1, whereas the arithmetic operation requires 2 bits of the operation result and carry.
First, the 1-bit logic (Boolean) operation function 123 will be described.

図３はメモリ型プロセッサの１ｂｉｔ論理（ブール）演算機能の回路構成の例であり、メモリ型プロセッサ１０１のアドレス１０４によって選択された１ｂｉｔのメモリ記憶セル１０２のデータを演算する１ｂｉｔ論理（ブール）演算器１２３の回路と演算の内容を説明する。 FIG. 3 shows an example of the circuit configuration of the 1-bit logic (Boolean) operation function of the memory-type processor. The 1-bit logic (Boolean) operation for calculating the data of the 1-bit memory storage cell 102 selected by the address 104 of the memory-type processor 101. The circuit of the device 123 and the contents of the calculation will be described.

本例ではスイッチ２０１を１のポジションに切り替えると、１ｂｉｔの論理（ブール）演算１２３が実現出来る構成となっている。
図に示す通り、回路構成は論理積１１２、論理和１１３、排他論理和１１５、論理否定１１４、フリップフロップ（ＦＦ）２０２、選択回路２０３で構成される極めて単純な構成である。
排他論理和１１５は、論理積１１２、論理和１１３、論理否定１１４の組合せで演算できるので、必ずしも必要ではないが、後述する四則演算など利用頻度が高い場合には組み込んでおくとよい。 In this example, when the switch 201 is switched to the 1 position, a 1-bit logical (Boolean) operation 123 can be realized.
As shown in the figure, the circuit configuration is a very simple configuration including a logical product 112, a logical sum 113, an exclusive logical sum 115, a logical negation 114, a flip-flop (FF) 202, and a selection circuit 203.
Since the exclusive OR 115 can be calculated by a combination of the logical product 112, the logical sum 113, and the logical negation 114, it is not always necessary. However, the exclusive OR 115 may be incorporated when the usage frequency is high, such as four arithmetic operations described later.

また排他論理和１１５は暗号化に利用することができる。
平文データを暗号化する場合、暗号データとの排他論理和演算を行うことにより平文データは暗号化され、暗号化された暗号文データは、暗号データと排他論理和１１５演算することにより平文化可能であることは周知のことである。
従ってこの１ｂｉｔブール演算器１０５の排他論理和１１５は、大量の平文データを超高速に暗号化、復号化することができる。 The exclusive OR 115 can be used for encryption.
When plaintext data is encrypted, the plaintext data is encrypted by performing an exclusive OR operation with the encrypted data, and the encrypted ciphertext data can be plainly obtained by performing an exclusive OR operation with the encrypted data. It is well known.
Therefore, the exclusive OR 115 of the 1-bit Boolean calculator 105 can encrypt and decrypt a large amount of plaintext data at a very high speed.

この回路はアドレス１０４により選択されたメモリ記憶セル１０２からの１ｂｉｔのデータは、選択回路２０３で正論理、または負論理（論理否定１１４）が選択可能になっている。
同様にフリップフロップ２０２の結果出力２０５も正論理、または負論理（論理否定１１４）が選択可能な構成となっている。 In this circuit, 1-bit data from the memory storage cell 102 selected by the address 104 can be selected by the selection circuit 203 as positive logic or negative logic (logic negation 114).
Similarly, the result output 205 of the flip-flop 202 can be selected from positive logic or negative logic (logic negation 114).

以下に演算可能な内容を説明する。
第１にメモリから読み出された正論理もしくは負論理のデータをメモリデータ２０４として、フリップフロップ２０２へ直接代入（記憶させる）することが出来る。
第２に演算結果であるフリップフロップ２０２の正論理もしくは負論理の結果出力２０５を、再度フリップフロップ２０２へ直接代入（記憶させる）することが出来る。 The contents that can be calculated will be described below.
First, positive or negative logic data read from the memory can be directly substituted (stored) in the flip-flop 202 as memory data 204.
Secondly, the result output 205 of the flip-flop 202 as the operation result can be directly substituted (stored) in the flip-flop 202 again.

第３に正論理もしくは負論理のメモリデータ２０４と演算結果１０７であるフリップフロップ２０２の正論理もしくは負論理の結果出力２０５を、論理積１１２．論理和１１３、排他論理和１１５のいずれかの演算を実行し、その結果をフリップフロップ２０２へ代入（記憶させる）することが出来る。
以上の演算を全レコード並列に指定できる構成、演算条件指定機能１１１になっている。 Third, the positive logic or negative logic memory data 204 and the result 107 of the positive logic or negative logic output 205 of the flip-flop 202 which is the operation result 107 are ANDed. Any one of the logical sum 113 and the exclusive logical sum 115 can be executed, and the result can be substituted (stored) in the flip-flop 202.
The above-described calculation can be specified in parallel for all records, and the calculation condition specifying function 111 is provided.

また、この１ｂｉｔ論理（ブール）演算結果は、スイッチ２０１を３のポジションにすることによって、アドレス１０４により選択された記憶セル１０２に記憶することが出来る構成である。 The 1-bit logic (Boolean) operation result can be stored in the memory cell 102 selected by the address 104 by setting the switch 201 to the 3 position.

１ｂｉｔ論理演算結果は、通常勝ち残りレコードとなりレコード数が大幅に絞り込まれるのでプライオリティアドレスエンコーダ出力回路などの演算結果出力１０６機能から１レコード毎に、レコードの番地を外部に出力する方式とすると、メモリ型プロセッサ１０１チップの出力ピンのピン数を最小限のものとすることが出来る。
詳細は後述する。 Since the 1-bit logical operation result is a normal winning remaining record and the number of records is greatly narrowed down, the record address is output to the outside for each record from the operation result output 106 function such as the priority address encoder output circuit. The number of output pins of the processor 101 chip can be minimized.
Details will be described later.

背景技術で述べた通りＡＬＵ２１０は、たとえば３２ｂｉｔデータ２組を一括して演算する必要があるので１ユニットの回路の規模が大規模にならざるを得ないが、この構成の１ｂｉｔ論理（ブール）演算器１０５は１ユニットの回路規模が１５０トランジスタ程度、演算結果出力１０６を含めても数百トランジスタ程度で実現できるので、並列処理の演算ユニットとして最適である。
後述する、１ｂｉｔ四則（数値）演算機能１２４も極めて省スペースな演算回路である。 As described in the background art, for example, the ALU 210 needs to calculate two sets of 32-bit data at a time, so the circuit scale of one unit is inevitably large. However, the 1-bit logic (Boolean) operation of this configuration is necessary. Since the unit 105 can be realized with a circuit scale of about 150 transistors and several hundred transistors including the operation result output 106, it is optimal as a parallel processing operation unit.
A 1-bit four arithmetic (numerical value) calculation function 124, which will be described later, is also an extremely space-saving calculation circuit.

以上の省スペースな演算機能でも超高速でしかも多様な演算が実現出来ることを順次以下に紹介する。 The following introduces the fact that the above-mentioned space-saving calculation function can realize various operations at ultra-high speed.

図４は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサのインデックス検索演算方法の例である。
インデックス検索演算はインターネット検索、特許検索、全文検索など様々な用途がある。
通常文字列情報は、キーワードだけの検索とキーワードと演算式のその双方を条件指定する方法の２通りがある。 FIG. 4 is an example of a memory type processor index search calculation method using a 1-bit logic (Boolean) calculator.
The index search operation has various uses such as Internet search, patent search, and full-text search.
There are two types of normal character string information: a search using only a keyword, and a method of specifying conditions for both the keyword and the arithmetic expression.

本例では特許文献の検索のように例えば「情報処理」、「情報検索」、「ＣＰＵ」などのキーインデックスと論理積,１１２、論理和１１３、論理否定１１４の演算を可能にした検索で、キーとなる語彙と演算式の双方を与えることにより、レコードの絞り込みが行われ、これらのキーワードと論理演算条件に合致する文献（レコード）があるかないか、あればどの文献（レコード）にあるかを判定するインデックス検索を対象とする。 In this example, as a search for patent documents, a key index such as “information processing”, “information search”, “CPU” and the like and a logical product 112, a logical sum 113, and a logical negation 114 can be calculated. By providing both key vocabulary and arithmetic expressions, the records are narrowed down, and if there are documents (records) that match these keywords and logical operation conditions, or if there are any documents (records) An index search for determining

以下にこのメモリ型プロセッサ１０１を用いて文献検索などインデックス検索演算に利用した例を説明する。 Hereinafter, an example in which the memory type processor 101 is used for index search calculation such as document search will be described.

本例の場合、１からＮまでのアドレスには、「情報処理」、「情報検索」、「特許」、「ＣＰＵ」などの語彙をインデックスとして割り付けし、１レコードを１文献に対応させる。
つまり１つの文献中に、「情報処理」、「情報検索」、「特許」、「ＣＰＵ」などの文字が１つでもあれば、対応するメモリセル（フィールド）に「１」を書き込んで、文献毎に登録をしておく。
（「０」は省略されている、以降同様）。 In this example, vocabulary such as “information processing”, “information retrieval”, “patent”, “CPU”, etc. is assigned as an index to addresses 1 to N, and one record corresponds to one document.
That is, if there is at least one character such as “information processing”, “information retrieval”, “patent”, “CPU” in one document, “1” is written in the corresponding memory cell (field). Register every time.
("0" is omitted, and so on).

従って本例の場合、Ｎ個の語彙（インデックス）と、ｎ冊の文献（ｎレコード）がデータベースとして登録されていることになる。 Therefore, in this example, N vocabularies (indexes) and n books (n records) are registered as a database.

本例ではアドレス１８が「情報処理」、アドレス５が「情報検索」、アドレス２４が「特許」、アドレス１０が「ＣＰＵ」としてインデックス語彙（キー語彙）が割り付けられており、演算条件は（「情報処理」＋「情報検索」の語彙を含む）＊（「特許」の語彙を含まない文献）＊（「ＣＰＵ」の語彙を含む文献）とする場合で説明する。 In this example, the address 18 is “information processing”, the address 5 is “information retrieval”, the address 24 is “patent”, the address 10 is “CPU”, the index vocabulary (key vocabulary) is assigned, and the calculation condition is (“ The case will be described where “information processing” + “information search” vocabulary is included) * (documents not including the “patent” vocabulary) * (documents including the “CPU” vocabulary).

図４の下段に以上のキーワード検索の演算方法が示されている。
アドレス１８の「情報処理」とアドレス５の「情報検索」の論理和（ＯＲ）演算の結果、双方の語彙が含まれる文献のレコードは３、４、５、１３、１４、１６、１９、２１、２５である。
次に、アドレス２４の「特許」の語彙が含まれない文献は論理否定演算の結果、４、８．１１、１６、２２、２５である。
先ほどの演算結果、文献レコード３、４、５、１３、１４、１６、１９、２１、２５と、この論理否定（NOT）演算結果の論理積（ＡＮＤ）演算を行った結果の勝ち残り文献は４、１６、２５となる。 The lower part of FIG. 4 shows the above keyword search calculation method.
As a result of the logical sum (OR) operation of the “information processing” at the address 18 and the “information retrieval” at the address 5, the records of the documents including both vocabularies are 3, 4, 5, 13, 14, 16, 19, 21 , 25.
Next, the documents that do not include the “patent” vocabulary at the address 24 are 4, 8.11, 16, 22, and 25 as a result of the logical NOT operation.
The result of the previous calculation, the document records 3, 4, 5, 13, 14, 16, 19, 21, 25 and the result of performing a logical product (AND) operation of the logical negation (NOT) calculation result is 4 winning documents. 16, 25.

最後にアドレス１０の「ＣＰＵ」を含む文献レコード３、７、９、１２、１５、１６、２２と直前の勝ち残り文献との論理積（ＡＮＤ）演算を行うことにより最終勝ち残り文献１０７はレコード１６になっている。 Finally, by performing a logical product (AND) operation between the document records 3, 7, 9, 12, 15, 16, 22 including the “CPU” at the address 10 and the previous winning remaining document, the last winning remaining document 107 is stored in the record 16. It has become.

１つのレジスタで構成される１ｂｉｔ演算器で様々な演算を実施する場合、（）内の演算の結果など中間の演算結果を一時退避して記憶する必要がある場合、例えばアドレスＮをテンポラリな一時バッファー２０７として利用することにより実現することが可能になる。 When various operations are performed by a 1-bit arithmetic unit composed of one register, when it is necessary to temporarily save and store an intermediate operation result such as the operation result in (), for example, the address N is temporarily stored. It can be realized by using it as the buffer 207.

つまり、文献１６は（「情報処理」＋「情報検索」いずれかの語彙を含む文献）＊（特許の語彙を含まない文献）＊（ＣＰＵの語彙を含む文献）である。
以上の結果を、プライオリティアドレスエンコーダ出力回路などの演算結果出力１０６から順次読み出せばよい。 That is, the document 16 is (a document including any vocabulary of “information processing” + “information retrieval”) * (a document including no patent vocabulary) * (a document including the vocabulary of the CPU).
The above results may be read sequentially from the operation result output 106 such as a priority address encoder output circuit.

ＣＰＵやＧＰＵは、このメモリ型プロセッサ１０１にアドレス選択指定１１０と、演算条件指定１１１を行うだけで、全メモリ空間の情報を全く探しまわることなしに、目的の情報をこのメモリ型プロセッサ１０１から検出することが可能になる。 The CPU or GPU simply detects the target information from the memory type processor 101 without performing any search for the information of the entire memory space by simply performing the address selection designation 110 and the calculation condition designation 111 to the memory type processor 101. It becomes possible to do.

以上の説明は文献検索演算の例であったが、レコードをＵＲＬに置き換えすればインターネット検索用のデータベースに利用可能である。 The above description is an example of a document search operation. However, if a record is replaced with a URL, it can be used for a database for Internet search.

以上の文献検索は、全てが１ｂｉｔからなるデータ有り無しのインデックスデータであったが、データが値で記憶（登録）されたデータの情報処理について説明する。 The above document search is all index data with or without 1-bit data. Information processing of data in which data is stored (registered) by value will be described.

図５は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサのデータ比較演算式の例の説明図である。
「/」の記号は論理否定１１４演算、「＊」は論理積１１２演算、「＋」は論理和１１３演算を表している。
本例は２進数、４ｂｉｔの場合のデータの比較演算つまり、一致、大小、範囲演算を行う場合の演算式を示したものである。
図示するように、１０進数は２進数に変換され、何れの場合も、ＭＳＢである「８」からＬＳＢ
である「１」まで割り付けされた「８」、「４」、「２」、「１」の各ｂｉｔのアドレス１０４を、比較する演算条件に適合するように選択し、論理否定１１４、論理積１１２、論理和１１３することで、一致、以上、未満のデータを検出することが可能であることを示している。
本表を参考にすれば演算条件を以下や範囲にすることも、多数ｂｉｔデータの比較の演算条件式も容易に記述することができる。 FIG. 5 is an explanatory diagram of an example of a data comparison arithmetic expression of a memory type processor using a 1-bit logic (Boolean) arithmetic unit.
The symbol “/” represents a logical negation 114 operation, “*” represents a logical product 112 operation, and “+” represents a logical sum 113 operation.
This example shows an arithmetic expression for performing a comparison operation of data in the case of a binary number and 4 bits, that is, a coincidence, magnitude, and range operation.
As shown in the figure, decimal numbers are converted to binary numbers, and in either case, MSB “8” is changed to LSB.
The addresses 104 of “8”, “4”, “2”, and “1” allocated to “1” that are “1” are selected so as to match the operation condition to be compared, and logical negation 114, logical product 112, the logical sum 113 indicates that it is possible to detect data that is equal to or less than or equal to.
By referring to this table, the calculation condition can be set to the following or range, and the calculation condition expression for comparing multiple bit data can be easily described.

先にも説明したが、これらの演算を実施する上で、中間の演算結果を一時退避して記憶する必要がある場合、例えばアドレスＮをテンポラリな一時演算バッファー２０７として利用する。
このことは以下の何れの演算にも応用することが出来る。 As described above, when it is necessary to temporarily save and store an intermediate calculation result in performing these calculations, for example, the address N is used as the temporary temporary calculation buffer 207.
This can be applied to any of the following operations.

図６は、メモリ型プロセッサの、１ｂｉｔ論理（ブール）演算器によるデータ一致比較演算方法の例である。
本例は１ｂｉｔ論理（ブール）演算機能を備えたメモリによる２進数、８ｂｉｔのデータ一致演算の例である。 FIG. 6 is an example of a data coincidence comparison calculation method using a 1-bit logic (Boolean) arithmetic unit in a memory type processor.
This example is an example of binary, 8-bit data matching operation using a memory having a 1-bit logic (Boolean) operation function.

例えば、アドレス１０を最上位ｂｉｔ（MＳＢ）「１２８」としてアドレス１７を最下位ｂｉｔ（LＳＢ）「１」とする８ｂｉｔのデータをフィールドに割りつけた場合を考える。
８ｂｉｔのデータであるので、２５６通りのデータを記憶することが可能であり、アドレス１０からアドレス１７の８つのアドレスを適切に選択することにより、２５６通りのデータの中から完全一致のデータを検出してそのレコードの番地を出力することが可能になる。 For example, consider a case where 8-bit data having address 10 as the most significant bit (MSB) “128” and address 17 as the least significant bit (LSB) “1” is assigned to the field.
Since it is 8-bit data, it is possible to store 256 types of data, and by selecting eight addresses from address 10 to address 17 appropriately, completely matching data is detected from 256 types of data. It becomes possible to output the address of the record.

例えば、１０進データ値「１０」＝２進数「００００１０１０」を完全一致で探す場合、アドレス１０を最上位ｂｉｔ（MＳＢ）「１２８」としてアドレス１７を最下位ｂｉｔ（LＳＢ）「１」まで８回演算し「００００１０１０」であるデータを検出すればよい。 For example, when the decimal data value “10” = binary number “000001010” is searched with a perfect match, address 10 is the most significant bit (MSB) “128” and address 17 is the least significant bit (LSB) “1” eight times. It is only necessary to calculate and detect data that is “00001010”.

図の下方に示す通り、本例では、ＭＳＢのアドレス１０から「１２８」、「６４」、「３２」、「１６」、「８」、「４」、「２」、「１」の順に演算を行っている。
この際、２進数「００００１０１０」の「０」の桁の場合は論理否定、「１」の桁の場合は正論理で、８回の論理積演算（勝ち抜き演算）を繰り返し勝ち残った１３および２５の２つのレコードが１０進データ値「１０」になっている。
以上のような１ｂｉｔ演算を繰り返すことにより任意の値のデータ値を検出することが出来る。 As shown in the lower part of the figure, in this example, calculation is performed in the order of “128”, “64”, “32”, “16”, “8”, “4”, “2”, “1” from the MSB address 10. It is carried out.
In this case, in the case of the digit “00001010” in the digit “0”, the logical negation is performed, and in the case of the digit “1”, the positive logic is performed. Two records have the decimal data value “10”.
By repeating the 1-bit operation as described above, an arbitrary data value can be detected.

図７は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサのデータ大小比較演算方法の例である。
これまでの説明は、１０進データ値「１０」の完全一致を求めるものであったが、１０進データ値「１０」以上を探す場合、図に示す通りＭＳＢのアドレス１０からアドレス１３まで４回アドレスの論理和を取ることにより１０進データ値が「１６」以上のレコードをまとめて検出することが出来る。 FIG. 7 shows an example of a data size comparison operation method of a memory type processor using a 1-bit logic (Boolean) arithmetic unit.
In the description so far, the exact match of the decimal data value “10” has been obtained, but when searching for the decimal data value “10” or more, as shown in FIG. By taking the logical sum of the addresses, it is possible to collectively detect records having a decimal data value of “16” or more.

更に下位４ｂｉｔのアドレスの１５と１６の論理和と、アドレス１４を論理積演算することにより１０進データ値「１０」以上「１６」未満を求め、先ほどの１０進データ値が「１６」以上のレコードと論理和をとることにより、１０進データ値「１０」以上のデータ値のレコードを検出することが出来る。
更に図に示す通り１０進データ「１０」以上のデータ値のレコードを否定すれば「１０」未満つまり「９」以下のレコードが検出される。 Further, the logical value of the lower 4 bits of the addresses 15 and 16 and the address 14 are subjected to a logical product operation to obtain a decimal data value “10” or more and less than “16”. By taking a logical OR with the record, it is possible to detect a record having a data value of decimal data value “10” or more.
Further, as shown in the figure, if a record having a data value of decimal data “10” or more is negated, a record of less than “10”, that is, “9” or less is detected.

その他のデータ値や範囲検索も以上と同様な１ｂｉｔ演算を繰り返し行えばよい。
以上の演算は桁数以内の演算ステップ数で全レコードを並列に処理する結果になっている、データ値が１６ｂｉｔであれば以上の２倍、３２ｂｉｔになれば４倍になるだけで完全一致から範囲検索演算を実現することが出来る。
また、データ幅を８ｂｉｔから９ｂｉｔや１０ｂｉｔに増やす場合でも、極めて単純であり、必ずしもアドレスが連続されている必要もなくデータ幅を１ｂｉｔ増やし１７ｂｉｔや３３ｂｉｔにすることも違和感なく実現出来る。 For other data values and range searches, the same 1-bit operation may be repeated.
The above calculation is the result of processing all records in parallel with the number of calculation steps within the number of digits. A range search operation can be realized.
Further, even when the data width is increased from 8 bits to 9 bits or 10 bits, it is very simple, and it is not always necessary that the addresses are continuous, and the data width can be increased by 1 bit to 17 bits or 33 bits without discomfort.

つまりこのメモリ型プロセッサは、ある／なしの１ｂｉｔデータから２５６ｂｉｔ幅など、任意のデータ幅のデータをレコード内のフィールドデータとして割付することができる。
例えば個人情報などの場合、「氏名」、「住所」、「勤務先」、「生年月日」、「身長」、「体重」、「性別」など、データ幅は様々であり、必要最低限のデータ幅を割付し、必要なデータｂｉｔだけ演算させ、無駄なｂｉｔは演算させないように出来ることも特徴の１つである。
詳細は後述する。 In other words, this memory-type processor can allocate data having an arbitrary data width such as one bit data with / without one to 256 bit width as field data in the record.
For example, in the case of personal information, the data width varies, such as “name”, “address”, “workplace”, “birth date”, “height”, “weight”, “gender”, etc. One of the features is that the data width can be allocated so that only necessary data bits are calculated, and unnecessary bits are not calculated.
Details will be described later.

図８は、メモリ型プロセッサの、１ｂｉｔ論理（ブール）演算器による最大・最小比較演算方法の例である。
アドレスＸにはＭＳＢ「１２８」、アドレスＸ＋７にはＬＳＢ「１」、の８ｂｉｔのデータが割り付けされており、それぞれのレコードには比較するデータ、１０進数「６９、１０９、２１、１４、５、１０５、５〜３４」が書き込まれている。 FIG. 8 is an example of a maximum / minimum comparison calculation method using a 1-bit logic (Boolean) arithmetic unit in a memory type processor.
8-bit data of MSB “128” is assigned to address X and LSB “1” is assigned to address X + 7. Data to be compared is represented in each record, decimal numbers “69, 109, 21, 14, 5, 105, 5-34 "are written.

このような、データの中から、最大値を探すためには、いずれかのレコードにデータがあるのか無いのかの演算結果を利用して、その判定結果に基づき演算条件式を設定するとよい。
以下にその手順を示す。 In order to find the maximum value from such data, it is preferable to set a calculation conditional expression based on the determination result using the calculation result as to whether or not there is data in any record.
The procedure is shown below.

最初にＭＳＢの「１２８」のアドレスを選択しその演算結果を判定する。
その結果どのレコードにもデータが無いので、「/１２８」の演算を行う。（省略することもできる）
次に「/１２８」＊「６４」の演算を行い、何れかのレコードにデータがあるので、「/１２８」＊「６４」＊「３２」の演算を行う。
その結果何れかのレコードにデータがあるので、「/１２８」＊「６４」＊「３２」＊「１６」の演算を行う。
その結果どのレコードにもデータが無いので、「/１２８」＊「６４」＊「３２」＊「/１６」の演算を行う。
その結果何れかのレコードにデータがあるので、「/１２８」＊「６４」＊「３２」＊「/１６」＊「８」の演算を行う。
その結果何れかのレコードにデータがあるので、「/１２８」＊「６４」＊「３２」＊「/１６」＊「８」＊「４」の演算を行う。
その結果何れかのレコードにデータがあるので、「/１２８」＊「６４」＊「３２」＊「/１６」＊「８」＊「４」＊「２」の演算を行う。
その結果どのレコードにもデータが無いので、「/１２８」＊「６４」＊「３２」＊「/１６」＊「８」＊「４」＊「/２」の演算を行う。
その結果何れかのレコードにデータがあるので、「/１２８」＊「６４」＊「３２」＊「/１６」＊「８」＊「４」＊「/２」＊「１」の演算を行う。
その結果勝ち残ったレコードが最大値であり、その値は、６４＋８＋４＋１＝１０９であることが分かる。
この８ｂｉｔの最大値検出演算は合計４８ステップである。 First, the address “128” of the MSB is selected and the calculation result is judged.
As a result, since there is no data in any record, an operation of “/ 128” is performed. (Can be omitted)
Next, an operation of “/ 128” * “64” is performed. Since data exists in any record, an operation of “/ 128” * “64” * “32” is performed.
As a result, since any record has data, the calculation of “/ 128” * “64” * “32” * “16” is performed.
As a result, since there is no data in any record, an operation of “/ 128” * “64” * “32” * “/ 16” is performed.
As a result, since there is data in one of the records, an operation of “/ 128” * “64” * “32” * “/ 16” * “8” is performed.
As a result, since there is data in any of the records, the calculation of “/ 128” * “64” * “32” * “/ 16” * “8” * “4” is performed.
As a result, since there is data in any of the records, the calculation of “/ 128” * “64” * “32” * “/ 16” * “8” * “4” * “2” is performed.
As a result, since there is no data in any record, an operation of “/ 128” * “64” * “32” * “/ 16” * “8” * “4” * “/ 2” is performed.
As a result, since there is data in any of the records, an operation of “/ 128” * “64” * “32” * “/ 16” * “8” * “4” * “/ 2” * “1” is performed. .
As a result, the record that has been won is the maximum value, and it can be seen that the value is 64 + 8 + 4 + 1 = 109.
This 8-bit maximum value detection calculation is a total of 48 steps.

次に最小を求めるには、図に示す通り最大とは反対に、それぞれのデータの補数データが勝ち残りするよう繰り返し演算することに求めることが可能である。
本例では、２つのレコードに最小値である５が検出されている。 Next, in order to obtain the minimum, as shown in the figure, it is possible to obtain it by repeatedly calculating so that the complement data of each data remains as opposed to the maximum.
In this example, the minimum value of 5 is detected in two records.

以上のように最大・最小演算は勝ち残りの演算結果を判定して、そのあり／なしの判定結果をもとに条件設定を行うものである。 As described above, in the maximum / minimum calculation, the remaining calculation results are determined, and the condition is set based on the determination result.

このような演算はＣＰＵの条件判定演算と同様に様々な演算に応用が出来る。
従って、演算結果出力は、それぞれのレコードの番地を出力するだけでなく、何れかのレコードに勝ち残りｂｉｔがあるかないかのみを判定できる出力を設けると効率的な条件判定演算が可能になる。 Such a calculation can be applied to various calculations in the same manner as the CPU condition determination calculation.
Accordingly, the calculation result output is not only output of the address of each record, but it is possible to perform efficient condition determination calculation by providing an output that can determine only whether any record has a winning bit or not.

これまで主にデータを探すための演算について説明を行ったが、以下にその他の演算例を紹介する。 So far, the calculation for mainly searching for data has been described, but other calculation examples are introduced below.

図９は、メモリ型プロセッサの、１ｂｉｔ論理（ブール）演算器によるデータカウント演算方法のアドレス割り付けの例である。
本例は２進数、４ｂｉｔデータの加減算カウントをを行う場合の、アドレスの割り付けの例を示したものである。
アドレスＸからアドレスＸ＋３はカウントデータ２０６、「８」、「４」、「２」、「１」の各ｂｉｔの記憶エリア、アドレスＸ＋４はテンポラリな一時バッファー２０７、アドレスＸ＋５からアドレスＸ＋８は「８」、「４」、「２」、「１」の各ｂｉｔのカウント演算に必要なキャリーデータ２０８（またはボローデータ２０９）の各ｂｉｔに割り付けした構成を示すものである。 FIG. 9 is an example of address allocation in a data count calculation method by a 1-bit logic (Boolean) arithmetic unit of a memory type processor.
This example shows an example of address allocation when performing addition / subtraction counting of binary numbers and 4-bit data.
Address X to address X + 3 are count data 206, “8”, “4”, “2”, “1” bit storage areas, address X + 4 is temporary temporary buffer 207, and addresses X + 5 to X8 are “8”. , “4”, “2”, and “1”, a configuration in which each bit of carry data 208 (or borrow data 209) necessary for count calculation of each bit is allocated.

図１０はメモリ型プロセッサの、１ｂｉｔ論理（ブール）演算器による加算カウント演算方法の例である。
本例は、レコードデータの加算カウントの例を示すものであり、２進数、４ｂｉｔデータの加算カウントを実行するためのアドレス、レコードの割り付けの例を示している。 FIG. 10 shows an example of an addition count calculation method using a 1-bit logic (Boolean) calculator in a memory type processor.
This example shows an example of addition count of record data, and shows an example of allocation of addresses and records for executing addition count of binary number and 4-bit data.

図の上段の初期状態に示す通り、「８」、「４」、「２」、「１」の４ｂｉｔデータはアドレスＸからアドレスＸ＋３番地までカウントデータが割り付けられ、本例の場合、左側のレコードより初期値として、「０、８、１、１５、７、３、５、〜１０」のカウントデータが書き込まれている。
１ｂｉｔ論理演算器の演算結果（勝ち残り）が左側のレコードより「０、１、０、１、１、１、０〜１」であることが示されており、現在のカウントデータに１ｂｉｔ演算器（FF）の演算結果（勝ち残り）を加算し、演算結果「０、９、１、０、８、４、５、〜１１」を求める場合の考え方を示す。 As shown in the initial state in the upper part of the figure, “8”, “4”, “2”, “1” 4-bit data is assigned count data from address X to address X + 3. In this example, the left record As an initial value, count data of “0, 8, 1, 15, 7, 3, 5, 10” is written.
It is shown from the record on the left side that the 1-bit logic operation result (unwined) is “0, 1, 0, 1, 1, 1, 0 to 1”, and the current count data includes the 1-bit operation unit ( FF) calculation results (remaining winnings) are added and the calculation result “0, 9, 1, 0, 8, 4, 5,...

１ｂｉｔ論理（ブール）演算を繰り返し加算処理を行う場合、加算結果の桁上げ（Carry）アドレスと１ｂｉｔ演算器結果を一時記憶するバッファアドレスをワークエリアとして利用することによりカウンタ機能を実現することが出来る。
本例ではアドレスＸ＋４を１ｂｉｔ演算器バッファー２０７、アドレスＸ＋５〜アドレスＸ＋８を各ｂｉｔのキャリーデータ２０８としている。
初期状態ではこれらのアドレスはクリアし、オール「０」状態としておく。
アドレス割り付けはこれに限定されるものではない。 When the 1-bit logic (Boolean) operation is repeatedly added, the counter function can be realized by using the carry address of the addition result and the buffer address for temporarily storing the 1-bit calculator result as the work area. .
In this example, the address X + 4 is a 1-bit computing unit buffer 207, and the addresses X + 5 to X + 8 are carry data 208 of each bit.
In the initial state, these addresses are cleared and all are set to “0”.
The address assignment is not limited to this.

以下に以上割り付けされた、カンターデータ２０６ならびにワークアドレスを利用して加算カウントを行う場合の例を示す。 An example in which addition counting is performed using the canter data 206 and the work address assigned above will be described below.

図の上から順番に、初期状態、「１」のｂｉｔ演算、「２」のｂｉｔ演算、「４」のｂｉｔ演算、「８」のｂｉｔ演算、演算結果が示されている。 In order from the top of the figure, an initial state, a bit calculation of “1”, a bit calculation of “2”, a bit calculation of “4”, a bit calculation of “8”, and a calculation result are shown.

ステップ１で１ｂｉｔ論理演算器の内容を１ｂｉｔ演算器バッファー２０７、アドレスＸ＋４に退避させておく。
１ｂｉｔ演算器（FF）の内容は変化しない。
ステップ２でカウントデータの「１」のｂｉｔ、アドレスＸ＋３の内容を代入し、１ｂｉｔ演算器（FF）の演算結果と論理積１０５演算させる、演算結果データは「１」のｂｉｔのキャリーデータ２０８である。
ステップ３でこのキャリーデータ２０８をアドレスＸ＋８に代入する。
ステップ４で１ｂｉｔ論理演算器バッファー２０７である、アドレスＸ＋４から、初期状態（加算値）の１ｂｉｔ演算器のデータを代入する。
ステップ５で代入された加算値と、カウントデータ２０６の「１」の桁、アドレスＸ＋３を代入し排他論理和演算を行う、この演算結果は、「１」の桁の演算結果となる。
ステップ６で、アドレスＸ＋３にこの結果を「１」の新データとして書き込み、「１」のｂｉｔ演算が完了する。 In step 1, the contents of the 1-bit logic operator are saved in the 1-bit operator buffer 207 and the address X + 4.
The contents of the 1-bit calculator (FF) do not change.
In step 2, the bit of the count data “1” and the contents of the address X + 3 are substituted, and the result of operation of the 1-bit arithmetic unit (FF) and the logical product 105 are calculated. The operation result data is the carry data 208 of “1” bit. is there.
In step 3, this carry data 208 is assigned to address X + 8.
In step 4, the data of the 1-bit arithmetic unit in the initial state (added value) is substituted from the address X + 4 which is the 1-bit logical arithmetic unit buffer 207.
An exclusive OR operation is performed by substituting the addition value substituted in step 5 and the digit “1” of the count data 206 and the address X + 3. The calculation result is the calculation result of the digit “1”.
In step 6, this result is written as new data “1” at address X + 3, and the bit calculation of “1” is completed.

以降同様にステップ７からステップ２４まで、「２」のｂｉｔ演算、「４」のｂｉｔ演算、「８」のｂｉｔ演算を繰り返し、「２」、「４」、「８」の桁のデータを書き換える。 Similarly, from step 7 to step 24, the bit calculation of “2”, the bit calculation of “4”, and the bit calculation of “8” are repeated to rewrite the data of the digits “2”, “4”, and “8”. .

図の最下段にはカウンタデータが更新され「０、９、１、０、８、４、５、〜１１」となっていることが示されており、１ｂｉｔ論理（ブール）演算により加算カウントが正常に実現されていることが示されている。 At the bottom of the figure, it is shown that the counter data is updated to “0, 9, 1, 0, 8, 4, 5, 11”, and the addition count is calculated by 1-bit logic (Boolean) operation. It is shown that it has been successfully realized.

４ｂｉｔデータの場合、以上の２４ステップであるが、８ｂｉｔデータの場合はその倍の４８ステップであり、１６ｂｉｔデータであれば９６ステップ必要である。 In the case of 4-bit data, the above 24 steps are required. However, in the case of 8-bit data, it is 48 steps, which is twice as much, and in the case of 16-bit data, 96 steps are required.

以上のように１ｂｉｔ演算は一見非効率的であるように思えるが、数千、数万、１００万レコードというように超並列が出来る場合、その演算効率は極めて高く高速である。 As described above, the 1-bit operation seems to be inefficient at first glance. However, when parallel processing is possible such as thousands, tens of thousands, and 1 million records, the calculation efficiency is extremely high and high speed.

以上のカウント演算は、何らかの演算処理の後に連続して活用すると極めて効率的である。
例えば、インターネットの検索回数をカウントする検索ランキングのようにレコードデータの検索演算の完了後、ヒットしたレコードの検索ランキングカウンタをそれぞれ１カウントアップするように利用すれば、メモリ内で自動的にカウンターを更新できるので誠に好都合である。
これ等のカウンタの値は、大小比較や範囲比較、最大・最小の対象となるレコードを高速に読み出すことが可能であることは言うまでもない。
以上のような検索ランキングに限らず顏認識や静脈認証などの複数の特徴の識別結果の多数決判定など各種データの多数決演算、クラス決定演算など様々な利用が可能である。 The count operation described above is extremely efficient when used continuously after some arithmetic processing.
For example, if the search ranking counter of the hit record is incremented by 1 after completion of the search operation of record data like the search ranking that counts the number of searches on the Internet, the counter is automatically set in the memory. It is very convenient because it can be updated.
It goes without saying that the values of these counters can be read at a high speed from the size comparison, range comparison, and maximum / minimum records.
The present invention is not limited to the above-described search ranking, and various uses such as a majority decision operation of various data, such as a majority decision of identification results of a plurality of features such as eyelid recognition and vein authentication, and a class decision operation are possible.

図１１は、メモリ型プロセッサの、１ｂｉｔ論理（ブール）演算器による減算カウント演算の例である。
以上説明の加算と減算の違いは、キャリーをボローに変えて補数演算させることである。
キャリーデータ２０８をボローデータ２０９に置き換えアドレス割り付けする。
ボローデータは対象データを論理否定１１４する補数演算で加算同様に実現できることは周知の事である。
従ってステップ２、８、１４、２０の論理積演算の際、「１」、「２」、「４」、「８」の論理否定データで演算することにより実現される。
その結果は「０、７、１、１４、８、２、５、〜９」であり、１ｂｉｔ論理（ブール）演算により目的の減算カウント演算が正しく実現出来ていることが示されている。 FIG. 11 is an example of a subtraction count operation by a 1-bit logic (Boolean) arithmetic unit of a memory type processor.
The difference between the addition and the subtraction described above is that the carry is replaced with a borrow and the complement operation is performed.
The carry data 208 is replaced with the borrow data 209 and assigned an address.
It is well known that the borrow data can be realized in the same manner as the addition by a complement operation that performs logical negation 114 on the target data.
Therefore, in the logical product operation of Steps 2, 8, 14, and 20, it is realized by calculating with logical negative data of “1”, “2”, “4”, and “8”.
The result is “0, 7, 1, 14, 8, 2, 5,... 9”, which indicates that the target subtraction count operation can be correctly realized by the 1-bit logic (Boolean) operation.

以上の例は４ｂｉｔの例を示したものであるが、さらに多数の桁の演算は、必要桁数を繰り返し演算すれば良い。 The above example shows an example of 4 bits, but for the calculation of a larger number of digits, the required number of digits may be repeatedly calculated.

図１２は、メモリ型プロセッサの、１ｂｉｔ論理（ブール）演算器による全加算演算の例である。
図の上段は２進数４桁の全加算演算の真理値表を表すものである。
各桁とも入力Ａ、入力Ｂ、入力Ｃｉ（キャリー）の３入力の全加算演算を行い、その演算出力ＳならびにキャリーＣo出力の真理値表である。 FIG. 12 is an example of a full addition operation by a 1-bit logic (Boolean) arithmetic unit of a memory type processor.
The upper part of the figure represents a truth table for a binary 4-digit full addition operation.
Each figure is a truth table of the arithmetic output S and carry Co output after performing full addition calculation of three inputs of input A, input B, and input Ci (carry).

１桁目の場合、入力Ａ「１」、入力Ｂ「１」、入力Ｃｉ「１」の３入力の全加算演算を行いその結果、出力Ｓ「１」、出力Ｃo「１」が出力される、通常１桁目にＣｉは入力されない。
２桁目は入Ａ「２」、入力Ｂ「２」、入力Ｃｉ「２」の３入力の全加算演算を行いその結果、出力Ｓ「２」、出力Ｃo「２」が出力される、この入力Ｃi「２」は一桁目の出力Ｃo「１」である。
従って２桁目は、８通りの組合せが存在する。
３桁目、４桁目も２桁目と同様な演算となる。 In the case of the first digit, a full addition operation of three inputs of input A “1”, input B “1”, and input Ci “1” is performed, and as a result, output S “1” and output Co “1” are output. Normally, Ci is not input in the first digit.
In the second digit, a full addition operation of three inputs of input A “2”, input B “2”, and input Ci “2” is performed, and as a result, output S “2” and output Co “2” are output. The input Ci “2” is the output Co “1” in the first digit.
Therefore, there are 8 combinations in the second digit.
The third digit and the fourth digit are the same calculation as the second digit.

従って、１ｂｉｔ論理（ブール）演算器により全加算を行う場合には、入力の３値が「０」、「１」の何れの状態であるかの８通り（１桁目は４通りでもよい）の判定演算を行い、その結果を一時記憶し、その結果から真理値表にもとづき出力Ｓならびに出力Ｃｏを求めれば良い。
図の下段には、以上の考え方を基に、入力Ａ、入力Ｂ、それぞれ４ｂｉｔ同士の全加算演算を実行する場合のアドレスの割付が示されている。
アドレスＸからアドレスＸ＋３はＡ「８」からＡ「１」、アドレスＸ＋４からアドレスＸ＋７はＢ「８」からB「１」のデータが記憶されている。
アドレスＸ＋８からアドレスＸ＋１１は演算結果の出力Ｓの演算結果を記憶するエリアである。
アドレスＸ＋１２は、MSB「８」桁のキャリーの記憶エリアである。
アドレスＸ＋１３からアドレスＸ＋２０は入力Ａ、入力Ｂ、入力Ｃｉの３つの入力が８通りのどの状態であるかどうかを判定しその結果を一時記憶するエリアである。
図に示すようにＡＢＣｉ判定１から８まで、アドレスＸ＋１３には「/Ａ」＊「/Ｂ」＊「/Ｃi」つまり３値とも「０」、アドレスＸ＋２０には「Ａ」＊「Ｂ」＊「Ｃi」つまり３値とも「１」、のように３値の全ての組合せ８通りの演算を行い、その結果を所定のアドレスに記憶する。 Therefore, when full addition is performed by a 1-bit logic (Boolean) arithmetic unit, there are eight ways in which the three values of the input are “0” or “1” (the first digit may be four). And the result is temporarily stored, and the output S and the output Co are obtained from the result based on the truth table.
In the lower part of the figure, based on the above concept, the assignment of addresses in the case of executing a full addition operation of 4 bits each of input A and input B is shown.
Addresses X to X + 3 store data from A “8” to A “1”, and addresses X + 4 to X + 7 store data from B “8” to B “1”.
Addresses X + 8 to X + 11 are areas for storing the calculation result of the calculation result output S.
Address X + 12 is an MSB “8” digit carry storage area.
Addresses X + 13 to X + 20 are areas for determining which of the eight inputs A, B, and Ci is in eight states and temporarily storing the results.
As shown in the figure, ABCi determinations 1 to 8, “/ A” * “/ B” * “/ Ci” in address X + 13, that is, “0” in all three values, “A” * “B” * in address X + 20 “Ci”, that is, all three values are “1”, and all eight combinations of three values are calculated, and the result is stored in a predetermined address.

ＡＢＣｉ判定１から８までのこの８つの判定結果で「１」となっているのは、１レコード当たり、アドレスＸ＋１３からアドレスＸ＋２０までの１アドレスのみである。
従って、先に示した全加算演算の真理値表に従い、以上のＡＢＣｉ判定の２、３、５、そして８の何れかに「１」がある場合、出力Ｓを「１」にしてそれ以外は「０」にするための論理和１１３演算を行い、その結果をアドレスＸ＋２１に一時記憶するとともに、出力Ｓの桁ごとの演算結果はアドレスＸ＋８からアドレスＸ＋１１の何れかに記憶する。 The eight determination results from ABCi determinations 1 to 8 are “1” for only one address from address X + 13 to address X + 20 per record.
Therefore, according to the truth table of the full addition operation shown above, if there is “1” in any of 2, 3, 5, and 8 of the above ABCi determination, the output S is set to “1”, otherwise The logical sum 113 operation to make “0” is performed, and the result is temporarily stored in the address X + 21, and the operation result for each digit of the output S is stored in any of the address X + 8 to the address X + 11.

次に、ＡＢＣｉ判定の４、６、７、そして８のいずれかに「１」がある場合、Ｃo出力を「１」にしてそれ以外は「０」にするための論理和演算１１３を行いその結果をアドレスＸ＋２２に記憶する。
つまり、８つの組合せどれかを判定するための８×３＝２４ステップの判定演算と演算結果ＳならびにＣ（キャリー）判定するための８ステップの判定演算、合計３２ステップの演算を実行すればよい。
以上が１桁分の演算であり、４ｂｉｔの場合これを４回つまり１２８ステップ繰り返えせば良い。
８ｂｉｔの場合は２５６ステップ繰り返えせば良い。 Next, when there is “1” in any of 4, 6, 7, and 8 of the ABCi determination, the logical sum operation 113 is performed to set the Co output to “1” and set it to “0” otherwise. The result is stored at address X + 22.
In other words, 8 × 3 = 24 steps of determination calculation for determining any of the eight combinations, 8 steps of determination calculation for determining the calculation result S and C (carry), and a total of 32 steps of calculation may be executed. .
The above is the calculation for one digit. In the case of 4 bits, this may be repeated four times, that is, 128 steps.
In the case of 8 bits, 256 steps may be repeated.

以上は、加算の例であるが、減算はキャリーデータをボローデータに変換し補数演算を行えば実現できることは周知のことであり、ここでは省略する。 The above is an example of addition, but it is well known that subtraction can be realized by converting carry data into borrow data and performing a complement operation, and is omitted here.

図１３は、１ｂｉｔ論理（ブール）演算器によるメモリ型プロセッサの乗算演算方法（例）説明図である。
被乗数に対し、乗数の各桁毎に演算した結果を各桁毎に１ｂｉｔ毎にシフトしその結果の各桁を加算することにより乗算演算が実現できることが知られている。 FIG. 13 is an explanatory diagram of a multiplication operation method (example) of a memory type processor by a 1-bit logic (Boolean) arithmetic unit.
It is known that a multiplication operation can be realized by shifting the result calculated for each digit of the multiplicand for each digit of the multiplicand and shifting each digit of the result for each digit.

例えば１０進数被乗数「１１」×乗数「１４」＝「１５４」
２進数被乗数「１０１１」×乗数「１１１０」を乗算演算する場合、
乗数１桁目の演算は「１０１１」×「０」＝「００００」、演算結果シフト０「００００００００」
乗数２桁目の演算は「１０１１」×「１」＝「１０１１」、演算結果シフト１「０００１０１１０」
乗数３桁目の演算は「１０１１」×「１」＝「１０１１」、演算結果シフト２「００１０１１００」
乗数４桁目の演算は「１０１１」×「１」＝「１０１１」、演算結果シフト３「０１０１１０００」
以上４回の全ての演算を加算＝「１００１１０１０」
この演算内容をメモリ型プロセッサ１０１の、１ｂｉｔ論理（ブール）演算器による乗算演算で実施する場合の手順を２桁目演算結果Ｓ３の所まで示したものである。 For example, the decimal multiplicand “11” × multiplier “14” = “154”
When multiplying the binary multiplicand “1011” × multiplier “1110”,
The calculation of the first digit of the multiplier is “1011” × “0” = “0000”, the calculation result shift 0 “00000000”
The calculation of the second digit of the multiplier is “1011” × “1” = “1011”, the calculation result shift 1 “00010110”
The calculation of the third digit of the multiplier is “1011” × “1” = “1011”, the calculation result shift 2 “00101100”
The calculation of the fourth digit of the multiplier is “1011” × “1” = “1011”, the calculation result shift 3 “01011000”
Add all four operations above = "10011010"
The procedure in the case where this calculation content is executed by a multiplication operation by a 1-bit logic (Boolean) calculator of the memory type processor 101 is shown up to the second digit calculation result S3.

アドレスＸからアドレスＸ＋３には４ｂｉｔの被乗数データＡが記憶されている、アドレスＸ＋４からアドレスＸ＋７には、４ｂｉｔの被乗数データＢが記憶されている。
アドレスＸ＋８からアドレスＸ＋１５には乗数１桁目の演算結果を各桁毎に一時記憶するエリアである。
アドレスＸ＋１６からアドレスＸ＋３９には乗数２桁目から４桁目の演算結果を各桁毎に一時記憶するエリアである。
被乗数Ａ、乗数Ｂそれぞれの桁の演算を行い、各エリアの記憶データを１ｂｉｔずつずらして演算結果を記憶する。
この演算はキャリーデータ２０８が出ないので合計１６ステップの単純な処理である。 4-bit multiplicand data A is stored from address X to address X + 3, and 4-bit multiplicand data B is stored from address X + 4 to address X + 7.
Addresses X + 8 to X + 15 are areas for temporarily storing the calculation result of the first digit of the multiplier for each digit.
Addresses X + 16 to X + 39 are areas for temporarily storing the calculation results of the second to fourth multipliers for each digit.
Calculation of the digits of the multiplicand A and the multiplier B is performed, and the storage data of each area is shifted by 1 bit and the calculation result is stored.
This calculation is a simple process with a total of 16 steps since no carry data 208 is output.

以上の４桁の演算結果を、先に示した全加算演算し、その結果をアドレスＸ＋４０からアドレスＸ＋４７に最終結果として記憶する。
この演算はキャリーデータ２０８が出るので少し複雑になるが先に示した全加算演算の手法により演算すればよい。
合計１０５６ステップで４桁の乗算が実現出来る。 The above four-digit calculation result is subjected to the full addition operation shown above, and the result is stored as the final result from address X + 40 to address X + 47.
Although this calculation is a little complicated because the carry data 208 is output, the calculation may be performed by the full addition calculation method described above.
4-digit multiplication can be realized in a total of 1056 steps.

除算の場合は、被除数Ａの上位の桁から除数Ｂで割り切れる桁まで演算を繰り返せばよいことが知られている、１ｂｉｔ論理（ブール）演算機能１２３のこの手法で乗除演算が出来る。
ここまでの説明は、半加算器、全加算器などの四則（数値）演算器を用いることなく、ブール演算素子のみで、様々な四則演算を含む様々な演算を実行することが可能になることを示したものである。
通常これまで説明して来た演算手法は繰り返し演算の手間（時間）がかかり意味のないことのように考えれれるが超並列演算が実現できること、ＣＰＵやＧＰＵの力を借りることなくメモリ内部で演算を完結出来ることを考慮すれば大きな意義をもつ。 In the case of division, it is known that the operation may be repeated from the upper digit of the dividend A to the digit divisible by the divisor B, and the multiplication / division operation can be performed by this method of the 1-bit logic (Boolean) operation function 123.
In the above description, various arithmetic operations including various arithmetic operations can be executed only by a Boolean arithmetic element without using four arithmetic (numerical) arithmetic units such as a half adder and a full adder. Is shown.
The calculation methods that have been described so far seem to be meaningless because it takes time and effort to perform repeated calculations, but it is possible to implement massively parallel calculations and perform calculations inside the memory without the help of the CPU or GPU. Considering that it can be completed, it has great significance.

以下に、四則（数値）演算を多用するような情報処理の場合、数値演算特有のキャリーデータ２０８ならびにボローデータ２０９の処理の負担を軽減する方法を紹介する。 In the following, a method for reducing the processing load of carry data 208 and borrow data 209 unique to numerical operations will be introduced in the case of information processing that frequently uses four arithmetic (numerical) operations.

図１４は、メモリ型プロセッサの、四則（数値）演算１２４を効率的に実現出来るようするために考えられた回路である。
本例ではスイッチ２０１を２に切り替えると、１ｂｉｔの四則（数値）演算１２４が実現出来るような構成となっている。
メモリからの１ｂｉｔ毎のデータは、２つの半加算器２１０と論理和１１３で構成される全加算器２１１の、Ａ、B、Ｃｉの３つの各入力のデータを順次記憶する３つのフリップフロップ２０２とその何れかを指定する論理積１１２ゲートで選択指定され、全加算器２１１に加えられるように構成されている。 FIG. 14 shows a circuit designed to efficiently realize the four arithmetic (numerical) operations 124 of the memory type processor.
In this example, when the switch 201 is switched to 2, a 1-bit four arithmetic (numerical value) calculation 124 can be realized.
The data for each bit from the memory is three flip-flops 202 that sequentially store data of each of the three inputs A, B, and Ci of the full adder 211 composed of two half adders 210 and a logical sum 113. And any one of them is selected and specified by a logical product 112 gate, and is added to the full adder 211.

さらにその結果全加算器２１１のＳ出力、ならびにＣ出力は一旦２つのフリップフロップ２０２に順次記憶され、ＯＵＴＳ選択入力もしくはＯＵＴＣ選択入力で選択指定された出力が、スイッチ２０１を通じてメモリ１００の記憶セル１０２に記憶されるように構成されている。
以上のように構成することで１桁の四則演算を５ステップで実現できる、演算器が完成する。
以上の構成は加算器で記載されているが、減算器、そして乗除演算に利用することが可能であることはこれまでの説明の通りである。 As a result, the S output and C output of the full adder 211 are temporarily stored in the two flip-flops 202 once, and the output selected and designated by the OUT S selection input or the OUT C selection input is stored in the memory 100 through the switch 201. It is configured to be stored in the cell 102.
By configuring as described above, an arithmetic unit capable of realizing four-digit arithmetic operation with one digit in five steps is completed.
The above configuration is described as an adder, but as described above, it can be used for a subtracter and multiplication / division calculation.

以上の１ｂｉｔの四則（数値）演算１２４は、四則演算が多用される情報処理には最適である。
さらにこの四則（数値）演算器の性能を最大限に発揮できる多ビットデータの演算方法ならびにデータシフト回路を紹介する。 The 1-bit four arithmetic (numerical) operation 124 described above is optimal for information processing in which four arithmetic operations are frequently used.
In addition, a multi-bit data calculation method and a data shift circuit capable of maximizing the performance of the four arithmetic (numerical) arithmetic units will be introduced.

図１５は、メモリ型プロセッサの、多ビットデータ並列四則演算方法の例である。
これまでの演算方法は全て１つのレコード内で完結され実行されるものであたが、この演算方法は複数のレコードを１つのレコード群として演算をするものである。
つまりレコード群を一つのデータとするもので、レコードの幅（データ長）は任意である。
本例の場合レコードＹからレコードＹ＋８の９つのレコードを１つのレコード群とし１つのデータとなるように割り付けされており、その割り付けは、レコードＹ＋８をＬＳＢとして、レコードＹ＋１をＭＳＢとする８ｂｉｔデータ、レコードＹは四則演算結果で出力される桁上げ出力になっている。
また、アドレスＸは演算入力データＡｉ、アドレスＸ＋１は演算入力データＢｉ、アドレスＸ＋２は演算出力データＣｏ、アドレスＸ＋３は演算出力データＳｏが割り付けされている。 FIG. 15 shows an example of the multi-bit data parallel arithmetic operation method of the memory type processor.
All the calculation methods so far have been completed and executed within one record. However, this calculation method calculates a plurality of records as one record group.
That is, the record group is one data, and the record width (data length) is arbitrary.
In the case of this example, nine records from record Y to record Y + 8 are assigned as one record group and assigned as one data, and the assignment is 8-bit data with record Y + 8 as LSB and record Y + 1 as MSB. Record Y is a carry output that is output as a result of four arithmetic operations.
The address X is assigned the operation input data Ai, the address X + 1 is assigned the operation input data Bi, the address X + 2 is assigned the operation output data Co, and the address X + 3 is assigned the operation output data So.

図の下段部の演算部には、メモリからのデータ、メモリへのデータを一時記憶する一時記憶器と、四則演算を実施する四則演算器がそれぞれ並列に配列されている。 In the arithmetic unit in the lower part of the figure, a temporary storage device that temporarily stores data from the memory and data to the memory and an arithmetic operation device that performs the four arithmetic operations are arranged in parallel.

これまで説明の１ｂｉｔ四則（算術）演算器１２４は、１つのレコード内で完結される演算であったが、この多ｂｉｔ演算器１０３は、データを他のレコードに横送りする機能つまりLSBからＭＳＢ各ｂｉｔのキャリーデータ２０８であるキャリーＣｏ出力を上位のキャリーＣｉ入力に入力する接続が付加された構成である。
この接続は、データの幅分必要であり、どのレコードからどのレコードまでを接続する、または接続しないなどレコード毎に任意に指定可能な構成にすることもできる。 The 1-bit four arithmetic (arithmetic) arithmetic unit 124 described so far has been an operation completed within one record, but this multi-bit arithmetic unit 103 has a function of laterally feeding data to other records, that is, from LSB to MSB. In this configuration, a carry Co output, which is carry data 208 of each bit, is input to an upper carry Ci input.
This connection is necessary for the width of the data, and it can be configured such that any record can be arbitrarily specified for each record, such as connecting from which record to which record or not connecting.

この構成とすることにより、メモリのアドレスＸならびにアドレスＸ＋１から並列に代入されたＡｉ、Ｂｉの入力データは一時記憶器に記憶され、その一時記憶器の出力が四則演算器の入力に接続されているので、ＬＳＢからＭＳＢ各レコードの四則演算器は、各レコード並列に演算を実行してその結果の演算出力Ｓｏならびにキャリー出力Ｃｏを、一時記憶器に出力しその結果をメモリのアドレスＹ＋２並びにアドレスＹ＋３に演算結果として記憶させることができる。 With this configuration, the input data of Ai and Bi assigned in parallel from the memory address X and the address X + 1 is stored in the temporary memory, and the output of the temporary memory is connected to the input of the four arithmetic units. Therefore, the four arithmetic units for each record from the LSB to the MSB execute the calculation in parallel for each record and output the calculation output So and the carry output Co to the temporary storage unit and output the result to the memory address Y + 2 and the address. Y + 3 can be stored as a calculation result.

この演算方法は詳しくは後述するがこれまでの何れの演算方法に比較して極めて効率的で高速であるが、この四則演算器に、例えば乗算器やキャリー先読み機能などのオプション機能を設けることによりさらに高速な演算が実現できる。
キャリー先読み回路はデータ幅が広くなると回路構成が大きくなるので、演算のデータ幅を８ｂｉｔや１６ｂｉｔ幅程度に抑え、データ幅の広い演算はキャリーを含んだ演算を繰り返し行うこととすれば省スペースで高速な演算回路を実現することができる。 Although this calculation method will be described later in detail, it is extremely efficient and high-speed compared to any of the calculation methods so far. However, by providing optional functions such as a multiplier and a carry look-ahead function in the four arithmetic units. Further, high-speed computation can be realized.
Since the circuit structure of the carry prefetch circuit increases as the data width increases, the data width of the operation is limited to about 8 bits or 16 bits, and if the operation with a wide data width is performed repeatedly including the carry, the space can be saved. A high-speed arithmetic circuit can be realized.

これまでの手法では、例えば１ｂｉｔ論理（ブール）演算器で８ｂｉｔの加算演算をする場合には２５６ステップ、１ｂｉｔ四則演算器の場合には４０ステップ程度の演算が必要であったが、この演算の場合、データ幅に関わらず合計５ステップ程度の演算で四則演算が実現できるのでこれまでの演算方法に比較して極めて効率的である。
この桁上げ演算方法は、これまで説明した、カウンタ演算、加減乗除演算の全てに応用できることは言うまでもない。 In the conventional methods, for example, when an 8-bit addition operation is performed by a 1-bit logic (Boolean) arithmetic unit, 256 steps are required in the case of a 1-bit four arithmetic operation unit. In this case, four arithmetic operations can be realized with a total of about 5 steps regardless of the data width, which is extremely efficient as compared with the conventional calculation methods.
It goes without saying that this carry calculation method can be applied to all of the counter calculation and addition / subtraction / multiplication / division calculations described above.

この演算方法は、四則演算で不可欠なキャリーデータ２０８やボローデータ２０９のレコード間データ転送をデータの横送り機能により実現したものである、言うまでもなくこの手法は１組のレコード群のデータに制限されるものではなく多数のレコード群で使用することができる。
並列度が上がるほどその効果が大きくなるので、レコード全体に利用することも可能である。 In this calculation method, data transfer between records of carry data 208 and borrow data 209, which is indispensable in the four arithmetic operations, is realized by a data lateral feed function. Needless to say, this method is limited to data of one set of record groups. It can be used with a large number of records.
Since the effect increases as the degree of parallelism increases, it can be used for the entire record.

本方式の場合、入力データ（Ａ、Ｂ）には符号をつけることが出来る、また演算結果は、例えば乗算の場合など最大となるデータ幅と、キャリー、ボロー、符号を考慮したデータ幅（レコード幅）を考慮した割り付けをするとよい。 In the case of this method, the input data (A, B) can be signed, and the calculation result has a maximum data width, for example, multiplication, and a data width (record) considering carry, borrow, and sign. It is better to make an allocation considering the width.

以上の説明の通りこの１ｂｉｔ演算機能（論理演算、四則演算）は、行ならびに列の双方向の演算が出来るので通常のＡＬＵにない大きな特徴があり、データベースを行方向で演算するデータ、列方向で演算するデータなど適切に使い分けすることにより、最適な演算方法を選択することが出来る。
以下に、行方向、列方向のデータ変換や、様々な演算を行う上で不可欠なデータシフト機能を説明する。 As described above, this 1-bit operation function (logical operation, four arithmetic operations) has a large feature not found in a normal ALU because it can perform bi-directional operation of rows and columns. The appropriate calculation method can be selected by properly using the data to be calculated in.
In the following, the data conversion function indispensable for performing data conversion in the row direction and column direction and various operations will be described.

図１６はデータシフト機能を備えたメモリ型プロセッサの回路構成例である。
図に示すようにアドレス毎の記憶セル１０２にレコードデータ２１５が記憶されている。
このレコード毎に１ｂｉｔ演算器１０５が接続されている。
これまでの説明では１ｂｉｔ演算器１０５のレジスタはフリップフロップ２０２であった、この場合同一レコード内のデータは１ｂｉｔ毎に自由に移動し演算させることが出来るが、レコード間にまたがりデータを移動し演算することは出来なかった。 FIG. 16 is a circuit configuration example of a memory type processor having a data shift function.
As shown in the figure, record data 215 is stored in the memory cell 102 for each address.
A 1-bit calculator 105 is connected for each record.
In the description so far, the register of the 1-bit arithmetic unit 105 is the flip-flop 202. In this case, the data in the same record can be freely moved and operated for each 1-bit, but the data is moved between the records and operated. I couldn't do it.

図に示すように、これまで説明のフリップフロップ２０２をシフトレジスタ２１２として、外部からクロック信号２１６を与え、フリップフロップ結果出力２０５（この場合シフトレジスタ出力）と、クロック信号２１６を隣のレコードのシフトレジスタの入力に接続することによりレコード間のデータ移動（シフト転送）が可能になる。 As shown in the figure, the flip-flop 202 described so far is used as the shift register 212, the clock signal 216 is given from the outside, and the flip-flop result output 205 (in this case, the shift register output) and the clock signal 216 are shifted to the next record. By connecting to the input of the register, data movement (shift transfer) between records becomes possible.

以上のような、データのレコード間のシフト転送が可能になると、これまでの演算機能をさらに高度で高速なものにすることが可能なる。 When shift transfer between data records as described above becomes possible, it is possible to make the conventional calculation functions more sophisticated and faster.

通常のシフトレジスタは１クロックで１レコード分のデータシフトしか実現出来ないが、１クロックで８レコードや１６レコード早送りするようなシフトレジスタ構成とすることも可能である。
以上説明の多ｂｉｔ演算器１０３ならびにシフトレジスタ２１２は、演算対象のデータを他のレコードへ転送する機能の一つであり、この機能によりメモリ型プロセッサの性能は大きく拡大する。 A normal shift register can only implement a data shift of one record in one clock, but it is also possible to adopt a shift register configuration that fast forwards 8 records or 16 records in one clock.
The multi-bit arithmetic unit 103 and the shift register 212 described above are one of functions for transferring data to be calculated to other records, and this function greatly increases the performance of the memory processor.

図１７、図１８はメモリ型プロセッサによるデータの行列変換の例をし示すものである。
先に述べた通り、１ｂｉｔ演算機能を備えたメモリ型プロセッサは行ならびに列の双方向の演算が出来る特徴がある。
以上の特徴を上手く利用するために、先に説明のシフト機能を利用して演算データを行列変換するための手法を紹介する。 17 and 18 show an example of data matrix conversion by the memory type processor.
As described above, a memory type processor having a 1-bit operation function has a feature that can perform bi-directional operation of rows and columns.
In order to make good use of the above features, we introduce a method for matrix conversion of operation data using the shift function described above.

図１７では行列変換をする上での前処理演算のステップが示されている。
アドレス１からアドレス４には行列変換の対象になるデータが書き込まれている。
このデータはレコード５から８、レコード９から１２、レコード１３から１６の４ｂｉｔで１組みとなる４ｂｉｔデータ幅のデータである。 FIG. 17 shows the steps of preprocessing operations for matrix conversion.
Data to be subjected to matrix conversion is written from address 1 to address 4.
This data is data having a 4-bit data width that is a set of 4 bits of records 5 to 8, records 9 to 12, and records 13 to 16.

同上レコードのアドレス５からに８には、演算補助データであるマスクデータが書き込まれている。 Mask data which is operation auxiliary data is written in addresses 5 to 8 of the record.

アドレスＸにはアドレス１とアドレス５の論理積演算結果が代入されている。
アドレスＸ＋１にはアドレス１とアドレス６の論理積演算結果が代入されている。
アドレスＸ＋２にはアドレス１とアドレス７の論理演積算結果が代入されている。
アドレスＸ＋３にはアドレス１とアドレス８の論理積演算結果が代入されている。 The address X is assigned the logical product of addresses 1 and 5.
The logical product of address 1 and address 6 is assigned to address X + 1.
The logical multiplication result of address 1 and address 7 is assigned to address X + 2.
The logical product of address 1 and address 8 is assigned to address X + 3.

アドレスＸの論理積演算結果はシフトされていない。
アドレスＸ＋１の論理積演算結果は−１シフト（左に１シフト）されている。
アドレスＸ＋２の論理積演算結果は−２シフト（左に２シフト）されている。
アドレスＸ＋３の論理積演算結果は−３シフト（左に３シフト）されている。
以上でアドレス１のデータに関わる前処理演算が完了する。 The logical product operation result at the address X is not shifted.
The logical product operation result at the address X + 1 is shifted by -1 (one shift to the left).
The logical product operation result at address X + 2 is shifted by -2 (2 shifts to the left).
The logical product operation result at address X + 3 is shifted by -3 (three shifts to the left).
Thus, the preprocessing calculation related to the data at address 1 is completed.

全く同様に
アドレス２のデータに関わる演算が、アドレスＸ＋４からアドレスＸ＋７で行われている。
アドレス３のデータに関わる演算が、アドレスＸ＋８からアドレスＸ＋１１で行われている。
アドレス４のデータに関わる演算が、アドレスＸ＋１２からアドレスＸ＋１５で行われている。
以上が行列変換をする上での前処理演算である。 The operation related to the data at address 2 is performed in the same manner from address X + 4 to address X + 7.
An operation related to the data at address 3 is performed from address X + 8 to address X + 11.
An operation related to the data at address 4 is performed from address X + 12 to address X + 15.
The above is the preprocessing operation for matrix conversion.

図１８では、以上の前処理演算結果を行列変換するステップが示されている。
アドレス１のデータの前処理演算結果であるアドレスＸからアドレスＸ＋３は全てシフトされない。
アドレス２のデータの前処理演算結果であるアドレスＸ＋４からアドレスＸ＋７は全て＋１シフト（右に１）さてれいる。
アドレス３のデータの前処理演算結果であるアドレスＸ＋８からアドレスＸ＋１１は全て＋２シフト（右に２）さてれいる。
アドレス４のデータの前処理演算結果であるアドレスＸ＋１２からアドレスＸ＋１５は全て＋３シフト（右に３）さてれいる。 FIG. 18 shows a step of performing matrix transformation on the above preprocessing calculation result.
All of the addresses X + 3 from the address X, which is the result of the preprocessing operation on the data at the address 1, are not shifted.
All of the addresses X + 4 to X + 7, which are the result of the preprocessing operation on the data at address 2, are shifted by +1 (1 to the right).
Address X + 8 to address X + 11, which are the pre-processing calculation results for the data at address 3, are all shifted by +2 (2 to the right).
Address X + 12 to address X + 15, which are the results of the preprocessing operation on the data at address 4, are all shifted by +3 (3 to the right).

アドレスＸ＋１６には、アドレスＸ、アドレスＸ＋４、アドレスＸ＋８、アドレスＸ＋１２の論理和演算が代入されている。
アドレスＸ＋１７には、アドレスＸ＋１、アドレスＸ＋５、アドレスＸ＋９、アドレスＸ＋１３の論理和演算が代入されている。
アドレスＸ＋１８には、アドレスＸ＋２、アドレスＸ＋６、アドレスＸ＋１０、アドレスＸ＋１４の論理和演算が代入されている。
アドレスＸ＋１９には、アドレスＸ＋３、アドレスＸ＋７、アドレスＸ＋１１、アドレスＸ＋１５の論理和演算が代入されている。 The logical sum operation of the address X, the address X + 4, the address X + 8, and the address X + 12 is assigned to the address X + 16.
The logical sum operation of address X + 1, address X + 5, address X + 9, and address X + 13 is assigned to address X + 17.
The logical sum operation of address X + 2, address X + 6, address X + 10, and address X + 14 is assigned to address X + 18.
The logical sum operation of address X + 3, address X + 7, address X + 11, and address X + 15 is assigned to address X + 19.

以上の演算結果であるアドレスＸ＋１６からアドレスＸ＋１９のレコード５から８、レコード９から１２、レコード１３から１６の４ｂｉｔデータ幅のデータはアドレス１からアドレス４のデーが行列変換されたものである。 The data of the 4-bit data width of records 5 to 8, records 9 to 12, and records 13 to 16 of addresses X + 16 to X + 19, which are the above calculation results, is obtained by matrix conversion of data of addresses 1 to 4.

本例では４ｂｉｔデータであったが、８ｂｉｔでも１６ｂｉｔでもさらに大きなデータでも可能である。
繰り返し演算の回数は増えるが、並列演算が出来ることを考えれば効果的な行列変換であることはこれまで説明の通りである。 In this example, the data is 4-bit data, but 8-bit, 16-bit, or even larger data is possible.
Although the number of repeated operations increases, as described above, it is an effective matrix transformation considering that parallel operations can be performed.

本例では、アドレス５からアドレス８を行列演算する上で必要な前処理演算を補助データを利用し、シフトレジスタ機能で行列変換を行ったものである。
本メモリ型プロセッサは、以上のような補助データを活用することより、活用方法が大幅に拡大する。 In this example, a pre-processing operation necessary for performing matrix operation on addresses 5 to 8 is performed using auxiliary data and matrix conversion is performed using a shift register function.
This memory type processor greatly expands the utilization method by utilizing the auxiliary data as described above.

次にこのメモリ型プロセッサ１０１のデータ入力機能について説明する。
この技術の最大の検討課題は、レコードデータの記憶方法である。
比較的簡単に実現できる方法として、このメモリ型プロセッサ１０１に目的のデータを書き込みする場合、１ｂｉｔ演算器のフリップフロップ２０２を通じて外部からのデータを、書き込み、読み出しすることが出来る。 Next, the data input function of the memory type processor 101 will be described.
The biggest consideration of this technology is how to store record data.
As a method that can be realized relatively easily, when writing target data to the memory type processor 101, external data can be written and read through the flip-flop 202 of the 1-bit arithmetic unit.

この場合原則的に１ｂｉｔ毎の書き込みとなり、このメモリ型プロセッサ１０１は通常のデータベースのデータを縦横反転した配列となるため、例えば通常のメモリに記憶されているレコードデータ６４ｂｉｔ＊６４ｂｉｔなどの配列データを、このメモリ１０１の入力から転送した後、メモリ型プロセッサ１０１内部で行列変換し、指定した書き込み先のレコードの６４個のフリップフロップ２０２のみが有効（他のレコードのメモリセルに影響を与えない）になるように回路構成し、書き込み先のアドレス毎に６４回データを記憶させることで外部データを記憶させることが出来る。
つまり６４ｂｉｔ幅のデータを６４回アドレス毎に順次書き込めば良い。
この場合１レコード分のデータの変更でも６４レコード幅を対象にして上書きする。 In this case, in principle, writing is performed for each 1 bit, and this memory type processor 101 has an array in which normal database data is inverted vertically and horizontally, so for example, array data such as record data 64 bits * 64 bits stored in a normal memory is stored. Then, after transferring from the input of the memory 101, matrix conversion is performed inside the memory type processor 101, and only 64 flip-flops 202 of the designated write destination record are valid (does not affect the memory cells of other records). The external data can be stored by storing the data 64 times for each write destination address.
That is, it is only necessary to sequentially write 64-bit data every 64 addresses.
In this case, even if the data for one record is changed, it is overwritten for the width of 64 records.

次にこのメモリ型プロセッサ１０１の出力機能について説明する。
レコード幅が比較的少数であれば、演算結果を並列に出力することも可能である。
然しながら、レコード数が数千以上になった場合、このメモリのチップに出力ピンを引き出すことは現実出来でない。
従ってレコード数が大きい場合は優先順出力回路（プライオリテイエンコ−ダ）１２４により演算結果のレコードの番地を１レコード毎に出力すると良い。
優先順出力回路（プライオリテイエンコ−ダ）１２４は演算結果が絞込みされている場合には好都合である。
沢山のレコードが検索される事を勘案して、プライオリティアドレスエンコーダ出力回路などの演算結果出力１０６をいくつかのブロックに分割して、ブロック単位で読み出し出来るように構成すれば、分割したブロック分、レコードの番地出力を高速にすることが可能になる。
また先に述べたように、勝ち残りレコードがあるかないかを判断する出力があると、条件判断が多い処理に効率的である。
言うまでもなく、チップから出力ピンを引き出せる範囲で、主要なレコードの演算結果を並列に出力しても構わない。 Next, the output function of the memory type processor 101 will be described.
If the record width is relatively small, the operation results can be output in parallel.
However, when the number of records becomes several thousand or more, it is not practical to pull out the output pin to the chip of this memory.
Therefore, when the number of records is large, the priority order output circuit (priority encoder) 124 may output the address of the record of the calculation result for each record.
The priority order output circuit (priority encoder) 124 is convenient when the calculation result is narrowed down.
Considering that many records are searched, if the operation result output 106 such as the priority address encoder output circuit is divided into several blocks and can be read out in units of blocks, It becomes possible to increase the address output of records.
Further, as described above, if there is an output for determining whether or not there are remaining winning records, it is efficient for processing with many condition determinations.
Needless to say, the operation results of the main records may be output in parallel as long as the output pins can be drawn from the chip.

これまで様々な回路構成を紹介してきたが１ｂｉｔ演算器の最小構成は１ユニット（１レコード）は回路を単一機能に限定すれば数百トランジスタ程度から標準的な回路構成でも１０００（１Ｋ）個程度のトランジスタ回路規模で実現できる。 Various circuit configurations have been introduced so far, but the minimum configuration of a 1-bit computing unit is 1 unit (1 record). If the circuit is limited to a single function, from a few hundred transistors to a standard circuit configuration of 1000 (1K) It can be realized with a transistor circuit scale of about.

以上のことを前提に本発明を半導体で実現する場合を考えてみる、
現時点の半導体微細化技術では１チップ上に１００億個のトランジスタが実装出来るレベルにあり、１チップのＤＲＡＭのメモリ容量は８Ｇｂｉｔ程度である。
将来はその１桁程度、１０００億個のトランジスタが実装出来る見通しである。 Consider the case where the present invention is realized with a semiconductor on the assumption of the above,
In the current semiconductor miniaturization technology, 10 billion transistors can be mounted on one chip, and the memory capacity of a one-chip DRAM is about 8 Gbit.
In the future, it is expected that about 100 billion transistors will be mounted.

しかしながら、ムーアの法則による微細化技術も間近に限界を迎えており、それ以降の集積度の向上は３次元実装など他の手法に転換する以外にないとされている。
これまでムーアの法則に従ってＡＬＵを使ったマルチコアーやメニーコアの並列度が向上してきたが、ムーアの法則が限界に達すれば並列度の向上は望めない。
しかしながら本発明による演算器はこれまで説明の通り極めて単純な回路構成であるので標準的な機能の場合１０００（１K）トランジスタ程度である。 However, miniaturization technology based on Moore's Law is approaching its limit, and it is said that the improvement of the integration after that will only have to be switched to other methods such as three-dimensional mounting.
So far, the parallelism of multi-cores and manycores using ALU has been improved according to Moore's Law, but if Moore's Law reaches the limit, the improvement of parallelism cannot be expected.
However, since the arithmetic unit according to the present invention has a very simple circuit configuration as described above, it has about 1000 (1K) transistors in the case of a standard function.

従って１万ユニット（レコード）では１０Ｍトランジスタ、１０万ユニットでは１００M（１億）トランジスタ程度、１００万ユニット（１Ｍレコード）では１０億トランジスタ程度である。
従って１Ｍレコードの場合でも、現時点で１チップに搭載可能な１００億個のトランジスタの１０％を占める程度である。
微細化の限界時点では１％を占める程度で済むので、その省スペース性は特筆される。 Therefore, 10,000 units (record) has about 10M transistors, 100,000 units have about 100M (100 million) transistors, and one million units (1M record) has about 1 billion transistors.
Therefore, even in the case of 1M record, it accounts for 10% of 10 billion transistors that can be mounted on one chip at present.
Since it occupies 1% at the time of the limit of miniaturization, the space-saving property is specially noted.

また、１Ｍのレコードなど幅の広いアドレスをアクセスする場合、レコードを幾つかのバンクに分散させて、微小時間アクセス時間を遅延させて演算するなどの回路構成とすることにより突入電流を最小限に抑え省電力で演算可能なチップにすることができる。
また未使用のレコードを演算対象外として、省電力を図ることも可能である。 Also, when accessing a wide address such as a 1M record, the inrush current is minimized by adopting a circuit configuration in which the record is distributed to several banks and the operation is performed by delaying the minute access time. It can be a chip that can be operated with reduced power consumption.
It is also possible to save power by excluding unused records.

積層化技術が進むＦＬＡＳＨメモリの場合現時点での１チップ当りのメモリ容量は１Ｔｂｉｔ程度である。
従って、縦横１Ｍ（１００万）ｂｉｔの不揮発性の１ｂｉｔ演算機能を備えたメモリ型プロセッサ１０１が実現される。
ＳＲＡＭの場合、ＤＲＡＭやＦＬＡＳＨメモリなどのように集積度は上がらないもの、高速な演算が期待できる、また半導体開発コストが比較的低コストで実現可能である。 In the case of a FLASH memory in which the stacking technology advances, the memory capacity per chip at the present time is about 1 Tbit.
Therefore, the memory type processor 101 having a nonvolatile 1-bit arithmetic function of 1 M (million) bits in the vertical and horizontal directions is realized.
In the case of SRAM, the degree of integration does not increase like DRAM and FLASH memory, high-speed computation can be expected, and the semiconductor development cost can be realized at a relatively low cost.

以上のＤＲＡＭ、ＦＬＡＳＨ、ＳＲＡＭ、以外最近は不揮発、省電力が期待される磁気記憶型のメモリセルも盛んに研究されておりこの様なメモリにも共通に利用可能である、本メモリはチップのごく一部に論理演算機能を加えるのみであるので極めて大容量で超高速で簡便な情報処理が出来ることである。 In addition to the DRAM, FLASH, and SRAM described above, recently, non-volatile and magnetic storage type memory cells that are expected to save power have been actively studied, and this memory can be used in common for such memories. Since only a small part of the logical operation function is added, it is possible to perform simple information processing with a very large capacity and at an ultra-high speed.

このメモリはインメモリデータベースでしかもＣＰＵに演算を頼らない自己完結型演算機能を持たせたメモリがメモリ型プロセッサである。
メモリ型プロセッサは必ずしも大きなデータを対象とするものでなく、小さなデータでも繰り返し演算する必要がある情報処理に最適であるので、ＦＰＧＡにもこのメモリ型プロセッサ１０１のアルゴリズムを容易に実装することが出来る。
多様なケースが存在する多ビット演算器１２２などはＦＰＧＡであればフレキシブルに利用することが出来るようになる。
このメモリ型プロセッサとＣＰＵを一体化したデバイス構成や、ＣＰＵのキャッシュメモリにも有効である。 This memory is an in-memory database, and a memory having a self-contained calculation function that does not rely on the CPU for calculation is a memory type processor.
The memory type processor is not necessarily intended for large data, and is optimal for information processing that requires repeated calculation even for small data. Therefore, the algorithm of the memory type processor 101 can be easily implemented in an FPGA. .
The multi-bit arithmetic unit 122 and the like having various cases can be used flexibly if it is an FPGA.
It is also effective for a device configuration in which the memory type processor and the CPU are integrated, and a cache memory of the CPU.

図１９は、メモリ型プロセッサの、直並列接続の例である。
このメモリ１０１は完全に独立したメモリとして、縦方向（アドレス方向）にも横方向（データ幅方向）に拡張することが出来るので、システムの拡張が極めて単純でありシステムに永続性を持たせることが出来る。
全文検索であれば、アドレスは数十万にも上る、しかしながら個人情報であれば一人当たり数Ｋから数拾Ｋのアドレスがあれば十分である。 FIG. 19 shows an example of serial-parallel connection of memory type processors.
This memory 101 is a completely independent memory and can be expanded in the vertical direction (address direction) or in the horizontal direction (data width direction), so that the expansion of the system is extremely simple and the system is made persistent. I can do it.
In the case of full-text search, there are hundreds of thousands of addresses. However, in the case of personal information, it is sufficient if there are addresses from several K to several K per person.

通常、全く配列の定義やインデックスのないメモリの中から１つのＣＰＵが、特定の情報を見つけ出す場合には、例えば１０ｎ秒平均でメモリをアクセスし照合するだけでも、１Ｍアドレスの場合１０ｍ秒程度、１Ｇの場合１０秒、１Ｔの場合１０，０００秒（３時間程度）の時間が必要になる。
ＣＰＵを並列に使用し分散処理理すれば、原則的にＣＰＵの数に比例して処理時間を削減することが出来る。
しかしながら１ＴＢを超えるようなインメモリ型のデータベースをリアルタイム（例えば１秒以内）で検索演算やデータマイニングするのは困難とされている。 Normally, when one CPU finds specific information from a memory having no array definition or index at all, for example, even if the memory is accessed and collated with an average of 10 ns, it takes about 10 milliseconds for a 1M address. It takes 10 seconds for 1G and 10,000 seconds (about 3 hours) for 1T.
If CPUs are used in parallel and distributed processing is performed, the processing time can be reduced in principle in proportion to the number of CPUs.
However, it is difficult to perform a search calculation or data mining in real time (for example, within 1 second) for an in-memory database exceeding 1 TB.

本メモリ型プロセッサ１０１の場合どのように直並列されていて、例えば１０ＴＢのデータであっても全メモリの並列処理が可能で、アドレス選択１１０と演算条件指定１１１を数回から数十回、数百回繰り返すだけで良い。 In the case of the memory type processor 101, it is possible to perform parallel processing of all memories even if the data is, for example, 10 TB data, and the address selection 110 and the calculation condition specification 111 can be performed several times to several tens of times. Just repeat it a hundred times.

記憶素子によりアクセススピードは様々であるが例えば一回の論理演算のスピードを１０ｎ秒とすると、1ｍ秒あれば、１００、０００回の演算を実現することが出来る、完全並列処理が出来るのでどのようなサイズのビッグデータであっても、そのサイズの関わりなく数百ｎ秒からマイクロ秒、1ｍ秒程度で目的のレコードを探し当てることが出来る。
つまりビックデータほど効果が著しいことがこの技術の最大の特徴と言える。
この発明のメモリ構造とデータの縦横関係を逆転する考え方は、情報処理の回数を大幅に削減し、処理時間を大幅に削減する事を如実に示している。 Although the access speed varies depending on the memory element, for example, if the speed of one logical operation is 10 ns, if 1 s, 1 million seconds can be realized, and 100,000 parallel operations can be realized. Even with big data of any size, it is possible to find the target record in hundreds of nanoseconds, microseconds, and 1 millisecond, regardless of its size.
In other words, it can be said that the greatest feature of this technology is that the effect is as remarkable as big data.
The idea of reversing the vertical / horizontal relationship between the memory structure and the data of the present invention clearly shows that the number of times of information processing is greatly reduced and the processing time is greatly reduced.

このことは様々な仮定に基づき検索を繰り返す必要があるビッグデータのデータマイニングや総当たり的な演算が必要なデータの照合処理に極めて効果的である、詳細は後述する。 This is extremely effective for data mining of big data that needs to be repeatedly searched based on various assumptions and data matching processing that requires brute force calculation. Details will be described later.

図２０はメモリ型プロセッサの、階層化接続の例である。
図の例では図の最上段に示すメモリ型プロセッサ１０１をマスタとしてそのそれぞれの、レコードに対応させて、レコード毎に更に詳細なデータを格納したサブのメモリ型プロセッサ１０１を検索できるように構成してものである、特にビッグデータであればこのような階層化データベースを利用することによりどのような規模のデータベースであっても対応可能になる。 FIG. 20 shows an example of hierarchical connection of memory type processors.
In the example shown in the figure, the memory type processor 101 shown at the top of the figure is used as a master so that each sub-memory type processor 101 storing more detailed data for each record can be searched corresponding to each record. In particular, in the case of big data, it is possible to deal with any size database by using such a hierarchical database.

図２１はメモリ型プロセッサによる特徴データベースＡの例である。
全てのレコードと全てのフィールドデータの総当り的な演算や大量の組合せ演算を必要とするような照合演算はこのメモリの特徴を最も効果的に利用することができる。
本例は構成のイメージを示すだけの極めて小さなデータテーブルであるが、４ｂｉｔデータの特徴ＡからＥまで５種類の特徴データがレコードデータ２１５としてアドレス１からアドレス２０までデータベース化され記憶されている。 FIG. 21 is an example of the feature database A by a memory type processor.
Matching operations that require brute force operations or a large number of combination operations of all records and all field data can make the most effective use of this memory feature.
This example is an extremely small data table that only shows an image of the configuration, but five types of feature data from features A to E of 4-bit data are stored in a database from address 1 to address 20 as record data 215.

比較される照合データ２１３の特徴ＡからＥまで５種類のデータと、このデータベースとの照合を行い、一致した特徴の数、近似する特徴の数、さらにはデータベースと照合データとの差を演算してその結果を累計するなどの様々な演算を行った照合結果をアドレス２１以下に記憶しその結果を判定するようアドレスの割り付けがなされた構成である。
特徴データのデータ幅を個別に増減することも、特徴の数を大幅に増やすことも自由である。 Five types of data A to E of comparison data 213 to be compared are compared with this database, and the number of matched features, the number of approximate features, and the difference between the database and the verification data are calculated. In this configuration, the collation results obtained by performing various operations such as accumulating the results are stored at addresses 21 and below, and the addresses are assigned so as to determine the results.
The data width of the feature data can be increased or decreased individually, or the number of features can be greatly increased.

図２２はメモリ型プロセッサによる特徴データベースＢの例である。
本例のデータベースも先に説明のものと同様のものであるが、先に示した、多ｂｉｔ四則演算方式を使った８ｂｉｔデータの特徴データベースである。 FIG. 22 shows an example of the feature database B by the memory type processor.
The database of this example is the same as that described above, but is a feature database of 8-bit data using the multi-bit four arithmetic operation method described above.

本例の場合、メモリ型プロセッサのアドレス２０まで特徴データＡから特徴データＴまでの８ｂｉｔ特徴データが、１２レコード群毎に書き込みされている。
外部から与えられた、特徴照合データ毎にデータベースを並列（本例では１２レコード群）に照合することになる。
この方式は、先に述べた特徴データベースＡより遥かに高速な演算が可能である。 In this example, 8-bit feature data from feature data A to feature data T up to address 20 of the memory-type processor is written for every 12 record groups.
The database is collated in parallel (in this example, 12 record groups) for each feature collation data given from the outside.
This method can perform computations much faster than the feature database A described above.

本例ではアドレス２１にその照合データを代入し、対象となる特徴と照合を繰り返し行くことになるが、この時、アドレス２１の８ｂｉｔデータは全く同じデータを並列数分（本例では１２組）書き込みする必要がある。
レコード群が大きく（並列度が大きく）なると照合データの転送時間が問題になる。
このような場合、以下に示す方法を利用すると、極めて高速な照合データを得ることができる。 In this example, the collation data is substituted into the address 21, and the target feature and collation are repeated. At this time, the 8-bit data at the address 21 is exactly the same data as the parallel number (12 sets in this example). Need to write.
When the record group becomes large (the degree of parallelism becomes large), the transfer time of verification data becomes a problem.
In such a case, extremely high-speed collation data can be obtained by using the following method.

図２３はメモリ型プロセッサによる並列データ作成の例である。
図に示すように、
アドレスＸからアドレスＸ＋３には４ｂｉｔ幅の補助データ
アドレスＸ＋４からアドレスＸ＋１１には８ｂｉｔ幅の補助データ
アドレスＸ＋１２からアドレスＸ＋２７には１６ｂｉｔ幅の補助データ
が事前に書き込みされている。 FIG. 23 shows an example of parallel data creation by a memory type processor.
As shown in the figure
From address X to address X + 3, 4-bit auxiliary data address X + 4 to address X + 11, 8-bit auxiliary data address X + 12 to address X + 27 are previously written with 16-bit auxiliary data.

これらの補助データが事前に用意されていると、
例えば
アドレスＸ＋２８は、４ｂｉｔデータのアドレスＸ、アドレスＸ＋２の論理和演算されたデータである。
アドレスＸ＋２９は、８ｂｉｔデータのアドレスＸ＋６、アドレスＸ＋９、アドレスＸ＋１１の論理和演算されたデータである。
アドレスＸ＋３０は、１６ｂｉｔデータのアドレスＸ＋１２、からアドレスＸ＋２７の論理和演算されたデータである。
さらにアドレスＸ＋３１は、アドレスＸ＋２８、アドレスＸ＋２９、アドレスＸ＋３０の論理和演算されたデータである。 If these auxiliary data are prepared in advance,
For example, the address X + 28 is data obtained by ORing the address X of the 4-bit data and the address X + 2.
The address X + 29 is data obtained by ORing the address X + 6, the address X + 9, and the address X + 11 of 8-bit data.
The address X + 30 is data obtained by ORing the address X + 27 from the address X + 12 of 16-bit data.
Further, the address X + 31 is data obtained by ORing the address X + 28, the address X + 29, and the address X + 30.

特定のレコードの特定のアドレスの記憶セルを「０」もしくは「１」にする場合やデータ同士を重ね合わせする際にも補助データは有効である。
以上のような補助データを事前に用意しておくことにより、任意のデータ幅の任意のデータを並列に（高速に）得ることができる。
以上は補助データ活用の一例であり補助データはメモリ型プロセッサの能力を大幅に拡大する。 The auxiliary data is also effective when the storage cell at a specific address in a specific record is set to “0” or “1” or when data is overlapped.
By preparing the auxiliary data as described above in advance, arbitrary data having an arbitrary data width can be obtained in parallel (at high speed).
The above is an example of utilization of auxiliary data, and auxiliary data greatly expands the capacity of the memory type processor.

一例であるが、顏を照合するためのデータは１人当たり１ＫＢ程度とされている。
１，０００人分のデータであれば１ＭＢｙｔｅであるので通常のパソコン程度でもリアルタイム処理はそれほど困難ではない、しかしながら１００万人分のデータとなれば１ＧＢｙｔｅとなり通常のパソコンでは困難でＣＰＵやＧＰＵを多数用いた専用のシステムが必要である。
まして日本全体の場合は、その１００倍の１００ＧＢｙｔｅ、世界中の人の顔をリアルタイムで処理させようとすると、さらにその１００倍の１０ＴＢｙｔｅと、膨大なシステムとなるため、ＣＰＵやＧＰＵの情報処理では現実的ではなくなる。
この発明を使えば、この発明のメモリ型プロセッサ１０１を沢山用意すればよいだけである。
先に示した、１ＴｂｉｔのＦＬＡＳＨメモリが、８０個あれば全人類の顏照合のデータベースが完成する。
近未来に誕生する１０ＴｂｉｔのＦＬＡＳＨメモリであればわずか８個で全人類の顏照合のデータベースが完成する。
演算内容により異なるが多くの場合数十マイクロ秒程度で照合判定演算を可能にし、先に述べた通り、レコード件が例えば１００万レコードでも１００億レコードでも同一時間であることが最大の特徴である。
発熱も少なく、複雑な周辺回路も不要になるので、システムを大幅に小型化し、省電力にすることが出来る。 As an example, the data for checking the bag is about 1 KB per person.
Since the data for 1,000 people is 1 MByte, real-time processing is not so difficult even with a normal personal computer. However, the data for 1 million people becomes 1 GByte, which is difficult for a normal personal computer and has many CPUs and GPUs. The dedicated system used is necessary.
Furthermore, in the case of Japan as a whole, a system that is 100 times as large as 100 GBytes, and if you try to process the faces of people all over the world in real time, it will become a huge system with 100 times as many as 10 TBytes. It's not realistic.
If this invention is used, it is only necessary to prepare many memory type processors 101 of this invention.
If there are 80 1Tbit FLASH memories, the database for collation of all human beings is completed.
With a 10Tbit FLASH memory born in the near future, a database for collation of all human beings will be completed with only eight.
Although it depends on the calculation contents, in many cases, collation determination calculation is possible in about several tens of microseconds, and as described above, the biggest feature is that the record time is the same time for 1 million records or 10 billion records, for example. .
Less heat is generated and complicated peripheral circuits are not required, so that the system can be greatly reduced in size and saved in power.

以上は特徴データの照合演算であったが、特徴データを、データのカテゴリー、データのクラス、データのエリア、など様々なものにあてはめればそのた応用は無限である。
個人情報であれば、「氏名」、「住所」、「勤務先」、「生年月日」、「身長」、「体重」、「性別」など、様々なデータ幅のデータをデータカテゴリーとして割付すればよい。
これらの個人情報は情報探しの様々なインデックスを用意する必要がなく、データの登録が完了すれば直ちに演算開始することができる。
いくらＣＰＵやＧＰＵの性能が向上してもこのようなインデックスが不要な情報処理環境を得ることは出来ない。
この技術の大きな特徴である、詳細は後述する。 The above is the collation operation of the feature data. However, if the feature data is applied to various things such as the data category, the data class, and the data area, the application is infinite.
For personal information, data of various data widths such as “name”, “address”, “workplace”, “birth date”, “height”, “weight”, “gender”, etc. are assigned as data categories. That's fine.
It is not necessary to prepare various indexes for searching for information for these personal information, and calculation can be started immediately after data registration is completed.
Even if the performance of the CPU or GPU is improved, an information processing environment that does not require such an index cannot be obtained.
Details of this technique will be described later.

従ってレコードの件数が多いほどその効果は顕著で、先に述べた人類全体７０億人の顏データの照合のみならず、指紋データの照合、声紋認証、文字認識データの特徴照合などあらゆる用途に最適である。 Therefore, the effect is more remarkable as the number of records increases, and it is most suitable for all uses such as fingerprint data verification, voiceprint authentication, character recognition data feature verification, etc. It is.

この技術は、組合せ情報処理が極めて多い人工知能の一部に組み込むことによりその性能を大幅に向上することになる。
人工知能で利用される学習機能は、サンプルになる情報を多数読み込みこみ期待する答えがでるまで学習させる必要があるが、学習する情報の規模やクラスが大きくなると極めて大きな時間が掛る。
この技術を使えば、大規模な知識情報であっても複雑な情報処理が解消されるので学習時間を短縮し、認識能力を大幅に向上することが出来る。
また、様々な条件が複雑に絡み合う気象情報などの解析やデータマイニングに最適である。 This technology greatly improves the performance by incorporating it into a part of artificial intelligence that has a lot of combined information processing.
The learning function used in artificial intelligence needs to read a lot of sample information and learn until the expected answer is obtained, but it takes a very long time if the scale or class of information to be learned increases.
If this technology is used, complicated information processing can be eliminated even with large-scale knowledge information, so the learning time can be shortened and the recognition ability can be greatly improved.
In addition, it is ideal for analysis and data mining of weather information where various conditions are intricately intertwined.

以降この技術で得られる様々な効果について説明をする。
このメモリ型プロセッサ１０１を利用してデータベースを構築するには、レコード、とフィールドデータのアドレスの割り付けのみで、後は演算条件指定１１１するだけで利用することが出来る。
従ってこのメモリ型プロセッサ１０１のアプリケションインターフェースを準備することにより、従来一般的である検索アルゴリズム、例えばＳＱＬ等のデータベースに組み込み利用することができる。 Hereinafter, various effects obtained by this technique will be described.
In order to construct a database using the memory type processor 101, it is possible to use only by assigning the addresses of records and field data, and only by specifying the calculation condition 111 thereafter.
Accordingly, by preparing an application interface of the memory type processor 101, it can be incorporated into a database such as a conventional search algorithm such as SQL.

ＣＰＵを用いた情報探しは、ＣＰＵの負担を軽減するために様々な利用上のテクニックが存在する。
バイナリサーチはその典型的な例である。
このアルゴリズムは情報データの検索回数を極めて少なくすることが出来る技術として情報処理の定番技術であるが、メモリ上のデータテーブルにデータ値を書き込む際、例えば小さいデータから大きいデータ順に並べておくような事前準備が必要でありデータが増えたり、減ったりするたびに、メモリ上のデータを並べ変えする（データメンテナンス）必要がある。 There are various usage techniques for searching for information using a CPU in order to reduce the burden on the CPU.
A binary search is a typical example.
This algorithm is a standard technology for information processing as a technology that can extremely reduce the number of searches for information data. However, when data values are written in a data table on a memory, for example, prior arrangement is made such that data is arranged in order from small data to large data. It is necessary to rearrange the data on the memory (data maintenance) whenever preparation is necessary and data increases or decreases.

つまり、このアルゴリズムにより、ＣＰＵが特定のデータ値を探す時の負担は短縮されるが、その前の事前処理、データメンテナンスに掛る負担はけして少なくない。
以上はバイナリサーチの例であるが、ハッシュテーブルやＢ木構造（インデックス）などその他のアルゴリズムも全く同様である。 In other words, this algorithm reduces the burden on the CPU when searching for a specific data value, but the burden on pre-processing and data maintenance before that is quite small.
The above is an example of binary search, but other algorithms such as a hash table and a B-tree structure (index) are exactly the same.

本発明を利用すると、以上のようなアルゴリズムを使用する必要がなくなるので、事前準備やメンテナンスなどの情報処理は全く不要となるので、レコードのどこか、アドレスのどこかを指定して、データを登録するか抹消するだけであり煩わしい配列の変更、データの並べ替えなどのデータメンテナンスは一切不要である。 By using the present invention, it is not necessary to use the above algorithm, so information processing such as advance preparation and maintenance is completely unnecessary. It only requires registration or deletion, and no data maintenance such as annoying array changes or data rearrangements is required.

この事は従来のＣＰＵやＧＰＵのみの情報処理に比較して格段に情報処理の構成が簡素化し平易化することを示している。
これまで紹介してきた様々な情報処理は情報処理の中心的な処理であるので、情報処理に携わる多くの利用者（技術者）の負担を大幅に軽減する結果となる。 This indicates that the information processing configuration is greatly simplified and simplified as compared with the conventional CPU and GPU-only information processing.
Since the various information processing introduced so far is the central processing of information processing, it results in greatly reducing the burden on many users (engineers) who are involved in information processing.

また、本メモリ型プロセッサ１０１をコントロールし情報処理の全体をコントロールするＣＰＵは高速である必要がなくなるので情報処理に関わる電力を大幅に削減することが可能になる。
したがって情報処理に携わる利用者の負担とＣＰＵやＧＰＵならびに周辺回路の負担を同時に大幅軽減する結果となる。 In addition, since the CPU that controls the memory type processor 101 and controls the entire information processing does not need to be high speed, it is possible to greatly reduce the power related to the information processing.
Therefore, the burden on the user involved in information processing and the burden on the CPU, GPU, and peripheral circuits are greatly reduced at the same time.

現在の情報処理は、メモリ１００のデータ幅が３２ｂｉｔ、６４ｂｉｔ、１２８ｂｉｔなど一定のデータ幅でＣＰＵが順次アドレスをアクセスしデータを読み込み逐次情報処理を行って行くものである。
データ幅（バス幅）が広い程、情報処理の効率は高いが、デバイスの入出力ピン数が増えること、デバイスを実装するプリント基板の配線負担が多いことなどデータのバス幅の拡大には限界がある。 The current information processing is such that the CPU sequentially accesses the address, reads the data, and sequentially performs the information processing with a constant data width such as the data width of the memory 100 such as 32 bits, 64 bits, and 128 bits.
The wider the data width (bus width), the higher the efficiency of information processing, but there are limits to the expansion of the data bus width, such as the increase in the number of input / output pins of the device and the increased wiring burden on the printed circuit board on which the device is mounted. There is.

また、個人データベースなどの場合、年齢は７ｂｉｔデータ（最大１２７）、性別は１ｂｉｔなどデータ幅が少なくてもよいデータもデータ幅の広い演算器を利用するので無駄なｂｉｔの処理も多く存在することになる。
この発明のメモリ型プロセッサ１０１は、１ｂｉｔ以上任意のデータ幅で、行方向、列方向、任意のデータ幅の並列演算可能であるので全く無駄ｂｉｔがない情報処理が可能である。 In addition, in the case of a personal database or the like, there is a lot of useless bit processing because data having a small data width such as age is 7-bit data (up to 127) and sex is 1-bit, and an arithmetic unit with a wide data width is used. become.
The memory-type processor 101 of the present invention can perform parallel calculation in an arbitrary data width of 1 bit or more and in a row direction, a column direction, and an arbitrary data width, and therefore can perform information processing with no wasteful bits.

本願発明者はこれまで様々なメモリ型デバイスの研究行ってきた、特許第４５８８１１４号情報絞り込み検出機能を備えたメモリはパターンマッチなど論理積演算が得意なメモリである。
またPCT/ＪＰ２０１３／０５９２６０号集合演算機能を備えたメモリは以上の情報絞り込み検出機能を備えたメモリの概念を拡大発展させて、論理積演算、論理和演算、論理否定演算などを自由に行うことができるメモリ型デバイスである。
メモリ型デバイスの特徴は、情報処理の容量が大きくても少なくても常に一定の処理時間であることが大きな特徴であるので、情報処理容量が大きいほどその効果が大きくなる。 The inventor of the present application has been researching various types of memory devices so far. Patent No. 4588114 A memory having an information narrowing detection function is a memory that is good at AND operations such as pattern matching.
PCT / JP2013 / 059260 memory with set operation function expands and expands the concept of memory with information narrowing detection function, and can perform logical product operation, logical sum operation, logical negation operation, etc. It is a memory type device that can
A feature of the memory type device is that the processing time is always constant regardless of whether the capacity of information processing is large or small. Therefore, the effect increases as the information processing capacity increases.

情報処理容量が比較的小さいＦＰＧＡを用いたデモ機でも従来の情報処理に比較して数万倍以上の能力があることを検証済みであり、情報処理容量の大きなＡＳＩＣのチップを創れば数百万倍高速化できることが検証されている。 Demonstration machines using FPGAs with relatively small information processing capacities have been verified to be tens of thousands of times more powerful than conventional information processing. If an ASIC chip with a large information processing capacity is created, several hundreds It has been verified that the speed can be increased 10,000 times.

また本願発明者による発明の特願特願２０１３−２６４７６３、情報検索機能を備えたメモリ、でＦＰＧＡによる文献検索システムを開発した結果、通常のソフト処理では７６ｍ秒程度掛かる検索時間が、２０７ｎ秒と３７万倍も高速に実現できることが実証されている。
以上の内容は「データ検索を１００万倍高速にする技術」として、学会、展示会やマスコミを通じて広く公開されており実用化製品を開発中である。 In addition, as a result of developing a document retrieval system using FPGA with the Japanese Patent Application No. 2013-264863 of the invention by the inventor of the present invention and a memory having an information retrieval function, the retrieval time which takes about 76 milliseconds in normal software processing is 207 nanoseconds. It has been demonstrated that it can be realized at a speed as high as 370,000 times.
The above content is widely disclosed as a “technology for speeding up data retrieval 1 million times” through academic conferences, exhibitions, and the media, and practical products are being developed.

この発明では以上の実績をもとに、メモリ型デバイスの様々な特徴を最大限に引き出すことが可能な演算回路と様々な使用方法を提供し、以下のような広範囲な情報処理が可能であることを示したものである。
１．インデックス演算・・・・・・献検索、データベース検索
２．データの比較（一致、大小、範囲、最大・最小）・・・・・・データベース検索
３．演算結果判定・・・・・条件演算
４．加算、減算カウンタ・・・・・演算結果の累計
５．加算器、減算器・・・・・データ同志の加減算
６．暗号処理・・・・・・平文の暗号化、暗号文の復号
７．行列変換
８．データの作成
９．以上の組み合わせ演算 Based on the above results, the present invention provides an arithmetic circuit capable of extracting various features of the memory type device to the maximum and various usage methods, and can perform a wide range of information processing as follows. It shows that.
1. Index calculation .... Contribute search, database search Data comparison (match, large, small, range, maximum / minimum) ... database search Calculation result judgment Conditional calculation 4. Addition and subtraction counters · Cumulative calculation results Adder, subtractor ... Addition and subtraction of data 6 Cryptographic processing: Plain text encryption, cipher text decryption7. Matrix transformation8. 8. Creation of data Combined operation above

この技術の特徴は、豊富なメモリ資源をデータエリアやワークエリアとして利用することにより、極めて単純な回路構成のブール演算素子（論理演算素子）による１ｂｉｔ論理（ブール）演算器のみでも、多彩な演算を可能であることを具体的に示したものである。
また数値演算の利用頻度が多い場合には、１ｂｉｔの四則（算術）演算素子を追加して組み込むことにより効率的な数値演算処理が出来ることを具体的に示したものである。
さらにレコード間のデータ転送機能を持たせることにより、複数のレコードをまとめてレコード群としてデータ演算することや行列変換することが可能になるなど、さらに効果的で高速な演算が出来ることを具体的に示したものである。 The feature of this technology is that by using abundant memory resources as a data area and work area, a variety of operations can be performed with only a 1-bit logic (Boolean) arithmetic unit using a Boolean operation element (logical operation element) with an extremely simple circuit configuration. It is specifically shown that it is possible.
In addition, when the use frequency of numerical operations is high, it is specifically shown that efficient numerical operation processing can be performed by adding and incorporating a 1-bit four arithmetic (arithmetic) operation element.
In addition, by providing a data transfer function between records, it is possible to perform more effective and high-speed operations, such as performing data operations as a group of records and performing matrix transformations. It is shown in.

論理積（ＡＮＤ）、論理和（ＯＲ）、論理積（ＮＯＴ）さらに、排他論理和（ＸＯＲ）とこれらを組合せした加算器などの演算素子が、情報処理（コンピュータ）の基本であることは周知の事実であり、その利用方法や応用方法は様々な文献やインターネット上で幅広く紹介されている。
その典型的な素子がＡＬＵであり、通常８ｂｉｔ以上の多ビット演算器１２２であるＡＬＵ２１７を用いたＣＰＵやＧＰＵが我々の生活や産業の隅々で利用されている。
当然のことながら１ｂｉｔ演算器で全ての演算を行うことは、以上紹介した通り、繰り返し演算の手間がかかるため研究されることも紹介されることも利用されたこともない。 It is well known that arithmetic elements such as logical product (AND), logical sum (OR), logical product (NOT), exclusive logical sum (XOR), and an adder combining them are the basis of information processing (computer). The usage and application methods are widely introduced in various documents and the Internet.
A typical element is ALU, and CPUs and GPUs using ALU 217, which is usually a multi-bit arithmetic unit 122 of 8 bits or more, are used in every corner of our lives and industries.
As a matter of course, performing all operations with a 1-bit arithmetic unit has not been studied, introduced, or used because it takes time for repeated operations as described above.

しかしながらこれまで説明をしてきた１ｂｉｔ演算器によるメモリ型プロセッサ演算の要点をまとめると以下の通りである。
１．１ｂｉｔ演算器による演算は一見非効率的であるが超並列演算をすると極めて高速な演算となる、また行列双方向の演算が可能である。
２．データ量（レコード数）が大きくても小さくても何時も一定の演算時間である、従ってビックデータに最適である。
３．ＣＰＵやＧＰＵの負担が軽減され、情報処理装置の電力を削減する。
４．複雑な情報処理アルゴリズムとそのメタデータを考える必要がなくなるので開発者の負担が軽減される
５．インデックスなどのメタデータが不要になるので、事前準備やメタデータのデータメンテナンスが不要になる。
６．基データをメモリに記憶させるだけですぐに演算が可能になる。 However, the main points of the memory type processor operation by the 1-bit arithmetic unit described so far are summarized as follows.
1. The calculation by the 1-bit arithmetic unit is inefficient at first glance, but if the parallel operation is performed, the calculation becomes extremely fast and bi-directional matrix operation is possible.
2. Regardless of whether the amount of data (number of records) is large or small, the computation time is always constant, so it is optimal for big data.
3. The burden on the CPU and GPU is reduced, and the power of the information processing apparatus is reduced.
4). 4. No need to consider complicated information processing algorithms and their metadata, reducing the burden on developers. Since metadata such as an index is not necessary, advance preparation and metadata data maintenance are not required.
6). Just storing the basic data in the memory makes it possible to perform calculations immediately.

本願発明で紹介の演算手法は超並列演算によって以上のような様々なメリットをもつ１ｂｉｔ演算器の実力を知るもの以外生み出すことが出来ない。
本願特許で紹介した演算手法は本願発明者が独自に考案し体系化させたものであるが、行と列が反転された情報処理の手法を全く白紙に近い状態から今回の出願に至るまでとりまとめまとめるには多くの忍耐と時間が必要であった。 The calculation method introduced in the present invention cannot be generated except for knowing the ability of a 1-bit arithmetic unit having the above-mentioned various merits by massively parallel calculation.
The calculation method introduced in the patent of this application was originally devised and systematized by the inventor of the present application, but the information processing method in which the rows and columns are reversed is summarized from the state almost completely blank to the present application. It took a lot of patience and time to put together.

紹介した演算手法は情報処理で頻繁に利用される代表的な演算手法とこれに最適な演算器を示したものであり、更に様々なオプション機能を付けることも、反対に必要最低限に限定して利用することも自由である。
演算手法やその応用はＣＰＵ同様無数に存在する。 The introduced calculation methods show typical calculation methods that are frequently used in information processing and the most suitable calculators. In addition, various optional functions can be added to the necessary minimum. It is also free to use.
There are innumerable calculation methods and their applications as well as CPU.

これらの演算手法を標準的なライブラリとしておくことにより利用者はハードウエアを意識することなく、一般のソフトウエアに組み込み利用することができるようになる。 By setting these calculation methods as a standard library, the user can incorporate and use it in general software without being aware of the hardware.

この発明は、ビッグデータ社会のコンピュータのあるべき姿を追い求めて、ＣＰＵやＧＰＵのみの情報処理のコンピュータ技術の課題の多くを解決し、ＣＰＵやＧＰＵの弱点を補完するメモリ型プロセッサである。 The present invention is a memory-type processor that pursues the ideal form of a computer in a big data society, solves many of the problems of computer technology for information processing using only a CPU and GPU, and complements the weak points of the CPU and GPU.

この発明によるメモリ型プロセッサは、一般データデータベースはもとより、超大型データベース、超大型並列演算処理、各種認証、照合処理などの装置はもとより人工知能の装置の一部の機能として幅広く利用することが出来る。
またこの技術は情報処理開発に携わる技術者の負担を軽減し、しかも情報処理の電力を大幅に抑制することが出来るのでＩＴ機器の環境問題の解消に大きな意義をもつ。
将来に渡っては、多ｂｉｔで省スペースな超並列演算素子や、ＸＹ２軸双方向アドレスアクセス、２軸双方向レコード並列演算可能な素子の実現などより高度な情報処理機能を持ったメモリ型プロセッサとしての発展が期待できる。 The memory-type processor according to the present invention can be widely used as a part of a function of an artificial intelligence device as well as a general data database, a super-large database, a super-large parallel processing, various types of authentication, a verification process, and the like. .
In addition, this technology reduces the burden on engineers involved in information processing development, and can significantly reduce the power of information processing, so it has great significance in solving environmental problems in IT equipment.
In the future, memory processors with more advanced information processing functions such as multi-bit, space-saving massively parallel computing elements, XY 2-axis bidirectional address access, and 2-axis bidirectional record parallel computing elements Can be expected.

１００メモリ
１０１メモリ型プロセッサ
１０２記憶セル
１０３ワード幅（レコード数）
１０４アドレス
１０５１ｂｉｔ演算器（演算機能）
１０６演算結果出力
１０７演算結果
１１０アドレス選択
１１１演算条件指定機能
１１２論理積
１１３論理和
１１４論理否定
１１５排他論理和
１１６論理記憶
１２２多ｂｉｔ演算器（演算機能）
１２３ 1ｂｉｔ論理（ブール）演算器（演算機能）
１２４１ｂｉｔ四則（算術）演算器（演算機能）
２０１スイッチ
２０２フリップフロップ（ＦＦ）
２０３選択回路
２０４メモリデータ
２０５フリップフロップ結果出力
２０６カウントデータ
２０７１ｂｉｔ演算器バッファー
２０８キャリーデータ
２０９ボローデータ
２１０半加算器
２１１全加算器
２１２シフトレジスタ
２１３照合データ
２１４優先順出力回路（プライオリテイエンコーダ出力）
２１５レコードデータ
２１６クロック信号
２１７ＡＬＵ

100 Memory 101 Memory type processor 102 Memory cell 103 Word width (number of records)
104 Address 105 1-bit computing unit (calculation function)
106 operation result output 107 operation result 110 address selection 111 operation condition designation function 112 logical product 113 logical sum 114 logical negation 115 exclusive logical sum 116 logical storage
122 Multi-bit computing unit (calculation function)
123 1-bit logic (Boolean) arithmetic unit (arithmetic function)
124 1-bit four arithmetic (arithmetic) arithmetic unit
201 switch 202 flip-flop (FF)
203 selection circuit 204 memory data 205 flip-flop result output 206 count data 207 1-bit arithmetic unit buffer 208 carry data 209 borrow data 210 half adder 211 full adder 212 shift register 213 collation data 214 priority order output circuit (priority encoder output)
215 Record data 216 Clock signal 217 ALU

Claims

A multi-bit memory cell capable of storing data input from an external input function, an operation for reading the data of the memory cell every 1 bit, performing an operation, and writing the operation result to the memory cell every 1 bit An operation condition designating function for arranging a plurality of records in parallel, enabling the address selection of the memory cells of the many bits of all records in parallel, and designating operation conditions in parallel. ,
An external output function for outputting the calculation result of the record to the outside;
A memory type processor.

The calculation function is
(4) logical product operation (5) logical sum operation and (6) logical negation operation (4) exclusive OR operation (5) half addition operation (6) full addition operation (7) operation storage (8) or more (1) The memory type processor according to claim 1, wherein any one of the combinations of (7) to (7) is executed.

The external input function is:
2. The memory type processor according to claim 1, further comprising a function of performing matrix transformation on the data to be calculated from outside and writing the data to be calculated.

The external output function is:
(5) Output the addresses of the records in order of priority (6) Divide the records into several parts and output the divided records in order of priority (7) Whether any of the records has a calculation result The memory type according to claim 1, further comprising: an output function that outputs any one of the combinations (1) to (4). Processor.

2. The memory type processor according to claim 1, further comprising a function of transferring the calculation target data to the other record.

2. The memory type processor according to claim 1, wherein the memory type processor is incorporated in a circuit of a CPU and other functions.

2. The memory type processor according to claim 1, wherein the memory type processor is implemented by an FPGA.

An apparatus comprising the memory-type processor according to claim 1.

2. The method of using a memory-type processor according to claim 1, wherein both the data to be calculated and the work area data for temporarily saving the calculation result are allocated to the address, and both the data are used. By repeating the 1-bit calculation function,
(1) All-record parallel index search operation of the calculation target data (2) All-record parallel comparison (match, large, small, range, maximum / minimum) calculation of the calculation target data (3) All of the calculation target data Record parallel count (up / down) operation (4) All-record parallel addition / subtraction operation of the calculation target data (5) All-record parallel multiplication / division calculation of the calculation target data (6) All-record parallel of the calculation target data Encryption of plaintext and plaintext decryption operation of ciphertext (7) All-record parallel matrix conversion operation of the operation target data (8) Combination operation of all-record parallel data creation operation of the operation target data (9) or more A method of using a memory-type processor, characterized by performing any of the above operations.

A method of using a memory type processor according to claim 1, wherein it is determined whether or not there is a record remaining after winning the operation result, and an operation conditional expression is given based on the determination result. how to use.

2. Use of a memory type processor according to claim 1, wherein the data of a plurality of memory cells are allocated to a set of data from the large number of records and used. Method.

A method of using a memory-type processor, characterized in that the memory-type processor is used in (1) serial, parallel, or series-parallel connection, or (2) any one or more of hierarchical connections.