[go: up one dir, main page]

TWI911805B - System with output circuit for a vector-by-matrix multiplication array and operating method for output circuit - Google Patents

System with output circuit for a vector-by-matrix multiplication array and operating method for output circuit

Info

Publication number
TWI911805B
TWI911805B TW113126826A TW113126826A TWI911805B TW I911805 B TWI911805 B TW I911805B TW 113126826 A TW113126826 A TW 113126826A TW 113126826 A TW113126826 A TW 113126826A TW I911805 B TWI911805 B TW I911805B
Authority
TW
Taiwan
Prior art keywords
voltage
current
converter
array
output
Prior art date
Application number
TW113126826A
Other languages
Chinese (zh)
Other versions
TW202509816A (en
Inventor
曉萬 陳
崔安德魯庫尼爾
華 武
Original Assignee
美商超捷公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US18/386,901 external-priority patent/US20250068900A1/en
Priority claimed from PCT/US2023/081167 external-priority patent/WO2025048866A1/en
Application filed by 美商超捷公司 filed Critical 美商超捷公司
Publication of TW202509816A publication Critical patent/TW202509816A/en
Application granted granted Critical
Publication of TWI911805B publication Critical patent/TWI911805B/en

Links

Abstract

In one example, a system comprises a vector-by-matrix multiplication array comprising nonvolatile memory cells arranged into rows and columns, a first set of columns storing W+ weights and a second set of columns storing W- weights; and an output circuit to receive a first current from a respective column in the first set of columns and a second current from a respective column in the second set of columns and to generate a first voltage and a second voltage, the output circuit comprising a first current-to-voltage converter comprising a first integration capacitor to provide the first voltage equal to an initial voltage minus a first discharge value due to the first current, and a second current-to-voltage converter comprising a second integration capacitor to provide the second voltage equal to the initial voltage minus a second discharge value due to the second current.

Description

具有用於向量矩陣乘法陣列的輸出電路之系統及輸出電路之操作方法A system having an output circuit for a vector matrix multiplication array and a method for operating the output circuit.

本申請案主張2023年8月25日申請之名稱為「用於神經網路陣列之輸出電路(Output Circuit for Neural Network Array)」的美國臨時專利申請案第63/534,755號及2023年11月3日申請之名稱為「用於向量矩陣乘法陣列之輸出電路(Output Circuit for a Vector-By-Matrix Multiplication Array)」的美國專利申請案第18/386,901號之優先權。This application claims priority to U.S. Provisional Patent Application No. 63/534,755, filed August 25, 2023, entitled "Output Circuit for Neural Network Array," and U.S. Patent Application No. 18/386,901, filed November 3, 2023, entitled "Output Circuit for a Vector-By-Matrix Multiplication Array."

揭示用於向量矩陣乘法陣列之輸出電路及相聯結方法的眾多實例。Numerous examples of output circuits and connection methods used in vector matrix multiplication arrays are revealed.

人工神經網路模擬生物神經網路(動物之中樞神經系統,特別地,大腦)且用於估計或估算可取決於大量輸入且通常未知的功能。人工神經網路通常包括彼此交換訊息之互連「神經元」的層。Artificial neural networks simulate biological neural networks (the central nervous system of animals, especially the brain) and are used to estimate or evaluate functions that depend on large amounts of input and are often unknown. Artificial neural networks typically consist of layers of interconnected "neurons" that exchange information with each other.

圖1繪示人工神經網路,其中圓形表示神經元之輸入或層。連接(稱為突觸)由箭頭表示,且具有可基於經驗進行調諧之數值權重。此使得神經網路適應於輸入且能夠學習。典型地,神經網路包括多個輸入之一層。通常存在一或多個中間神經元層及提供神經網路之輸出的輸出神經元層。各層級處之神經元基於自突觸接收到之資料而個別地或集體地作出決策。Figure 1 illustrates an artificial neural network, where circles represent neuron inputs or layers. Connections (called synapses) are indicated by arrows and have numerical weights that can be tuned based on experience. This allows the neural network to adapt to inputs and learn. Typically, a neural network comprises one layer with multiple inputs. There are usually one or more intermediate neuron layers and output neuron layers that provide the network's outputs. Neurons at each layer make decisions individually or collectively based on data received from synapses.

用於高效能資訊處理之人工神經網路之發展中的主要挑戰中之一者在於缺乏充分的硬體技術。實際上,實際神經網路依賴於極大量之突觸來實現神經元之間的高連接性,亦即極高計算平行性。原則上,此複雜性可利用數位超級電腦或特殊化圖形處理單元叢集來達成。然而,除高成本之外,與生物網路相比,此等方法亦受中等能效困擾,主要因為生物網路執行低精確度類比計算,所以其消耗少得多的能量。CMOS類比電路已用於人工神經網路,但鑒於大量神經元及突觸,大多數CMOS實施之突觸已過於龐大。One of the major challenges in the development of artificial neural networks for high-performance information processing is the lack of sufficient hardware technology. In reality, neural networks rely on a vast number of synapses to achieve high connectivity between neurons, i.e., extremely high computational parallelism. In principle, this complexity can be achieved using digital supercomputers or specialized clusters of graphics processing units. However, in addition to high cost, these methods are also hampered by moderate energy efficiency compared to biological networks, mainly because biological networks perform low-precision analog computations and therefore consume far less energy. CMOS analog circuits have been used in artificial neural networks, but given the large number of neurons and synapses, most CMOS implementations have excessively large synapses.

申請人先前在美國專利申請公開案2017/0337466A1中揭示一種利用一或多個非揮發性記憶體陣列作為突觸之人工(類比)神經網路,該美國專利申請公開案以引用之方式併入。非揮發性記憶體陣列操作為類比神經記憶體,且包含配置成列及行之非揮發性記憶體胞元。神經網路包括:第一複數個突觸,其被組構成接收第一複數個輸入且自該第一複數個輸入產生第一複數個輸出;及第一複數個神經元,其被組構成接收第一複數個輸出。第一複數個突觸包括複數個記憶體胞元,其中該等記憶體胞元中之各者包括:形成於半導體基板中之間隔開的源極區及汲極區,其中通道區在源極區與汲極區之間延伸;浮動閘極,其裝設於通道區之第一部分上方且與該第一部分絕緣;及非浮動閘極,其裝設於通道區之第二部分上方且與該第二部分絕緣。複數個記憶體胞元中之各者儲存對應於浮動閘極上之電子數目的權重值。複數個記憶體胞元將第一複數個輸入乘以所儲存權重值以產生第一複數個輸出。 非揮發性記憶體胞元The applicant previously disclosed an artificial (analog) neural network utilizing one or more nonvolatile memory arrays as synapses in U.S. Patent Application Publication 2017/0337466A1, which is incorporated herein by reference. The nonvolatile memory arrays operate as analog neural memory and include nonvolatile memory cells configured in columns and rows. The neural network includes: a first plurality of synapses configured to receive a first plurality of inputs and generate a first plurality of outputs from the first plurality of inputs; and a first plurality of neurons configured to receive the first plurality of outputs. The first plurality of synapses include a plurality of memory cells, each of which includes: a source region and a drain region formed in a semiconductor substrate and spaced apart, wherein a channel region extends between the source region and the drain region; a floating gate disposed above and insulated from a first portion of the channel region; and a non-floating gate disposed above and insulated from a second portion of the channel region. Each of the plurality of memory cells stores a weight value corresponding to the number of electrons on the floating gate. The plurality of memory cells multiply the first plurality of inputs by the stored weight values to generate the first plurality of outputs. Non-volatile memory cells.

非揮發性記憶體為熟知的。舉例而言,以引用方式併入本文中之美國專利5,029,130 (「'130專利」)揭示了一種分離閘極非揮發性記憶體胞元陣列,其為一種類型之快閃記憶體胞元。此類記憶體胞元210顯示於圖2中。各記憶體胞元210包括形成於半導體基板12中之源極區14及汲極區16以及該源極區與該汲極區之間的通道區18。浮動閘極20形成於通道區18之第一部分上方且與該第一部分絕緣(且控制該第一部分之導電性),且形成於源極區14之一部分上方。字元線端子22 (其通常耦接至字元線)具有:第一部分,其裝設於通道區18之第二部分上方且與該第二部分絕緣(且控制該第二部分之導電性);及第二部分,其在浮動閘極20上及上方延伸。浮動閘極20及字元線端子22藉由閘極氧化物與基板12絕緣。位元線24耦接至汲極區16。Non-volatile memory is well known. For example, U.S. Patent 5,029,130 (“130 Patent”), incorporated herein by reference, discloses a discrete gate non-volatile memory cell array, which is a type of flash memory cell. Such a memory cell 210 is shown in FIG. 2. Each memory cell 210 includes a source region 14 and a drain region 16 formed in a semiconductor substrate 12, and a channel region 18 between the source region and the drain region. A floating gate 20 is formed above and insulated from (and controls the conductivity of) a first portion of the channel region 18, and is formed above a portion of the source region 14. The character line terminal 22 (which is typically coupled to a character line) has: a first portion disposed above and insulated from the second portion of the channel region 18 (and controlling the conductivity of the second portion); and a second portion extending above and on the floating gate 20. The floating gate 20 and the character line terminal 22 are insulated from the substrate 12 by a gate oxide. The bit line 24 is coupled to the drain region 16.

記憶體胞元210藉由將高正電壓置放於字元線端子22上而抹除(其中電子自浮動閘極移除),其使得浮動閘極20上之電子經由富爾-諾罕(Fowler-Nordheim;FN)穿隧自浮動閘極20穿過中間絕緣件穿隧至字元線端子22。The memory cell 210 is erased by placing a high positive voltage on the word line terminal 22 (where electrons are removed from the floating gate), which causes the electrons on the floating gate 20 to tunnel from the floating gate 20 through the intermediate insulation to the word line terminal 22 via the Fowler-Nordheim (FN) tunnel.

記憶體胞元210係藉由將正電壓置放於字元線端子22上及將正電壓置放於源極區14上而藉由運用熱電子之源極側注入(SSI)而經程式化(其中電子置放於浮動閘極上)。電子電流將自汲極區16朝向源極區14流動。當電子到達字元線端子22與浮動閘極20之間的間隙時,該等電子將加速且被加熱。經加熱電子中之一些將由於來自浮動閘極20之吸引靜電力而穿過閘極氧化物注入至浮動閘極20上。Memory cell 210 is programmed by source-side injection (SSI) of hot electrons (where electrons are placed on the floating gate) by applying a positive voltage to word line terminal 22 and a positive voltage to source region 14. Electron current flows from drain region 16 to source region 14. When electrons reach the gap between word line terminal 22 and floating gate 20, they are accelerated and heated. Some of the heated electrons are injected into floating gate 20 through the gate oxide due to the attractive electrostatic force from floating gate 20.

記憶體胞元210藉由將正讀取電壓置放於汲極區16及字元線端子22上來讀取(此接通通道區18之在字元線端子下方的部分)。若浮動閘極20帶正電(亦即,電子經抹除),則通道區18之在浮動閘極20下方的部分亦接通,且電流將跨越通道區18流動,此被感測為經抹除或「1」狀態。若浮動閘極20帶負電(亦即,用電子程式化),則通道區之在浮動閘極20下方的部分被大部分或完全斷開,且電流將不跨越通道區18流動(或將有極少電流跨越該通道區流動),此被感測為經程式化或「0」狀態。Memory cell 210 reads data by applying a positive read voltage to the drain region 16 and the word line terminal 22 (this connects the portion of channel region 18 below the word line terminal). If the floating gate 20 is positively charged (i.e., electrons are erased), the portion of channel region 18 below the floating gate 20 is also connected, and current flows across channel region 18; this is detected as erased or a "1" state. If the floating gate 20 is negatively charged (i.e., electronically programmed), the portion of channel region below the floating gate 20 is mostly or completely disconnected, and current does not flow across channel region 18 (or a very small amount of current flows across the channel region); this is detected as programmed or a "0" state.

表1描繪可施加至記憶體胞元210之端子以用於執行讀取、抹除及程式化操作的典型電壓及電流範圍: 表1:圖2之快閃記憶體胞元210之操作 WL BL SL 讀取 2-3V 0.6-2V 0V 抹除 ~11-13V 0V 0V 程式化 1-2V 10.5-3μA 9-10V Table 1 depicts the typical voltage and current ranges that can be applied to the terminals of memory cell 210 for performing read, erase, and programming operations: Table 1: Operation of flash memory cell 210 in Figure 2 WL BL SL Read 2-3V 0.6-2V 0V erase ~11-13V 0V 0V Programming 1-2V 10.5-3μA 9-10V

其他分離閘極記憶體胞元組構為吾人所知,該等分離閘極記憶體胞元組構為其他類型之快閃記憶體胞元。舉例而言,圖3描繪四閘極記憶體胞元310,其包含源極區14、汲極區16、在通道區18之第一部分上方的浮動閘極20、在通道區18之第二部分上方的選擇閘極22 (通常耦接至字元線WL)、在浮動閘極20上方之控制閘極28以及在源極區14上方之抹除閘極30。此組構描述於美國專利6,747,310中,其出於所有目的以引用之方式併入本文中。此處,除浮動閘極20以外,所有閘極皆為非浮動閘極,此意謂該等閘極電連接或可電連接至電壓源。程式化藉由來自通道區18之經加熱電子將自身注入至浮動閘極20上來執行。抹除藉由電子自浮動閘極20穿隧至抹除閘極30來執行。Other isolated gate memory cell architectures are known to us, such as those for other types of flash memory cells. For example, Figure 3 depicts a quad-gate memory cell 310 comprising a source region 14, a drain region 16, a floating gate 20 above a first portion of a channel region 18, a selection gate 22 above a second portion of the channel region 18 (typically coupled to a word line WL), a control gate 28 above the floating gate 20, and an erase gate 30 above the source region 14. This architecture is described in U.S. Patent 6,747,310, which is incorporated herein by reference for all purposes. Here, except for the floating gate 20, all gates are non-floating gates, meaning that these gates are electrically connected or can be electrically connected to a voltage source. Programming is performed by injecting heated electrons from channel region 18 into the floating gate 20. Erasure is performed by electrons tunneling from the floating gate 20 to the erase gate 30.

表2描繪可施加至記憶體胞元310之端子以用於執行讀取、抹除及程式化操作之典型電壓及電流範圍: 表2:圖3之快閃記憶體胞元310之操作 WL/SG BL CG EG SL 讀取 1.0-2V 0.6-2V 0-2.6V 0-2.6V 0V 抹除 -0.5V/0V 0V 0V/-8V 8-12V 0V 程式化 1V 0.1-1μA 8-11V 4.5-9V 4.5-5V Table 2 depicts the typical voltage and current ranges that can be applied to the terminals of memory cell 310 for performing read, erase, and programming operations: Table 2: Operation of flash memory cell 310 in Figure 3 WL/SG BL CG EG SL Read 1.0-2V 0.6-2V 0-2.6V 0-2.6V 0V erase -0.5V/0V 0V 0V/-8V 8-12V 0V Programming 1V 0.1-1μA 8-11V 4.5-9V 4.5-5V

圖4描繪三閘極記憶體胞元410,其為另一類型之快閃記憶體胞元。記憶體胞元410與圖3之記憶體胞元310相同,除記憶體胞元410不具有單獨控制閘極以外。抹除操作(藉此抹除通過使用抹除閘極來進行)及讀取操作類似於圖3之抹除操作及讀取操作,除未施加控制閘極偏壓以外。程式化操作亦在無控制閘極偏壓之情況下進行,且因此,較高電壓在程式化操作期間施加於源極線上以補償控制閘極偏壓之缺乏。Figure 4 depicts a three-gate memory cell 410, which is another type of flash memory cell. Memory cell 410 is the same as memory cell 310 in Figure 3, except that memory cell 410 does not have a separate control gate. The erase operation (thereby erased by using the erase gate) and read operation are similar to the erase and read operations in Figure 3, except that no control gate bias is applied. Programming operation is also performed without control gate bias, and therefore, a higher voltage is applied to the source line during programming operation to compensate for the lack of control gate bias.

表3描繪可施加至記憶體胞元410之端子以用於執行讀取、抹除及程式化操作的典型電壓及電流範圍: 表3:圖4之快閃記憶體胞元410之操作 WL/SG BL EG SL 讀取 0.7-2.2V 0.6-2V 0-2.6V 0V 抹除 -0.5V/0V 0V 11.5V 0V 程式化 1V 0.2-3μA 4.5V 7-9V Table 3 depicts the typical voltage and current ranges that can be applied to the terminals of memory cell 410 for performing read, erase, and programming operations: Table 3: Operation of flash memory cell 410 in Figure 4 WL/SG BL EG SL Read 0.7-2.2V 0.6-2V 0-2.6V 0V erase -0.5V/0V 0V 11.5V 0V Programming 1V 0.2-3μA 4.5V 7-9V

圖5描繪堆疊閘極記憶體胞元510,其為另一類型之快閃記憶體胞元。記憶體胞元510類似於圖2之記憶體胞元210,除浮動閘極20在整個通道區18上方延伸,且控制閘極22 (其在此處將耦接至字元線)在浮動閘極20上方延伸,藉由絕緣層(未顯示)分離以外。該抹除藉由電子自FG至基板之FN穿隧而進行,程式化藉由在通道區18與汲極區16之間的區處進行通道熱電子(CHE)注入、藉由電子自源極區14朝向汲極區16流動來進行,且讀取操作類似於針對具有較高控制閘極電壓之記憶體胞元210之讀取操作。Figure 5 depicts a stacked gate memory cell 510, which is another type of flash memory cell. Memory cell 510 is similar to memory cell 210 in Figure 2, except that the floating gate 20 extends over the entire channel area 18, and the control gate 22 (which will be coupled to the character line here) extends over the floating gate 20, separated by an insulating layer (not shown). The erasure is performed by electron tunneling from FG to FN on the substrate. The program is performed by channel hot electron (CHE) injection in the region between channel region 18 and drain region 16, by electron flow from source region 14 to drain region 16, and the read operation is similar to the read operation for memory cell 210 with a higher control gate voltage.

表4描繪可施加至記憶體胞元510之端子及基板12以用於執行讀取、抹除及程式化操作的典型電壓範圍: 圖4:圖5之快閃記憶體胞元510之操作 CG BL SL 基板 讀取 2-5V 0.6 - 2V 0V 0V 抹除 -8至-10V/0V FLT FLT 8-10V / 15-20V 程式化 8-12V 3-5V 0V 0V Table 4 depicts the typical voltage range that can be applied to the terminals and substrate 12 of the memory cell 510 for performing read, erase, and programming operations: Figure 4: Operation of flash memory cell 510 as shown in Figure 5 CG BL SL substrate Read 2-5V 0.6 - 2V 0V 0V erase -8 to -10V/0V FLT FLT 8-10V / 15-20V Programming 8-12V 3-5V 0V 0V

本文中所描述之方法及手段可應用於其他非揮發性記憶體技術,諸如FINFET分離閘極快閃或堆疊閘極快閃記憶體、NAND快閃、矽-氧化物-氮化物-氧化物-矽(silicon-oxide-nitride-oxide-silicon;SONOS,氮化物中之電荷捕捉)、金屬-氧化物-氮化物-氧化物-矽(MONOS,氮化物中之金屬電荷捕捉)、電阻式ram (ReRAM)、相變記憶體(PCM)、磁性ram (MRAM)、鐵電ram (FeRAM)、電荷捕捉(CT)記憶體、碳管(CN)記憶體、雙層級或多層級一次性可程式化(OTP)及相關電子ram (CeRAM)。The methods and techniques described herein can be applied to other non-volatile memory technologies, such as FinFET gate flash or stacked gate flash memory, NAND flash, silicon-oxide-nitride-oxide-silicon (SONOS, charge trapping in nitride), metal-oxide-nitride-oxide-silicon (MONOS, metal charge trapping in nitride), resistive RAM (ReRAM), phase-change memory (PCM), magnetic RAM (MRAM), ferroelectric RAM (FeRAM), charge trapping (CT) memory, carbon nanotube (CN) memory, two-level or multi-level one-time programmable (OTP) and related electronic RAM (CeRAM).

為了利用包含上文在人工神經網路中所描述之非揮發性記憶體胞元類型中之一者的記憶體陣列,進行二個修改。首先,線被組構成使得各記憶體胞元可個別地經程式化、抹除及讀取而不會不利地影響陣列中之其他記憶體胞元的記憶體狀態,如下文進一步解釋。其次,提供記憶體胞元之連續(類比)程式化。To utilize a memory array comprising one of the nonvolatile memory cell types described above in Artificial Neural Networks, two modifications are made. First, the lines are configured such that each memory cell can be individually programmed, erased, and read without adversely affecting the memory state of other memory cells in the array, as explained further below. Second, continuous (analogous) programming of memory cells is provided.

具體而言,陣列中之各記憶體胞元之記憶體狀態(亦即,浮動閘極上之電荷)可獨立地且在最少干擾其他記憶體胞元之情況下連續地自完全抹除狀態改變至完全程式化狀態,且反之亦然。此意謂胞元儲存器有效地類比或至少可儲存許多離散值(諸如16或64個不同值)中之一者,此允許記憶體陣列中之所有記憶體胞元的極精確及個別調諧,且此使得記憶體陣列對於儲存及對神經網路之突觸權重進行微調調整而言係理想的。 採用非揮發性記憶體胞元陣列之神經網路Specifically, the memory state (i.e., the charge on the floating gate) of each memory cell in the array can independently and continuously change from a completely erased state to a fully programmed state, and vice versa, with minimal interference to other memory cells. This means that the cell memory can effectively analog to or at least store one of many discrete values (such as 16 or 64 different values), allowing for extremely precise and individual tuning of all memory cells in the memory array, and making the memory array ideal for storing and fine-tuning the synaptic weights of the neural network. Neural networks employing non-volatile memory cell arrays

圖6在概念上繪示利用本發明實例之非揮發性記憶體陣列的神經網路之非限制性實例。此實例將非揮發性記憶體陣列神經網路用於人臉辨識應用,但任何其他適當應用皆可使用基於非揮發性記憶體陣列之神經網路來實施。Figure 6 conceptually illustrates a non-limiting example of a neural network utilizing the non-volatile memory array of the present invention. This example uses a non-volatile memory array neural network for facial recognition applications, but any other suitable application can be implemented using a neural network based on a non-volatile memory array.

S0為輸入層,對於此實例,該輸入層為具有5位元精確度之32×32像素RGB影像(亦即,三個32×32像素陣列,各色彩R、G及B一個陣列,各像素為5位元精確度)。在一些情況下,自輸入層S0進入至層C1之突觸CB1施加不同權重集合,而在其他情況下共用權重,且用3×3像素重疊濾波器(核心)掃描輸入影像,將濾波器移位1個像素(或大於1個像素,如由模型指定)。具體而言,影像之3×3部分(亦即,被稱作濾波器或核心)中之9個像素的值被提供給突觸CB1,其中此等9個輸入值乘以適當權重,且在求和該相乘之輸出之後,單一輸出值被判定且藉由CB1之第一突觸提供,用於產生層C1之特徵圖中之一者的像素。3×3濾波器接著在輸入層S0內向右移位一個像素(亦即,在右側上添加三個像素之行,且在左側上丟棄三個像素之行),藉此,此新定位濾波器中之9個像素值提供至突觸CB1,其中該等像素值乘以相同權重,且第二單一輸出值藉由相聯結突觸來判定。此程序針對所有三個色彩且針對所有位元(精確度值)繼續,直至3×3濾波器跨越輸入層S0之整個32×32像素影像進行掃描為止。程序接著使用不同權重集合進行重複以產生層C1之不同特徵圖,直至層C1之所有特徵圖已被計算為止。S0 is the input layer. In this example, the input layer is a 32×32 pixel RGB image with 5-bit precision (i.e., three 32×32 pixel arrays, one array for each color (R, G, and B), and each pixel with 5-bit precision). In some cases, different weight sets are applied to the synapse CB1 from the input layer S0 to layer C1, while in other cases, the weights are shared, and the input image is scanned with a 3×3 pixel overlay filter (core), shifting the filter by 1 pixel (or more than 1 pixel, as specified by the model). Specifically, the values of nine pixels in a 3×3 portion of the image (i.e., the filter or core) are provided to synapse CB1. These nine input values are multiplied by appropriate weights, and after summing the output of the multiplication, a single output value is determined and provided by the first synapse of CB1 to generate a pixel in the feature map of layer C1. The 3×3 filter is then shifted one pixel to the right within the input layer S0 (i.e., a row of three pixels is added on the right and a row of three pixels is discarded on the left), thereby providing nine pixel values in this newly positioned filter to synapse CB1, where these pixel values are multiplied by the same weights, and a second single output value is determined by the connecting synapse. This procedure continues for all three colors and for all bits (accuracy values) until the 3×3 filter has scanned the entire 32×32 pixel image of the input layer S0. The procedure is then repeated using different weight sets to generate different feature maps for layer C1 until all feature maps for layer C1 have been calculated.

在本實例中,在層C1中存在16個特徵圖,各特徵圖具有30×30像素。各像素為自輸入與核心相乘而提取之新特徵像素,且因此各特徵圖為二維陣列,且因此在此實例中,層C1構成二維陣列之16個層(應謹記,本文中所提及之層及陣列為邏輯關係,未必為實體關係-亦即,陣列未必定向於實體二維陣列中)。層C1中之16個特徵圖中之各者由施加至濾波器掃描之十六個不同突觸權重集合中之一者產生。C1特徵圖可皆針對同一影像特徵之不同態樣,諸如邊界識別。舉例而言,第一圖(使用第一權重集合產生,共用於用以產生此第一圖之所有掃描)可識別圓形邊緣,第二圖(使用不同於第一權重集合之第二權重集合產生)可識別矩形邊緣,或某些特徵之縱橫比等。In this example, there are 16 feature maps in layer C1, each with 30×30 pixels. Each pixel is a new feature pixel extracted by multiplying the input by the kernel, and therefore each feature map is a two-dimensional array. Thus, in this example, layer C1 constitutes 16 layers of two-dimensional arrays (it should be noted that the layers and arrays mentioned in this article are logical relationships, not necessarily physical relationships—that is, the array is not necessarily oriented in a physical two-dimensional array). Each of the 16 feature maps in layer C1 is generated from one of sixteen different sets of synaptic weights applied to the filter scan. The C1 feature maps can all target different forms of the same image feature, such as boundary recognition. For example, the first image (generated using the first weight set, shared by all scans used to generate this first image) can identify circular edges, while the second image (generated using the second weight set, which is different from the first weight set) can identify rectangular edges, or the aspect ratio of certain features, etc.

激勵函數P1 (池化)在自層C1進入層S1之前應用,其池化來自各特徵圖中之連續非重疊2×2區的值。池化函數P1之目的為使附近位置達到平均數(或亦可使用最大函數),以例如降低邊緣位置之相依性且在進入下一階段之前減小資料大小。在層S1處,存在16個15×15特徵圖(亦即,各自具有15×15像素之十六個不同陣列)。自層S1進入層C2之突觸CB2利用4×4濾波器掃描層S1中之圖,其中濾波器移位1個像素。在層C2處,存在22個12×12特徵圖。激勵函數P2 (池化)在自層C2進入層S2之前應用,其池化來自各特徵圖中之連續非重疊2×2區的值。在層S2處,存在22個6×6特徵圖。激勵函數(池化)在自層S2進入層C3之突觸CB3處應用,其中層C3中之每個神經元經由CB3之各別突觸連接至層S2中之每個圖。在層C3處,存在64個神經元。自層C3進入輸出層S3之突觸CB4將C3完全連接至S3,亦即,層C3中之每個神經元連接至層S3中之每個神經元。S3處之輸出包括10個神經元,其中最高輸出神經元判定類別。此輸出可例如指示原始影像之內容的識別或分類。The excitation function P1 (pooling) is applied before entering layer S1 from layer C1. Its pooling is derived from the values of consecutive, non-overlapping 2×2 regions in each feature map. The purpose of pooling function P1 is to average the values of nearby locations (or, alternatively, use a max function) to, for example, reduce the dependency of edge locations and decrease the data size before moving to the next stage. At layer S1, there are 16 15×15 feature maps (i.e., sixteen different arrays, each with 15×15 pixels). The synapse CB2, which allows entry from layer S1 into layer C2, scans the map in layer S1 using a 4×4 filter, with the filter shifted by one pixel. At layer C2, there are 22 12×12 feature maps. The activation function P2 (pooling) is applied before entering layer S2 from layer C2, and its pooling is derived from the values of continuous, non-overlapping 2×2 regions in each feature map. At layer S2, there are 22 6×6 feature maps. The activation function (pooling) is applied at the synapse CB3 where layer S2 enters layer C3, where each neuron in layer C3 is connected to each feature map in layer S2 via individual synapses in CB3. At layer C3, there are 64 neurons. The synapse CB4 where layer C3 enters the output layer S3 completely connects C3 to S3; that is, each neuron in layer C3 is connected to every neuron in layer S3. The output at S3 includes 10 neurons, with the highest output neuron determining the category. This output can, for example, indicate the identification or classification of the content of the original image.

各突觸層係使用非揮發性記憶體胞元之陣列或陣列之一部分來實施。Each synaptic layer is implemented using arrays or portions of nonvolatile memory cells.

圖7為可用於彼目的之陣列的方塊圖。向量矩陣乘法(VMM)陣列32包括非揮發性記憶體胞元,且用作一層與下一層之間的突觸(諸如圖6中之CB1、CB2、CB3及CB4)。具體而言,VMM陣列32包括非揮發性記憶體胞元陣列33、抹除閘極及字元線閘極解碼器34、控制閘極解碼器35、位元線解碼器36及源極線解碼器37,該等解碼器對非揮發性記憶體胞元陣列33之各別輸入進行解碼。至VMM陣列32之輸入可來自抹除閘極及字元線閘極解碼器34或來自控制閘極解碼器35。在此實例中,源極線解碼器37亦對非揮發性記憶體胞元陣列33之輸出進行解碼。替代地,位元線解碼器36可對非揮發性記憶體胞元陣列33之輸出進行解碼。Figure 7 is a block diagram of the array that can be used for that purpose. The Vector Matrix Multiplication (VMM) array 32 includes non-volatile memory cells and serves as synapses between layers (such as CB1, CB2, CB3, and CB4 in Figure 6). Specifically, the VMM array 32 includes a non-volatile memory cell array 33, an erase gate and character line gate decoder 34, a control gate decoder 35, a bit line decoder 36, and a source line decoder 37, which decode the respective inputs of the non-volatile memory cell array 33. The inputs to the VMM array 32 can come from the erase gate and character line gate decoder 34 or from the control gate decoder 35. In this example, the source line decoder 37 also decodes the output of the non-volatile memory cell array 33. Alternatively, the bit line decoder 36 can decode the output of the non-volatile memory cell array 33.

非揮發性記憶體胞元陣列33用於二個目的。首先,其儲存將由VMM陣列32使用之權重。其次,非揮發性記憶體胞元陣列33有效地使輸入乘以儲存於非揮發性記憶體胞元陣列33中之權重,且按輸出線(源極線或位元線)將結果相加以產生輸出,該輸出將為至下一層之輸入或至最終層之輸入。藉由執行乘法及加法函數,非揮發性記憶體胞元陣列33移除對單獨的乘法及加法邏輯電路之需求,且由於其原位記憶體計算而亦為功率高效的。The non-volatile memory cell array 33 serves two purposes. First, it stores the weights that will be used by the VMM array 32. Second, the non-volatile memory cell array 33 efficiently multiplies the input by the weights stored in the non-volatile memory cell array 33 and adds the results along the output lines (source lines or bit lines) to produce an output that will be the input to the next layer or the input to the final layer. By performing multiplication and addition functions, the non-volatile memory cell array 33 eliminates the need for separate multiplication and addition logic circuits and is also power efficient due to its in-situ memory calculations.

非揮發性記憶體胞元陣列33之輸出經供應至差分求和器(諸如求和運算放大器或求和電流鏡) 38,該差分求和器對非揮發性記憶體胞元陣列33之輸出求和以產生用於彼卷積之單一值。差分求和器38經配置以執行正權重與負權重之求和。The output of the nonvolatile memory cell array 33 is supplied to a differential summer (such as a summing operational amplifier or a summing current mirror) 38, which sums the output of the nonvolatile memory cell array 33 to produce a single value for its convolution. The differential summer 38 is configured to perform summation of positive and negative weights.

差分求和器38之總計輸出值接著供應至激勵函數區塊39,該激勵函數區塊對輸出進行整流。激勵函數區塊39可提供S型(sigmoid)、雙曲正切(tanh)或ReLU函數。激勵函數區塊39之經整流輸出值變成作為下一層(例如,圖6中之C1)之特徵圖之元素,且接著應用於下一突觸以產生下一特徵圖層或最終層。因此,在此實例中,非揮發性記憶體胞元陣列33構成複數個突觸(其自前一神經元層或自諸如影像資料庫之輸入層接收該等突觸之輸入),且求和運算放大器38及激勵函數區塊39構成複數個神經元。The total output of the difference summer 38 is then supplied to the excitation function block 39, which rectifies the output. The excitation function block 39 can provide a sigmoid, hyperbolic tangent, or ReLU function. The rectified output of the excitation function block 39 becomes an element of the feature map of the next layer (e.g., C1 in Figure 6), and is then applied to the next synapse to produce the next feature map layer or the final layer. Therefore, in this example, the nonvolatile memory cell array 33 constitutes a plurality of synapses (which receive inputs from the previous neuronal layer or from an input layer such as an image database), and the summing operational amplifier 38 and the excitation function block 39 constitute a plurality of neurons.

至圖7中之VMM陣列32之輸入(WLx,EGx,CGx,以及選擇地BLx及SLx)可為類比位準、二進位位準或數位位元(在此情況下,DAC被設置成為將數位位元轉換成適當輸入類比位準),且輸出可為類比位準、二進位位準或數位位元(在此情況下,輸出ADC被設置成為將輸出類比位準轉換成數位位元)。The inputs (WLx, EGx, CGx, and optional BLx and SLx) to the VMM array 32 in Figure 7 can be analog, binary, or digital (in which case, the DAC is set to convert digital to the appropriate input analog level), and the output can be analog, binary, or digital (in which case, the output ADC is set to convert the output analog level to digital).

圖8為描繪此處標記為VMM陣列32a、32b、32c、32d及32e之VMM陣列32的眾多層之使用的方塊圖。如圖8中所顯示,表示為Inputx之輸入由數位至類比轉換器31自數位轉換成類比,且經提供至輸入VMM陣列32a。經轉換之類比輸入可為電壓或電流。第一層之輸入D/A轉換可藉由使用函數或查找表(LUT)來進行,該函數或LUT將輸入Inputx映射至適用於輸入VMM陣列32a之矩陣乘法器的類比位準。輸入轉換亦可藉由類比至類比(A/A)轉換器來進行以將外部類比輸入轉換成至輸入VMM陣列32a之經映射類比輸入。Figure 8 is a block diagram depicting the use of the multiple layers of VMM array 32, denoted here as VMM arrays 32a, 32b, 32c, 32d, and 32e. As shown in Figure 8, the input of Inputx is converted from digital to analog by the digital-to-analog converter 31 and provided to the input VMM array 32a. The converted analog input can be voltage or current. The first-layer input D/A conversion can be performed using a function or lookup table (LUT) that maps the input Inputx to the analog level suitable for the matrix multiplier of the input VMM array 32a. Input conversion can also be performed using an analog-to-analog (A/A) converter to convert external analog inputs to mapped analog inputs to the input VMM array 32a.

由輸入VMM陣列32a產生之輸出被設置為至下一VMM陣列(隱藏層級1) 32b之輸入,該下一VMM陣列又產生輸出,該輸出被設置為至下一VMM陣列(隱藏層級2) 32c之輸入等。VMM陣列32之各種層充當卷積神經網路(CNN)之不同的突觸及神經元層。各VMM陣列32a、32b、32c、32d及32e可為單獨的實體非揮發性記憶體陣列,或多個VMM陣列可利用相同實體非揮發性記憶體陣列之不同部分,或多個VMM陣列可利用相同實體非揮發性記憶體陣列之重疊部分。圖8中所顯示之實例含有五個層(32a、32b、32c、32d、32e):一個輸入層(32a)、二個隱藏層(32b、32c)及二個完全連接層(32d、32e)。一般熟悉本技藝者應瞭解,此僅為實例,且系統替代地可包含多於二個隱藏層及多於二個完全連接層。 向量矩陣乘法(VMM)陣列The output generated by the input VMM array 32a is set as the input to the next VMM array (hidden level 1) 32b, which in turn generates an output, which is set as the input to the next VMM array (hidden level 2) 32c, and so on. The various layers of the VMM array 32 serve as different synaptic and neuron layers of a convolutional neural network (CNN). Each VMM array 32a, 32b, 32c, 32d, and 32e can be a single physical non-volatile memory array, or multiple VMM arrays can utilize different portions of the same physical non-volatile memory array, or multiple VMM arrays can utilize overlapping portions of the same physical non-volatile memory array. The example shown in Figure 8 contains five layers (32a, 32b, 32c, 32d, 32e): one input layer (32a), two hidden layers (32b, 32c), and two fully interconnected layers (32d, 32e). Those familiar with this technique should understand that this is merely an example, and the system may alternatively contain more than two hidden layers and more than two fully connected layers. Vector Matrix Multiplication (VMM) array

圖9描繪神經元VMM陣列900,其尤其適合於如圖3中所顯示之記憶體胞元310,且用作輸入層與下一層之間的突觸及神經元部分。VMM陣列900包含非揮發性記憶體胞元之記憶體陣列901及非揮發性參考記憶體胞元之參考陣列902 (在陣列之頂部處)。替代地,另一參考陣列可置放於底部處。Figure 9 depicts a neuronal VMM array 900, which is particularly suitable for memory cells 310 as shown in Figure 3, and serves as a synaptic and neuronal portion between the input layer and the next layer. The VMM array 900 includes a memory array 901 of nonvolatile memory cells and a reference array 902 of nonvolatile reference memory cells (at the top of the array). Alternatively, another reference array may be placed at the bottom.

在VMM陣列900中,諸如控制閘極線903之控制閘極線在垂直方向上延行(因此,列方向上之參考陣列902正交於控制閘極線903),且諸如抹除閘極線904之抹除閘極線在水平方向上延行。此處,至VMM陣列900之輸入被設置於控制閘極線(CG0、CG1、CG2、CG3)上,且VMM陣列900之輸出出現於源極線(SL0、SL1)上。在一個實例中,僅使用偶數列,且在另一實例中,僅使用奇數列。置放於各源極線(分別為SL0、SL1)上之電流對來自連接至彼特定源極線之記憶體胞元的所有電流執行求和函數。In the VMM array 900, control gate lines such as control gate line 903 extend vertically (therefore, the reference array 902 in the column direction is orthogonal to control gate line 903), and erase gate lines such as erase gate line 904 extend horizontally. Here, the inputs to the VMM array 900 are set on control gate lines (CG0, CG1, CG2, CG3), and the outputs of the VMM array 900 appear on source lines (SL0, SL1). In one example, only even-numbered columns are used, and in another example, only odd-numbered columns are used. The current placed on each source line (SL0, SL1) is summed against all currents from memory cells connected to that specific source line.

如本文中針對神經網路所描述,VMM陣列900之非揮發性記憶體胞元,亦即,VMM陣列900之記憶體胞元310,可被組構成在次臨限區中操作。As described in this paper regarding neural networks, the nonvolatile memory cells of VMM array 900, namely memory cells 310 of VMM array 900, can be configured to operate in subcritical regions.

本文中所描述之非揮發性參考記憶體胞元及非揮發性記憶體胞元在弱反轉(次臨限區)中經偏壓: Ids = Io * e (Vg- Vth)/nVt = w * Io * e (Vg)/nVt, 其中w = e (- Vth)/nVt其中Ids為汲極至源極電流;Vg為記憶體胞元上之閘極電壓;Vth為記憶體胞元之臨限電壓;Vt為熱電壓= k*T/q,其中k為波茲曼常數(Boltzmann constant),T為以克耳文(Kelvin)為單位之溫度,且q為電子電荷;n為斜率因數= 1 + (Cdep/Cox),其中Cdep =空乏層之電容,且Cox為閘極氧化物層之電容;Io為等於臨限電壓之閘極電壓下之記憶體胞元電流,Io為與(Wt/L)*u*Cox* (n-1) * Vt2成比例,其中u為記憶體胞元之載流子遷移率,且Wt及L分別為寬度及長度。The nonvolatile reference memory cell and the nonvolatile memory cell described in this paper are biased in the weak inversion (subcritical region): Ids = Io * e (Vg - Vth)/nVt = w * Io * e (Vg)/nVt , where w = e (- Vth)/nVt, where Ids is the drain-to-source current; Vg is the gate voltage on the memory cell; Vth is the critical voltage of the memory cell; Vt is the thermoelectric pressure = k*T/q, where k is the Boltzmann constant, T is the temperature in Kelvin, and q is the electron charge; n is the slope factor = 1 + (Cdep/Cox), where Cdep = The capacitance of the depletion layer, and Cox is the capacitance of the gate oxide layer; Io is the memory cell current at the gate voltage equal to the threshold voltage, and Io is proportional to (Wt/L)*u*Cox* (n-1) * Vt 2 , where u is the carrier mobility of the memory cell, and Wt and L are the width and length, respectively.

對於使用記憶體胞元(諸如參考記憶體胞元或周邊記憶體胞元)或電晶體將輸入電流轉換成輸入電壓之I至V對數轉換器: Vg= n*Vt*log [Ids/wp*Io] 其中,wp為參考或周邊記憶體胞元之w。For an I-to-V logarithmic converter that uses memory cells (such as reference memory cells or peripheral memory cells) or transistors to convert input current into input voltage: Vg = n*Vt*log [Ids/wp*Io] where wp is the w of the reference or peripheral memory cell.

對於用作具有電流輸入之向量矩陣乘法器VMM陣列之記憶體陣列,輸出電流為: Iout = wa * Io * e (Vg)/nVt,亦即 Iout = (wa/wp) * Iin = W * Iin W = e(Vthp - Vtha)/nVt此處,wa =記憶體陣列中之各記憶體胞元之w。 Vthp為周邊記憶體胞元之有效臨限電壓,且Vtha為主(資料)記憶體胞元之有效臨限電壓。應注意,電晶體之臨限電壓為基板基底偏壓電壓之函數,且表示為Vsb之基板基底偏壓電壓可經調變以補償此溫度下之各種條件。臨限電壓Vth可表述為: Vth = Vth0 + γ (SQRT |Vsb - 2*φF) - SQRT |2* φF |) 其中Vth0為具有零基板偏壓之臨限電壓,φF為表面電位,且γ為體效應參數。For a memory array used as a vector matrix multiplier (VMM) with current input, the output current is: Iout = wa * Io * e (Vg)/nVt , or Iout = (wa/wp) * Iin = W * Iin, W = e (Vthp - Vtha)/nVt. Here, wa = w of each memory cell in the memory array. Vthp is the effective threshold voltage of the peripheral memory cells, and Vtha is the effective threshold voltage of the main (data) memory cells. It should be noted that the threshold voltage of the transistor is a function of the substrate bias voltage, and the substrate bias voltage, expressed as Vsb, can be modulated to compensate for various conditions at this temperature. The critical voltage Vth can be expressed as: Vth = Vth0 + γ (SQRT |Vsb - 2*φF) - SQRT |2*φF |) where Vth0 is the critical voltage with zero substrate bias, φF is the surface potential, and γ is the volume effect parameter.

字元線或控制閘極可用作用於輸入電壓之記憶體胞元之輸入。Character lines or control gates can be used to input memory cells that act on the input voltage.

替代地,本文中所描述之VMM陣列之快閃記憶體胞元可被組構成在線性區中操作: Ids = β* (Vgs-Vth)*Vds;β = u*Cox*Wt/L W = α (Vgs-Vth) 此意謂線性區中之權重W與(Vgs-Vth)成比例。Alternatively, the flash memory cells of the VMM array described herein can be configured to operate in a linear region: Ids = β* (Vgs-Vth)*Vds; β = u*Cox*Wt/L W = α (Vgs-Vth) This means that the weight W in the linear region is proportional to (Vgs-Vth).

字元線或控制閘極或位元線或源極線可用作在線性區中操作之記憶體胞元的輸入。位元線或源極線可用作記憶體胞元之輸出。Character lines, control gates, bit lines, or source lines can be used as inputs to memory cells operating in a linear region. Bit lines or source lines can be used as outputs to memory cells.

對於I至V線性轉換器,記憶體胞元(諸如,參考記憶體胞元或周邊記憶體胞元)或在線性區中操作之電晶體可用以將輸入/輸出電流線性地轉換成輸入/輸出電壓。For I-to-V linear converters, memory cells (such as reference memory cells or peripheral memory cells) or transistors operating in the linear region can be used to linearly convert input/output currents into input/output voltages.

替代地,本文中所描述之VMM陣列之記憶體胞元可被組構成在飽和區中操作: Ids =* β* (Vgs-Vth)2;β= u*Cox*Wt/L Wα (Vgs-Vth)2,此意謂權重W與(Vgs-Vth)2成比例。Alternatively, the memory cells of the VMM array described herein can be configured to operate in saturation regions: Ids = * β* (Vgs-Vth) 2 ;β= u*Cox*Wt/L Wα (Vgs-Vth) 2 , which means that the weight W is proportional to (Vgs-Vth) 2 .

字元線、控制閘極或抹除閘極可用作在飽和區中操作之記憶體胞元之輸入。位元線或源極線可用作輸出神經元之輸出。Character lines, control gates, or erase gates can be used as inputs to memory cells that operate in the saturation region. Bit lines or source lines can be used as outputs to output neurons.

替代地,本文中所描述之VMM陣列之記憶體胞元可用於神經網路之各層或多層之所有區或其組合(次臨限區、線性區或飽和區)中。Alternatively, the memory cells of the VMM array described herein can be used in all regions or combinations thereof (subcritical regions, linear regions or saturated regions) of any layer or multiple layers of a neural network.

圖7之VMM陣列32的其他實例描述於美國專利第10,748,630號中,該專利以引用之方式併入本文中。如彼申請案中所描述,源極線或位元線可用作神經元輸出(電流求和輸出)。Other examples of the VMM array 32 in Figure 7 are described in U.S. Patent No. 10,748,630, which is incorporated herein by reference. As described in that application, source lines or bit lines can be used as neural outputs (current summation outputs).

圖10描繪神經元VMM陣列1000,其尤其適合於如圖2中所顯示之記憶體胞元210,且用作輸入層與下一層之間的突觸。VMM陣列1000包含非揮發性記憶體胞元之記憶體陣列1003、第一非揮發性參考記憶體胞元之參考陣列1001及第二非揮發性參考記憶體胞元之參考陣列1002。配置於陣列之行方向上之參考陣列1001及1002用以將流入端子BLR0、BLR1、BLR2及BLR3中之電流輸入轉換成WL0、WL1、WL2及WL3之電壓輸入。實際上,第一及第二非揮發性參考記憶體胞元為二極體連接式貫穿多工器1014 (部分描繪),其中電流輸入流入該等多工器中。參考胞元經調諧(例如,經程式化)至目標參考位準。目標參考位準由參考小型陣列矩陣(未顯示)提供。Figure 10 depicts a neuronal VMM array 1000, which is particularly suitable for memory cells 210 as shown in Figure 2 and serves as a synapse between the input layer and the next layer. The VMM array 1000 includes a memory array 1003 of nonvolatile memory cells, a reference array 1001 of first nonvolatile reference memory cells, and a reference array 1002 of second nonvolatile reference memory cells. The reference arrays 1001 and 1002, arranged in the row direction of the array, are used to convert the current input flowing into terminals BLR0, BLR1, BLR2, and BLR3 into voltage inputs WL0, WL1, WL2, and WL3. In practice, the first and second nonvolatile reference memory cells are diode-connected through-type multiplexers 1014 (partially depicted), into which current inputs flow. The reference cells are tuned (e.g., programmed) to a target reference level. The target reference level is provided by a reference miniature array matrix (not shown).

記憶體陣列1003用於二個目的。首先,其儲存將由VMM陣列1000在其各別記憶體胞元上使用之權重。其次,記憶體陣列1003有效地使輸入(亦即,在端子BLR0、BLR1、BLR2及BLR3中提供之電流輸入,其由參考陣列1001及1002轉換成輸入電壓以供應至字元線WL0、WL1、WL2及WL3)乘以儲存於記憶體陣列1003中之權重,且接著將所有結果(記憶體胞元電流)相加以在各別位元線(BL0至BLN)上產生輸出,該輸出將為至下一層之輸入或至最終層之輸入。藉由執行乘法及加法函數,記憶體陣列1003消除對單獨的乘法及加法邏輯電路之需求,且亦為功率高效的。此處,電壓輸入設置於字元線WL0、WL1、WL2及WL3上,且輸出在讀取(推理)操作期間出現於各別位元線BL0至BLN上。置放於位元線BL0至BLN中之各者上的電流對來自連接至彼特定位元線之所有非揮發性記憶體胞元的電流執行求和函數。The memory array 1003 serves two purposes. First, it stores the weights used by the VMM array 1000 in its individual memory cells. Second, the memory array 1003 effectively multiplies the inputs (i.e., the current inputs provided in terminals BLR0, BLR1, BLR2, and BLR3, which are converted into input voltages by reference arrays 1001 and 1002 to supply word lines WL0, WL1, WL2, and WL3) by the weights stored in the memory array 1003, and then adds all the results (memory cell currents) to produce an output on the individual bit lines (BL0 to BLN), which will be the input to the next layer or the input to the final layer. By performing multiplication and addition functions, the memory array 1003 eliminates the need for separate multiplication and addition logic circuits and is also power-efficient. Here, voltage inputs are located on word lines WL0, WL1, WL2, and WL3, and outputs appear on individual bit lines BL0 to BLN during read (inference) operations. The current placed on each of the bit lines BL0 to BLN performs a summation function on the current from all non-volatile memory cells connected to the bit position lines.

表5描繪用於VMM陣列1000之操作電壓及電流。表中之行指示置放於以下各者上之電壓:用於選定胞元之字元線、用於未選定胞元之字元線、用於選定胞元之位元線、用於未選定胞元之位元線、用於選定胞元之源極線及用於未選定胞元之源極線。列指示讀取、抹除及程式化之操作。 表5:圖10之VMM陣列1000之操作 WL WL-未選定 BL BL-未選定 SL SL-未選定 讀取 1-3.5V -0.5V/0V 0.6-2V (Ineuron) 0.6V-2V/0V 0V 0V 抹除 ~5-13V 0V 0V 0V 0V 0V 程式化 1-2V -0.5V/0V 0.1-3 uA Vinh ~2.5V 4-10V 0-1V/FLT Table 5 depicts the operating voltages and currents used in the VMM array 1000. The row indicators in the table are set to the following voltages: word lines for selected cells, word lines for unselected cells, bit lines for selected cells, bit lines for unselected cells, source lines for selected cells, and source lines for unselected cells. The table also shows the operations for reading, erasing, and programming the row indicators. Table 5: Operation of the VMM array 1000 in Figure 10 WL WL - Not selected BL BL - Not Selected SL SL - Not selected Read 1-3.5V -0.5V/0V 0.6-2V (Ineuron) 0.6V-2V/0V 0V 0V erase ~5-13V 0V 0V 0V 0V 0V Programming 1-2V -0.5V/0V 0.1-3 uA Vinh ~2.5V 4-10V 0-1V/FLT

圖11描繪神經元VMM陣列1100,其尤其適合於如圖2中所顯示之記憶體胞元210,且用作輸入層與下一層之間的突觸及神經元部分。VMM陣列1100包含非揮發性記憶體胞元之記憶體陣列1103、第一非揮發性參考記憶體胞元之參考陣列1101及第二非揮發性參考記憶體胞元之參考陣列1102。參考陣列1101及1102在VMM陣列1100之列方向上延行。VMM陣列類似於VMM 1000,除在VMM陣列1100中字元線在垂直方向上延行以外。此處,輸入被設置於字元線(WLA0、WLB0、WLA1、WLB1、WLA2、WLB2、WLA3、WLB3)上,且輸出在讀取操作期間出現於源極線(SL0、SL1)上。置放於各源極線上之電流對來自連接至彼特定源極線之記憶體胞元的所有電流執行求和函數。Figure 11 depicts a neuronal VMM array 1100, which is particularly suitable for memory cells 210 as shown in Figure 2, and serves as a synaptic and neuronal portion between the input layer and the next layer. The VMM array 1100 includes a memory array 1103 of nonvolatile memory cells, a reference array 1101 of first nonvolatile reference memory cells, and a reference array 1102 of second nonvolatile reference memory cells. Reference arrays 1101 and 1102 extend in the column direction of the VMM array 1100. The VMM array is similar to VMM 1000, except that in the VMM array 1100, the character lines extend in the vertical direction. Here, the inputs are set on the character lines (WLA0, WLB0, WLA1, WLB1, WLA2, WLB2, WLA3, WLB3), and the outputs appear on the source lines (SL0, SL1) during read operations. The current placed on each source line performs a summation function on all currents from the memory cells connected to that particular source line.

表6描繪用於VMM陣列1100之操作電壓及電流。表中之行指示置放於以下各者上之電壓:用於選定胞元之字元線、用於未選定胞元之字元線、用於選定胞元之位元線、用於未選定胞元之位元線、用於選定胞元之源極線及用於未選定胞元之源極線。列指示讀取、抹除及程式化之操作。 表6:圖11之VMM陣列1100之操作 WL WL-未選定 BL BL-未選定 SL SL-未選定 讀取 1-3.5V -0.5V/0V 0.6-2V 0.6V-2V/0V ~0.3-1V (Ineuron) 0V 抹除 ~5-13V 0V 0V 0V 0V SL-禁止(~4- 8V) 程式化 1-2V -0.5V/0V 0.1-3 uA Vinh ~2.5V 4-10V 0-1V/FLT Table 6 depicts the operating voltages and currents used in the VMM array 1100. The row indicators in the table are set to the following voltages: word lines for selected cells, word lines for unselected cells, bit lines for selected cells, bit lines for unselected cells, source lines for selected cells, and source lines for unselected cells. The table also shows the operations for reading, erasing, and programming the row indicators. Table 6: Operation of the VMM array 1100 in Figure 11 WL WL - Not selected BL BL - Not Selected SL SL - Not selected Read 1-3.5V -0.5V/0V 0.6-2V 0.6V-2V/0V ~0.3-1V (Ineuron) 0V erase ~5-13V 0V 0V 0V 0V SL - Prohibited (~4-8V) Programming 1-2V -0.5V/0V 0.1-3 uA Vinh ~2.5V 4-10V 0-1V/FLT

圖12描繪神經元VMM陣列1200,其尤其適合於如圖3中所顯示之記憶體胞元310,且用作輸入層與下一層之間的突觸及神經元部分。VMM陣列1200包含非揮發性記憶體胞元之記憶體陣列1203、第一非揮發性參考記憶體胞元之參考陣列1201及第二非揮發性參考記憶體胞元之參考陣列1202。參考陣列1201及1202用以將流入端子BLR0、BLR1、BLR2及BLR3中之電流輸入轉換為電壓輸入CG0、CG1、CG2及CG3。實際上,第一及第二非揮發性參考記憶體胞元為二極體連接式貫穿多工器1212 (部分地顯示),其中電流輸入通過BLR0、BLR1、BLR2及BLR3流入該等多工器中。多工器1212各自包括各別多工器1205及串疊電晶體1204以確保在讀取操作期間第一及第二非揮發性參考記憶體胞元中之各者之位元線(諸如BLR0)上的恆定電壓。參考胞元經調諧至目標參考位準。Figure 12 depicts a neuronal VMM array 1200, which is particularly suitable for memory cells 310 as shown in Figure 3, and serves as a synaptic and neuronal portion between the input layer and the next layer. The VMM array 1200 includes a memory array 1203 of nonvolatile memory cells, a reference array 1201 of first nonvolatile reference memory cells, and a reference array 1202 of second nonvolatile reference memory cells. Reference arrays 1201 and 1202 are used to convert current inputs flowing into terminals BLR0, BLR1, BLR2, and BLR3 into voltage inputs CG0, CG1, CG2, and CG3. In practice, the first and second nonvolatile reference memory cells are diode-connected through-multiplexers 1212 (partially shown), with current input flowing into these multiplexers via BLR0, BLR1, BLR2, and BLR3. Each multiplexer 1212 includes a separate multiplexer 1205 and a cascaded transistor 1204 to ensure a constant voltage on the bit lines (such as BLR0) of each of the first and second nonvolatile reference memory cells during read operations. The reference cell is tuned to a target reference level.

記憶體陣列1203用於二個目的。首先,其儲存將由VMM陣列1200使用之權重。其次,記憶體陣列1203有效地使輸入(被提供至端子BLR0、BLR1、BLR2及BLR3之電流輸入,其中參考陣列1201及1202將此等電流輸入轉換為輸入電壓以供應至控制閘極(CG0、CG1、CG2及CG3))乘以儲存於記憶體陣列中之權重,且接著將所有結果(胞元電流)相加以產生輸出,該輸出顯現於BL0至BLN上,且將為至下一層之輸入或至最終層之輸入。藉由執行乘法及加法函數,記憶體陣列消除對單獨的乘法及加法邏輯電路之需求,且亦為功率高效的。此處,輸入設置於控制閘極線(CG0、CG1、CG2及CG3)上,且輸出在讀取操作期間出現於位元線(BL0至BLN)上。置放於各位元線上之電流對來自連接至彼特定位元線之記憶體胞元的所有電流執行求和函數。The memory array 1203 serves two purposes. First, it stores the weights to be used by the VMM array 1200. Second, the memory array 1203 effectively multiplies the inputs (current inputs provided to terminals BLR0, BLR1, BLR2, and BLR3, wherein reference arrays 1201 and 1202 convert these current inputs into input voltages to supply to the control gates (CG0, CG1, CG2, and CG3)) by the weights stored in the memory array, and then adds all the results (cell currents) to produce an output, which is displayed on BL0 to BLN and will be the input to the next layer or to the final layer. By performing multiplication and addition functions, the memory array eliminates the need for separate multiplication and addition logic circuits and is also power efficient. Here, the inputs are located on the control gate lines (CG0, CG1, CG2, and CG3), and the outputs appear on the bit lines (BL0 to BLN) during read operations. The current placed on each bit line performs a summation function on all currents from the memory cells connected to the bit positioner lines.

VMM陣列1200針對記憶體陣列1203中之非揮發性記憶體胞元實施單向調諧。亦即,各非揮發性記憶體胞元經抹除且接著經部分程式化,直至達到浮動閘極上之所要電荷為止。若過多電荷置放於浮動閘極上(使得錯誤值儲存於胞元中),則胞元經抹除且部分程式化操作之序列重新開始。如所顯示,共用同一抹除閘極(諸如,EG0或EG1)之二個列被一起抹除(此被稱為頁面抹除),且此後,各胞元經部分地程式化直至達到浮動閘極上之所要電荷為止。The VMM array 1200 performs unidirectional modulation on the non-volatile memory cells in the memory array 1203. That is, each non-volatile memory cell is erased and then partially programmed until the desired charge is reached on the floating gate. If too much charge is placed on the floating gate (causing an error value to be stored in the cell), the sequence of cell erasure and partial programming operations restarts. As shown, two columns sharing the same erase gate (e.g., EG0 or EG1) are erased together (this is called page erasure), and thereafter, each cell is partially programmed until the desired charge is reached on the floating gate.

表7描繪用於VMM陣列1200之操作電壓及電流。該表中之行指示置放於以下各者上之電壓:用於選定胞元之字元線、用於未選定胞元之字元線、用於選定胞元之位元線、用於未選定胞元之位元線、用於選定胞元之控制閘極、用於與選定胞元處於同一扇區中的未選定胞元之控制閘極、用於與選定胞元處於不同扇區中的未選定胞元之控制閘極、用於選定胞元之抹除閘極、用於未選定胞元之抹除閘極、用於選定胞元之源極線及用於未選定胞元之源極線。列指示讀取、抹除及程式化之操作。 表7:圖12之VMM陣列1200之操作 WL WL-未 選定 BL BL-未 選定 CG CG-未 選定同 一扇區 CG-未 選定 EG EG-未 選定 SL SL-未 選定 讀取 1.0-2V -0.5V/ 0V 0.6-2V (Ineuron) 0V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V 抹除 0V 0V 0V 0V 0V 0-2.6V 0-2.6V 5-12V 0-2.6V 0V 0V 程式化 0.7-1V -0.5V/ 0V 0.1-1uA Vinh (1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V Table 7 depicts the operating voltages and currents used in the VMM array 1200. The row indicators in this table are set to the following voltages: word lines for selected cells, word lines for unselected cells, bit lines for selected cells, bit lines for unselected cells, control gate for selected cells, control gate for unselected cells in the same sector as the selected cell, control gate for unselected cells in a different sector from the selected cell, erase gate for selected cells, erase gate for unselected cells, source lines for selected cells, and source lines for unselected cells. The column indicators handle read, erase, and programmable operations. Table 7: Operation of VMM Array 1200 in Figure 12 WL WL - Not selected BL BL - Not Selected CG CG - Not Selected for the Same Sector CG - Not Selected EG EG - Not selected SL SL - Not selected Read 1.0-2V -0.5V/ 0V 0.6-2V (Ineuron) 0V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V erase 0V 0V 0V 0V 0V 0-2.6V 0-2.6V 5-12V 0-2.6V 0V 0V Programming 0.7-1V -0.5V/ 0V 0.1-1uA Vinh (1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V

圖13描繪神經元VMM陣列1300,其尤其適合於如圖3中所顯示之記憶體胞元310,且用作輸入層與下一層之間的突觸及神經元部分。VMM陣列1300包含非揮發性記憶體胞元之記憶體陣列1303、第一非揮發性參考記憶體胞元之參考陣列1301及第二非揮發性參考記憶體胞元之參考陣列1302。EG線EGR0、EG0、EG1及EGR1垂直地延行,而CG線CG0、CG1、CG2及CG3以及WL線WL0、WL1、WL2及WL3水平地延行。VMM陣列1300類似於VMM陣列1200,除VMM陣列1300實施雙向調諧外,其中由於使用單獨的EG線,各個別胞元可視需要經完全抹除、部分程式化及部分抹除以達到浮動閘極上之所要電荷量。如所顯示,參考陣列1301及1302將端子BLR0、BLR1、BLR2及BLR3中之輸入電流轉換成待在列方向上施加至記憶體胞元之控制閘極電壓CG0、CG1、CG2及CG3 (通過二極體連接式參考胞元貫穿多工器1314進行之動作)。電流輸出(神經元)在位元線BL0至BLN中,其中各位元線對來自連接至彼特定位元線之非揮發性記憶體胞元的所有電流進行求和。Figure 13 depicts a neuronal VMM array 1300, which is particularly suitable for memory cells 310 as shown in Figure 3, and serves as a synaptic and neuronal portion between the input layer and the next layer. The VMM array 1300 includes a memory array 1303 of nonvolatile memory cells, a reference array 1301 of first nonvolatile reference memory cells, and a reference array 1302 of second nonvolatile reference memory cells. EG lines EGR0, EG0, EG1, and EGR1 extend vertically, while CG lines CG0, CG1, CG2, and CG3 and WL lines WL0, WL1, WL2, and WL3 extend horizontally. VMM array 1300 is similar to VMM array 1200, except that VMM array 1300 implements bidirectional tuning. Because it uses a separate EG line, individual cells can be completely erased, partially programmed, or partially erased as needed to achieve the desired charge on the floating gate. As shown, reference arrays 1301 and 1302 convert the input current in terminals BLR0, BLR1, BLR2, and BLR3 into control gate voltages CG0, CG1, CG2, and CG3 to be applied to the memory cells in the column direction (operated via diode-connected reference cells through multiplexer 1314). The current output (neuron) is in bit lines BL0 to BLN, where each bit line sums all currents from nonvolatile memory cells connected to the bit line.

表8描繪用於VMM陣列1300之操作電壓及電流。該表中之行指示置放於以下各者上之電壓:用於選定胞元之字元線、用於未選定胞元之字元線、用於選定胞元之位元線、用於未選定胞元之位元線、用於選定胞元之控制閘極、用於與選定胞元處於同一扇區中的未選定胞元之控制閘極、用於與選定胞元處於不同扇區中的未選定胞元之控制閘極、用於選定胞元之抹除閘極、用於未選定胞元之抹除閘極、用於選定胞元之源極線及用於未選定胞元之源極線。列指示讀取、抹除及程式化之操作。 表8:圖13之VMM陣列1300的操作 WL WL-未 選定 BL BL-未 選定 CG CG-未選 定同一扇區 CG-未 選定 EG EG-未 選定 SL SL-未 選定 讀取 1.0-2V -0.5 V/0V 0.6-2V (Ineuron) 0V 0-2.6V 0 -2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V 抹除 0V 0V 0V 0V 0V 4-9V 0-2.6V 5-12V 0-2.6V 0V 0V 程式化 0.7-1V -0.5 V/0V 0.1-1uA Vinh (1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V Table 8 depicts the operating voltages and currents used in the VMM array 1300. The row indicators in this table are set to the following voltages: word lines for selected cells, word lines for unselected cells, bit lines for selected cells, bit lines for unselected cells, control gate for selected cells, control gate for unselected cells in the same sector as the selected cell, control gate for unselected cells in a different sector from the selected cell, erase gate for selected cells, erase gate for unselected cells, source lines for selected cells, and source lines for unselected cells. The column indicators handle read, erase, and programmable operations. Table 8: Operation of VMM Array 1300 (Figure 13) WL WL - Not selected BL BL - Not Selected CG CG - Not Selected for the Same Sector CG - Not Selected EG EG - Not selected SL SL - Not selected Read 1.0-2V -0.5 V/0V 0.6-2V (Ineuron) 0V 0-2.6V 0 -2.6V 0-2.6V 0-2.6V 0-2.6V 0V 0V erase 0V 0V 0V 0V 0V 4-9V 0-2.6V 5-12V 0-2.6V 0V 0V Programming 0.7-1V -0.5 V/0V 0.1-1uA Vinh (1-2V) 4-11V 0-2.6V 0-2.6V 4.5-5V 0-2.6V 4.5-5V 0-1V

圖22描繪神經元VMM陣列2200,其尤其適合於如圖2中所顯示之記憶體胞元210,且用作輸入層與下一層之間的突觸及神經元部分。在VMM陣列2200中,輸入INPUT0、…、INPUTN分別接收於位元線BL0、…、BLN上,且輸出OUTPUT1、OUTPUT2、OUTPUT3及OUTPUT4分別產生於源極線SL0、SL1、SL2及SL3上。Figure 22 depicts a neuronal VMM array 2200, which is particularly suitable for memory cell 210 as shown in Figure 2, and serves as a synapse and neuronal portion between the input layer and the next layer. In the VMM array 2200, inputs INPUT 0 , ..., INPUT N are received on bit lines BL0, ..., BLN, respectively, and outputs OUTPUT 1 , OUTPUT 2 , OUTPUT 3 , and OUTPUT 4 are generated on source lines SL0, SL1, SL2, and SL3, respectively.

圖23描繪神經元VMM陣列2300,其尤其適合於如圖2中所顯示之記憶體胞元210,且用作輸入層與下一層之間的突觸及神經元部分。在此實例中,輸入INPUT0、INPUT1、INPUT2及INPUT3分別接收於源極線SL0、SL1、SL2及SL3上,且輸出OUTPUT0、…、OUTPUTN產生於位元線BL0、…、BLN上。Figure 23 depicts a neuronal VMM array 2300, which is particularly suitable for memory cell 210 as shown in Figure 2, and serves as a synapse and neuronal portion between the input layer and the next layer. In this example, inputs INPUT 0 , INPUT 1 , INPUT 2 , and INPUT 3 are received on source lines SL0, SL1, SL2, and SL3, respectively, and outputs OUTPUT 0 , ..., OUTPUT N are generated on bit lines BL0, ..., BLN.

圖24描繪神經元VMM陣列2400,其尤其適合於如圖2中所顯示之記憶體胞元210,且用作輸入層與下一層之間的突觸及神經元部分。在此實例中,輸入INPUT0、…、INPUTM分別接收於字元線WL0、…、WLM上,且輸出OUTPUT0、…、OUTPUTN產生於位元線BL0、…、BLN上。Figure 24 depicts a neuronal VMM array 2400, which is particularly suitable for memory cell 210 as shown in Figure 2, and serves as a synapse and neuronal portion between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT M are received on word lines WL0, ..., WL M , respectively, and outputs OUTPUT 0 , ..., OUTPUT N are generated on bit lines BL0, ..., BLN.

圖25描繪神經元VMM陣列2500,其尤其適合於如圖3中所顯示之記憶體胞元310,且用作輸入層與下一層之間的突觸及神經元部分。在此實例中,輸入INPUT0、...、INPUTM分別接收於字元線WL0、...、WLM上,且輸出OUTPUT0、...、OUTPUTN產生於位元線BL0、...、BLN上。Figure 25 depicts a neuronal VMM array 2500, which is particularly suitable for memory cell 310 as shown in Figure 3, and serves as a synapse and neuronal portion between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT M are received on word lines WL0, ..., WL M , respectively, and outputs OUTPUT 0 , ..., OUTPUT N are generated on bit lines BL0, ..., BLN.

圖26描繪神經元VMM陣列2600,其尤其適合於如圖4中所顯示之記憶體胞元410,且用作輸入層與下一層之間的突觸及神經元部分。在此實例中,輸入INPUT0、...、INPUTN分別接收於垂直控制閘極線CG0、...、CGN上,且輸出OUTPUT1及OUTPUT2產生於源極線SL0及SL1上。Figure 26 depicts a neuronal VMM array 2600, which is particularly suitable for memory cell 410 as shown in Figure 4, and serves as a synapse and neuronal portion between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT N are received on vertical control gate lines CG0, ..., CG N , respectively, and outputs OUTPUT 1 and OUTPUT 2 are generated on source lines SL0 and SL1.

圖27描繪神經元VMM陣列2700,其尤其適合於如圖4中所顯示之記憶體胞元410,且用作輸入層與下一層之間的突觸及神經元部分。在此實例中,輸入INPUT0、…、INPUTN分別接收於位元線控制閘極2701-1、2701-2、…、2701-(N-1)及2701-N之閘極上,該等閘極分別耦接至位元線BL0、…、BLN。實例輸出OUTPUT1及OUTPUT2產生於源極線SL0及SL1上。Figure 27 depicts a neuronal VMM array 2700, which is particularly suitable for memory cell 410 as shown in Figure 4, and serves as a synapse and neuronal portion between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT N are received at the gates of bit line control gates 2701-1, 2701-2, ..., 2701-(N-1) and 2701-N, respectively, which are coupled to bit lines BL0, ..., BLN. Example outputs OUTPUT 1 and OUTPUT 2 are generated on source lines SL0 and SL1.

圖28描繪神經元VMM陣列2800,其尤其適合於如圖3中所顯示之記憶體胞元310、如圖5中所顯示之記憶體胞元510及如圖7中所顯示之記憶體胞元710,且用作輸入層與下一層之間的突觸及神經元部分。在此實例中,輸入INPUT0、…、INPUTM接收於字元線WL0、…、WLM上,且輸出OUTPUT0 …、OUTPUTN分別產生於位元線BL0、…、BLN上。Figure 28 depicts a neuronal VMM array 2800, which is particularly suitable for memory cells 310 as shown in Figure 3, memory cells 510 as shown in Figure 5, and memory cells 710 as shown in Figure 7, and serves as synapses and neuronal portions between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT M are received on word lines WL0, ..., WL M , and outputs OUTPUT 0 , ..., OUTPUT N are generated on bit lines BL0, ..., BLN, respectively.

圖29描繪神經元VMM陣列2900,其尤其適合於如圖3中所顯示之記憶體胞元310、如圖5中所顯示之記憶體胞元510及如圖7中所顯示之記憶體胞元710,且用作輸入層與下一層之間的突觸及神經元部分。在此實例中,輸入INPUT0、...、INPUTM接收於控制閘極線CG0、...、CGM上。輸出OUTPUT0、...、OUTPUTN分別產生於垂直源極線SL0、...、SLN上,其中各源極線SLi耦接至行i中之所有記憶體胞元之源極線。Figure 29 depicts a neuronal VMM array 2900, which is particularly suitable for memory cells 310 as shown in Figure 3, memory cells 510 as shown in Figure 5, and memory cells 710 as shown in Figure 7, and serves as the synaptic and neuronal portion between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT M are received on control gate lines CG0, ..., CGM . Outputs OUTPUT 0 , ..., OUTPUT N are generated on vertical source lines SL0, ..., SLN, respectively, wherein each source line SLi is coupled to the source line of all memory cells in row i.

圖30描繪神經元VMM陣列3000,其尤其適合於如圖3中所顯示之記憶體胞元310、如圖5中所顯示之記憶體胞元510及如圖7中所顯示之記憶體胞元710,且用作輸入層與下一層之間的突觸及神經元部分。在此實例中,輸入INPUT0、...、INPUTM接收於控制閘極線CG0、...、CGM上。輸出OUTPUT0、...、OUTPUTN分別產生於垂直位元線BL0、...、BLN上,其中各位元線BLi耦接至行i中之所有記憶體胞元之位元線。 長短期記憶體Figure 30 depicts a neuronal VMM array 3000, which is particularly suitable for memory cells 310 as shown in Figure 3, memory cells 510 as shown in Figure 5, and memory cells 710 as shown in Figure 7, and serves as the synapse and neuronal portion between the input layer and the next layer. In this example, inputs INPUT 0 , ..., INPUT M are received on control gate lines CG0, ..., CGM . Outputs OUTPUT 0 , ..., OUTPUT N are generated on vertical bit lines BL0, ..., BLN, respectively, where each bit line BLi is coupled to the bit lines of all memory cells in row i. Long Short-Term Memory (LSTM)

先前技術包括被稱為長短期記憶體(LSTM)之概念。LSTM單元常常用於神經網路中。LSTM允許神經網路在預定任意時間間隔內記住資訊且在後續操作中使用彼資訊。習知LSTM單元包含胞元、輸入閘極、輸出閘極及遺忘閘極。三個閘極調節資訊進入及離開胞元之流動及在LSTM中記住資訊之時間間隔。VMM尤其適用於LSTM單元。Previous technologies included a concept known as Long Short-Term Memory (LSTM). LSTM cells are commonly used in neural networks. LSTM allows neural networks to remember information at predetermined time intervals and use that information in subsequent operations. A known LSTM cell consists of a cell, an input gate, an output gate, and a forget gate. These three gates regulate the flow of information into and out of the cell and the time intervals at which information is remembered in the LSTM. Virtual Memory Models (VMMs) are particularly well-suited for LSTM cells.

圖14描繪實例LSTM 1400。此實例中之LSTM 1400包含胞元1401、1402、1403及1404。胞元1401接收輸入向量x0,且產生輸出向量h0及胞元狀態向量c0。胞元1402接收輸入向量x1、來自胞元1401之輸出向量(隱藏狀態) h0及來自胞元1401之胞元狀態c0,且產生輸出向量h1及胞元狀態向量c1。胞元1403接收輸入向量x2、來自胞元1402之輸出向量(隱藏狀態) h1及來自胞元1402之胞元狀態c1,且產生輸出向量h2及胞元狀態向量c2。胞元1404接收輸入向量x3、來自胞元1403之輸出向量(隱藏狀態) h2及來自胞元1403之胞元狀態c2,且產生輸出向量h3。可使用額外胞元,且具有四個胞元之LSTM僅為實例。Figure 14 illustrates an example LSTM 1400. This example LSTM 1400 includes cells 1401, 1402, 1403, and 1404. Cell 1401 receives the input vector x0 and produces the output vector h0 and cell state vector c0 . Cell 1402 receives the input vector x1 , the output vector (hidden state) h0 from cell 1401, and the cell state c0 from cell 1401, and produces the output vector h1 and cell state vector c1 . Cell 1403 receives the input vector x2 , the output vector (hidden state) h1 from cell 1402, and the cell state c1 from cell 1402, and produces the output vector h2 and cell state vector c2 . Cell 1404 receives input vector x3 , output vector (hidden state) h2 from cell 1403, and cell state c2 from cell 1403, and produces output vector h3 . Additional cells can be used, and an LSTM with four cells is just an example.

圖15描繪LSTM胞元1500之實例實施,其可用於圖14中之胞元1401、1402、1403及1404。LSTM胞元1500接收輸入向量x(t)、來自前一胞元之胞元狀態向量c(t-1)及來自前一胞元之輸出向量h(t-1),且產生胞元狀態向量c(t)及輸出向量h(t)。Figure 15 illustrates an example implementation of LSTM cell 1500, which can be used in cells 1401, 1402, 1403 and 1404 in Figure 14. LSTM cell 1500 receives input vector x(t), cell state vector c(t-1) from the previous cell and output vector h(t-1) from the previous cell, and produces cell state vector c(t) and output vector h(t).

LSTM胞元1500包含S型函數構件1501、1502及1503,其中之各者應用0與1之間的數字以控制輸入向量中之各分量被允許通過輸出向量之量。LSTM胞元1500亦包含用以將雙曲正切函數應用於輸入向量之雙曲正切構件1504及1505、用以使二個向量相乘在一起之乘法器構件1506、1507及1508,及用以將二個向量相加在一起之加法構件1509。輸出向量h(t)可被提供至系統中之下一LSTM胞元,或可出於其他目的來存取該輸出向量。LSTM cell 1500 includes sigmoid function components 1501, 1502, and 1503, each of which uses numbers between 0 and 1 to control the amount of each component of the input vector allowed to pass through the output vector. LSTM cell 1500 also includes hyperbolic tangent components 1504 and 1505 for applying a hyperbolic tangent function to the input vector, multiplier components 1506, 1507, and 1508 for multiplying two vectors together, and adder component 1509 for adding two vectors together. The output vector h(t) can be provided to the next LSTM cell in the system or accessed for other purposes.

圖16描繪LSTM胞元1600,其為LSTM胞元1500之實施的實例。為了方便讀者,來自LSTM胞元1500之相同編號用於LSTM胞元1600中。S型函數構件1501、1502及1503以及雙曲正切構件1504各自包含多個VMM陣列1601及激勵函數區塊1602。因此,可見VMM陣列尤其適用於在某些神經網路系統中使用之LSTM胞元。乘法器構件1506、1507及1508以及加法構件1509以數位方式或以類比方式實施。激勵函數區塊1602可以數位方式或以類比方式實施。Figure 16 depicts LSTM cell 1600, which is an example of an implementation of LSTM cell 1500. For ease of reading, the same designations from LSTM cell 1500 are used in LSTM cell 1600. S-shaped function components 1501, 1502, and 1503, and hyperbolic tangent component 1504 each contain multiple VMM arrays 1601 and excitation function blocks 1602. Therefore, it can be seen that VMM arrays are particularly suitable for LSTM cells used in certain neural network systems. Multiplier components 1506, 1507, and 1508, and adder component 1509 are implemented digitally or analogously. Excitation function block 1602 can be implemented digitally or analogously.

LSTM胞元1600之替代方案(及LSTM胞元1500之實施的另一實例)顯示於圖17中。在圖17中,S型函數構件1501、1502及1503以及雙曲正切構件1504以時間多工方式共用同一實體硬體(VMM陣列1701及激勵函數區塊1702)。LSTM胞元1700亦包含:乘法器構件1703,其用以使二個向量相乘在一起;加法構件1708,其用以將二個向量相加在一起;雙曲正切構件1505 (其包含激勵函數區塊1702);暫存器1707,其用以當i(t)自S型函數區塊1702輸出時儲存值i(t);暫存器1704,其用以當值f(t) * c(t-1)通過多工器1710自乘法器構件1703輸出時儲存彼值;暫存器1705,其用以當值i(t) * u(t)通過多工器1710自乘法器構件1703輸出時儲存彼值;及暫存器1706,其用以當值o(t) * c~(t)通過多工器1710及多工器1709自乘法器構件1703輸出時儲存彼值。An alternative to LSTM cell 1600 (and another example of an implementation of LSTM cell 1500) is shown in Figure 17. In Figure 17, sigmoid function components 1501, 1502, and 1503 and hyperbolic tangent component 1504 share the same physical hardware (VMM array 1701 and excitation function block 1702) in a time-multiplexed manner. The LSTM cell 1700 also includes: a multiplier component 1703 for multiplying two vectors together; an adder component 1708 for adding two vectors together; a hyperbolic tangent component 1505 (containing the excitation function block 1702); a register 1707 for storing the value i(t) when i(t) is output from the sigmoid function block 1702; a register 1704 for storing the value f(t) * c(t-1) when it is output from the multiplier component 1703 via the multiplexer 1710; and a register 1705 for storing the value i(t) * u(t) stores its value when it is output by multiplexer 1710 and multiplier component 1703; and register 1706 is used to store the value o(t) * c~(t) when it is output by multiplexer 1710 and multiplexer 1709 and multiplier component 1703.

LSTM胞元1600含有VMM陣列1601及各別激勵函數區塊1602之多個集合,而LSTM胞元1700僅含有VMM陣列1701及激勵函數區塊1702之一個集合,其用於表示LSTM胞元1700之實例中之多個層。LSTM胞元1700將需要相較於LSTM 1600較少的空間,此係因為LSTM胞元1700相比於LSTM胞元1600將需要1/4之空間以用於VMM及激勵函數區塊。LSTM cell 1600 contains multiple sets of VMM array 1601 and individual excitation function blocks 1602, while LSTM cell 1700 contains only one set of VMM array 1701 and excitation function blocks 1702, which is used to represent multiple layers in an instance of LSTM cell 1700. LSTM cell 1700 will require less space than LSTM 1600 because LSTM cell 1700 will require 1/4 of the space for VMM and excitation function blocks compared to LSTM cell 1600.

可進一步瞭解,LSTM胞元將通常包含多個VMM陣列,其中之各者使用由VMM陣列外部的某些電路區塊,諸如求和器及激勵函數區塊以及高電壓產生區塊所提供之功能性。向各VMM陣列提供單獨電路區塊將需要半導體裝置內之大量空間且將略微低效。因此,下文所描述之實例減少在VMM陣列自身外部所使用之電路系統。 閘控遞回單元As can be further understood, an LSTM cell will typically contain multiple VMM arrays, each utilizing functionality provided by certain circuit blocks outside the VMM arrays, such as summer and excitation function blocks, and high-voltage generation blocks. Providing separate circuit blocks for each VMM array would require a significant amount of space within the semiconductor device and would be somewhat inefficient. Therefore, the examples described below reduce the circuitry used outside the VMM arrays themselves. Gate Return Unit

類比VMM實施可用於閘控遞回單元(GRU)系統。GRU為遞回神經網路中之閘控機制。GRU類似於LSTM,除GRU胞元通常含有少於LSTM胞元之組件外。Analogous VMM implementations can be used in gate recursive unit (GRU) systems. A GRU is a gate control mechanism in a recursive neural network. A GRU is similar to an LSTM, except that a GRU cell typically contains fewer components than an LSTM cell.

圖18描繪實例GRU 1800。此實例中之GRU 1800包含胞元1801、1802、1803及1804。胞元1801接收輸入向量x0且產生輸出向量h0。胞元1802接收輸入向量x1、來自胞元1801之輸出向量h0,且產生輸出向量h1。胞元1803接收輸入向量x2及來自胞元1802之輸出向量(隱藏狀態) h1,且產生輸出向量h2。胞元1804接收輸入向量x3及來自胞元1803之輸出向量(隱藏狀態) h2,且產生輸出向量h3。可使用額外胞元,且具有四個胞元之GRU僅為實例。Figure 18 illustrates an example GRU 1800. This example GRU 1800 includes cells 1801, 1802, 1803, and 1804. Cell 1801 receives input vector x0 and produces output vector h0 . Cell 1802 receives input vector x1 and output vector h0 from cell 1801, and produces output vector h1 . Cell 1803 receives input vector x2 and output vector h1 (in the hidden state) from cell 1802, and produces output vector h2 . Cell 1804 receives input vector x3 and output vector h2 (in the hidden state) from cell 1803, and produces output vector h3 . Additional cells can be used, and a GRU with four cells is just one example.

圖19描繪GRU胞元1900之實例實施,其可用於圖18之胞元1801、1802、1803及1804。GRU胞元1900接收輸入向量x(t)及來自前一GRU胞元之輸出向量h(t-1),且產生輸出向量h(t)。GRU胞元1900包含S型函數構件1901及1902,其中之各者將0與1之間的數字應用至來自輸出向量h(t-1)及輸入向量x(t)之分量。GRU胞元1900亦包含用以將雙曲正切函數應用至輸入向量之雙曲正切構件1903,用以將二個向量相乘在一起之複數個乘法器構件1904、1905及1906,用以將二個向量相加在一起之加法構件1907及用以自1減去輸入以產生輸出之互補構件1908。Figure 19 illustrates an example implementation of GRU cell 1900, which can be used in cells 1801, 1802, 1803, and 1804 of Figure 18. GRU cell 1900 receives an input vector x(t) and an output vector h(t-1) from the previous GRU cell, and generates an output vector h(t). GRU cell 1900 includes sigmoid function components 1901 and 1902, each of which applies numbers between 0 and 1 to the components of the output vector h(t-1) and the input vector x(t). The GRU cell 1900 also includes a hyperbolic tangent component 1903 for applying the hyperbolic tangent function to the input vector, multiple multiplier components 1904, 1905 and 1906 for multiplying two vectors together, an adder component 1907 for adding two vectors together, and a complementary component 1908 for subtracting the input from 1 to produce the output.

圖20描繪GRU胞元2000,其為GRU胞元1900之實施的實例。為了方便讀者,來自GRU胞元1900之相同編號用於GRU胞元2000中。如圖20中可見,S型函數構件1901及1902以及雙曲正切構件1903各自包含多個VMM陣列2001及激勵函數區塊2002。因此,可見VMM陣列尤其用於在某些神經網路系統中使用之GRU胞元。乘法器構件1904、1905、1906、加法構件1907及互補構件1908以數位方式或以類比方式實施。激勵函數區塊2002可以數位方式或以類比方式實施。Figure 20 depicts GRU cell 2000, which is an example of an implementation of GRU cell 1900. For the convenience of the reader, the same designations from GRU cell 1900 are used in GRU cell 2000. As can be seen in Figure 20, the sigmoid function components 1901 and 1902 and the hyperbolic tangent component 1903 each contain multiple VMM arrays 2001 and excitation function blocks 2002. Therefore, it can be seen that VMM arrays are particularly used in GRU cells used in some neural network systems. Multiplier components 1904, 1905, and 1906, adder component 1907, and complementary component 1908 are implemented digitally or analogously. Excitation function block 2002 can be implemented digitally or analogously.

GRU胞元2000之替代方案(及GRU胞元1900之實施之另一實例)顯示於圖21中。在圖21中,GRU胞元2100利用VMM陣列2101及激勵函數區塊2102,該激勵函數區塊在被組構為S型函數時應用0與1之間的數字以控制輸入向量中之各分量被允許通過輸出向量之量。在圖21中,S型函數構件1901及1902以及雙曲正切構件1903以時間多工方式共用相同實體硬體(VMM陣列2101及激勵函數區塊2102)。GRU胞元2100亦包含:乘法器構件2103,其用以使二個向量相乘在一起;加法構件2105,其用以使二個向量相加在一起;互補構件2109,其用以自1減去輸入以產生輸出;多工器2104;暫存器2106,其用以當值h(t-1) * r(t)通過多工器2104自乘法器構件2103輸出時保持彼值;暫存器2107,其用以當值h(t-1) *z(t)通過多工器2104自乘法器構件2103輸出時保持彼值;及暫存器2108,其用以當值h^(t) * (1-z(t))通過多工器2104自乘法器構件2103輸出時保持彼值。An alternative to GRU cell 2000 (and another example of the implementation of GRU cell 1900) is shown in Figure 21. In Figure 21, GRU cell 2100 utilizes a VMM array 2101 and an excitation function block 2102, which, when configured as a sigmoid function, uses numbers between 0 and 1 to control the amount by which each component of the input vector is allowed to pass through the output vector. In Figure 21, the sigmoid function components 1901 and 1902 and the hyperbolic tangent component 1903 share the same physical hardware (VMM array 2101 and excitation function block 2102) in a time-multiplexed manner. The GRU cell 2100 also includes: a multiplier component 2103 for multiplying two vectors together; an adder component 2105 for adding two vectors together; a complement component 2109 for subtracting the input from 1 to produce the output; a multiplexer 2104; a register 2106 for retaining the value h(t-1) * r(t) when it is output from the multiplier component 2103 via the multiplexer 2104; a register 2107 for retaining the value h(t-1) * z(t) when it is output from the multiplier component 2103 via the multiplexer 2104; and a register 2108 for retaining the value h^(t) * (1-z(t)) when it is output from the multiplier component 2103 via the multiplexer 2104.

GRU胞元2000含有VMM陣列2001及激勵函數區塊2002之多個集合,而GRU胞元2100含有VMM陣列2101及激勵函數區塊2102的一個集合,其用於表示GRU胞元2100之實例中的多個層。GRU胞元2100將需要相較於GRU胞元2000較少的空間,此係因為GRU胞元2100相比於GRU胞元2000將需要1/3之空間以用於VMM及激勵函數區塊。GRU cell 2000 contains multiple sets of VMM array 2001 and excitation function blocks 2002, while GRU cell 2100 contains a set of VMM array 2101 and excitation function blocks 2102, which are used to represent multiple layers in an instance of GRU cell 2100. GRU cell 2100 will require less space than GRU cell 2000 because GRU cell 2100 will require 1/3 of the space for VMM and excitation function blocks.

可進一步瞭解,GRU系統將通常包含多個VMM陣列,其中之各者使用由VMM陣列外部之某些電路區塊,諸如求和器及激勵函數區塊以及高電壓產生區塊所提供之功能性。向各VMM陣列提供單獨電路區塊將需要半導體裝置內之大量空間且將略微低效。因此,下文所描述之實例減少在VMM陣列自身外部所使用之電路系統。As can be further understood, a GRU system will typically comprise multiple VMM arrays, each utilizing functionality provided by certain circuit blocks external to the VMM arrays, such as summer and excitation function blocks, and high-voltage generation blocks. Providing separate circuit blocks for each VMM array would require a significant amount of space within the semiconductor device and would be somewhat inefficient. Therefore, the examples described below reduce the circuitry used externally to the VMM arrays themselves.

至VMM陣列之輸入可為類比位準、二進位位準、脈衝、時間調變脈衝或數位位元(在此情況下,使用DAC將數位位元轉換成適當的輸入類比位準),且輸出可為類比位準、二進位位準、定時脈衝、脈衝或數位位元(在此情況下,使用輸出ADC將輸出類比位準轉換成數位位元)。The inputs to the VMM array can be analog levels, binary levels, pulses, time-modulated pulses, or digital bits (in which case, a DAC is used to convert the digital bits to the appropriate input analog level), and the outputs can be analog levels, binary levels, timed pulses, pulses, or digital bits (in which case, an output ADC is used to convert the output analog level to digital bits).

一般而言,對於VMM陣列中之各記憶體胞元,各權重W可由單一記憶體胞元或差分胞元或二個混合記憶體胞元(2個胞元之平均值)實施。在差分胞元情況下,使用二個記憶體胞元以將權重W實施為差分權重(W = W+ - W-)。在二個混合記憶體胞元中,需要二個記憶體胞元以將權重W實施為二個胞元之平均值。Generally, for each memory cell in a VMM array, each weight W can be implemented by a single memory cell, a differential cell, or two mixed memory cells (the average of the two cells). In the case of differential cells, two memory cells are used to implement the weight W as a differential weight (W = W+ - W-). In the case of two mixed memory cells, two memory cells are required to implement the weight W as the average of the two cells.

圖31描繪VMM系統3100。在一些實例中,儲存於VMM陣列中之權重W經儲存為差分對W+ (正權重)及W- (負權重),其中W = (W+) - (W-)。在VMM系統3100中,一半位元線被指定為W+線,亦即,連接至將儲存正權重W+之記憶體胞元的位元線,且另一半位元線被指定為W-線,亦即,連接至實施負權重W-之記憶體胞元的位元線。W-線以交替方式穿插於W+線當中。減法運算由自W+線及W-線接收電流之求和電路執行,該求和電路諸如為求和電路3101及3102。W+線之輸出及W-線之輸出組合在一起,從而對於所有對(W+, W-)線之各對(W+, W-)胞元,有效地得出W = W+ - W-。雖然上文已關於W-線以交替方式穿插在W+線當中進行描述,但在其他實例中,W+線及W-線可任意地位於陣列中之任何位置。Figure 31 illustrates a VMM system 3100. In some embodiments, the weights W stored in the VMM array are stored as difference pairs W+ (positive weights) and W- (negative weights), where W = (W+) - (W-). In the VMM system 3100, half of the bit lines are designated as W+ lines, i.e., bit lines connected to memory cells that will store positive weights W+, and the other half of the bit lines are designated as W- lines, i.e., bit lines connected to memory cells that implement negative weights W-. The W- lines are interspersed among the W+ lines. Subtraction operations are performed by summing circuits that receive current from the W+ and W- lines, such as summing circuits 3101 and 3102. The outputs of the W+ line and the W- line are combined to effectively derive W = W+ - W- for each pair of (W+, W-) cells of all pairs of (W+, W-) lines. Although the alternating arrangement of the W- line among the W+ lines has been described above, in other instances, the W+ and W- lines can be positioned arbitrarily anywhere in the array.

圖32描繪另一實例。在VMM系統3210中,正權重W+經實施於第一陣列3211中且負權重W-經實施於第二陣列3212中,第二陣列3212與第一陣列分離,且所得權重係藉由求和電路3213適當地組合在一起。Figure 32 illustrates another example. In the VMM system 3210, the positive weight W+ is implemented in the first array 3211 and the negative weight W- is implemented in the second array 3212, which is separate from the first array, and the resulting weights are appropriately combined by the summing circuit 3213.

圖33描繪VMM系統3300。儲存於VMM陣列中之權重W經儲存為差分對W+ (正權重)及W- (負權重),其中W = (W+) - (W-)。VMM系統3300包含陣列3301及陣列3302。陣列3301及3302中之各者中的一半位元線被指定為W+線,亦即,連接至將儲存正權重W+之記憶體胞元的位元線,且陣列3301及3302中之各者中的另一半位元線被指定為W-線,亦即,連接至實施負權重W-之記憶體胞元的位元線。W-線以交替方式穿插於W+線當中。減法運算由自W+線及W-線接收電流之求和電路執行,該求和電路諸如為求和電路3303、3304、3305及3306。來自各陣列3301、3302之W+線之輸出及W-線之輸出分別組合在一起,從而對於所有對(W+, W-)線之各對(W+, W-)胞元,有效地得出W = W+ - W-。另外,來自各陣列3301及3302之W值可通過求和電路3307及3308進一步組合,以使得各W值為來自陣列3301之W值減去來自陣列3302之W值的結果,此意謂來自求和電路3307及3308之最終結果為二個差分值之差分值。Figure 33 depicts a VMM system 3300. The weights W stored in the VMM array are stored as difference pairs W+ (positive weights) and W- (negative weights), where W = (W+) - (W-). The VMM system 3300 includes arrays 3301 and 3302. Half of the bit lines in each of arrays 3301 and 3302 are designated as W+ lines, i.e., bit lines connected to memory cells where positive weights W+ will be stored, and the other half of the bit lines in each of arrays 3301 and 3302 are designated as W- lines, i.e., bit lines connected to memory cells where negative weights W- are implemented. W- lines are interspersed among the W+ lines in an alternating manner. The subtraction operation is performed by summing circuits that receive current from the W+ and W- lines, such as summing circuits 3303, 3304, 3305, and 3306. The outputs of the W+ and W- lines from each of arrays 3301 and 3302 are combined to effectively derive W = W+ - W- for each pair of (W+, W-) cells of all pairs of (W+, W-) lines. Furthermore, the W values from each array 3301 and 3302 can be further combined by summing circuits 3307 and 3308 so that each W value is the result of subtracting the W value from array 3302 from the W value from array 3301. This means that the final result from summing circuits 3307 and 3308 is the difference between two difference values.

用於類比神經記憶體系統中之各非揮發性記憶體胞元待經抹除及程式化,以在浮動閘極中保持極特定且精確的電荷量,亦即電子數目。舉例而言,各浮動閘極可保持N個不同值中之一者,其中N為可由各胞元指示之不同權重的數目。N之實例包括16、32、64、128及256。This analogy is used to describe the non-volatile memory cells in an analogy to be erased and programmed to maintain a very specific and precise amount of charge, i.e., the number of electrons, in a floating gate. For example, each floating gate can maintain one of N different values, where N is the number of different weights that can be indicated by each cell. Examples of N include 16, 32, 64, 128, and 256.

先前技術系統需要相當大的面積且涉及輸出級處之相當大的潛時。舉例而言,多個時脈週期用於將自VMM陣列接收到之類比電流轉換成數位輸出資料。Previous technology systems required a considerable area and involved a considerable latency at the output stage. For example, multiple clock cycles were used to convert the analog current received from the VMM array into digital output data.

需要減少輸出處之潛時以增加系統之總操作速度,該系統表示人工神經網路中之一些或全部。It is necessary to reduce the latency at the output to increase the overall operating speed of the system, which represents some or all of an artificial neural network.

揭示用於神經網路陣列之輸出電路及相聯結方法的眾多實例。Numerous examples of output circuits and interconnection methods used in neural network arrays are revealed.

VMM系統架構 圖34描繪VMM系統3400之方塊圖。VMM系統3400包含VMM陣列3401、列解碼器3402、高電壓解碼器3403、行解碼器3404、位元線驅動器3405(諸如用於程式化之位元線控制電路系統)、輸入電路3406、輸出電路3407、控制邏輯3408及偏壓產生器3409。VMM系統3400進一步包含高電壓產生區塊3410,該高電壓產生區塊包含電荷泵3411、電荷泵調節器3412及高電壓位準產生器3413。VMM系統3400進一步包含(程式化/抹除或權重調諧)演算法控制器3414、類比電路系統3415、控制引擎3416 (其可包括但不限於諸如算術函數、激勵函數之函數,嵌入式微控制器邏輯)、測試控制邏輯3417及靜態隨機存取記憶體(SRAM)區塊3418,該靜態隨機存取記憶體區塊用以儲存諸如用於輸入電路之中間資料(例如,激勵資料)或用於輸出電路之中間資料(神經元輸出資料、部分和輸出神經元資料)或用於程式化之資料輸入(諸如,用於一整列或用於多列之資料輸入)。VMM System Architecture Figure 34 depicts a block diagram of a VMM system 3400. The VMM system 3400 includes a VMM array 3401, a column decoder 3402, a high-voltage decoder 3403, a row decoder 3404, a bit line driver 3405 (such as for a programmable bit line control circuit system), an input circuit 3406, an output circuit 3407, a control logic 3408, and a bias generator 3409. The VMM system 3400 further includes a high-voltage generation block 3410, which includes a charge pump 3411, a charge pump regulator 3412, and a high-voltage level generator 3413. The VMM system 3400 further includes a (programmed/erase or weighted tuning) algorithm controller 3414, an analog circuit system 3415, a control engine 3416 (which may include, but is not limited to, functions such as arithmetic functions, excitation functions, embedded microcontroller logic), a test control logic 3417, and a static random access memory (SRAM) block 3418 for storing data such as intermediate data for input circuits (e.g., excitation data) or intermediate data for output circuits (neuron output data, partial and output neuron data) or for programmed data inputs (e.g., for a whole column or for multiple columns of data inputs).

輸入電路3406可包括電路,諸如DAC (數位至類比轉換器)、DPC (數位至脈衝轉換器、數位至時間調變脈衝轉換器)、AAC (類比至類比轉換器,諸如電流至電壓轉換器、對數轉換器)、PAC (脈衝至類比位準轉換器)或任何其他類型之轉換器。輸入電路3406可實施正規化、線性或非線性按比例放大/按比例縮小函數或算術函數中之一或多者。輸入電路3406可針對輸入位準實施溫度補償函數。輸入電路3406可實施諸如ReLU或S型之激勵函數。輸入電路3406可儲存待在程式化或讀取操作期間作為輸入信號施加或與輸入信號組合的數位激勵資料。數位激勵資料可儲存於暫存器中。輸入電路3406可包含用以驅動諸如CG、WL、EG及SL線之陣列端子的電路,其可包括取樣保持電路及緩衝器。DAC可用於將數位激勵資料轉換成待施加至陣列之類比輸入電壓。Input circuitry 3406 may include circuitry such as a DAC (digital-to-analog converter), DPC (digital-to-pulse converter, digital-to-time-modulated pulse converter), AAC (analog-to-analog converter, such as a current-to-voltage converter, logarithmic converter), PAC (pulse-to-analog level converter), or any other type of converter. Input circuitry 3406 may implement one or more of normalization, linear or nonlinear scaling/scaling functions, or arithmetic functions. Input circuitry 3406 may implement a temperature compensation function for the input level. Input circuitry 3406 may implement excitation functions such as ReLU or S-type excitation functions. Input circuit 3406 can store digital excitation data to be applied as an input signal or combined with an input signal during programming or read operations. The digital excitation data can be stored in a register. Input circuit 3406 may include circuitry for driving array terminals such as CG, WL, EG, and SL lines, and may include sample-and-hold circuitry and a buffer. A DAC can be used to convert the digital excitation data into an analog input voltage to be applied to the array.

輸出電路3407可包括電路,諸如ITV (電流至電壓電路)、ADC(類比至數位轉換器,其用以將神經元類比輸出轉換成數位位元)、AAC (類比至類比轉換器,諸如電流至電壓轉換器、對數轉換器)、APC (類比至脈衝轉換器、類比至時間調變脈衝轉換器)或任何其他類型之轉換器。輸出電路3407可將陣列輸出轉換成激勵資料。輸出電路3407可實施激勵函數,諸如整流線性激勵函數(ReLU)或S型。輸出電路3407可實施統計正規化、正則化、按比例放大/按比例縮小/增益函數,統計捨入或算術函數(例如,加法、減法、除法、乘法、移位、對數)中之一或多者以用於神經元輸出。輸出電路3407可實施溫度補償函數以用於神經元輸出或陣列輸出(諸如位元線輸出),以便使陣列之功率消耗在溫度範圍內保持近似恆定或諸如藉由使IV斜率在溫度範圍內保持大致相同而改良陣列(神經元)輸出之精確度。輸出電路3407可包含用於儲存輸出資料之暫存器。Output circuit 3407 may include circuits such as ITV (current-to-voltage circuit), ADC (analog-to-digital converter, used to convert analog neuron outputs to digital bits), AAC (analog-to-analog converter, such as current-to-voltage converter, logarithmic converter), APC (analog-to-pulse converter, analog-to-time-modulated pulse converter), or any other type of converter. Output circuit 3407 can convert array outputs into excitation data. Output circuit 3407 can implement excitation functions such as rectified linear excitation function (ReLU) or sigmoid function. Output circuit 3407 may implement one or more of the following for neural outputs: statistical normalization, regularization, scaling/scaling/gain functions, statistical rounding, or arithmetic functions (e.g., addition, subtraction, division, multiplication, shifting, logarithmic). Output circuit 3407 may implement temperature compensation functions for neural outputs or array outputs (such as bit-line outputs) to keep the array's power consumption approximately constant over a temperature range or to improve the accuracy of the array (neuron) output, for example, by keeping the IV slope approximately the same over a temperature range. Output circuit 3407 may include registers for storing output data.

圖35描繪輸出電路3500。輸出電路3500為圖34中之輸出電路3407的實例實施。輸出電路3500包含參考電路3501及行電路3502。行電路3502之複數個例示用於VMM陣列3401中之複數個行,亦即用於VMM陣列3401中之各別行的行電路3502的各別例示。行電路3502之八個例示顯示於圖35中,但應理解,VMM系統中可存在更多例示及相聯結行。Figure 35 illustrates output circuit 3500. Output circuit 3500 is an example implementation of output circuit 3407 in Figure 34. Output circuit 3500 includes reference circuit 3501 and row circuit 3502. Multiple examples of row circuit 3502 are used for multiple rows in VMM array 3401, that is, for individual rows of row circuit 3502 in VMM array 3401. Eight examples of row circuit 3502 are shown in Figure 35, but it should be understood that there may be more examples and associated rows in a VMM system.

參考電路3501自非揮發性記憶體胞元之一或多個參考行接收參考電流,諸如位元線電流。參考行可為VMM陣列3401之部分(如所顯示),或可位於單獨陣列中。參考電路3501包含行多工器3503、電流至電壓轉換器3504及參考產生器3505。若多於一個參考行連接至參考電路3501,則行多工器3503選擇行且將來自彼行之電流提供至電流至電壓轉換器3504,該電流至電壓轉換器將接收到之電流轉換成電壓。參考產生器3505產生由類比至數位轉換器(ADC) 3508使用之電壓參考VADCREF(諸如VADCREFH及VADCREFL),其中電壓參考VADCREF係基於來自電流至電壓轉換器3504之電壓輸出。舉例而言,若ADC 3508為連續近似暫存器(SAR) ADC,則參考產生器3505產生電壓VREFP、VREFN及VCIM(例如,對於ADC之0.3 V之滿刻度輸入電壓,VREFP = 0.3 V,VREFN = 0 V,VCIM = 0.15V)。因為ADC 3508之參考電壓由類似於電流至電壓轉換器3507之電流至電壓轉換器3504之電壓輸出產生,所以ADC 3508之解析度被維持,因為ADC之參考電壓自動追蹤由電流至電壓轉換器3504及電流至電壓轉換器3507之操作條件變化(諸如溫度變化)引起的任何變化。Reference circuit 3501 receives reference current, such as bit line current, from one or more reference rows of non-volatile memory cells. A reference row may be part of a VMM array 3401 (as shown) or may be located in a separate array. Reference circuit 3501 includes a row multiplexer 3503, a current-to-voltage converter 3504, and a reference generator 3505. If more than one reference row is connected to reference circuit 3501, the row multiplexer 3503 selects a row and supplies current from that row to the current-to-voltage converter 3504, which converts the received current into voltage. Reference generator 3505 generates a voltage reference VADCREF (such as VADCREFH and VADCREFL) used by the analog-to-digital converter (ADC) 3508, where the voltage reference VADCREF is based on the voltage output from the current-to-voltage converter 3504. For example, if ADC 3508 is a continuous approximation register (SAR) ADC, then reference generator 3505 generates voltages VREFP, VREFN, and VCIM (for example, for a full-scale input voltage of 0.3 V for the ADC, VREFP = 0.3 V, VREFN = 0 V, and VCIM = 0.15 V). Because the reference voltage of the ADC 3508 is generated from the voltage output of the current-to-voltage converter 3504, which is similar to that of the current-to-voltage converter 3507, the resolution of the ADC 3508 is maintained. This is because the ADC's reference voltage automatically tracks any changes caused by variations in the operating conditions (such as temperature changes) of the current-to-voltage converter 3504 and the current-to-voltage converter 3507.

各行電路3502自VMM陣列3401中的一或多個行接收電流,諸如位元線電流。各行電路3502包含行多工器3506、電流至電壓轉換器3507及ADC 3508。若多於一個行連接至行電路3502,則行多工器3506選擇行且將來自彼行之電流提供至電流至電壓轉換器3507,該電流至電壓轉換器將接收到之電流轉換成電壓,ADC 3508將該電壓轉換成數位輸出。舉例而言,行電路3502輸出DOUT0x [n:0]。若單一行連接至行電路3502,則行多工器3506為可選擇的。Each row circuit 3502 receives current, such as bit line current, from one or more rows in the VMM array 3401. Each row circuit 3502 includes a row multiplexer 3506, a current-to-voltage converter 3507, and an ADC 3508. If more than one row is connected to the row circuit 3502, the row multiplexer 3506 selects a row and supplies the current from that row to the current-to-voltage converter 3507, which converts the received current into voltage, and the ADC 3508 converts the voltage into a digital output. For example, the row circuit 3502 outputs DOUT0x [n:0]. If only one row is connected to the row circuit 3502, the row multiplexer 3506 is selectable.

圖36A及圖36B描繪用於產生用於本文所描述之電路的快速時脈信號之時脈產生器,該快速時脈信號意謂頻率比自VMM系統外部之源接收到的系統時脈更快之時脈信號。Figures 36A and 36B illustrate a clock generator for generating a fast clock signal for the circuit described herein, wherein the fast clock signal is a clock signal with a frequency faster than the system clock received from a source outside the VMM system.

圖36A描繪時脈產生器3600,其接收系統時脈CLKIN (例如,自VMM系統外部之源接收到之系統時脈)、致能信號EN及組構位元CFGx,且使用延遲鎖定迴路(DLL) 3601及時脈產生器區塊3602產生取決於組構位元CFGx而具有不同時脈頻率之較快時脈CLKOUT。圖36B描繪時脈產生器3650,其接收系統時脈CLKIN且使用鎖相迴路(PLL) 3651及時脈產生器區塊3652產生較快時脈CLKOUT。DLL 3601及PLL 3651用於產生分別比系統時脈CLKIN快的與輸入時脈CLKIN同步的精確時脈,例如比輸入系統時脈CLKIN快2至100倍之時脈。時脈產生器區塊3602及3652用於使用CLK_INT回應於組構位元CFGx而產生不同時脈頻率。Figure 36A depicts a clock generator 3600 that receives the system clock CLKIN (e.g., a system clock received from a source outside the VMM system), the enable signal EN, and the configuration bit CFGx, and uses a delay-locked loop (DLL) 3601 and a clock generator block 3602 to generate a faster clock CLKOUT with a different clock frequency depending on the configuration bit CFGx. Figure 36B depicts a clock generator 3650 that receives the system clock CLKIN and uses a phase-locked loop (PLL) 3651 and a clock generator block 3652 to generate a faster clock CLKOUT. DLL 3601 and PLL 3651 are used to generate precise clocks that are faster than the system clock CLKIN and synchronized with the input clock CLKIN, for example, clocks that are 2 to 100 times faster than the input system clock CLKIN. Clock generator blocks 3602 and 3652 are used to generate different clock frequencies using the CLK_INT response to the configuration bit CFGx.

圖37描繪可用於圖35中之行電路3502之例示的輸出電路3700。輸出電路3700為差分電路,意謂電路輸出為二個輸入之函數。輸出電路3700包含電流至電壓轉換器(ITV) 3704 (第一電流至電壓轉換器),ITV 3705 (第二電流至電壓轉換器),差分輸入串列近似暫存器類比至數位轉換器(SAR ADC) 3701,電晶體3713、3715、3724及3726 (其形成行多工器之部分,將第一組行中之一行耦接至ITV 3704且將第二組行中之一行耦接至ITV 3705),以及電流源3702及3703。電流源3702表示藉由VMM陣列3401中之由行多工器選擇之行汲取的電流,其中該行儲存W+值。電流源3703表示藉由VMM陣列3401中之由行多工器選擇之行汲取的電流,其中該行儲存W-值。輸出電路3700分別自二個差分行W+及W-接收電流IW+ (第一電流)及IW- (第二電流),且輸出指示彼等電流之數位輸出DOUT,該數位輸出等於W = W+ - W-。Figure 37 illustrates an output circuit 3700 that can be used in the row circuit 3502 of Figure 35. The output circuit 3700 is a differential circuit, meaning that the circuit output is a function of two inputs. The output circuit 3700 includes a current-to-voltage converter (ITV) 3704 (first current-to-voltage converter), an ITV 3705 (second current-to-voltage converter), a differential input serial approximation register analog-to-digital converter (SAR ADC) 3701, transistors 3713, 3715, 3724 and 3726 (which form part of a row multiplexer, coupling one row of the first group of rows to ITV 3704 and one row of the second group of rows to ITV 3705), and current sources 3702 and 3703. Current source 3702 represents the current drawn from a row selected by the row multiplexer in the VMM array 3401, where the row stores the W+ value. Current source 3703 represents the current drawn from a row selected by the row multiplexer in the VMM array 3401, where the row stores the W- value. Output circuit 3700 receives currents IW+ (first current) and IW- (second current) from two differential rows W+ and W- respectively, and outputs a digital output DOUT indicating these currents, which is equal to W = W+ - W-.

在一個替代方案中,輸出電路3700可實施為單端電路,意謂使用一個ITV (3704或3705)及單一輸入ADC。在另一替代方案中,差分輸出可藉由使用二組ITV來達成,且單一輸入ADC可藉由組合二個結果來形成。在另一替代方案中,差分輸出可藉由執行時間多工且藉由在時間上組合二個結果來使用ITV及單一輸入ADC來達成。In one alternative, the output circuit 3700 can be implemented as a single-ended circuit, meaning it uses one ITV (3704 or 3705) and a single-input ADC. In another alternative, the differential output can be achieved by using two ITVs, and the single-input ADC can be formed by combining the two results. In yet another alternative, the differential output can be achieved by time-multiplexing and by using the ITV and a single-input ADC by combining the two results in time.

ITV 3704包含開關3706、3707及3708;積分電容器3710及3711;NMOS串疊電晶體3712;及運算放大器3714。在讀取操作之前,開關3706、3707及3708閉合,導致積分電容器3710及3711之頂板及底板充電至Vsup且VIN+等於Vsup。在積分週期期間,開關3706斷開。電流源3702汲取電流,從而導致VIN+之電壓(第一電壓)與由電流源3702汲取之電流成比例地下拉。亦即,VIN+將等於讀取操作之前VIN+之初始值減去由第一電流IW+引起之第一放電值。在積分週期之後,電容器3710及3711上之電壓經取樣至SAR ADC 3701中。在此取樣週期之後,ADC將開始進行此經取樣電壓至數位輸出位元之轉換。在一個實例中,電容器3710及電容器3711中之一或多者上之電壓在進入SAR ADC 3701之前緩衝。在另一實例中,電容器3711為SAR ADC 3701之二進位電容器陣列中之電容器(亦即,ITV 3704及SAR ADC 3701共用電容器以節省晶粒空間)。在此情況下,在積分週期之後,開關3707及開關3708斷開,且SAR ADC 3701開始電容器3711上之電壓至數位輸出位元的轉換。The ITV 3704 includes switches 3706, 3707, and 3708; integrating capacitors 3710 and 3711; an NMOS series transistor 3712; and an operational amplifier 3714. Before a read operation, switches 3706, 3707, and 3708 are closed, causing the top and bottom plates of integrating capacitors 3710 and 3711 to charge to Vsup, and VIN+ equal to Vsup. During the integration cycle, switch 3706 is open. Current source 3702 draws current, causing the voltage of VIN+ (the first voltage) to be pulled down proportionally to the current drawn by current source 3702. That is, VIN+ will be equal to the initial value of VIN+ before the read operation minus the first discharge value caused by the first current IW+. After the integration cycle, the voltages on capacitors 3710 and 3711 are sampled into the SAR ADC 3701. After this sampling cycle, the ADC will begin the conversion of this sampled voltage into digital output bits. In one example, the voltage on one or more of capacitors 3710 and 3711 is buffered before entering the SAR ADC 3701. In another example, capacitor 3711 is a capacitor in the binary capacitor array of the SAR ADC 3701 (i.e., the ITV 3704 and the SAR ADC 3701 share a capacitor to save die space). In this case, after the integration cycle, switches 3707 and 3708 are turned off, and the SAR ADC 3701 begins the conversion of the voltage on capacitor 3711 into digital output bits.

類似地,ITV 3705包含開關3717、3718及3719;電容器3721及3722;NMOS串疊電晶體3723;及運算放大器3725。在讀取操作之前,開關3717、3718及3719閉合,導致積分電容器3721及3722之頂板及底板充電至Vsup且VIN-等於Vsup。在積分週期期間,開關3717斷開。電流源3703將汲取電流,從而導致VIN-之電壓(第二電壓)與由電流源3703汲取之電流成比例地下拉。亦即,VIN-將等於讀取操作之前VIN-之初始值減去由第一電流IW-引起之第一放電值。電容器3721及3722類似於ITV 3704中之電容器3710及3711。開關3718及3719類似於ITV 3704中之開關3707及3708。針對電流IW-,ITV 3705之操作類似於ITV 3704之操作。Similarly, the ITV 3705 includes switches 3717, 3718, and 3719; capacitors 3721 and 3722; an NMOS series transistor 3723; and an operational amplifier 3725. Before a read operation, switches 3717, 3718, and 3719 are closed, causing the top and bottom plates of integrating capacitors 3721 and 3722 to charge to Vsup and VIN- equal to Vsup. During the integration cycle, switch 3717 is open. Current source 3703 draws current, causing the voltage at VIN- (the second voltage) to be pulled down proportionally to the current drawn by current source 3703. That is, VIN- will be equal to the initial value of VIN- before the read operation minus the first discharge value caused by the first current IW-. Capacitors 3721 and 3722 are similar to capacitors 3710 and 3711 in ITV 3704. Switches 3718 and 3719 are similar to switches 3707 and 3708 in ITV 3704. The operation of ITV 3705 with respect to current IW- is similar to that of ITV 3704.

ITV 3704之位元線調節電路包括當針對位元線IW+需要讀取操作時皆接通之運算放大器3714、電晶體3712以及電晶體3713及3715。此電路在讀取操作期間對位元線強加固定偏壓。具體而言,其在讀取操作期間在位元線上強加VREF,該VREF被施加至運算放大器3714之正輸入端子,而不管由位元線IW+汲取之電流的量值。The bit line adjustment circuit of the ITV 3704 includes an operational amplifier 3714, transistors 3712, 3713, and 3715, which are always turned on when a read operation is required for bit line IW+. This circuit applies a fixed bias to the bit line during the read operation. Specifically, it applies VREF to the bit line during the read operation, which is applied to the positive input terminal of the operational amplifier 3714, regardless of the amount of current drawn from bit line IW+.

類似地,ITV 3705之位元線調節電路包括當針對位元線IW-需要讀取操作時皆接通之運算放大器3725、電晶體3723以及力及感測電晶體3724及3726。此電路在讀取操作期間在位元線上強加VREF,該VREF被施加至運算放大器3725之正輸入端子,而不管由位元線IW+汲取之電流的量值。Similarly, the bit line adjustment circuit of the ITV 3705 includes an operational amplifier 3725, a transistor 3723, and force and sensing transistors 3724 and 3726, all of which are turned on when a read operation is required for bit line IW+. During a read operation, this circuit applies VREF to the bit line, which is applied to the positive input terminal of the operational amplifier 3725, regardless of the amount of current drawn from bit line IW+.

SAR ADC 3701接收差分電壓VIN+及VIN-,以及參考電壓VADCREFH及VADCREFL,且基於VIN+與VIN-之間的差而產生數位輸出DOUT[n:0]。The SAR ADC 3701 receives differential voltages VIN+ and VIN-, as well as reference voltages VADCREFH and VADCREFL, and generates a digital output DOUT[n:0] based on the difference between VIN+ and VIN-.

值得注意地,積分電容器3710及3721將需要相當大的晶粒空間,此係因為其將相對較大。選擇地,自SAR ADC 3701之電容器陣列再利用積分電容器3711及3722以節省晶粒內之面積。當SAR ADC 3701開始轉換時,開關3707、3708、3718及3719斷開。It is worth noting that integrating capacitors 3710 and 3721 will require a relatively large die space due to their relatively large size. Alternatively, integrating capacitors 3711 and 3722 are reused from the capacitor array of the SAR ADC 3701 to save on die area. When the SAR ADC 3701 begins switching, switches 3707, 3708, 3718, and 3719 are turned off.

圖38A描繪可用於圖35中之行電路3502之二個例示的輸出電路3800。輸出電路3800包含電路3801、電路3802及共用電容器網路3813。電路3801及3802各自與圖37中之輸出電路3700相同,除輸出電路3700中之積分電容器3710及3721已分別被共用電容器網路3813中之共用積分電容器3803 (第一積分電容器)及3804 (第二積分電容器)替代;及開關3805、3807、3809及3811用於電路3801之共用電容器網路3813中;及開關3806、3808、3810及3812用於電路3802之電容器網路3813中以外。電路3801包含耦接至IW1+ (第一電流)之ITV 3823 (第一電流至電壓轉換器)及耦接至IW1- (第二電流)之ITV 3824 (第二電流至電壓轉換器)。電路3802包含耦接至IW2+ (第三電流)之ITV 3825 (第三電流至電壓轉換器)及耦接至IW2- (第四電流)之ITV 3826 (第四電流至電壓轉換器)。與使用圖37中之輸出電路3700之二個例示相比,此設計使用較少的二個電容器。Figure 38A illustrates two illustrative output circuits 3800 that can be used in the line circuit 3502 in Figure 35. The output circuit 3800 includes circuits 3801, 3802 and a common capacitor network 3813. Circuits 3801 and 3802 are the same as output circuit 3700 in Figure 37, except that the integrating capacitors 3710 and 3721 in output circuit 3700 have been replaced by common integrating capacitors 3803 (first integrating capacitor) and 3804 (second integrating capacitor) in common capacitor network 3813, respectively; and switches 3805, 3807, 3809 and 3811 are used in common capacitor network 3813 of circuit 3801; and switches 3806, 3808, 3810 and 3812 are used in capacitor network 3813 of circuit 3802. Circuit 3801 includes ITV 3823 (first current to voltage converter) coupled to IW1+ (first current) and ITV 3824 (second current to voltage converter) coupled to IW1- (second current). Circuit 3802 includes ITV 3825 (third current to voltage converter) coupled to IW2+ (third current) and ITV 3826 (fourth current to voltage converter) coupled to IW2- (fourth current). Compared to the two examples using output circuit 3700 in Figure 37, this design uses two fewer capacitors.

ITV 3823產生第一電壓,且ITV 3824產生第二電壓且耦接至包含CDAC 3829之SAR ADC 3827 (第一SAR ADC)。ITV 3825產生第三電壓,且ITV 3826產生第四電壓且耦接至包含CDAC 3830之SAR ADC 3828 (第二SAR ADC)。在讀取操作期間,第一電壓將等於初始電壓(在讀取操作之前同一節點上之電壓)減去由IW1+引起之第一放電值,第二電壓將等於初始電壓(在讀取操作之前同一節點上之電壓)減去由IW1-引起之第二放電值,第三電壓將等於初始電壓(在讀取操作之前同一節點上之電壓)減去由IW2+引起之第三放電值,且第四電壓將等於初始電壓(在讀取操作之前同一節點上之電壓)減去由IW2-引起之第四放電值。ITV 3823 generates a first voltage, and ITV 3824 generates a second voltage and is coupled to SAR ADC 3827 (first SAR ADC) which includes CDAC 3829. ITV 3825 generates a third voltage, and ITV 3826 generates a fourth voltage and is coupled to SAR ADC 3828 (second SAR ADC) which includes CDAC 3830. During the read operation, the first voltage will be equal to the initial voltage (the voltage at the same node before the read operation) minus the first discharge value caused by IW1+, the second voltage will be equal to the initial voltage (the voltage at the same node before the read operation) minus the second discharge value caused by IW1-, the third voltage will be equal to the initial voltage (the voltage at the same node before the read operation) minus the third discharge value caused by IW2+, and the fourth voltage will be equal to the initial voltage (the voltage at the same node before the read operation) minus the fourth discharge value caused by IW2-.

因此,共用積分電容器3803 (第一積分電容器)係以時間多工方式由ITV 3823及ITV 3826共用,且共用積分電容器係以時間多工方式由ITV 3824及ITV 3826共用。Therefore, the shared integrating capacitor 3803 (the first integrating capacitor) is shared by ITV 3823 and ITV 3826 in a time-multiplexed manner, and the shared integrating capacitor is shared by ITV 3824 and ITV 3826 in a time-multiplexed manner.

圖38B描繪可用於圖35中之行電路3502之四個例示的輸出電路3850。輸出電路3850具有與輸出電路3800相同的設計,但具有8個ITV而非4個ITV且4個SAR ADC而非2個SAR ADC,其使用包含共用電容器3871及3872之共用電容器網路3870,以及用以將共用電容器3871及3872選擇性地耦接至ITV 3851、3852、3853、3854、3855、3856、3857及3858的一系列開關。ITV 3851耦接至位元線電流IW1+ (第一電流),ITV 3852耦接至位元線電流IW1- (第二電流),ITV 3853耦接至位元線電流IW2+ (第三電流),ITV 3854耦接至位元線電流IW2- (第四電流),ITV 3855耦接至位元線電流IW3+ (第五電流),ITV 3856耦接至位元線電流IW3- (第六電流),ITV 3857耦接至位元線IW4+ (第七電流),且ITV 3858耦接至位元線電流IW4- (第八電流)。ITV 3851及3852耦接至SAR ADC 3861,ITV 3853及3854耦接至SAR ADC 3862,ITV 3855及3856耦接至SAR ADC 3863,且ITV 3857及3858耦接至SAR ADC 3864,SAR ADC 3861包含二進位電容器陣列(CDAC) 3865,SAR ADC 3862包含CDAC 3866,SAR ADC 3863包含CDAC 3867,且SAR ADC 3864包含CDAC 3868。下文參考圖42更詳細地描述CDAC之設計。Figure 38B illustrates four exemplified output circuits 3850 that can be used in the output circuit 3502 in Figure 35. The output circuits 3850 have the same design as the output circuit 3800, but have 8 ITVs instead of 4 ITVs and 4 SAR ADCs instead of 2 SAR ADCs. They use a common capacitor network 3870 including common capacitors 3871 and 3872, and a series of switches for selectively coupling the common capacitors 3871 and 3872 to ITVs 3851, 3852, 3853, 3854, 3855, 3856, 3857 and 3858. ITV 3851 is coupled to bit line current IW1+ (first current), ITV 3852 is coupled to bit line current IW1- (second current), ITV 3853 is coupled to bit line current IW2+ (third current), ITV 3854 is coupled to bit line current IW2- (fourth current), ITV 3855 is coupled to bit line current IW3+ (fifth current), ITV 3856 is coupled to bit line current IW3- (sixth current), ITV 3857 is coupled to bit line current IW4+ (seventh current), and ITV 3858 is coupled to bit line current IW4- (eighth current). ITVs 3851 and 3852 are coupled to SAR ADC 3861, ITVs 3853 and 3854 are coupled to SAR ADC 3862, ITVs 3855 and 3856 are coupled to SAR ADC 3863, and ITVs 3857 and 3858 are coupled to SAR ADC 3864. SAR ADC 3861 includes a binary capacitor array (CDAC) 3865, SAR ADC 3862 includes a CDAC 3866, SAR ADC 3863 includes a CDAC 3867, and SAR ADC 3864 includes a CDAC 3868. The design of the CDACs is described in more detail below with reference to Figure 42.

圖38C描繪可用於圖35中之行電路3502之四個例示的輸出電路3880。輸出電路3880與圖38B中之輸出電路3850相同,除其使用二個SAR ADC (分別含有CDAC 3885及3886之SAR ADC 3883及3884)而非四個SAR ADC以外,此藉由添加多工器3881且添加多工器3882來達成,該多工器3881允許來自ITV 3851及3852之信號與來自ITV 3853及3854之信號進行時間多工以共用SAR ADC 3883,該多工器3882允許來自ITV 3855及3856之信號與來自ITV 3857及3858之信號進行時間多工以共用SAR ADC 3884。Figure 38C depicts four exemplary output circuits 3880 that can be used in the linear circuit 3502 in Figure 35. Output circuit 3880 is the same as output circuit 3850 in Figure 38B, except that it uses two SAR ADCs (SAR ADCs 3883 and 3884 containing CDAC 3885 and 3886 respectively) instead of four SAR ADCs. This is achieved by adding multiplexer 3881 and multiplexer 3882. Multiplexer 3881 allows signals from ITV 3851 and 3852 to be time-multiplexed with signals from ITV 3853 and 3854 to share SAR ADC 3883. Multiplexer 3882 allows signals from ITV 3855 and 3856 to be time-multiplexed with signals from ITV 3857 and 3858 to share SAR ADC 3884.

圖38D描繪輸出電路3890,其與圖38A中之輸出電路3800相同,除各ITV可藉由開關或多工器(未顯示)選擇性地耦接至二個不同位元線中之一者以外。ITV 3823可為耦接至IW1+ (第一電流)或IW3+ (第五電流),且ITV 3824可耦接至IW1- (第二電流)或IW3- (第六電流),ITV 3825可耦接至IW2+ (第三電流)或IW4+ (第七電流),且ITV 3826可耦接至IW2- (第四電流)或IW4- (第八電流)。Figure 38D depicts output circuit 3890, which is the same as output circuit 3800 in Figure 38A, except that each ITV can be selectively coupled to one of two different bit lines by means of a switch or multiplexer (not shown). ITV 3823 can be coupled to IW1+ (first current) or IW3+ (fifth current), and ITV 3824 can be coupled to IW1- (second current) or IW3- (sixth current), ITV 3825 can be coupled to IW2+ (third current) or IW4+ (seventh current), and ITV 3826 can be coupled to IW2- (fourth current) or IW4- (eighth current).

圖38E描繪輸出電路3895,其類似於輸出電路3890,除二個CDAC 3829及3830用於單一SAR ADC 3896內以外。Figure 38E depicts output circuit 3895, which is similar to output circuit 3890, except that two CDACs 3829 and 3830 are used in a single SAR ADC 3896.

圖39A描繪用於操作輸出電路3850之時序圖3900,該輸出電路含有八個ITV及四個SAR ADC。存在二個主要週期。Figure 39A depicts the timing diagram 3900 used to operate the output circuit 3850, which contains eight ITVs and four SAR ADCs. There are two main cycles.

第一週期為積分或取樣週期(包含子週期t1、t2、t3及t4),其中位元線電流係藉由電容器積分。The first period is the integration or sampling period (including sub-periods t1, t2, t3 and t4), in which the bit line current is integrated by the capacitor.

在第一週期之子週期t1期間,共用電容器3871及3872耦接至ITV 3851及3852,且位元線IW1+及IW1-經取樣且保持於CDAC 3865中。During the sub-cycle t1 of the first cycle, common capacitors 3871 and 3872 are coupled to ITVs 3851 and 3852, and bit lines IW1+ and IW1- are sampled and held in CDAC 3865.

在第一週期之子週期t2期間,共用電容器3871及3872耦接至ITV 3853及3854,且位元線IW2+及IW2-經取樣且保持於CDAC 3866中。During the sub-cycle t2 of the first cycle, common capacitors 3871 and 3872 are coupled to ITV 3853 and 3854, and bit lines IW2+ and IW2- are sampled and held in CDAC 3866.

在第一週期之子週期t3期間,共用電容器3871及3872耦接至ITV 3855及3856,且位元線IW3+及IW3-經取樣且保持於CDAC 3867中。During the sub-cycle t3 of the first cycle, common capacitors 3871 and 3872 are coupled to ITV 3855 and 3856, and bit lines IW3+ and IW3- are sampled and held in CDAC 3867.

在第一週期之子週期t4期間,共用電容器3871及3872耦接至ITV 3857及3858,且位元線IW4+及IW4-經取樣且保持於CDAC 3868中。During the sub-cycle t4 of the first cycle, common capacitors 3871 and 3872 are coupled to ITV 3857 and 3858, and bit lines IW4+ and IW4- are sampled and held in CDAC 3868.

第二週期(包含子週期t5)為SAR ADC 3861、3862、3863及3864分別使用儲存於CDAC 3865、3866、3867及3868中之電壓的轉換操作。The second cycle (including sub-cycle t5) involves the SAR ADCs 3861, 3862, 3863 and 3864 performing voltage conversion operations using the voltages stored in CDACs 3865, 3866, 3867 and 3868, respectively.

圖39B描繪用於操作圖38C中之輸出電路3880之時序圖3920,該輸出電路含有八個ITV及二個SAR ADC。如可見,在ITV之取樣與SAR ADC之轉換之間存在乒乓動作,其中取樣及保持在一個CDAC上發生,同時轉換在另一CDAC上執行。Figure 39B depicts a timing diagram 3920 for operating the output circuit 3880 in Figure 38C, which contains eight ITVs and two SAR ADCs. As can be seen, there is a ping-pong action between the sampling of the ITVs and the conversion of the SAR ADCs, wherein sampling and holding occur on one CDAC while conversion is performed on the other CDAC.

在子週期t1期間,共用電容器3871及3872耦接至ITV 3851及3852,且位元線IW1+及IW1-經取樣且保持於CDAC 3885中。During sub-cycle t1, common capacitors 3871 and 3872 are coupled to ITVs 3851 and 3852, and bit lines IW1+ and IW1- are sampled and held in CDAC 3885.

在子週期t2期間,共用電容器3871及3872耦接至ITV 3853及3854,且位元線IW2+及IW2-經取樣且保持於CDAC 3886中。SAR ADC 3883對在子週期t1期間儲存於CDAC 3885中之值執行轉換操作。During sub-cycle t2, common capacitors 3871 and 3872 are coupled to ITVs 3853 and 3854, and bit lines IW2+ and IW2- are sampled and stored in CDAC 3886. SAR ADC 3883 performs a conversion operation on the value stored in CDAC 3885 during sub-cycle t1.

在子週期t3期間,共用電容器3871及3872耦接至ITV 3855及3856,且位元線IW3+及IW3-經取樣且保持於CDAC 3885中。SAR ADC 3884對在子週期t2期間儲存於CDAC 3886中之值執行轉換操作。During sub-cycle t3, common capacitors 3871 and 3872 are coupled to ITVs 3855 and 3856, and bit lines IW3+ and IW3- are sampled and stored in CDAC 3885. SAR ADC 3884 performs a conversion operation on the value stored in CDAC 3886 during sub-cycle t2.

在子週期t4期間,共用電容器3871及3872耦接至ITV 3857及3858,且位元線IW4+及IW4-經取樣且保持於CDAC 3886中。SAR ADC 3883對在子週期t3期間儲存於CDAC 3885中之值執行轉換操作。During sub-cycle t4, common capacitors 3871 and 3872 are coupled to ITVs 3857 and 3858, and bit lines IW4+ and IW4- are sampled and stored in CDAC 3886. SAR ADC 3883 performs a conversion operation on the value stored in CDAC 3885 during sub-cycle t3.

在子週期t5期間,SAR ADC 3884對在子週期t4期間儲存於CDAC 3886中之值執行轉換操作。During sub-cycle t5, the SAR ADC 3884 performs a conversion operation on the value stored in the CDAC 3886 during sub-cycle t4.

圖39C描繪用於操作圖38A中之輸出電路3800的時序圖3940,該輸出電路含有四個ITV及二個SAR ADC,其中取樣及保持在一個CDAC上發生,而轉換在另一CDAC上執行。Figure 39C depicts a timing diagram 3940 for operating the output circuit 3800 in Figure 38A, which contains four ITVs and two SAR ADCs, wherein sampling and holding occur on one CDAC and switching is performed on the other CDAC.

在子週期t1期間,共用電容器3803及3804耦接至ITV 3823及3824,且位元線IW1+及IW1-經取樣且保持於CDAC 3829中。During sub-cycle t1, common capacitors 3803 and 3804 are coupled to ITVs 3823 and 3824, and bit lines IW1+ and IW1- are sampled and held in CDAC 3829.

在子週期t2期間,共用電容器3803及3804耦接至ITV 3825及3826,且位元線IW2+及IW2-經取樣且保持於CDAC 3830中。SAR ADC 3827對在子週期t1期間儲存於CDAC 3829中之值執行轉換操作。During sub-cycle t2, common capacitors 3803 and 3804 are coupled to ITVs 3825 and 3826, and bit lines IW2+ and IW2- are sampled and held in CDAC 3830. SAR ADC 3827 performs a conversion operation on the value stored in CDAC 3829 during sub-cycle t1.

在子週期t3期間,共用電容器3803及3804耦接至ITV 3827及3828,且位元線IW3+及IW3-經取樣且保持於CDAC 3829中。SAR ADC 3828對在子週期t2期間儲存於CDAC 3830中之值執行轉換操作。During sub-cycle t3, common capacitors 3803 and 3804 are coupled to ITVs 3827 and 3828, and bit lines IW3+ and IW3- are sampled and held in CDAC 3829. SAR ADC 3828 performs a conversion operation on the value stored in CDAC 3830 during sub-cycle t2.

在子週期t4期間,共用電容器3803及3804耦接至ITV 3827及3828,且位元線IW4+及IW4-經取樣且保持於CDAC 3830中。SAR ADC 3827對在子週期t3期間儲存於CDAC 3829中之值執行轉換操作。During sub-cycle t4, common capacitors 3803 and 3804 are coupled to ITVs 3827 and 3828, and bit lines IW4+ and IW4- are sampled and held in CDAC 3830. SAR ADC 3827 performs a conversion operation on the value stored in CDAC 3829 during sub-cycle t3.

在子週期t5期間,SAR ADC 3828對在子週期t4期間儲存於CDAC 3830中之值執行轉換操作。During sub-cycle t5, the SAR ADC 3828 performs a conversion operation on the value stored in the CDAC 3830 during sub-cycle t4.

圖39D描繪用於操作圖38D中之輸出電路3890的時序圖3950,該輸出電路含有選擇性地耦接至8個不同位元線之四個ITV及二個SAR ADC。時序圖3950利用位元線電流重疊技術來減少穩定時間。舉例而言,在位元線IW1+及IW1-之輸出轉換之後,位元線IW2+及IW2-被致能,而在停用位元線IW+及IW-且開始位元線IW2+及IW2-之轉換程序之前,位元線IW1+及IW1-仍然被致能。此確保在跨越多個位元線之讀取操作之間至ITV中之高效電流負載。此對於ITV由多個位元線共用之情況係有用的。Figure 39D depicts a timing diagram 3950 for operating the output circuit 3890 in Figure 38D, which contains four ITVs and two SAR ADCs selectively coupled to eight different bit lines. Timing diagram 3950 utilizes bit line current overlap technology to reduce settling time. For example, after the output conversion of bit lines IW1+ and IW1-, bit lines IW2+ and IW2- are enabled, while bit lines IW1+ and IW1- remain enabled before the conversion procedure of bit lines IW2+ and IW2- is deactivated. This ensures efficient current loading to the ITV between read operations across multiple bit lines. This is useful when the ITV is shared by multiple bit lines.

在子週期t1期間,位元線IW1+及IW1-經致能,共用電容器3803及3804耦接至ITV 3823及3824,位元線IW1+及IW1-經取樣且保持於CDAC 3829中,且SAR ADC 3827對儲存於CDAC 3829中之值執行轉換操作。在轉換操作之後,IW2+及IW2-經致能,而IW1+及IW1-仍經致能。During sub-cycle t1, bit lines IW1+ and IW1- are enabled, coupled to ITVs 3823 and 3824 via shared capacitors 3803 and 3804. Bit lines IW1+ and IW1- are sampled and stored in CDAC 3829, and SAR ADC 3827 performs a conversion operation on the values stored in CDAC 3829. After the conversion operation, IW2+ and IW2- are enabled, while IW1+ and IW1- remain enabled.

在子週期t2期間,位元線IW2+及IW2-經致能,共用電容器3803及3804耦接至ITV 3825及3826,位元線IW2+及IW2-經取樣且保持於CDAC 3830中,且SAR ADC 3828對儲存於CDAC 3830中之值執行轉換操作。在轉換操作之後,IW3+及IW3-經致能,而IW2+及IW2-仍經致能。During sub-cycle t2, bit lines IW2+ and IW2- are enabled, coupled to ITVs 3825 and 3826 via shared capacitors 3803 and 3804. Bit lines IW2+ and IW2- are sampled and stored in CDAC 3830, and SAR ADC 3828 performs a conversion operation on the values stored in CDAC 3830. After the conversion operation, IW3+ and IW3- are enabled, while IW2+ and IW2- remain enabled.

在子週期t3期間,位元線IW3+及IW3-經致能,共用電容器3803及3804耦接至ITV 3823及3824,位元線IW3+及IW3-經取樣且保持於CDAC 3829中,且SAR ADC 3827對儲存於CDAC 3829中之值執行轉換操作。在轉換操作之後,IW4+及IW4-經致能,而IW3+及IW3-仍經致能。During sub-cycle t3, bit lines IW3+ and IW3- are enabled, coupled to ITVs 3823 and 3824 via shared capacitors 3803 and 3804. Bit lines IW3+ and IW3- are sampled and stored in CDAC 3829, and SAR ADC 3827 performs a conversion operation on the values stored in CDAC 3829. After the conversion operation, IW4+ and IW4- are enabled, while IW3+ and IW3- remain enabled.

在子週期t4期間,位元線IW4+及IW4-經致能,共用電容器3803及3804耦接至ITV 3825及3826,位元線IW4+及IW4-經取樣且保持於CDAC 3830中,且SAR ADC 3828對儲存於CDAC 3829中之值執行轉換操作。During sub-cycle t4, bit lines IW4+ and IW4- are enabled and coupled to ITVs 3825 and 3826 via capacitors 3803 and 3804. Bit lines IW4+ and IW4- are sampled and stored in CDAC 3830, and SAR ADC 3828 performs a conversion operation on the value stored in CDAC 3829.

在另一實例中,使用電流引導技術,其中在ITV之輸出操作(電流至電壓轉換)之間切換(諸如跨多個位元線)期間,存在用於ITV電路之高效電流負載(例如,來自固定電源)。In another example, current-driven techniques are used where a high-efficiency current load (e.g., from a stationary power supply) is present for the ITV circuit during switching between ITV output operations (current-to-voltage conversion, such as across multiple bit lines).

替代地,預充電電流可在本文中所描述之所有方案的取樣週期之前在各位元線上致能。此有助於確保位元線在穩定週期期間自低電壓斜升至高電壓。Alternatively, a pre-charge current can be applied to each bit line prior to the sampling cycle in all the schemes described herein. This helps ensure that the bit lines ramp up from low voltage to high voltage during a steady cycle.

圖40描繪可用於圖35中之行電路3502之i個例示的輸出電路4000。輸出電路4000包含如先前針對圖37所描述之輸出電路3700中的SAR ADC 3701 (或任何類型之ADC架構)、用於行W+之ITV 3704及用於行W-之ITV 3705。不同於輸出電路3700,輸出電路4000可藉由行多工器之部分4001-1、...、4001-i選擇性地連接至i個不同W+電流源(電流源4002-1、...、4002-i)中的一者,且藉由行多工器之部分4003-1、...、4003-i連接至i個不同W-電流源(電流源4004-1、...、4004-i)中的一者。在任何給定時間,一對行(一個W+行及一個W-)連接至用於行W+之ITV 3704及用於行W-之ITV 3705,且所有其他行使用多工器與ITV 3704及ITV 3705斷開。因此,i個不同W+/W-行對共用ITV 3704及3705以及SAR ADC 3701。位元線電流切換可使用如上文參考圖39C及圖39D所描述之電流引導技術及位元線電流重疊技術中之一或多者執行。Figure 40 illustrates an illustrative output circuit 4000 that can be used in the row circuit 3502 in Figure 35. The output circuit 4000 includes a SAR ADC 3701 (or any type of ADC architecture) as previously described with respect to the output circuit 3700 in Figure 37, an ITV 3704 for row W+, and an ITV 3705 for row W-. Unlike output circuit 3700, output circuit 4000 can be selectively connected to one of i different W+ current sources (current sources 4002-1, ..., 4002-i) via portions 4001-1, ..., 4001-i of the line multiplexer, and to one of i different W- current sources (current sources 4004-1, ..., 4004-i) via portions 4003-1, ..., 4003-i of the line multiplexer. At any given time, a pair of lines (one W+ line and one W- line) is connected to ITV 3704 for line W+ and ITV 3705 for line W-, and all other lines are disconnected from ITV 3704 and ITV 3705 using the multiplexer. Therefore, i different W+/W- row pairs share ITV 3704 and 3705 and SAR ADC 3701. Bit line current switching can be performed using one or more of the current guiding techniques and bit line current overlap techniques described above with reference to Figures 39C and 39D.

圖41描繪用以產生VADCREFL及VADCREFH之參考電壓產生器4100,該VADCREFL及VADCREFH用於產生作為由如先前圖中所顯示之SAR ADC 3701使用之參考電壓的電壓VREFP、VCIM及VREFN。參考電壓產生器4100為類似於圖37中所顯示之電流至電壓轉換器,且將參考電流IBLREF轉換成電壓VADCREFH及VADCREFL。參考電壓產生器4100包含開關4101、4102、4103、4104及4120;電容器4105及4106;電晶體4107;運算放大器4110;電晶體4108及4109;以及參考電流源4111。Figure 41 illustrates a reference voltage generator 4100 for generating VADCREFL and VADCREFH, which are used to generate voltages VREFP, VCIM, and VREFN as a reference voltage used by the SAR ADC 3701 shown in the previous figures. The reference voltage generator 4100 is similar to the current-to-voltage converter shown in Figure 37, and converts the reference current IBLEF into voltages VADCREFH and VADCREFL. The reference voltage generator 4100 includes switches 4101, 4102, 4103, 4104 and 4120; capacitors 4105 and 4106; transistor 4107; operational amplifier 4110; transistors 4108 and 4109; and reference current source 4111.

首先,開關4120、4101、4102、4103及4104閉合,從而導致電容器4105及4106之頂板及底板充電至VDDA。First, switches 4120, 4101, 4102, 4103 and 4104 are closed, thereby causing the top and bottom plates of capacitors 4105 and 4106 to charge to VDDA.

隨後,開關4120斷開。參考電流源4111汲取參考電流IBLREF,導致VADCREFL與由參考電流源4111汲取之電流成比例地下拉。Subsequently, switch 4120 is turned off. Reference current source 4111 draws reference current IBLREF, causing VADCREFL to drop proportionally to the current drawn from reference current source 4111.

在讀取操作期間,運算放大器4110以及電晶體4107、4108及4109在位元線上強加VREF,該VREF被施加至運算放大器3714之正輸入端子,而不管由位元線IW+汲取之電流的量值。圖42描繪SAR ADC 4200,其為可用於以下之SAR ADC之實例:圖37中之SAR ADC 3701;圖38A中之SAR ADC 3827及3828;圖38B中之SAR ADC 3861、3862、3863及3864;圖38C中之SAR ADC 3883及3884;圖38D中之SAR ADC 3827及3828;及圖40中之SAR ADC 3701。SAR ADC 4200包含CDAC 4201 (其為圖37、圖38A、圖38B、圖38C、圖38D及圖40中之CDAC 3829、3830、3865、3866、3867、3868、3885、3886、3829及3830)、比較器4202、SAR邏輯4203、開關4204及開關4205。當在SAR ADC 3701中不執行操作時,CDAC 4201之部分或全部選擇地可用作圖38A中之電容器3803及3804、圖38B及圖38C中之電容器3871及3872以及圖38D中之電容器3803及3804。During the read operation, operational amplifier 4110 and transistors 4107, 4108, and 4109 impose VREF on the bit line, which is applied to the positive input terminal of operational amplifier 3714, regardless of the amount of current drawn from bit line IW+. Figure 42 depicts SAR ADC 4200, which is an example of SAR ADCs that can be used with: SAR ADC 3701 in Figure 37; SAR ADCs 3827 and 3828 in Figure 38A; SAR ADCs 3861, 3862, 3863, and 3864 in Figure 38B; SAR ADCs 3883 and 3884 in Figure 38C; SAR ADCs 3827 and 3828 in Figure 38D; and SAR ADC 3701 in Figure 40. The SAR ADC 4200 includes a CDAC 4201 (which are CDACs 3829, 3830, 3865, 3866, 3867, 3868, 3885, 3886, 3829, and 3830 in Figures 37, 38A, 38B, 38C, 38D, and 40), a comparator 4202, a SAR logic 4203, a switch 4204, and a switch 4205. When not operating in the SAR ADC 3701, some or all of the CDAC 4201 can be selectively used as capacitors 3803 and 3804 in Figure 38A, capacitors 3871 and 3872 in Figures 38B and 38C, and capacitors 3803 and 3804 in Figure 38D.

SAR ADC 4200藉由首先將輸入電壓VIN+及VIN-分別取樣至電容器陣列(CDAC) 4201P及4201N中而操作。SAR邏輯4203將連續地將電壓轉換成數位位元,從而以最高有效位元開始且以最低有效位元結束。舉例而言,對於8位元ADC,B7將首先被轉換,且B0將最後被轉換。因此,對於8位元ADC,存在8個轉換時脈。對於各轉換,VIN+將與VIN-進行比較,且比較決策用於切換與位元相聯結之電容器以用於下一位元比較。The SAR ADC 4200 operates by first sampling the input voltages VIN+ and VIN- into capacitor arrays (CDACs) 4201P and 4201N, respectively. SAR logic 4203 sequentially converts the voltages into digital bits, starting with the most significant bit and ending with the least significant bit. For example, for an 8-bit ADC, B7 will be converted first, and B0 will be converted last. Therefore, for an 8-bit ADC, there are 8 conversion clocks. For each conversion, VIN+ is compared with VIN-, and the comparison decision is used to switch the capacitor associated with the bit for the next bit comparison.

圖43描繪可用於產生由圖42中之SAR ADC 4200使用的VREFP、VREFN及VCIM之電壓產生器4300。電壓產生器4300包含開關4301、4302、4303及4304;電容器4305及4306;開關4307;可變電容器4308;及比較器4309。在此實例中,電容器4305之電容=電容器4306之電容。藉由適當地切換開關4301至4304,VOUT = (VIN2 - VIN1) * (電容器4305之電容) / (電容器4306之電容)。舉例而言,為了產生VREFP (其中VOUT將為用於VREFP之值),來自圖41之VADCREFH用於VIN1,且來自圖41之VADCREFL用於VIN2。可變電容器4308可經調整以產生VREFN及VCIM。Figure 43 illustrates a voltage generator 4300 that can be used to generate VREFP, VREFN, and VCIM for use by the SAR ADC 4200 in Figure 42. The voltage generator 4300 includes switches 4301, 4302, 4303, and 4304; capacitors 4305 and 4306; switch 4307; variable capacitor 4308; and comparator 4309. In this example, the capacitance of capacitor 4305 equals the capacitance of capacitor 4306. By appropriately switching switches 4301 to 4304, VOUT = (VIN2 - VIN1) * (capacitor 4305) / (capacitor 4306). For example, to generate VREFP (where VOUT will be the value used for VREFP), VADCREFH from Figure 41 is used for VIN1, and VADCREFL from Figure 41 is used for VIN2. The variable capacitor 4308 can be adjusted to generate VREFN and VCIM.

圖44描繪可用於產生由圖42中之SAR ADC 4200使用的VREFP、VREFN及VCM之電壓產生器4400。電壓產生器4400包含箝位4401及電阻器梯4402。電阻器梯4402包含n+1個串聯耦接之電阻器。各節點將具有不同電壓,範圍介於電阻器梯4402之頂部節點處之VREF至電阻器梯4402之底部節點處之接地。適當節點經選擇以提供電壓VREFP、VREFN及VCM以用於SAR ADC 4200。Figure 44 illustrates a voltage generator 4400 that can be used to generate VREFP, VREFN, and VCM for use by the SAR ADC 4200 in Figure 42. The voltage generator 4400 includes a clamp 4401 and a resistor ladder 4402. The resistor ladder 4402 includes n+1 series-coupled resistors. Each node will have a different voltage, ranging from VREF at the top node of the resistor ladder 4402 to ground at the bottom node of the resistor ladder 4402. Appropriate nodes are selected to provide voltages VREFP, VREFN, and VCM for use in the SAR ADC 4200.

如本文所使用,術語「在...上方」及「在...上」兩者包括性地包括「直接在...上」(其間未裝設有中間材料、元件或空間)及「間接地在...上」(其間裝設有中間材料、元件或空間)。同樣地,術語「鄰近」包括「直接鄰近」(其間未裝設有中間材料、元件或空間)及「間接鄰近」(其間裝設有中間材料、元件或空間),「安裝至」包括「直接安裝至」(其間未裝設有中間材料、元件或空間)及「間接安裝至」(其間裝設有中間材料、元件或空間),且「電耦接」包括「直接電耦接至」(其間無將元件電連接在一起之中間材料或元件)及「間接電耦接至」(其間具有將元件電連接在一起之中間材料或元件)。舉例而言,「在基板上方」形成元件可包括直接在基板上形成元件而其間無中間材料/元件,以及間接地在基板上形成元件而其間具有一或多種中間材料/元件。As used herein, the terms "above" and "on" inclusively include "directly on" (without intermediate materials, components, or spaces) and "indirectly on" (with intermediate materials, components, or spaces). Similarly, the term "adjacent" includes "directly adjacent" (without intermediate materials, components, or spaces) and "indirectly adjacent" (with intermediate materials, components, or spaces), "installed to" includes "directly installed to" (without intermediate materials, components, or spaces) and "indirectly installed to" (with intermediate materials, components, or spaces), and "electrically coupled" includes "directly electrically coupled to" (without intermediate materials or components electrically connecting the components together) and "indirectly electrically coupled to" (with intermediate materials or components electrically connecting the components together). For example, forming an element "above a substrate" may include forming an element directly on the substrate without intermediate materials/elements, or forming an element indirectly on the substrate with one or more intermediate materials/elements.

12:半導體基板 14:源極區 16:汲極區 18:通道區 20:浮動閘極 22:字元線端子 24:位元線 28:控制閘極 30:抹除閘極 31:數位至類比轉換器 32,900,1000,1100,1200,1300,1601,1701,2001,2101,2200,2300,2400,2500,2600,2700,2800,2900,3000,3401:VMM陣列 32a,32b,32c,32d,32e:VMM陣列(層) 33:非揮發性記憶體胞元陣列 34:抹除閘極及字元線閘極解碼器 35:控制閘極解碼器 36:位元線解碼器 37:源極線解碼器 38:差分求和器/求和運算放大器 39,1602,1702,2002,2102:激勵函數區塊 210,310,410,510,710:記憶體胞元 901,1003,1103,1203,1303:記憶體陣列 902,1001,1002,1101,1102,1201,1202,1301,1302:參考陣列 903:控制閘極線 904:抹除閘極線 1014,1212:二極體連接式貫穿多工器 1204:串疊電晶體 1205,1709,1710,2104:多工器 1314:二極體連接式參考胞元貫穿多工器 1400:LSTM 1401,1402,1403,1404,1801,1802,1803,1804:胞元 1500,1600,1700:LSTM胞元 1501,1502,1503,1901,1902:S型函數構件 1504,1505,1903:雙曲正切構件 1506,1507,1508,1703,1904,1905,1906,2103:乘法器構件 1509,1708,1907,2105:加法構件 1704,1705,1706,1707,2106,2107,2108:暫存器 1800:GRU 1900,2000,2100:GRU胞元 1908,2109:互補構件 2701-1,2701-2,…,2701-(N-1),2701-N:位元線控制閘極 3100,3210,3300,3400:VMM系統 3101,3102,3213,3303,3304,3305,3306,3307,3308:求和電路 3211:第一陣列 3212:第二陣列 3301,3302:陣列 3402:列解碼器 3403:高電壓解碼器 3404:行解碼器 3405:位元線驅動器 3406:輸入電路 3407,3500,3700,3800,3850,3880,3890,3895,4000:輸出電路 3408:控制邏輯 3409:偏壓產生器 3410:高電壓產生區塊 3411:電荷泵 3412:電荷泵調節器 3413:高電壓位準產生器 3414:演算法控制器 3415:類比電路系統 3416:控制引擎 3417:測試控制邏輯 3418:SRAM區塊 3501:參考電路 3502:行電路 3503,3506:行多工器 3504,3507,3704,3705,3823,3824,3825,3826,3851,3852,3853, 3854,3855,3856,3857,3858:ITV 3505:參考產生器 3508:類比至數位轉換器 3600,3650:時脈產生器 3601:延遲鎖定迴路 3602,3652:時脈產生器區塊 3651:鎖相迴路 3701,3827,3828,3861,3862,3863,3864,3883,3884,3896,4200:SAR ADC 3702,3703,4002-1,...,4002-I,4004-1,...,4004-i:電流源 3706,3707,3708,3717,3718,3719,3805,3806,3807,3808,3809,3810,3811,3812,4101,4102,4103,4104,4120,4204,4205,4301,4302,4303,4304,4307:開關 3710,3711,3721,3722:積分電容器;電容器 3712,3723:NMOS串疊電晶體;電晶體 3713,3715,3724,3726,4107,4108,4109:電晶體 3714,3725,4110:運算放大器 3801,3802:電路 3803,3804,3871,3872:共用積分電容器;共用電容器;電容器 3813,3870:共用電容器網路;電容器網路 3829,3830,3865,3866,3867,3868,3885,3886,4201,4201P,4201N:CDAC 3881,3882:多工器 3900,3920,3940,3950:時序圖 4001-1,…,4001-i,4003-1,…,4003-i:部分 4100:參考電壓產生器 4105,4106,4305,4306:電容器 4111:參考電流源 4202,4309:比較器 4203:SAR邏輯 4300,4400:電壓產生器 4308:可變電容器 4401:箝位 4402:電阻器梯 BL0,BL1,BL2,BL3,…,BLN:位元線 BLR0,BLR1,BLR2,BLR3:端子 c0,c1,c2,c3,c(t-1),c(t):胞元狀態向量 C1,C2,C3,S1,S2,S3:層 CB1,CB2,CB3,CB4:突觸 CFGx:組構位元 CG0,CG1,CG2,CG3,…,CGM-1,CGM:控制閘極線/控制閘極電壓 CLKIN:系統時脈 CLKOUT:較快時脈 DOUT,DOUT[n:0]:數位輸出 EG0,EG1:抹除閘極/EG線 EGR0,EGR1:EG線 EN:致能信號 h0,h1,h2,h3,h(t-1),h(t):輸出向量 IBLREF:參考電流 Inputx,INPUT0,INPUT1,INPUT2,INPUT3,INPUT4,…,INPUTN-1,INPUTN,INPUTM-1,INPUTM:輸入 IW+,IW-:電流 IW1+,IW1-,IW2+,IW2-,IW3+,IW3-,IW4+,IW4-:位元線電流 OUTPUT0,OUTPUT1,OUTPUT2,OUTPUT3,OUTPUT4,OUTPUTN-1,OUTPUTN:輸出 P1:激勵函數/池化函數 P2:激勵函數 S0:輸入層 SL0,SL1,SL2,SL3:源極線 t1,t2,t3,t4,t5:子週期 VADCREFH,VADCREFL:參考電壓 VIN+,VIN-:差分電壓/輸入電壓 VREFP,VREFN,VCIM:電壓 WL0,WL1,WL2,WL3,WL4,WL5,WL6,WL7,WLM-1,WLM,WLA0,WLB0,WLA1,WLB1,WLA2,WLB2,WLA3,WLB3:字元線 x0,x1,x2,x3,x(t):輸入向量12: Semiconductor substrate; 14: Source region; 16: Drain region; 18: Channel region; 20: Floating gate; 22: Character line terminal; 24: Bit line; 28: Control gate; 30: Erase gate; 31: Digital-to-analog converter; 32, 900, 1000, 1100, 1200, 1300, 1601, 1701, 2001, 2101, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3401: VMM array; 32a, 32b, 32c, 32d, 32e: VMM array (layers) 33: Non-volatile memory cell array 34: Erase gate and character line gate decoder 35: Control gate decoder 36: Bit line decoder 37: Source line decoder 38: Differential summer/summation operational amplifier 39, 1602, 1702, 2002, 2102: Excitation function blocks 210, 310, 410, 510, 710: Memory cells 901, 1003, 1103, 1203, 1303: Memory cells Memory arrays 902, 1001, 1002, 1101, 1102, 1201, 1202, 1301, 1302: Reference array; 903: Control gate line; 904: Erase gate line; 1014, 1212: Diode-connected through-cell multiplexer; 1204: Series transistor; 1205, 1709, 1710, 2104: Multiplexer; 1314: Diode-connected reference cell through-cell multiplexer; 1400: LSTM. 1401, 1402, 1403, 1404, 1801, 1802, 1803, 1804: Cells; 1500, 1600, 1700: LSTM cells; 1501, 1502, 1503, 1901, 1902: S-shaped function components; 1504, 1505, 1903: Hyperbolic tangent components; 1506, 1507, 1508, 1703, 1904, 1905, 1906, 2103: Multiplier components; 1509, 1708, 1907, 2105: Adder components; 1704, 1705, 1706, 1707, 2106, 2107, 2108: Registers; 1800: GRU 1900, 2000, 2100: GRU cells; 1908, 2109: Complementary components; 2701-1, 2701-2, ..., 2701-(N-1), 2701-N: Bit line control gates; 3100, 3210, 3300, 3400: VMM systems; 3101, 3102, 3213, 3303 3304, 3305, 3306, 3307, 3308: Summation circuit; 3211: First array; 3212: Second array; 3301, 3302: Array; 3402: Column decoder; 3403: High-voltage decoder; 3404: Row decoder; 3405: Bit line driver; 3406: Input circuit; 3407, 3500, 3 700, 3800, 3850, 3880, 3890, 3895, 4000: Output circuit; 3408: Control logic; 3409: Bias generator; 3410: High voltage generation block; 3411: Charge pump; 3412: Charge pump regulator; 3413: High voltage level generator; 3414: Algorithm controller; 3415: Analog circuit. Path System 3416: Control Engine; 3417: Test Control Logic; 3418: SRAM Block; 3501: Reference Circuit; 3502: Horizontal Circuit; 3503, 3506: Horizontal Multiplexer; 3504, 3507, 3704, 3705, 3823, 3824, 3825, 3826, 3851, 3852, 3853. 3854, 3855, 3856, 3857, 3858: ITV; 3505: Reference Generator; 3508: Analog-to-Digital Converter; 3600, 3650: Clock Generator; 3601: Delay Locked Loop; 3602, 3652: Clock Generator Block; 3651: Phase-Locked Loop; 3701, 3827, 3828, 3861, 3862, 3863, 3864, 3883, 3884, 3896, 4200: SAR ADC 3702, 3703, 4002-1, ..., 4002-I, 4004-1, ..., 4004-i: Current source; 3706, 3707, 3708, 3717, 3718, 3719, 3805, 3806, 3807, 3808, 3809, 3810, 3811, 3812, 4101, 4102, 4103, 4104, 4120, 4204, 4205, 4301, 4302, 4303, 4304, 4307: Switch; 3710, 3711, 3721, 3722: Integrating capacitor; Capacitor 37 12, 3723: NMOS series transistors; transistors 3713, 3715, 3724, 3726, 4107, 4108, 4109: transistors 3714, 3725, 4110: operational amplifiers 3801, 3802: circuits 3803, 3804, 3871, 3872: common integrating capacitors; common capacitors; capacitors 3813, 3870: common capacitor network; capacitor network 3829, 3830, 3865, 3866, 3867, 3868, 3885, 3886, 4201, 4201P, 4201N: CDAC 3881, 3882: Multiplexers; 3900, 3920, 3940, 3950: Timing Diagrams; 4001-1,…,4001-i,4003-1,…,4003-i: Sections; 4100: Reference Voltage Generator; 4105, 4106, 4305, 4306: Capacitors; 4111: Reference Current Source; 4202, 4309: Comparators; 4203: SAR Logic; 4300, 4400: Voltage Generators; 4308: Variable Capacitor; 4401: Clamp; 4402: Resistor Ladder; BL0, BL1, BL2, BL3,…, BLN: Bit Lines; BLR0, BLR1, BLR2, BLR3: Terminals c0 , c1 , c2 , c3 c(t-1), c(t): Cell state vector C1, C2, C3, S1, S2, S3: Layer CB1, CB2, CB3, CB4: Synapse CFGx: Structure bits CG0, CG1, CG2, CG3, ..., CG M-1 , CG M : Control gate line/Control gate voltage CLKIN: System clock CLKOUT: Faster clock DOUT, DOUT[n:0]: Digital output EG0, EG1: Erasure gate/EG line EGR0, EGR1: EG line EN: Enable signal h0 , h1 , h2 , h3 , h(t-1), h(t): Output vector IBREF: Reference current Inputx, INPUT 0 , INPUT 1 , INPUT 2 , INPUT 3 , INPUT 4 , ..., INPUT N-1 , INPUT N , INPUT M-1 , INPUT M : Input IW+, IW-: Current IW1+, IW1-, IW2+, IW2-, IW3+, IW3-, IW4+, IW4-: Bit line current OUTPUT 0 , OUTPUT 1 , OUTPUT 2 , OUTPUT 3 , OUTPUT 4 , OUTPUT N-1 , OUTPUT N Output P1: Excitation function/pooling function P2: Excitation function S0: Input layer SL0,SL1,SL2,SL3: Source lines t1,t2,t3,t4,t5: Sub-periods VADCREFH,VADCREFL: Reference voltage VIN+,VIN-: Differential voltage/input voltage VREFP,VREFN,VCIM: Voltage WL0,WL1,WL2,WL3,WL4,WL5,WL6,WL7,WL M-1 ,WL M ,WLA0,WLB0,WLA1,WLB1,WLA2,WLB2,WLA3,WLB3: Character lines x0 , x1 , x2 , x3 ,x(t): Input vector

圖1為繪示人工神經網路之圖。Figure 1 is a diagram illustrating an artificial neural network.

圖2描繪先前技術分離閘極快閃記憶體胞元。Figure 2 depicts the isolated gate flash memory cell in the prior art.

圖3描繪另一先前技術分離閘極快閃記憶體胞元。Figure 3 depicts another prior art technique for isolating gated flash memory cells.

圖4描繪另一先前技術分離閘極快閃記憶體胞元。Figure 4 depicts another prior art technique for isolating gate-extreme flash memory cells.

圖5描繪另一先前技術分離閘極快閃記憶體胞元。Figure 5 depicts another prior art technique for isolating gated flash memory cells.

圖6為繪示利用一或多個非揮發性記憶體陣列之人工神經網路的不同層級之圖。Figure 6 is a diagram illustrating different levels of an artificial neural network that utilizes one or more non-volatile memory arrays.

圖7為繪示VMM系統之方塊圖。Figure 7 is a block diagram illustrating a VMM system.

圖8為繪示利用一或多個VMM系統之實例人工神經網路的方塊圖。Figure 8 is a block diagram illustrating an example artificial neural network utilizing one or more VMM systems.

圖9描繪VMM系統之另一實例。Figure 9 illustrates another example of a VMM system.

圖10描繪VMM系統之另一實例。Figure 10 illustrates another example of a VMM system.

圖11描繪VMM系統之另一實例。Figure 11 illustrates another example of a VMM system.

圖12描繪VMM系統之另一實例。Figure 12 illustrates another example of a VMM system.

圖13描繪VMM系統之另一實例。Figure 13 illustrates another example of a VMM system.

圖14描繪先前技術長短期記憶體系統。Figure 14 illustrates a prior art long short-term memory system.

圖15描繪用於長短期記憶體系統中之實例胞元。Figure 15 depicts an instance cell used in a long short-term memory system.

圖16描繪圖15之胞元之實例實施。Figure 16 illustrates an example implementation of the cell element in Figure 15.

圖17描繪圖15之胞元之另一實例實施。Figure 17 depicts another example implementation of the cell in Figure 15.

圖18描繪先前技術閘控遞回單元系統。Figure 18 illustrates a prior art gate return unit system.

圖19描繪用於閘控遞回單元系統中之實例胞元。Figure 19 illustrates an instance cell used in a gate return unit system.

圖20描繪圖19之胞元的實例實施。Figure 20 illustrates an example implementation of the cell from Figure 19.

圖21描繪圖19之胞元的另一實例實施。Figure 21 depicts another example implementation of the cell in Figure 19.

圖22描繪VMM系統之另一實例。Figure 22 illustrates another example of a VMM system.

圖23描繪VMM系統之另一實例。Figure 23 illustrates another example of a VMM system.

圖24描繪VMM系統之另一實例。Figure 24 illustrates another example of a VMM system.

圖25描繪VMM系統之另一實例。Figure 25 illustrates another example of a VMM system.

圖26描繪VMM系統之另一實例。Figure 26 illustrates another example of a VMM system.

圖27描繪VMM系統之另一實例。Figure 27 illustrates another example of a VMM system.

圖28描繪VMM系統之另一實例。Figure 28 illustrates another example of a VMM system.

圖29描繪VMM系統之另一實例。Figure 29 illustrates another example of a VMM system.

圖30描繪VMM系統之另一實例。Figure 30 illustrates another example of a VMM system.

圖31描繪VMM系統之另一實例。Figure 31 illustrates another example of a VMM system.

圖32描繪VMM系統之另一實例。Figure 32 illustrates another example of a VMM system.

圖33描繪VMM系統之另一實例。Figure 33 illustrates another example of a VMM system.

圖34描繪VMM系統之另一實例。Figure 34 illustrates another example of a VMM system.

圖35描繪用於VMM系統之輸出電路。Figure 35 illustrates the output circuit used in a VMM system.

圖36A描繪時脈產生器。Figure 36A depicts the pulse generator.

圖36B描繪時脈產生器。Figure 36B depicts the pulse generator.

圖37描繪用於VMM系統之輸出電路。Figure 37 illustrates the output circuit used in a VMM system.

圖38A描繪用於VMM系統之輸出電路。Figure 38A illustrates the output circuit used in a VMM system.

圖38B描繪用於VMM系統之另一輸出電路。Figure 38B illustrates another output circuit used in a VMM system.

圖38C描繪用於VMM系統之另一輸出電路。Figure 38C illustrates another output circuit used in a VMM system.

圖38D描繪用於VMM系統之另一輸出電路。Figure 38D depicts another output circuit used in a VMM system.

圖38E描繪用於VMM系統之另一輸出電路。Figure 38E depicts another output circuit used in a VMM system.

圖39A描繪輸出電路之時序圖。Figure 39A shows the timing diagram of the output circuit.

圖39B描繪另一輸出電路之時序圖。Figure 39B depicts the timing diagram of another output circuit.

圖39C描繪另一輸出電路之時序圖。Figure 39C depicts the timing diagram of another output circuit.

圖39D描繪另一輸出電路之時序圖。Figure 39D depicts the timing diagram of another output circuit.

圖40描繪用於VMM系統之輸出電路。Figure 40 illustrates the output circuit used in a VMM system.

圖41描繪參考電壓產生器。Figure 41 depicts the reference voltage generator.

圖42描繪連續近似暫存器類比至數位轉換器。Figure 42 illustrates a continuous approximation register analog-to-digital converter.

圖43描繪電壓產生器。Figure 43 depicts the voltage generator.

圖44描繪電壓產生器。Figure 44 depicts the voltage generator.

C1:層 C2:層 C3:層 CB1:突觸 CB2:突觸 CB3:突觸 CB4:突觸 P1:激勵函數/池化函數 P2:激勵函數 S0:輸入層 S1:層 S2:層 S3:輸出層C1: Layer C2: Layer C3: Layer CB1: Synapse CB2: Synapse CB3: Synapse CB4: Synapse P1: Excitation Function/Pooling Function P2: Excitation Function S0: Input Layer S1: Layer S2: Layer S3: Output Layer

Claims (7)

一種具有用於向量矩陣乘法陣列的輸出電路之系統,其包含: 一向量矩陣乘法陣列,其包含配置成列及行之非揮發性記憶體胞元,一第一組行儲存W+權重,且一第二組行儲存W-權重;及 一輸出電路,其用以自該第一組行中之一第一行接收一第一電流且自該第二組行中之一第二行接收一第二電流,且產生表示該第一電流之一第一電壓及表示該第二電流之一第二電壓,且自該第一組行中之一第三行接收一第三電流且自該第二組行中之一第四行接收一第四電流,且產生表示該第三電流之一第三電壓及表示該第四電流之一第四電壓,該輸出電路包含: 一第一電流至電壓轉換器,其選擇性地耦接至一第一積分電容器以提供該第一電壓,該第一電壓等於一初始電壓減去由該第一電流引起之一第一放電值; 一第二電流至電壓轉換器,其選擇性地耦接至一第二積分電容器以提供該第二電壓,該第二電壓等於該初始電壓減去由該第二電流引起之一第二放電值; 一第三電流至電壓轉換器,其選擇性地耦接至該第一積分電容器以提供該第三電壓,該第三電壓等於該初始電壓減去由該第三電流引起之一第三放電值,其中,該第一電流至電壓轉換器及該第三電流至電壓轉換器以時間多工方式共用該第一積分電容器;及 一第四電流至電壓轉換器,其選擇性地耦接至該第二積分電容器以提供該第四電壓,該第四電壓等於該初始電壓減去由該第四電流引起之一第四放電值,其中,該第二電流至電壓轉換器及該第四電流至電壓轉換器以時間多工方式共用該第二積分電容器。A system having an output circuit for a vector matrix multiplication array, comprising: a vector matrix multiplication array including nonvolatile memory cells arranged in columns and rows, a first set of rows storing W+ weights, and a second set of rows storing W- weights; and an output circuit for receiving a first current from a first row in the first set of rows and a second current from a second row in the second set of rows, and generating a first voltage representing the first current and a second voltage representing the second current, and receiving a third current from a third row in the first set of rows and a fourth current from a fourth row in the second set of rows, and generating a third voltage representing the third current and a fourth voltage representing the fourth current, the output circuit comprising: A first current-to-voltage converter selectively coupled to a first integrated capacitor to provide the first voltage, the first voltage being equal to an initial voltage minus a first discharge value caused by the first current; and a second current-to-voltage converter selectively coupled to a second integrated capacitor to provide the second voltage, the second voltage being equal to the initial voltage minus a second discharge value caused by the second current. A third current-to-voltage converter selectively coupled to the first integrated capacitor to provide the third voltage, the third voltage being equal to the initial voltage minus a third discharge value caused by the third current, wherein the first current-to-voltage converter and the third current-to-voltage converter share the first integrated capacitor in a time-multiplexed manner; and a fourth current-to-voltage converter selectively coupled to the second integrated capacitor to provide the fourth voltage, the fourth voltage being equal to the initial voltage minus a fourth discharge value caused by the fourth current, wherein the second current-to-voltage converter and the fourth current-to-voltage converter share the second integrated capacitor in a time-multiplexed manner. 如請求項1之系統,其包含: 一第一類比至數位轉換器,其用以將該第一電壓與該第二電壓之間的一差轉換成一第一數位輸出。The system of claim 1 includes: a first analog-to-digital converter for converting a difference between the first voltage and the second voltage into a first digital output. 如請求項2之系統,其中,該第一類比至數位轉換器為一連續近似暫存器類比至數位轉換器。The system of claim 2, wherein the first analog-to-digital converter is a continuous approximation register analog-to-digital converter. 如請求項2之系統,其包含: 一第二類比至數位轉換器,其用以將該第三電壓與該第四電壓之間的一差轉換成一第二數位輸出。The system of claim 2 includes: a second analog-to-digital converter for converting a difference between the third voltage and the fourth voltage into a second digital output. 如請求項4之系統,其中,該第二類比至數位轉換器為一連續近似暫存器類比至數位轉換器。The system of claim 4, wherein the second analog-to-digital converter is a continuous approximation register analog-to-digital converter. 一種輸出電路之操作方法,其包含: 在一第一時間週期期間: 將一第一共用電容器及一第二共用電容器耦接至一第一電流至電壓轉換器及一第二電流至電壓轉換器以產生一第一電壓及一第二電壓,該第一電流至電壓轉換器耦接至一非揮發性記憶體胞元陣列之一第一位元線,該第二電流至電壓轉換器耦接至該陣列之一第二位元線;及 將該第一電壓及該第二電壓儲存於第一類比至數位轉換器中; 在該第一時間週期之後的一第二時間週期期間: 將該第一共用電容器及該第二共用電容器耦接至一第三電流至電壓轉換器及一第四電流至電壓轉換器以產生一第三電壓及一第四電壓,該第三電流至電壓轉換器耦接至該陣列之一第三位元線,該第四電流至電壓轉換器耦接至該陣列之一第四位元線,其中,該第三電流至電壓轉換器及該第四電流至電壓轉換器不同於該第一電流至電壓轉換器及該第二電流至電壓轉換器;及 將該第三電壓及該第四電壓儲存於一第二類比至數位轉換器中;及 在該第一時間週期及該第二時間週期之後的一第三時間週期期間: 藉由該第一類比至數位轉換器將該第一電壓及該第二電壓轉換成一第一數位輸出;及 藉由該第二類比至數位轉換器將該第三電壓及該第四電壓轉換成一第二數位輸出。An operating method of an output circuit includes: during a first time cycle: coupling a first common capacitor and a second common capacitor to a first current-to-voltage converter and a second current-to-voltage converter to generate a first voltage and a second voltage, the first current-to-voltage converter being coupled to a first bit line of a non-volatile memory cell array, and the second current-to-voltage converter being coupled to a second bit line of the array; and storing the first voltage and the second voltage in a first analog-to-digital converter; during a second time cycle following the first time cycle: The first shared capacitor and the second shared capacitor are coupled to a third current-to-voltage converter and a fourth current-to-voltage converter to generate a third voltage and a fourth voltage. The third current-to-voltage converter is coupled to a third bit line of the array, and the fourth current-to-voltage converter is coupled to a fourth bit line of the array. The third and fourth current-to-voltage converters are different from the first and second current-to-voltage converters. The third and fourth voltages are stored in a second analog-to-digital converter. During a third time period following the first and second time periods: The first analog-to-digital converter converts the first voltage and the second voltage into a first digital output; and the second analog-to-digital converter converts the third voltage and the fourth voltage into a second digital output. 一種輸出電路之操作方法,其包含: 藉由一第一電流至電壓轉換器、一第二電流至電壓轉換器、一第三電流至電壓轉換器及一第四電流至電壓轉換器自一非揮發性記憶體胞元陣列之各別位元線接收電流; 在一第一時間週期期間: 將多個共用電容器連接至該第一電流至電壓轉換器及該第二電流至電壓轉換器; 藉由一第一類比至數位轉換器取樣及保持自該第一電流至電壓轉換器及該第二電流至電壓轉換器接收到之一第一組電壓;及 藉由該第一類比至數位轉換器將該第一組電壓轉換成數位輸出;及 在一第二時間週期期間: 將該等共用電容器連接至該第三電流至電壓轉換器及該第四電流至電壓轉換器; 藉由一第二類比至數位轉換器取樣及保持自該第三電流至電壓轉換器及該第四電流至電壓轉換器接收到之一第二組電壓;及 藉由該第二類比至數位轉換器將該第二組電壓轉換成數位輸出。An operation method of an output circuit includes: receiving current from individual bit lines of a non-volatile memory cell array via a first current-to-voltage converter, a second current-to-voltage converter, a third current-to-voltage converter, and a fourth current-to-voltage converter; during a first time period: connecting a plurality of shared capacitors to the first current-to-voltage converter and the second current-to-voltage converter; sampling and holding a first set of voltages received from the first current-to-voltage converter and the second current-to-voltage converter via a first analog-to-digital converter; and converting the first set of voltages into a digital output via the first analog-to-digital converter; and during a second time period: Connect the shared capacitors to the third current-to-voltage converter and the fourth current-to-voltage converter; sample and hold a second set of voltages received from the third current-to-voltage converter and the fourth current-to-voltage converter via a second analog-to-digital converter; and convert the second set of voltages into a digital output via the second analog-to-digital converter.
TW113126826A 2023-08-25 2024-07-18 System with output circuit for a vector-by-matrix multiplication array and operating method for output circuit TWI911805B (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US202363534755P 2023-08-25 2023-08-25
US63/534,755 2023-08-25
US18/386,901 US20250068900A1 (en) 2023-08-25 2023-11-03 Output circuit for a vector-by-matrix multiplication array
US18/386,901 2023-11-03
WOPCT/US23/81167 2023-11-27
PCT/US2023/081167 WO2025048866A1 (en) 2023-08-25 2023-11-27 Output circuit for a vector-by-matrix multiplication array

Publications (2)

Publication Number Publication Date
TW202509816A TW202509816A (en) 2025-03-01
TWI911805B true TWI911805B (en) 2026-01-11

Family

ID=

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230268004A1 (en) 2020-05-13 2023-08-24 Silicon Storage Technology, Inc. Setting levels for a programming operation in a neural network array

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230268004A1 (en) 2020-05-13 2023-08-24 Silicon Storage Technology, Inc. Setting levels for a programming operation in a neural network array

Similar Documents

Publication Publication Date Title
TWI814383B (en) Output circuit for analog neural memory in a deep learning artificial neural network
TWI884383B (en) Input circuitry for analog neural memory in a deep learning artificial neural network
WO2022245384A1 (en) Output circuit for analog neural memory in a deep learning artificial neural network
TWI842636B (en) Output circuitry for analog neural memory in a deep learning artificial neural network
TWI885266B (en) Split array architecture for analog neural memory in a deep learning artificial neural network
TWI846378B (en) Vector-by-matrix-multiplication array utilizing analog inputs
TWI870797B (en) Artificial neural network comprising reference array for i-v slope configuration
TWI911805B (en) System with output circuit for a vector-by-matrix multiplication array and operating method for output circuit
TW202407579A (en) Artificial neural network comprising a three-dimensional integrated circuit
TWI903748B (en) System and method to generate digital output from vector-by-matrix multiplication array
TWI871592B (en) System related to vector-by-matrix-multiplication array utilizing analog outputs
TWI912628B (en) System and method related to output circuit for artificial neural network array
TWI907830B (en) Input circuit for artificial neural network array and operating methods thereof
TWI908665B (en) Split array architecture for analog neural memory in a deep learning artificial neural network
US20250068900A1 (en) Output circuit for a vector-by-matrix multiplication array
TW202509816A (en) Output circuit for a vector-by-matrix multiplication array
TWI911509B (en) Calibration of electrical parameters in a deep learning artificial neural network
TW202522259A (en) Sigma-delta analog-to-digital converter to generate digital output from vector-by-matrix multiplication array
TW202509817A (en) Input block for vector-by-matrix multiplication array
TW202445416A (en) Masking sparse inputs and outputs in neural network array
TW202447412A (en) Redundancy for an array of non-volatile memory cells using tag registers
TW202416180A (en) Output circuit for artificial neural network array
TW202324211A (en) Hybrid memory system configurable to store neural memory weight data in analog form or digital form
TW202416179A (en) Input circuit for artificial neural network array
TW202418146A (en) Multiple row programming operation in artificial neural network array