Cross-coupling floating gate type memory and calculation integrated unit and memory and calculation array
Technical Field
The invention relates to a cross-coupling floating gate type memory calculation integrated unit and a memory calculation array, and belongs to the technical field of semiconductors.
Background
The von neumann architecture makes a great contribution to the development of various types of hardware in the field of computers, however, the characteristics of memory computation and separation of the von neumann architecture make memory access operation more time and energy consumption than data processing when a computer performs tasks of mass data computation, especially in the scene that a convolutional neural network, a large language model computation and the like need to perform a large number of matrix vector multiplications, in a classical architecture, the data are not only calculated by arranging tens of thousands of transistors in a processor as a computation core, but also a large amount of energy is consumed and the data to be computed are delayed to be transported from a memory to the processor, so that the energy efficiency of processing the algorithm under the classical architecture is greatly influenced.
In order to solve the bottleneck of the power consumption wall and the memory wall, a memory calculation integrated device and a memory calculation scheme are provided, and the scheme is characterized in that the weight to be calculated can be stored in a memory calculation array formed by a memory, a small number of memory calculation units can be deployed in a purely analog accumulation mode to locally complete matrix vector multiplication calculation at a higher speed, and the method for computing in a sample way can greatly reduce the power consumption waste and delay caused by data carrying and effectively solve the problems of the memory wall and the power consumption wall.
The integrated scheme of memory and calculation can be divided into nonvolatile memory and volatile memory according to storage media, and the main difference between the two is whether the weight information in the device can be quickly erased and written, and whether the power failure can cause the loss of the weight, and the two are respectively suitable for different use scenes. The nonvolatile memory computation mainly processes weights which do not need to be changed in the network, and the nonvolatile memory computation corresponds to some computation which needs to be changed rapidly in the network, for example, matrix multiplication is needed to be realized on two continuously-changed matrixes in self-attention mechanism computation in a large language model, so that the nonvolatile memory computation is more suitable for volatile memory computation.
The main media of the existing volatile memory are SRAM and eDRAM, the SRAM is suitable for a pure digital scheme, the single device size is over 100F 2, the parallelism and the scale are restricted, the eDRAM device can be used for analog domain calculation, but the area of the eDRAM device also reaches 40F 2, the device weight maintenance capability is millisecond level for large-scale integration, if the array scale is too large, the refreshing of the device weight becomes a problem, and the SRAM and the eDRAM device are not suitable for the scene of deploying an ultra-large scale memory array in the large language model algorithm, and the like, thus the volatile memory integrated unit meeting the requirements of high precision and high memory density in the pure analog calculation is needed at the present stage.
Disclosure of Invention
In order to solve the existing problems, the invention provides a cross-coupling floating gate type integrated memory unit which is used as an eDRAM volatile integrated memory unit, and the cross-coupling structure enables the structure in the integrated memory unit to realize the multiplexing to the greatest extent, and realizes a volatile integrated memory medium with high weight holding capacity and small size.
The first object of the invention is to provide a cross-coupling floating gate type memory integrated unit, which comprises a first coupling read-write subunit, a second coupling read-write subunit and a write-in switch tube, wherein the first coupling read-write subunit and the second coupling read-write subunit are formed on the same substrate, and the first coupling read-write subunit and the second coupling read-write subunit are arranged on two sides of the write-in switch tube in a conjugate manner.
Optionally, the first coupling read-write subunit and the second coupling read-write subunit have the same structure;
The first coupling read-write subunit comprises a first write drain electrode, a first write source electrode and a first grid electrode structure, wherein a first shallow slot isolation layer is arranged in a substrate area corresponding to the first grid electrode structure, the first coupling read-write subunit is isolated through the first shallow slot isolation layer to form a first charge coupling tube and a first read-write tube, the first charge coupling tube and the first read-write tube share the first grid electrode structure, and the first charge coupling tube does not have a source electrode and a drain electrode;
Correspondingly, the second coupling read-write subunit comprises a second write drain electrode, a second write source electrode and a second grid structure, a second shallow slot isolation layer is arranged in a substrate area corresponding to the second grid structure, the second coupling read-write subunit is isolated through the second shallow slot isolation layer to form a second charge coupled tube and a second read-write tube, the second charge coupled tube and the second read-write tube share the second grid structure, the second charge coupled tube does not have the source electrode and the drain electrode, and the second grid structure comprises a substrate, a bottom dielectric layer, a second floating gate, a second top dielectric layer and a second control gate from bottom to top.
Optionally, the write-in switch tube comprises a substrate, a bottom dielectric layer, a third floating gate, a third top dielectric layer and a third control gate from bottom to top, and a third shallow slot isolation layer is arranged in a substrate area corresponding to the write-in switch tube and used for isolating two paths formed by two coupling read-write subunits which are arranged in a conjugated manner so as to avoid short circuit during writing.
Optionally, the first read-write tube is used for reading out the electronic quantity information collected in the first charge-coupled tube substrate, and simultaneously, the electronic quantity information is written into the second charge-coupled tube substrate through the first writing drain electrode under the control of the writing switch tube, and the second reading-write tube is used for reading out the electronic information after the second coupling tube collects electrons, and simultaneously, the second reading-write tube is used for writing the electronic quantity information into the first charge-coupled tube through the second writing drain electrode under the control of the writing switch tube.
Optionally, the first read-write tube is required to be on the same side as the second charge-coupled tube, and the first read-write tube is on the same side as the first charge-coupled tube, so that after the write-in switch tube is turned on, the first write drain is connected with the second charge-coupled tube through the write-in switch tube, and the second write drain is connected with the first charge-coupled tube through the write-in switch tube.
Optionally, the substrate is P-type or N-type, and the first writing drain electrode, the first writing source electrode, the second writing drain electrode and the second writing source electrode are N-type or P-type.
Optionally, the first floating gate, the second floating gate, the third floating gate, the first control gate, the second control gate and the third control gate are N-type polysilicon, P-type polysilicon or metal, and the bottom dielectric layer, the first top dielectric layer, the second top dielectric layer, the third top dielectric layer, the first shallow trench isolation layer, the second shallow trench isolation layer and the third shallow trench isolation layer 304 are silicon dioxide or silicon nitride, or a combination thereof.
The second object of the present invention is to provide a memory array, where the memory array is formed by m×n cross-coupled floating gate memory integrated units, the first writing drain and the second writing drain of the memory integrated units in the same column are respectively connected to form a first bit line and a second bit line, the first writing source and the second writing source of the memory integrated units in the same column are respectively connected to form a first source line and a second source line, the first control gate and the second control gate of the memory integrated units in the same column are respectively connected to form a first word line and a second word line, and the third control gate of the memory integrated units in the same column is connected to form a writing word line.
The third object of the present invention is to provide a method for reading and writing the cross-coupled floating gate type integrated memory cell, wherein during writing, a positive bias voltage is applied between a first control gate and a substrate and between a second control gate and the substrate, so that depletion regions are generated below a first charge coupled transistor and a second charge coupled transistor, a write switch transistor is turned on, weighting voltages are respectively applied to a first write drain and a second write drain, the magnitude of the weighting voltages represents weight information to be written into the second charge coupled transistor and the first charge coupled transistor, the depletion regions below the first charge coupled transistor and the second charge coupled transistor store corresponding numbers of electrons, the numbers of electrons affect the electric potentials on the first floating gate and the second floating gate to different extents, and then the second write switch transistor is turned off, so that the write operation is completed;
During reading, positive pressure is applied to the first control gate or the second control gate of the cross-coupled floating gate type integrated memory unit simultaneously, bias voltage is applied between the first writing drain electrode and the first writing source electrode, and between the writing drain electrode and the second writing source electrode, the current value read by the first electronic reading and writing tube reflects the electron quantity stored in the depletion region of the first charge-coupled tube, the current value read by the second electronic reading and writing tube reflects the electron quantity stored in the depletion region of the second charge-coupled tube, and the reading operation is completed.
A fourth object of the present invention is to provide a calibration method of the above cross-coupled floating gate integrated memory cell, the method comprising:
Applying a strong electric field between the first writing drain electrode/the second writing drain electrode and the first writing source electrode/the second writing source electrode, and applying a positive voltage on the first control gate electrode/the second control gate electrode, so that a transverse electric field along the channel direction and a longitudinal electric field pointing to the substrate from a corresponding floating gate are generated in the first reading and writing tube/the second reading and writing tube, electrons acquire energy through the electric field and enter the corresponding floating gate, and the initial threshold voltage of the first reading and writing tube/the second reading and writing tube is further increased;
Applying positive voltage to the first control gate/the second control gate, enabling the first writing drain electrode/the second writing drain electrode and the first writing source electrode/the second writing source electrode to be extremely low in potential, generating a strong electric field pointing to the substrate direction from the corresponding floating gate, enabling electrons to enter the corresponding floating gate through tunneling, further increasing the initial threshold voltage of the first reading writing tube/the second reading writing tube, enabling the first control gate/the second control gate to apply negative voltage, generating an electric field pointing to the corresponding floating gate from the substrate, enabling electrons in the corresponding floating gate to pass through the corresponding floating gate through tunneling effect, and further reducing the initial threshold voltage of the first reading writing tube/the second reading writing tube.
The invention has the beneficial effects that:
Compared with the mode of storing charges by the grid electrode of the traditional eDRAM device, the cross-coupling type floating gate type integrated storage unit stores electric information through the silicon substrate depletion region, the structure of the depletion region for storing weight values has no electric leakage between the grid capacitor and PN junction, and the weight value maintenance capability is more than one order of magnitude stronger than that of the traditional 2T 1C type eDRAM.
The existing memory array is usually written in rows or columns, and weights of one row of units or one column of units are written at a time, but when the memory array formed by the cross-coupling memory integrated units provided by the invention is used, two rows of devices can be written at the same time, and the writing speed is doubled compared with that of the traditional scheme.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a top view of a cross-coupled floating gate type memory cell provided by one embodiment of the present invention;
FIG. 2 is a cross-sectional view of a storage and accounting unit along the X direction provided by one embodiment of the present invention;
FIG. 3 is a cross-sectional view of a memory cell along the Y-direction provided by one embodiment of the present invention;
FIG. 4 is a cross-sectional view of a storage unit along the Z-direction provided by one embodiment of the present invention;
FIG. 5 is a schematic diagram of a local layout method that can be repeatedly used when the integrated memory cells are formed into a large-scale array according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a method for writing 2 XN weights in a single instance of an array of M XN-scale memory cells according to an embodiment of the present invention;
The device comprises a first coupling read-write subunit, a second coupling read-write subunit, a 300-write switching tube, a 101-first write drain, a 102-first write source, a 110-first grid structure, a 201-second write drain, a 202-second write source and a 210-second grid structure, wherein the first coupling read-write subunit is a first coupling read-write subunit;
111-a first floating gate, 112-a first top dielectric layer, 113-a first control gate, 203-a depletion region of a second charge coupled device for collecting electrons, 211-a second floating gate, 212-a second top dielectric layer, 213-a second control gate, 301-a third floating gate, 302-a third top dielectric layer, 303-a third control gate, 400-a substrate, 500-a bottom dielectric layer;
204-a second shallow trench isolation layer, 304-a third shallow trench isolation layer;
1000-first dashed box, 2000-second dashed box, 3000-third dashed box.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
Example 1
The present embodiment provides a cross-coupled floating gate type integrated memory unit and corresponding read-out and write-in method, as shown in fig. 1, the cross-coupled floating gate type integrated memory unit includes a first coupling read-write subunit 100, a second coupling read-write subunit 200, and a write switch tube 300, which are formed on the same substrate 400, and the first coupling read-write subunit 100 and the second coupling read-write subunit 200 are disposed on two sides of the write switch tube 300 in a conjugated manner.
Referring to fig. 1, fig. 1 is a top view of the integrated storage unit. The substrate 400 area corresponding to the first grid structure 110 is provided with a first shallow slot isolation layer, the first coupling read-write subunit 100 is isolated by the first shallow slot isolation layer to form a first charge-coupled tube and a first read-write tube, the first charge-coupled tube and the first read-write tube share the first grid structure 110, the first charge-coupled tube is not provided with a source electrode and a drain electrode, the substrate 400 area corresponding to the second grid structure 210 is provided with a second shallow slot isolation layer, the second coupling read-write subunit 200 is isolated by the second shallow slot isolation layer to form a second charge-coupled tube and a second read-write tube, the second charge-coupled tube and the second read-write tube share the second grid structure 210, and the second charge-coupled tube is not provided with a source electrode and a drain electrode;
As described above, the integrated memory cell is composed of the first charge coupled device, the first read/write device, the second charge coupled device, the second read/write device, the write switch device, and the substrate that cannot be directly observed in the top view.
Fig. 2 is a cross-sectional view of the integrated memory cell along the direction of the first read/write tube-write switch tube-second charge coupled tube (the cross-section in the X-direction in fig. 1), that is, fig. 2 shows the shapes of the first read/write tube, the write switch tube 300, and the second charge coupled tube in the second coupled read/write subunit 200 in the first coupled read/write subunit 100, and for convenience of understanding, each part is respectively shown in fig. 2 with a first dashed frame 1000, a third dashed frame 3000, and a second dashed frame 2000.
The profile of the first read/write tube under this profile is shown in a first dashed box 1000, the profile of the second charge coupled tube under this profile is shown in a second dashed box 2000, and the profile of the write switch tube under this profile is shown in a third dashed box 3000.
Fig. 3 is a cross-sectional view of the integrated memory cell along the Y-direction in fig. 1, where the second shallow trench isolation 204 in fig. 3 separates the second charge coupled device from the second read/write tube. It can be seen that the charge-coupled tube shares the floating gate and the control gate with the read-write tube, and the number of electrons collected in the depletion region of the charge-coupled tube will change the potential in the floating gate, thereby affecting the threshold voltage of the read-write tube.
Fig. 4 is a cross-sectional view of the integrated memory cell along the Z direction in fig. 1, where the third shallow trench isolation layer 304 under the write switch in fig. 4 is used to isolate the first drain-write switch-second charge-coupled region depletion layer from the second drain-write switch-first charge-coupled region depletion layer, so as to avoid a short circuit during writing, and may be essentially connected to the shallow trench isolation layer for isolating the charge-coupled transistor and the read/write transistor to form the same insulating layer.
The method for reading and writing the cross-coupled floating gate type integrated memory cell comprises the following steps:
During writing, a positive bias voltage is applied between the first control gate 113 and the substrate 400 and between the second control gate 213 and the substrate 400, so that depletion regions are generated below the first charge coupled device and the second charge coupled device, the writing switch tube 300 is turned on, at this time, a weight voltage is respectively applied to the first writing drain 101 and the second writing drain 201, the magnitude of the weight voltage represents weight information to be written into the second charge coupled device and the first charge coupled device, at this time, the depletion regions below the first charge coupled device and the second charge coupled device store corresponding numbers of electrons, the different numbers of electrons affect the electric potentials on the first floating gate 111 and the second floating gate 211 to different extents, and then the second writing switch tube is turned off, so that the writing operation is completed.
In the reading process, positive voltage is simultaneously applied to the first control gate 113 or the second control gate 213 of the cross-coupled floating gate type integrated memory cell, and bias voltages are applied between the first writing drain 101 and the first writing source 102, between the second writing drain 201 and the second writing source 202, the current value read by the first electronic read/write tube will reflect the number of electrons stored in the depletion region of the first charge-coupled tube, and the current value read by the second electronic read/write tube reflects the number of electrons stored in the depletion region of the second charge-coupled tube, so that the reading operation is completed.
Example two
In this embodiment, referring to fig. 5, fig. 5 shows a local layout method that can be repeatedly used when a large-scale array is integrated, where the arrow indicates the direction of writing the weight, the gate structures of the writing switch tubes of the memory integrated units of the same row can be seen to be shared, and the first source electrode/second source electrode structures of the memory integrated units adjacent to the same row can be shared, so that the size of the memory integrated units when the memory integrated units are integrated into the array can be further reduced.
The first drain electrode and the second drain electrode of the same-column integrated memory cell are respectively connected to form a first bit line (BL 1) and a second bit line (BL 2), the first source electrode and the second source electrode of the same-column integrated memory cell are respectively connected to form a first source line (SL 1) and a second source line (SL 2), the first control gate and the second control gate of the same-row integrated memory cell are respectively connected to form a first word line (WL 1) and a second word line (WL 2), and the third control gate of the same-row integrated memory cell is connected to form a Write Word Line (WWL).
When programming operation is performed, 0V is applied to the write word line WWL of the whole chip, 5V is applied to the word line corresponding to the read-write tube to be programmed, 0V is applied to the unselected word line, 3V is applied to the bit line corresponding to the read-write tube to be programmed, 0V is applied to the source line, 0V is applied to the unselected bit line, channel hot electron programming occurs to the selected read-write tube at the moment, the threshold voltage of the read-write tube is increased, and programming does not occur to other read-write tubes.
When the array erasing operation is carried out, 0V is applied to write word lines WWL of the whole chip, 10V is applied to all word lines, the bit lines, the source lines and the substrate are all 0V, FN tunneling is carried out on read-write tubes of the whole array at the moment, electrons pass through floating gates, and the threshold voltage of the read-out tubes is reduced.
When array writing is carried out, 3V is applied to a write word line WWL corresponding to a unit to be written, all write switch tubes of the row are conducted, voltages are applied to all bit lines BL1 and BL2 of the row at the moment, the voltage values are corresponding weight values to be written by the first charge coupled tube and the second charge coupled tube, the write word line WWL of the row is set to 0V after the application is completed, the write of the row is completed, and then the write word lines WWL of other rows are started to carry out the operation until the array writing is completed. When the traditional memory array is used for array level writing, the method is to write according to rows or columns, and write weights of one row of units or one column of units at a time, and when the cross-coupling floating gate type memory integrated unit array provided by the invention is used, two rows of devices can be written at the same time, and the writing speed is doubled compared with that of the traditional scheme.
Referring to fig. 6, fig. 6 shows a scenario in which when a large-scale array is integrated, write word lines WWL <0> are turned on, write voltages of corresponding cells are simultaneously applied to bit lines one BL1<0:n > and bit lines two BL2<0:n >, and the first row of the integrated memory cells can be written. When reading or calculating, applying a reading voltage of 3V to a word line corresponding to a read-write tube to be read or calculated, and applying a reading source-drain voltage difference of 0.2V to a corresponding source electrode and drain electrode, wherein the current flowing out of the read-write tube represents weight information stored by a charge coupled tube, and the currents in the same column can be subjected to analog accumulation by the parallel connection relation of different read-write tubes, so that the calculation is completed.
Some steps in the embodiments of the present invention may be implemented by using software, and the corresponding software program may be stored in a readable storage medium, such as an optical disc or a hard disk.
The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.