[go: up one dir, main page]

HK1186270A - Semiconductor memory device with plural memory die and controller die - Google Patents

Semiconductor memory device with plural memory die and controller die Download PDF

Info

Publication number
HK1186270A
HK1186270A HK13113677.8A HK13113677A HK1186270A HK 1186270 A HK1186270 A HK 1186270A HK 13113677 A HK13113677 A HK 13113677A HK 1186270 A HK1186270 A HK 1186270A
Authority
HK
Hong Kong
Prior art keywords
die
memory
data
read data
controller
Prior art date
Application number
HK13113677.8A
Other languages
Chinese (zh)
Inventor
P.戈里汉姆
Original Assignee
考文森智财管理公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 考文森智财管理公司 filed Critical 考文森智财管理公司
Publication of HK1186270A publication Critical patent/HK1186270A/en

Links

Description

Semiconductor memory device having multiple memory dies and controller die
The present application is a divisional application of application No. 201180010822.2, application date 2/7/2011, entitled "semiconductor memory device having a plurality of memory dies and a controller die".
Technical Field
The present invention relates generally to semiconductor memory devices, and more particularly to semiconductor memory devices having multiple memory dies and a controller die.
Background
Multi-chip packages (MCPs) that integrate multiple stacked semiconductor chips, such as DRAM devices, in a single package achieve higher densities than individual chips packaged in dedicated packages.
U.S. patent 7,515,453 to Rajan describes an interface chip packaged with two or more DRAM die in a single package. The interface chip is capable of communicating with multiple DRAM dies over a shared data bus so that only a single die can be accessed at a given time. Alternatively, each DRAM die may have a dedicated data bus to the interface die so that multiple interfaces can be operated in parallel to provide higher bandwidth.
U.S. patent 7,386,656 to Rajan et al shows various configurations of stacked DRAM die located in the same package with a buffer chip. The external command buses (address, control and clock) may be buffered by the interface chip and provided to all DRAM dies on a common internal bus, or it may be provided to each DRAM die on separate internal buses, or it may be provided to several DRAM dies on each separate internal bus. The external data bus may be bi-directionally buffered by the interface chip and provided to all DRAM dies on a common internal bus, or it may be provided to each DRAM die on separate internal buses, or it may be provided to several DRAM dies on each separate internal bus.
However, these and other existing MCP implementations have various drawbacks, including high power consumption. This is problematic, especially for mobile devices where battery power is a limited resource. Therefore, it is desirable in the industry to be able to design MCPs with reduced power consumption.
Disclosure of Invention
According to a broad aspect, the present invention is directed to a semiconductor memory device, comprising: a plurality of memory dies; a controller die connected to an internal control bus, the controller die configured to provide internal read commands to selected ones of the memory dies in response to external read commands; wherein the selected memory die is configured to return read data to the controller die in response to the internal read command; wherein when at least two of the memory dies are selected as the selected memory die, a delay between the controller die receiving the external read command and the controller die receiving read data from the selected memory die is different for the at least two memory dies.
Other aspects and features of the present invention will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying figures.
Drawings
Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings, in which:
FIG. 1 is a block diagram of a multi-chip package (MCP) employing a controller die and a plurality of memory dies, in accordance with certain non-limiting embodiments of the present invention;
FIGS. 2A and 2B are block diagrams illustrating different possible configurations of MCPs in terms of bus and pin capacity;
FIG. 3 is a signal flow diagram illustrating possible interactions between various system components during a read operation;
FIG. 4 is a timing diagram showing the controller die resynchronizing read data received from the memory die without the memory die clocking their read outputs relative to a global clock signal;
FIG. 5 shows an example of a physical configuration for stacking memory dies and controller dies to make an MCP; and
FIG. 6 is a diagram of a multi-rank (rank) MCP RDIMM, according to a specific non-limiting embodiment of the present invention.
Detailed Description
Fig. 1 shows a block diagram of a semiconductor memory device 100, which includes a plurality of memory dies (die)110A, 110B, 110C, and 110D and a controller die 120. The semiconductor memory apparatus 100 may be referred to as a multi-chip package (MCP). Each of the memory dies 110A, 110B, 110C, and 110D and the controller die 120 may be referred to as a "Known Good Die (KGD)" to indicate that they have been adequately tested in wafer form prior to packaging into the MCP 100.
Memory dies (KGD) 110A, 110B, 110C, and 110D may be Dynamic Random Access Memory (DRAM) devices (including synchronous DRAM-SDRAM) or other types of memory devices, particularly devices that are expected to have low read and write latencies. In this particular example, the number of memory dies is 4, but this should not be considered limiting. In certain non-limiting example embodiments, one or more of the memory dies 110A, 110B, 110C, and 110D may be DRAM devices that conform to JEDEC DDR3 standard JESD79-3C (which is incorporated herein by reference). In some embodiments, different subsets of the memory dies 110A, 110B, 110C, and 110D may conform to different standards, which may or may not include the aforementioned JEDEC JESD79-3C standard.
Controller die 120 may be referred to as a "bridge chip" because it provides an interface to the outside world to memory dies 110A, 110B, 110C, and 110D. In particular, the external control bus 130 and the external data bus 140 connect the memory controller 150 to the MCP100 by connecting to the controller die 120. The memory controller 150 and the MCP100 may both be connected via a motherboard 160. The connection between the memory controller 150 and the MCP100 may be direct, or via registers and/or via one or more other MCPs. The controller die 120 may be configured to interact with the external control bus 130 and the external data bus 140 according to a given standard (such as JEDEC DDR 3) such that the MCP100 is considered a standard-compliant device from the perspective of the memory controller 150.
External control bus 130 carries command/address signals and a global clock signal from memory controller 150. The external data bus 140 includes external data lines that carry valid data when active and data strobe lines that carry data strobe signals. The data strobe signal is a clock signal that is used to indicate when the external data lines are active and therefore carry valid data. Since data may originate from either the memory controller 150 or from the MCP100, the data strobe line is occupied by either the memory controller 150 or the MCP100 depending on whether write data is being transferred from the memory controller 150 to the MCP100 or read data is being transferred from the MCP100 to the memory controller 150.
An external control bus 130 for providing command/address signals and a global clock signal is buffered and provided to each memory die along an internal control bus. The command/address signals and the global clock signal can be delivered to the internal control bus with a latency as little as one clock cycle. In the illustrated embodiment, separate dedicated internal control buses 190A, 190B, 190C, and 190D are provided, one for each memory die 110A, 110B, 110C, and 110D, respectively. Thus, when a command from memory controller 150 is addressed to a particular one of memory dies 110A, 110B, 110C, and 110D, but not to other memory dies, controller die 120 determines the destination memory die for the command and activates only the internal control bus to that destination memory die, which saves power. Alternatively, a single internal control bus can be provided that is shared in parallel by all of the memory dies 110A, 110B, 110C, and 110D. This reduces the number of pads on the controller die 120 and the number of interconnects in the MCP100, but at the cost of increased power consumption.
The controller die 120 is also connected to the memory dies 110A, 110B, 110C, and 110D through respective internal data buses 170A, 170B, 170C, and 170D. The internal data bus connecting controller die 120 to a particular one of memory dies 110A, 110B, 110C, and 110D includes internal data lines that carry valid data when active and data strobe lines that carry data strobe signals. The data strobe signal is a clock signal that is used to indicate when the internal data lines are active and therefore carry valid data. Since data may originate from controller die 120 or from a particular one of memory dies 110A, 110B, 110C, and 110D, the data strobe line is occupied by controller die 120 or a particular memory die depending on whether write data is being transferred from the particular memory die to controller die 120 or read data is being transferred from controller die 120 to the particular memory die.
To improve performance, particularly at high frequencies, controller die 120 can be configured to provide on-die termination (ODT) to external interfaces (i.e., to external data bus 140 and external control bus 130). To this end, the controller die 120 can implement various ODT options as described by, for example, the JEDEC DDR3 standard. One such option is to implement split resistive termination (split resistive termination) on the supply voltages VDDQ and VSSQ. Alternatively, to save power, a single resistive termination for a termination voltage regulated to an intermediate voltage between VDDQ and VSSQ may be used, such as VTT =1/2 (VDDQ-VSSQ). An example of the latter technique is described in U.S. patent application publication No.2010/0201397 entitled "Termination Circuit for On-Die Termination," assigned to the assignee of the present application, which is incorporated herein by reference. To this end, a linear VTT regulator may be used for low cost and ease of integration on controller die 120, or an inductive regulator may be used to provide higher power efficiency. In this case, the VTT regulator may be integrated in the MCP 100. Alternatively, VTT may be provided to the MCP100 through the motherboard 160 and a dedicated VTT pin or pins on the MCP.
It should be appreciated that due to the short interconnect distance from the controller die 120 to each of the memory dies 110A, 110B, 110C, and 110D, which remains relatively short even for the memory dies furthest from the controller die 120 (located within the same MCP 100), the internal data buses 170A, 170B, 170C, and 170D and the internal control buses (or buses 190A, 190B, 190C, and 190D) do not require on-chip terminators. To this end, it is contemplated that the memory dies 110A, 110B, 110C, and 110D are implemented without on-die terminators at all (which saves chip real estate) or with the ability to provide on-die terminators that can be turned off (e.g., programmed by an extended mode register and/or by connecting ODT pads to a supply voltage as provided by the JEDEC DDR3 standard). In both cases, the absence (or disabling) of ODT results in lower power consumption than if ODT were activated.
In order to know which internal data bus 170A, 170B, 170C, and 170D will be active for a read or write operation, it is necessary for the controller die 120 to identify the selected memory die based on command/address signals received via the external control bus 130. Various possible implementations can allow the selected memory die to be identified by the controller die 120. To illustrate some of these implementations, it is simply assumed that each of the four (4) memory dies 110A, 110B, 110C, and 110D is the same size and equal to 2NAn addressable word may be modified to be represented by N address bits. The capacity of the MCP is therefore effectively 2N+2A word, which is modifiable to be represented by N +2 bits.
In one possible implementation, as shown in FIG. 1, the memory controller 150 interacts with the MCP100 as if it were a 4 rank (rank) DRAM device, requiring rank selection (actually memory die selection) in addition to identifying the desired address within the selected memory die. To this end, memory controller 150 identifies the selected memory die by using four (4) Chip Enable (CE) lines 180 provided directly to controller die 120. Command/address signals received by controller die 120 along external control bus 130 encode the N bits needed to identify an address within the address space of the selected memory die.
In another possible implementation, the memory controller interacts with the MCP as if it were a 2 rank DRAM device, requiring selection of a rank in addition to identifying the desired address in the address space of the rank. To this end, the memory controller identifies the selected rank by using two (2) Chip Enable (CE) lines provided directly to the controller die, while command/address signals received by the controller die along the external control bus include one (1) extra bit to identify the selected memory die. The remaining N address bits identify an address in the address space of the selected memory die.
In yet another possible implementation, the memory controller interacts with the MCP as if there were four (4) times the number of banks (banks), rows, or columns of DRAM devices. To this end, the memory controller implicitly identifies the selected memory die by using two (2) additional address bits that form part of the address encoded by the command/address signals on the external control bus. The remaining N address bits identify an address in the address space of the implicitly selected memory die.
Those skilled in the art will appreciate that the internal and external data buses need not have the same width (pin count), total speed, or speed per pin. In particular, it is contemplated to meet the bandwidth requirements of the external data bus by using a variety of different configurations (some of which are not shown now).
For example, consider that the external data bus 140 is P wires wide and has a capacity of R per pin (R bits per second per pin). This produces a total capacity of Px R (bits per second Px R) on the external data bus 140. The case of fig. 2A is applicable if it is assumed that each of the internal data buses 170A, 170B, 170C and 170D is the same, and if each such internal data bus has the same width P but a capacity of 1/2R per pin (1/2R bits per pin per second). Specifically, two (2) of the memory dies 110A, 110B, 110C, and 110D should be activated simultaneously, so that the aggregate total bandwidth of the internal data buses corresponding to the activated memory dies totals P x R, the total capacity of the external data buses.
On the other hand, if each of the internal data buses 170A, 170B, 170C, and 170D has a capacity per pin of 1/2R (1/2R bits per pin per second) but now doubled in width (i.e., 2P conductors), the case of fig. 2B is applicable. Specifically, the bandwidth of each internal data bus 170A, 170B, 170C, and 170D is P x R, which matches the capacity of the external data bus 140. Thus, only a single memory die should be activated in order to meet the requirements of the external data bus 140.
Clearly, it should be appreciated that the use of the controller die 120 provides flexibility in being able to accommodate a wide range of memory die and internal data bus design choices to achieve system requirements.
It should be noted from the above example that during the time that the memory controller 150 writes/reads data to/from a particular one of the memory dies 110A, 110B, 110C, and 110D along the respective internal data buses 170A, 170B, 170C, and 170D, one or more of the other internal data buses 170A, 170B, 170C, and 170D can remain idle. This allows for a reduction in the amount of power consumed by the internal data buses 170A, 170B, 170C, and 170D and the memory dies 110A, 110B, 110C, and 110D connected to the controller die 120 as a whole.
Referring now to the signal flow diagram in fig. 3, an example of basic signaling that can be used by the controller die 120 and the memory dies 110A, 110B, 110C, and 110D in the context of a read operation is outlined below. First, the controller die 120 receives a global clock signal on the external control bus 130 and external command/address signals synchronized therewith. The external command/address signal includes a first portion that encodes a read command specifying that a read operation is to occur and prepares controller die 120 to receive an address. The second portion of the external command/address signal encodes an address from which data is to be read. The address is either complete enough to allow the controller die 120 to identify the selected memory die or to retrieve this information from another signal, such as one of the chip enable lines 180. Regardless, the controller die 120 identifies the selected memory die. The remainder of the address specifies a read address in the address space of the selected memory die.
After the selected memory die has been identified, the controller die 120 sends internal command/address signals to the selected memory die along an internal control bus (which may be a shared bus or a dedicated bus depending on the configuration). More specifically, the controller die 120 synchronizes internal command/address signals with an internal clock signal, and both are sent to the selected memory die along an internal control bus. A master DLL (not shown) may be provided in the controller die 120 to correlate the internal clock signal to the global clock signal. The internal command/address signals include a first portion that encodes a read command specifying that a read operation is to occur and that the selected memory die is ready to receive an address. The second portion of the internal command/address signals encode the aforementioned read address in the address space of the selected memory die.
The selected memory die receives internal command/address signals and internal clock signals along an internal control bus. As mentioned above, the internal command/address signals are synchronized with the internal clock signal. The selected memory die retrieves the data from the memory cell at the read address through its internal circuitry and places the "read data" onto its internal data bus. The selected memory die also controls the generation of a data strobe signal that is enabled when the internal data lines carry valid data. Thus, the internal data lines carry source synchronous data signals. When the selected memory die takes control of its internal data bus, the data placed on the data lines need not be synchronized with the internal clock signal received by the selected memory die via the internal control bus. The dedicated internal data buses 170A, 170B, 170C, and 170D eliminate the possibility of read data bursts from one memory die interfering with read data bursts from another memory die (which may occur if multiple memory dies sharing a common internal data bus are activated sequentially).
Read data from a selected memory die received on a corresponding internal data bus is captured by the controller die 120 and resynchronized for transmission on the external data bus 140. Proper capture of the read data is achieved by using a clock that is phase shifted by 90 degrees with respect to a data strobe signal received with the read data on the internal data bus. To this end, controller die 120 includes a slave DLL whose frequency is tied to the frequency of the master DLL associated to the global clock signal. The slave DLL is triggered by the rising edge of the data strobe signal and thereafter generates a clock signal that is exactly 90 degrees out of phase with the data strobe signal and at the same frequency as the received data strobe signal.
Controller die 120 may read data from several different selected memory dies in the manner described above. Thus, the slave DLLs are provided to each of the memory dies 110A, 110B, 110C, and 110D. A buffer (e.g., FIFO, not shown) in the controller die 120 can hold the data until it should be provided on the external data bus 140. Resynchronization can be achieved by using the master DLL described previously. This provides accurate latency control in the case of a read operation, as it allows the controller die 120 to ensure that any read data output onto the external data bus 140 follows a determined number of clock cycles after receipt of a read command. This will be explained in more detail later. The controller die 120 also controls the generation of an external data strobe signal that is activated when the external data lines carry valid data.
It should be noted that the selected memory die does not need a synchronization circuit (e.g., DLL) associated to an internal clock signal or any other clock signal, since the selected memory die does not need to align its read data with any received clock signal. This means that such circuitry can be disabled (which can save power), or omitted entirely (which additionally saves real chip area). This ability to disable existing synchronization circuits may be provided by programming memory dies 110A, 110B, 110C, and 110D to operate in a "DLL off" mode (as defined in standard DDR2 or DDR3DRAM devices that conform to JEDEC specifications).
Thus, those skilled in the art will appreciate that the cost of the MCP can be kept low by omitting DLL and ODT circuit blocks from the memory die that can be found in standard DRAM devices. These circuit blocks are not required for point-to-point information transfer on a dedicated internal bus. In addition, since only light loads are encountered in the MCP environment, the output driver size can be reduced.
The use of controller die 120 introduces some latency into the read operation because it takes the selected memory die at least one additional clock cycle before it knows the identity of the read address. However, any adverse consequences resulting from such additional delay are compensated for by the benefits of reduced capacitive loading, reduced power consumption (due to the ability to disable one or several inactive buses and disable ODT and DLL circuits), and reduced DRAM device cost/die size.
Referring now to fig. 4, a timing diagram illustrating two consecutive read operations shows signals at the controller die 120. CLK represents the global clock signal received from the memory controller over the external control bus 130. CLK may also represent an internal clock signal distributed to memory dies 110A, 110B, 110C, and 110D over a shared internal control bus if a master DLL is used to align the internal clock with the global clock. If the master DLL is not used for this purpose, there may be a phase shift between the internal clock signal and the global clock signal.
EXT _ CMD represents an external command/address signal that is provided on the external control bus 130 in synchronization with CLK (in this case, aligned to the falling edge of CLK). EXT CMD includes a first external read command 410 and a second external read command 420 for processing by controller die 120. The memory die to which a particular read command is directed is determined based on the external command/address signals. For purposes of this example, assume that the memory die targeted by the first external read command 410 is memory die 110B and the memory die targeted by the second external read command is memory die 110A. Thus, each external read command results in a corresponding internal read command destined for the selected memory die. In particular, INT CMD denotes an internal command/address signal, which is provided on the (shared) internal control bus in synchronization with the internal clock (in this case, by using the master DLL, INT CMD and the internal clock are both synchronized with CLK so that there is no uncontrolled phase shift, only one clock cycle delay). INT _ CMD includes a first internal read command 430 (which is delayed one full clock cycle from the first external read command 410) and a second internal read command 440 (which is also delayed one full clock cycle from the second external read command 420). Only a single INT CMD signal is shown. This represents the case where all memory dies are connected to a shared internal control bus. In the case of a separate internal control bus, there will be multiple internal control buses (e.g., INT _ CMD1, INT _ CMD2, etc.), and the respective internal read commands 430, 440 will appear on the respective internal control bus associated with the addressed memory device.
The first internal read command 430 is processed by the memory die 110A, which memory die 110A retrieves the first read data 450 from the memory location at the desired address. Memory die 110B source synchronously outputs first read data 450 onto internal data lines of internal data bus 170B. That is, when read data is provided on the internal data lines, the memory die 110B also activates the data strobe signal. This may be preceded by a preamble (lasting, for example, one full clock cycle) during which the data strobe signal remains at a low logic level. DQ2[0.. N ] represents the data on the internal data bus 170B providing the first read data 450, and DQS2 represents the data strobe signal. It should be noted that DQS2 exhibits full clock cycle preamble 455 at a low logic level.
In much the same way, the second internal read command 440 is processed by the memory die 110A, which memory die 110A retrieves the second read data 460 from the memory location at the desired address. The memory die 110A outputs the second read data 460 in a source synchronous manner onto the internal data lines of the internal data bus 170A. That is, when read data is provided on the internal data lines, the memory die 110A also activates the data strobe signal. This may be preceded by a preamble (lasting, for example, one full clock cycle) during which the data strobe signal remains at a low logic level. DQ1[0.. N ] represents the data on the internal data bus 170A providing the second read data 460, and DQS1 represents the data strobe signal. It should be noted that DQS1 exhibits a full clock cycle preamble 465 at a low logic level.
As mentioned above, memory die 110A and memory die 110B may not be equipped with circuitry for retrieving CLK, or such circuitry may be disabled. As a result, memory die 110A and memory die 110B output their data with an arbitrary phase relative to CLK. This would be the case, for example, when memory die 110A and memory die 110B output their data asynchronously. This will result in variations in the "CAS latency" (or "CL") between different memory dies. In particular, the latency between the issuance of the first internal read command 430 by the controller die 120 and the appearance of the first read data 450 on the internal data lines of the internal data bus 170B may be different than the latency between the issuance of the second internal read command 440 by the controller die 120 and the appearance of the second read data 460 on the internal data lines of the internal data bus 170A. In practice, the CAS latency CLn (for memory die 110 n) may vary anywhere within the latency range from the minimum value of CL (CLmin) to the maximum value of CL (CLmax), which may span more than one clock cycle. Factors that may affect CAS latency in a particular instance of an internal read command include manufacturing variations, the distance between the selected memory die and the controller die 120, and local temperature gradients (to mention just a few possibilities).
As shown in fig. 4, which assumes a DDR (double data rate) mode of operation, first read data 450 provided by memory die 110B includes a burst of four (4) data words at a maximum CAS latency CLmax (which is equal to 3 full clock cycles), while second read data 460 provided by memory die 110A includes a burst of four (4) data words at a minimum CAS latency CLmin (which is slightly greater than 2 full clock cycles in this example).
Controller die 120 receives read data 450, 460 on two internal data buses 170B, 170A. Specifically, in the DDR mode, the controller die 120 samples the first read data 450 (second read data 460) on the rising and falling edges of the received data strobe signal DQS2 (DQS 1) delayed by 90 degrees by the corresponding slave DLL. Some initial training may be required to determine the proper internal timing to enable each slave DLL during the preamble. It should be noted that controller die 120 wishes that read data arrive from any selected memory die no earlier than CLmin and no later than CLmax after an internal read command is issued to that memory die.
A buffer within controller die 120 is able to hold first read data 450 and second read data 460 until they should be provided onto the external data lines of external data bus 140, i.e., at a time that represents a determined delay with respect to the receipt of first external read command 410 or second external read command 420 from memory controller 150. DQ [0.. N ] represents data on external data bus 140, including first read data 450 and immediately subsequent second read data 460.
Controller die 120 outputs first read data 450 and second read data 460 onto external data bus 140 in a source synchronous manner. That is, the controller die 120 enables the data strobe signal (represented by DQS in FIG. 4) to pass signals present on the external data lines of the external data bus 140 for valid data. This may be preceded by a preamble (lasting, for example, one full clock cycle) during which the DQS remains at a low logic level. It should be noted that DQS exhibits a full clock cycle preamble 475 at a low logic level; however, it only needs to be lowered once (i.e., before the first read data 450 is output).
Thus, data received on the internal data bus 170n (for the memory die 110 n) with the CAS latency Cln is delayed by an additional ((CLmax-CLn) +1) clock cycle before appearing on the external data bus 140.
Thus, the aggregate total latency CLext between the receipt of an external read command (e.g., 410, 420) at the controller die via the external control bus 130 and the output of data (e.g., 450, 460) on the external data bus 140 may be represented as follows:
from external read command to internal read command: 1
From internal read command to receipt of read data: + CLn
Equalization delay added by the controller: ((CLmax-CLn) +1)
=======================
Aggregate total delay CLext: 5 clock cycles
Thus, it can be seen that the aggregate total delay CLext is uniform and independent of CLn. Thus, from the perspective of memory controller 150, the overall latency can remain the same, even though the individual memory dies 110A, 110B, 110C, and 110D may have different CAS latencies (due to various factors, especially the disabled synchronization circuit or the lack of a synchronization circuit). Thus, it should be appreciated that the MCP100 provides a deterministic latency with respect to external read commands without requiring a DLL located on each memory die 110A, 110B, 110C and 110D. In particular, it should be noted that the aggregate total latency between the issuance of the first external read command 410 by the memory controller 150 and the appearance of the first read data 450 on the external data bus 140 is the same as the aggregate total latency between the issuance of the second external read command 420 by the memory controller 150 and the appearance of the second read data 460 on the external data bus 140. Therefore, the aggregate total delay can be kept consistent.
In the above example, it is assumed that CLmax is exactly equal to three (3) clock cycles. Of course, CLmax may be different in certain implementations, and may not even be an integer number of clock cycles. In this case, the above calculation may be changed to account for the difference between CLmax and the next largest integer. Alternatively, CLmax can be adjusted to the next highest half clock cycle to achieve a CLmax of 3.5 or 4.5. In either case, however, the resulting value of CLext will be independent of CLn.
It should be noted that the use of separate internal data buses 170A, 170B, 170C and 170D overcomes various problems. First, if memory die 110A and memory die 110B share a common internal data bus, and if two internal read commands are issued consecutively as described above, the end of the burst from memory die 110A will collide with the beginning of the burst from memory die 110B. Moreover, since the data strobe signal accompanying a given burst has a longer duration than the burst itself (since the full clock cycle preamble has a low logic level), it is not possible to arrange the bursts in a continuous (gapless) manner using a common internal data bus, which would reduce the usable capacity of such a common internal data bus. However, in the embodiments of MCP100 described herein, these problems do not arise because each memory die has its own internal data bus. Also, in the embodiments of the MCP100 described herein, bursts of data from individual memory dies (which may overlap, be contiguous, or separated by time gaps) are connected to create longer gapless bursts for improved bus utilization.
From a physical standpoint, and as also shown in FIG. 5, the controller die 120 and the memory dies 110A, 110B, 110C, and 110D can be stacked on top of each other in the MCP 100. The controller die 120 can be smaller than any of the memory dies 110A, 110B, 110C, and 110D, and thus can be placed on top of the memory dies 110A, 110B, 110C, and 110D, while the memory dies 110A, 110B, 110C, and 110D can themselves be stacked on a package substrate. Wire bonds from the memory dies 110A, 110B, 110C, 110D and the controller die 120 can be down-connected to the package substrate to enable external connections and inter-die connections. In one embodiment, a custom memory die can be built with bond pads 550 along the edge of the memory die, as shown in fig. 5, which can facilitate die stacking. While it is possible for bond pads to be located on memory dies on both sides of the chip, a more advantageous memory die may be one in which bond pads are located on only one side of the chip. This allows the dies to be stacked and staggered to expose bond pads on all dies in the stack, thereby facilitating direct wire bonding with the package substrate without the need for an interposer (interposer).
In some configurations, several rows of MCPs are arranged on the front side (and possibly the back side) of the printed circuit board. This may be referred to as a dual in-line memory module (DIMM). DIMM modules are commonly used in PCs where a user can upgrade memory by adding or replacing modules inserted into motherboard slots. A DIMM module conforming to JEDEC DDR3 standard JESD-793C has a total of 250 pins and provides a 64-bit or 72-bit data interface.
In other configurations, multiple MCPs may be "registered". In particular, fig. 6 illustrates a multi-row MCP Registered Dimm (RDIMM) system utilizing a MCP RDIMM601, where the MCP RDIMM601 has a plurality of MCPs 600A,600B, 600C, 600D, 600E, 600F, 600G, and 600H mounted on a circuit board. In the example shown, the number of MCPs is eight (8), but this is not a limitation of the present invention. The MCP RDIMM601 has an interface 640 that may be connected to a storage controller 650 via a motherboard 660. In a typical PC, several DIMM slots mounted on a motherboard are used to facilitate system upgrades.
In addition, the MCP RDIMM601 includes a discrete register chip 610 mounted on a circuit board. The register chip 610 is configured to buffer external command/address signals and global clock signals received via the interface 640 for distribution to the MCP600A, 600B. Specifically, there are two (2) separate intermediate control buses, one (620L) for providing command/address and clock signals to the four (4) MCPs (MCPs 600A,600B, 600C, 600D) on the left, and another (620R) for providing command/address and clock signals to the four (4) MCP devices (MCPs 600E, 600F, 600G, 600H) on the right. A termination resistor network 630L, 630R is placed at the end of each intermediate control bus 620L, 620R to remove reflections and maintain signal integrity. There may be fewer or more than the number shown, depending on the speed of operation and module board design considerations.
The register chip 610 may detect which set of MCPs (i.e., left or right) is being accessed based on the Chip Enable (CE) line or address bits, and drive only the required intermediate control bus (i.e., 620L or 620R). In a standard PC DIMM, all external data buses are activated and therefore both the left and right control buses must be activated.
The external data bus of each MCP is directly connected to the memory controller 650 via the interface 640, without passing through the register chip 610. Specifically, for x8MCP (i.e., an external data bus with a width of 8 bits), the external data bus can be connected in byte packets to achieve the x64 module data width. Other groupings are possible, such as nibble (nibble) groupings implemented using x4DRAM devices. It is also possible to use for supporting DIMM modules with x72 module data width by using, for example, nine (9) byte wide MCPs in total.
As described above, each of MCPs 600A, 600B. In a given MCP, the controller die also buffers external command/address signals and global clock signals received via the register chip 610. The external data bus of a given MCP is connected directly between the memory controller 650 and the controller die of the given MCP, bypassing the register chip 610.
The register chip 610 includes a Delay Locked Loop (DLL) for generating an internal clock for capturing and regenerating command/address and internal clock signals. The input is latched (or registered) by using the input sampling clock, and the latched signal is clocked out by using the output drive clock. Typically, the output drive clock is automatically adjusted to provide a delay of one (1) clock cycle from input to output through the register chip 610.
The effect of the register chip 610 and the controller die 120 of each MCP on latency is as follows. First, the register chip 610 adds one (1) clock cycle of delay to the command stream, while the controller die 120 adds another one (1) clock cycle of delay to the command stream. For the data path, controller die 120 adds a delay of one (1) clock to the read data provided to the external data bus by the selected memory die and a delay of one (1) clock cycle to the write data from the external data bus to the selected memory die. Thus, the read data latency of a MCP RDIMM is three (3) more clock cycles than an unbuffered DRAM device (non-MCP, unbuffered DIMM) and two (2) more clock cycles than a conventional (non-MCP) RDIMM. The write data latency of the MCP RDIMM601 is one (1) clock cycle more than that of the unbuffered DRAM devices (non-MCP, unbuffered DIMMs) and is the same as the latency of a conventional (non-MCP) RDIMM.
In the MCP RDIMM system described above, adding the register chip 610 reduces the load on the memory controller 650 with respect to the external control bus. Also, each MCP600A,600B, 600H presents only a single load in terms of an external control bus and an external data bus. As a result, a greater number of MCPs can be accommodated (and thus greater memory density can be achieved), and the operating frequency can be maximized. In addition, power consumption will become lower, and since the load per module is reduced, a higher termination value can be used. Moreover, it will be appreciated that the MCP RDIMM described above uses a smaller module board area than a conventional RDIMM with even half the capacity. This allows for lower module heights and more compact systems, which is particularly beneficial in portable devices and blade servers where small form factor is a critical requirement.
It will thus be appreciated that MCPs having a controller die for buffering control signals and data signals to a plurality of memory dies have been provided. The memory die and the controller die can be assembled into a stack. The controller die presents a single load to the external memory controller to achieve high performance while reducing power consumption. In particular, power reduction is achieved by providing separate internal data (and possibly control) buses to the individual memory dies, and activating only those buses connected to the active memory devices. Power consumption is also reduced by operating the internal data and control buses in a non-terminated mode. Additional power reduction is achieved by operating the memory die in a DLL disabled mode. Further power reduction is achieved by using VTT termination (rather than split termination) on the external data and control buses.
It should be appreciated that in some embodiments, all or a portion of a semiconductor memory device can be fabricated based on a low-level hardware description obtained using a logic synthesis tool running on a computing device. The logic synthesis tool is configured to read source code (e.g., in a language such as HDL, VHDL, Verilog) containing a functional description of a semiconductor memory device and output a definition of a physical implementation of a circuit suitable for implementing the corresponding function.
Additionally, while the above description has been provided in the context of a DRAM memory device, those skilled in the art will recognize that aspects of the present invention may be applied to other memory types, including SRAM, MRAM, FeRAM, PCRAM, ReRAM, EEPROM, NAND flash, and NOR flash.
In the embodiments described above, the device elements and circuits are shown connected to each other for simplicity. In practical applications of the present invention, elements, circuits, etc. may be connected directly to each other, or they may be indirectly connected to each other through other elements, circuits, etc. necessary for the operation of the devices and apparatuses. Thus, in actual configuration, the circuit elements and circuits described herein may be directly or indirectly coupled or connected to each other.
The embodiments of the invention described above are intended to be examples only. Alterations, modifications and variations may be effected to the particular embodiments by those of skill in the art without departing from the scope of the invention, which is defined solely by the claims appended hereto.

Claims (25)

1. A semiconductor device, comprising:
a plurality of memory dies, wherein each memory die has a synchronization circuit; and
a controller die connected to an internal control bus, the controller die configured to provide an internal read command to a selected one of the memory dies in response to an external read command;
wherein the selected memory die is configured to provide read data to the controller die in response to the internal read command;
wherein each of the plurality of memory dies is configurable to provide the read data to the controller die in response to the internal read command with the synchronization circuit disabled.
2. The semiconductor device of claim 1, wherein the controller die is further configured to: after receiving the read data from the selected memory die, an equalization delay is added to the received read data.
3. The semiconductor device of claim 1, wherein the controller is configured to:
selecting a first memory die of the memory dies as a selected memory die;
receiving first read data from the first memory die;
selecting a second memory die of the memory dies as a selected memory die;
receiving second read data from the second memory die; and
and outputting the first read data and the second read data onto an external data bus.
4. The semiconductor device of claim 3, wherein the controller die is further configured to: buffering at least an initial portion of the second read data received concurrently with an ending portion of the first read data to allow the initial portion of the second read data to be output onto the external data bus following the ending portion of the first data.
5. The semiconductor device of claim 3, the controller die further configured to: delaying output of the first read data to allow the initial portion of the second data to follow the ending portion of the first data on the external data bus without a gap when there is a gap between receipt of an ending portion of the first read data and receipt of an initial portion of the second read data.
6. The semiconductor device of claim 1, wherein each memory die includes a respective internal data bus that independently connects the respective memory die to the controller die.
7. The semiconductor device of claim 6, wherein the controller die is further configured to: upon receiving read data output by the selected memory die, the internal data bus of each memory die other than the selected memory die is disabled.
8. The semiconductor device of claim 1, wherein all memory dies do not have circuitry configured for on-die termination of an internal data and control bus.
9. The semiconductor device of claim 1, all memory dies comprising circuitry configured for providing on-die termination of internal data and control buses, and the all memory dies configured to disable the circuitry for providing on-die termination.
10. The semiconductor device of claim 1, wherein the controller die is further configured to: receiving a global clock signal on an external control bus to synchronize the read data received from the selected memory die with the global clock signal.
11. The semiconductor device of claim 1, wherein each memory die is a Dynamic Random Access Memory (DRAM) die.
12. The semiconductor device of claim 11, wherein the synchronization circuit is a Delay Locked Loop (DLL).
13. A multi-chip package comprising a plurality of semiconductor memory devices according to claim 1.
14. A semiconductor device, comprising:
a plurality of memory dies, wherein all of the memory dies do not have synchronization circuitry; and
a controller die connected to an internal control bus, the controller die configured to provide an internal read command to a selected one of the memory dies in response to an external read command;
wherein the selected memory die is configured to provide read data to the controller die in response to the internal read command.
15. The semiconductor device of claim 1, wherein the controller die is further configured to: after receiving the read data from the selected memory die, an equalization delay is added to the received read data.
16. The semiconductor device of claim 1, wherein the controller is configured to:
selecting a first memory die of the memory dies as a selected memory die;
receiving first read data from the first memory die;
selecting a second memory die of the memory dies as a selected memory die;
receiving second read data from the second memory die; and
and outputting the first read data and the second read data onto an external data bus.
17. The semiconductor device of claim 3, wherein the controller die is further configured to: buffering at least an initial portion of the second read data received concurrently with an ending portion of the first read data to allow the initial portion of the second read data to be output onto the external data bus following the ending portion of the first data.
18. The semiconductor device of claim 3, the controller die further configured to: delaying output of the first read data to allow the initial portion of the second data to follow the ending portion of the first data on the external data bus without a gap when there is a gap between receipt of an ending portion of the first read data and receipt of an initial portion of the second read data.
19. The semiconductor device of claim 1, wherein each memory die includes a respective internal data bus that independently connects the respective memory die to the controller die.
20. The semiconductor device of claim 6, wherein the controller die is further configured to: upon receiving read data output by the selected memory die, the internal data bus of each memory die other than the selected memory die is disabled.
21. The semiconductor device of claim 1, wherein all memory dies do not have circuitry configured for on-die termination of an internal data and control bus.
22. The semiconductor device of claim 1, all memory dies comprising circuitry configured for providing on-die termination of internal data and control buses, and the all memory dies configured to disable the circuitry for providing on-die termination.
23. The semiconductor device of claim 1, wherein the controller die is further configured to: receiving a global clock signal on an external control bus to synchronize the read data received from the selected memory die with the global clock signal.
24. The semiconductor device of claim 1, wherein each memory die is a Dynamic Random Access Memory (DRAM) die.
25. A multi-chip package comprising a plurality of semiconductor memory devices according to claim 1.
HK13113677.8A 2010-02-25 2013-12-09 Semiconductor memory device with plural memory die and controller die HK1186270A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US61/308041 2010-02-25
US12/967918 2010-12-14

Publications (1)

Publication Number Publication Date
HK1186270A true HK1186270A (en) 2014-03-07

Family

ID=

Similar Documents

Publication Publication Date Title
US9348786B2 (en) Semiconductor memory device with plural memory die and controller die
US11990177B2 (en) Multi-die memory device
US11783879B2 (en) Memory device comprising programmable command-and-address and/or data interfaces
US9001597B2 (en) Memory system, semiconductor memory device, and wiring substrate, the semiconductor memory device including termination resistance circuit or control circuit
US6970968B1 (en) Memory module controller for providing an interface between a system memory controller and a plurality of memory devices on a memory module
EP4312104A1 (en) Memory module adapter card with multiplexer circuitry
HK1186270A (en) Semiconductor memory device with plural memory die and controller die
HK1175295A (en) Semiconductor memory device with plural memory die and controller die