CN1965486B - Memory compression - Google Patents
Memory compression Download PDFInfo
- Publication number
- CN1965486B CN1965486B CN2005800182352A CN200580018235A CN1965486B CN 1965486 B CN1965486 B CN 1965486B CN 2005800182352 A CN2005800182352 A CN 2005800182352A CN 200580018235 A CN200580018235 A CN 200580018235A CN 1965486 B CN1965486 B CN 1965486B
- Authority
- CN
- China
- Prior art keywords
- value
- paired
- replacement
- difference
- memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
Images
Landscapes
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Complex Calculations (AREA)
Abstract
Exemplary embodiments of the present invention include methods for compressing data for storage in a memory. According to one embodiment, the method forms a set of values based on a monotonically ordered sequence of look-up table values. For one or more pairs of paired values in the set, the exemplary method generates a difference of the values in the pairs. After modifying the set by replacing one of the values in the pair with a value based on the difference, the exemplary method stores a final value based on the modified set in memory. The invention also includes a memory for storing look-up table values. One exemplary memory includes a decoder, an encoder, and one or more patterns of intersecting interconnect lines that interconnect the decoder and the encoder. The pattern of crossing interconnect lines may be implemented on one or more planar layers of conductor tracks vertically interwoven with an isolating material.
Description
Technical field
What the present invention is based on concrete realization has to the microprocessor such as the storage requirement of the fixed data of tracing table or program, makes it possible to realize the low-cost electronic installation of dimension shrinks and cost cutting.
Background technology
Electronic installation is generally used for microprocessor, to carry out from the very simple to the unusual various tasks of complex array.When a device (such as electronic communication device, remote control, other mobile device etc.) when being designed to specific function or function, in case exploitation is accomplished, just be stored in the read-only memory (ROM) usually with these function corresponding programs.In ROM, also can store such as being used for needs and calculate, perhaps other fixed data of the tracing table of the value of some functions of trigonometric function like index or logarithmic function.
The ROM of low cost and compact conformation is the mask model programming ROM.In the mask model programming ROM, the address be encoded into the corresponding many holding wires in address in one on logic level is provided.Address wire is intersected with bit line, and bit line is corresponding with the output word of design, and is provided with a transistor at the infall of every address wire and bit line, and wherein, binary one is the position output to this address design.In fact, this ROM is equal to the huge OR gating to each carry-out bit, in this gating, if address A1.OR.A2.OR.A3. ... Effectively, then determine that it is " 1 ", otherwise, determine that it is " 0 ".
The transistor that uses for " 1 " of expression storage provides input to each many input OR gating.Fig. 1 shows this ROM, wherein, has stored the word of 32 of length in 1024 addresses each.If hope bigger memory, the pattern that can repeat among Fig. 1 forms the continuous multirow of 1024 words of every row, and capable selection wire can be provided, and makes it possible to from selection wire output, and this output is selected specific word with the specific column address line that activates.
The diffusion of the custom design of utilize the appropriate bit patterns that designs, during manufacture process, using and/or contact and/or metallization mask are provided with transistor.Because the mask manufacturing technology is expensive, so it only uses when the content of ROM is supposed to be fixed in the production unit of high power capacity.Because such convenience; Promptly; Can before the known and later filling of the content of hope, make memory; Also can in use revise the content of this memory, select to wipe the more how recent memory of Reprogrammable read-only memory (EAROM, EEROM, RPROM, " flash " memory etc.) and other such as other form of the nonvolatile memory exploitation of ferroelectric storage such as electricity or UV so more generally have precedence over mask model ROM.What say is that the silicon area of mask model programming ROM and cost advantage thereupon do not have fully to surpass the convenience of field programming so far.
Some solutions that when keeping some flexibility, can obtain the advantage of mask model ROM comprise stored program maturing part (promptly; Subprogram); Perhaps in mask model ROM, show to be used for the fixed table or the character set of mathematical function, thereby in the Reprogrammable memory, link and call them through a less program.Like this, replace mask model ROM routine if desired, then only need the routine of this replacement be set in Reprogrammable ROM and this mask model rom version of having mercy on, that is, and the processing of a kind of being known as " repairing ".Yet the area advantage of mask model ROM still is not enough to promote the wide in range use of this technology.Thereby, compare with up to the present mask model ROM, need the compacter and more economical immobilized substance memory of far reaching.
Summary of the invention
Example embodiment of the present invention comprises the method that is used for compressing at memory the data that are used to store.According to an embodiment, said method comprises based on the dull ordered sequence in the look-up table values and forms a class value.To a pair of or more values to pairing in said group, demonstration methods generates the poor of value in the said pairing.Utilizing of replacing in the said value in the said pairing based on a value of said difference to revise after said group, the demonstration methods handle is stored in the memory based on the end value of the group of said modification.
According to an embodiment, the surplus value in said modification pairing remains unchanged.After first time iteration, be stored in the memory based on the end value of said modification group.Alternatively, can carry out additional iteration stores with further packed data.For example, in second time iteration, demonstration methods forms the pairing between the unmodified value, and forms the pairing between the modification value, then is created on the new poor of said value in the new pairing.Subsequently, the demonstration methods utilization is revised present group based on one in the said value in the said pairing of value replacement of said difference.Repeat this process till the iteration of accomplishing pre-determined number.After accomplishing final iteration, be stored in end value in the memory based on the said modification group of value through final iteration generation.
According to another embodiment, utilize the value with generating of the said value that is based in the said pairing to replace surplus value said modification centering.At this for the first time after the iteration, connect each pairing and generate end value with difference.Alternatively, can carry out additional iteration stores with further packed data.For example, in second time iteration, demonstration methods forms the pairing between said and the value, and forms the pairing between the said difference, then is created on the new and poor of said value in each new pairing.Repeat this process till the iteration of accomplishing pre-determined number.After accomplishing final iteration, connect corresponding and generate end value with difference with the pairing of final iteration.
In one embodiment, through utilizing the algorithm of invention, being used to store word length L1 quantity 2
N1ROM from total big or small L1 * 2
N1The position is compressed into storage L2 position 2
N2Word, total L2 * 2
N2The ROM of position, wherein, L2>L1 and N2<N1.
A kind of application is for rule function F (x) provides tracing table, wherein, through with this tracing table of the corresponding binary digit pattern of x addressing, and output is pre-stored in the value F of that position.In this case, compression utilizes their triangle (delta) value to come the replacement value of replacement function according to aforementioned addresses for the first time.Alternatively, the every pair of consecutive value can use they abolishment least significant bit (LSB) and replace with their difference.When function was rule function, the difference between the successive value can be stored position seldom thus much smaller than functional value itself.
For the second time algorithm can comprise and repeats this compression process; It is understood that the array according to first order input even address value; Then separately import even address triangle value array, obtain functional value thus, two the first time poor and four functional values of the poor replacement second time.For the second time poor usually also less than for the first time poor, thus, can utilize than for the first time poor position still less and store, accomplish further compression.Disclose the process of system, be used for butterfly operation, utilized fast Walsh transform to make up the more compression algorithm of high order based on the LSB of wherein having abolished of revising and item.When reading ROM, utilize the inverse transformation reconstructed value.
Second carries out through storing common or storing the upward block (block) of adjacent value of numeral near common highest significant position (MSB) pattern and Different L SB pattern, thus, and the average figure place of every value of reduction storage.If the common highest significant position pattern in a block allows nearly different+l or-1, then can allow block size preferably, that is, and regular block size.。Then, the extra order that is associated with minimum live part confirms whether this common most significant part must be to a concrete searching value increasing or decreasing.Second execution and first execution are compared permission and are more simply rebuild.
The present invention also comprises the memory that is used to store a plurality of look-up table values, wherein, and each look-up table values and the address that comprises a plurality of address marks.An exemplary memory comprises decoder, encoder, and one or more pattern that is made up of cross-connect cable.One enable signal during decoder is exported to decoder through one or more generation in the said a plurality of address marks of decoding, and encoder generates output word based on this enable signal.The pattern that is made up of the cross bonding line is connected to the encoder input to each decoder output.In order to reduce the size of memory, aspects more of the present invention are in order to utilize the vertical dimensions of memory, and the plane layer of being processed by conductor rail that utilizes one or more and isolated material vertically to interweave forms the pattern that is made up of the cross bonding line.
Using to a demonstration of sort memory is even that the value that allows the compressing ram size to be used to store erratic function is the random value of storage such as computer program instructions.In this case, the algorithm of invention is by stored numbers order operation function or data value, and it can be accomplished by the address wire that fixed pattern is replaced on the chip.Then; The compression process of describing to rule function can be used for utilizing the triangle value of they relevant with the hithermost absolute value data value that is stored in any position to replace definite absolute value data value, compares this triangle value of storage with storage absolute value data value and has used less position.If two or more consecutive values are identical, can be compressed into one so, and utilize the input of OR gating to merge the corresponding address line.Each OR gating input needs a transistor, thus, is equivalent to the memory bit that takies according to silicon area.
The reconstruction that utilizes the value that M compression algorithm cross of addressing comprises as long as activate a word that just reads 2 of length L in the M bar neighbor address line.Then, merge it according to the address wire that wherein activates and comprise base value, the first time and the part of high order difference more.Can utilize the M-line to log
2(M) the line priority encoder makes up each group in the M bar address wire, to obtain log
2(M) position is used for systematically enabling to rebuild combinational logic.
Utilize the present invention, through use depend on the algorithm order 2 and 5 between the factor can typically reduce the ROM silicon area with the amount of rebuilding combinational logic.When the data of compression are computer program, need as it is certainly, thereby always taking place accurately, the position rebuilds.
In the second aspect that exemplary memory is carried out, in numerical order, accomplish the compression of ROM random content through relocated address bundle of lines data qualification.Because stored information effectively in displaced pattern, so compare with information bit, the information of in ROM, utilizing less actual bit to store identical figure place is fine.Because can on chip, hold the very many interconnection pattern layers that separate through insulating barrier, so can make a large amount of different displaced patterns of address wire, each displaced pattern all comprises information, thus, utilizes vertical dimensions to store the information of increment.
Description of drawings
Fig. 1 illustration mask model programmable read only memory (ROM).
Fig. 2 illustration one the demonstration rule function figure.
Fig. 3 illustration one the demonstration compression algorithm.
Fig. 4 illustration demonstration rebuild logic.
Fig. 5 illustration the comparison diagram between true monotonic function and the exponential approximation.
Fig. 6 A-6D illustration be used to eliminate the demonstration program for the ground floor of the data of rebuilding storage of dish formula circuit.
Fig. 7 illustration conventional MOS transistor.
Fig. 8 illustration the structure of an exemplary memory circuit.
Fig. 9 A and 9B illustration one the demonstration row-address decoder.
Figure 10 illustration be used for arranging a demonstration program of the data of storage by classified order.
Figure 11 illustration carry out to a demonstration of the memory of the first time compression that utilizes random data.
Figure 12 A-12C illustration be used for one of address decoder demonstration and carry out.
Figure 13 illustration an exemplary preference level encoder.
Figure 14 A has compared exemplary coder and decoder combination to 14B.
Embodiment
At first, Demonstration Application of the present invention is described, wherein, needed tracing table to the value of the rule function (for example, monotonic increasing function) of hoping to be used for specific calculation.Equality (1) is expressed as example function:
F
a(x)=log
2(1+a
-x)(1)
Wherein, " a " is the truth of a matter of logarithm.This function U.S. Patent application the _ _ _ _ _ _ _ _ _ _ _ _ _ _ mentioned in the logarithm arithmetical operation described in number (application attorney docket 4015-5281 and 4015-5287), this U.S. Patent application is incorporated into this by reference.
The truth of a matter be shown be this graph of function of 2 among Fig. 2, the value of x by 512 divide 0 to 16383 between, that is, the value of x has 9 on binary point the right, and has 5 on the binary point left side.To by X
MThe complement code as 512x of expression has been marked and drawed this function, and its scope is 0 to 32 and along with independent variable X
MIncrease and provide the monotonic increasing function value.It is desirable for the x value that provides according to 16384 possibility discrete sheet train values to any, providing precision is the fast access of the functional value of 23 binary location.Alternatively, can use truth of a matter e, in this case, utilize 24 binary location to realize similar precision.
To the latter, do not compress tracing table and comprise 16384 24 words, amount to 393216.The value of word should be rounded to nearest LSB, the fact with the important implication that will describe the back.Yet no matter how those are worth, and all need impose on any compression or the decompression algorithm of these look-up table values, reproduce the value of rounding off of hope with the accurate mode of step-by-step.
Greater than 17.32 situation, when x>24, is the situation of e for the truth of a matter to the x value, and function is with regard to 24 precision and Yan Weiling, and is 2 situation for the truth of a matter, and function is zero.This is appreciated that to for x>24, i.e. x>11xxx.xxxxxxxxx or X
M<00xxxxxxxxxxxx, F (x)=0; Only need X thus
MThe value of>00111111111=4095.Yet this is specific to example function, does not detail here thus; With consideration whole 16384 values are provided on the contrary.
To independent variable X
MIn only the difference between the successive value of 1 least significant difference thereby can use bit representation still less significantly much smaller than function itself.For any smooth function, this observation all is real.Therefore, the single order compression algorithm comprises only to X
MThe even address value store 24 place values, and for odd address, storage is with respect to the increment of last even number value.
Direct calculating to example function shows, the highest increment that runs into is 16376 (to multiply by 2
-24), this can be contained in 14 words.Therefore, ROM can be reconstructed into 8192 24 word and 8192 14 word.For even address X
MOnly need 24 place values, and for odd address X
M24 of needs and 14 place values read and addition.Thus, ROM is compressed to 8 192 * (24+14)=311296 from 16384 * 24=393216 position, has saved 21%, perhaps be compressed to 79% of life size.
Two order algorithms can be derived through using such process, that is, in this process, use 8192 24 place values first array a value and replace the consecutive value of pairing to the triangle value of second array of 8192 14 triangle values dividually.Use for the first time and adopt original function to generate two triangle values of separating; It doubles the triangle value of monospace roughly; Need 15 storages thus; This algorithm is used for the first time to 14 and generate triangle value for the second time the second time of triangle value, and it should need 5 in theory in 0 to 31 scope.Yet, because above-mentioned function is directed against rounding off of nearest 24 words, cause the scope of the triangle value second time to increase to minimum-1 to maximum 33, need 6 storages.This rising is because value can be to round down and adjacent words is rounded up to, so eucalyptus 1 or 2 increases (or reducing) triangle values.
Now, the ROM size reduce in 24,15,14 and 6 place values each all be 4096, amount to 4096 * 59=241664 position, save 39%, or be compressed to 61% of life size.The reconstruction of the value of hoping comprises 4096 * 59 ROM of the highest effective 12 bit addressings of utilization
; With acquisition length difference position 24,15,14, and four sub-words of 6; Then; Minimum effective 2 reconstructions of controlling according to the value of the hope of said four sub-words of using
, as following:
Use F, D2, D1 respectively; And DD representes 24,15,14; And 6 place value; For
LSB=00, the value of this hope is F
01 P+19l
1O F+D2
11 F+D1+D2+DD
Form F+D1 and need 14 adders; Forming F+D2 needs 15 adders, needs 6 adders and form D2+DD.Thus, need 35 full adder unit for rebuilding the value of hoping.To the situation of example function,, and utilize following rule to be shifted into O to 34 to the DD scope of-l to 33 through 15 D2 values by 1 reduction storage:
01 F+D1
10 F+D2+1
11 F+D1+D2+DD
Thus; 6 adders LSB according to
1 perhaps DD be added to D2; And can add a value DD0 or a DD1 usually, to allow being received to positive any scope of DD from negative.
Fig. 3 has described thisly to be used for compress algorithm through the process of obtaining the difference between the consecutive value is carried out iteration.The value that will store of hoping be at first 16384 with f0, f1, f2 ..., f6 ... Deng 24 place values up to f (16384) expression.First application of obtaining poor process keeps even number value constant, still uses each odd number value according to the difference of aforementioned even number value and has replaced.For example, replaced f5 with d4=f5-f4.For the demonstration smooth function, calculate that these are poor by maximum 14 long values, it is 1024 times less than original 24 place values.Even utilize the smooth function be almost a straight line, a kind ofly possibly be, expectation only by scope to become difference between the 1/16384th two values of separating of 16384 times/one less than maximum original value, and shorten 14 thus.Usually, to any smooth function, the shortening of this word length takes place in constraint.Thereby, second the row in, functional value is represented as 8192 pairs of values, in every pair of value one be 24 long, and another be 14 long.
In the third line in Fig. 3, separately show with f0, f2, f4 ... The expression even number value and with d0, d2, d4 ... The difference of expression comes illustration to be directed against residue f value and separately is directed against the application of d value to ask difference algorithm.The 4th the row shown in the result be, per the 2nd even number value, that is, it is 24 place values that per the 4th even number value in the original value remains unchanged this moment, and is becoming 15 potential differences such as the even number value between the f2-f0.14 place values that the d value that each replaces remains unchanged in the same old way, and those just reduce such as the value between the d6-d4 become the position 6 second time poor.Now, original 16384 24 place values have been compressed into 4096 groups and have comprised 24,15, the value of 14 and 6 place values.
Can this process of iteration till only about 2 the long high order differences of majority in value.24 nearest binary location form pattern quite at random to these final high order differences because of original 24 place values are rounded up to or are rounded up to downwards.Total size of memory asks difference algorithm to reduce according to having used how many times iteratively.
Fig. 4 shows and is used for the reconstruction logic of differential pressure compression algorithm for the second time.Comprise the tracing table that 14 bit address is applied to it in the original method of 16384 24 words of storage shown in the top of figure.The differential pressure technology that contracts has been described for the second time in the bottom of figure, and wherein, tracing table reduces into 4096 59 place values.Only utilize the highest effective 12 tracing tables that come addressing to be compressed into 4096 59 place values of original 14 bit address, it defines quaternate desired value effectively and is positioned at its value.The difference of poor, 14 monospaces that comprise 24 absolute values, 15 double spaces and 6 the second time poor said 59 place values be applied to corresponding adder as shown in the figure.Enable the first row adder according to the second minimum effective position; If institute's rheme is " 1 ", then addition 15 potential differences and 24 place values, and 6 of the additions poor and 14 potential differences second time; Otherwise; If enable bit is " 0 ", then transmit 24 place values and not with 15 place value additions, and transmit 14 place values and not with 6 place value additions.If the least significant bit of original 14 bit address is " 1 ", the then output of two adders of secondary series adder addition first row, otherwise it transmits 24 constant place values.Like this, two least significant bits controls of original 14 bit address generate said value according to 4 one group value.
Can further save in D1 through note D2 almost twice, thus, only need store 15 D2 and add that one revises and to be applied to 14 place value INT (D2/2)=move to right (RightShift) (D2), so that obtain the value of the accurate position of D1.This correction is poor with the second time of D1-(D2/2) expression, and it is founded according to directly calculating, and takies scope-1 to 8, needs 4 storages.Then, can further realize this second time of poor 1/4 of the DD that is worth no better than.Thus, can use according to the difference replacement of DD/4 it, it has scope-1 to+1 according to directly calculating, and needs 2 storages, and effectively poor for for the third time, it can be represented with DDD.
Thus, in order to reproduce each in 16384 accurate 24 bit function values in position, the value of storage comprises 4096 groups of quaternate values as follows:
24 place values: F (4i)
15 place values: D2 (4i)=F (4i+2)-F (4i)
6 place values: DD (4i)=(F (4i+3)-F (4i+2))-(F (4i+1)-(F (4i))
2 place values: DDD (4i)=(INT (D2 ((4i)/2))-(F (4i+1)-F (4i))
-INT(DD(4i)/4)
Then, four functional value F (4i), F (4i+1), F (4i+2), F (4i+3) accurately rebuild on ground, position according to following functional value:
F(4i+1)=F(4i)+INT(D2(4i)/2)+INT(DD(4i)/4)
-DDD(4i)
F(4i+2)=F(4i)+D2(4i)
F(4i+3)=F(4i)+INT(D2(4i)/2)+DD(4i)
+INT(DD(4i)/4)-DDD(4i)
Replace 96 ground to replace 4 one group of original 24 place values with amounting to 47, the compressed coefficient was just over 2: 1.
Vice versa, alternatively, can only store 14 D1 and add that a correction is applied to 2D1, to obtain D2.Thus, be possible such as other following arrangement of storage:
24 place value F (4i)
14 place value D1=F (4i+1)-F (4i)
6 place value DD=(F (4i+3)-F (4i+2))-D1
2 place value DDD=(F (4i+2)-F (4i))-(2D1+INT (DD/4))
It can cause a little more simply rebuilding.Should understand, represent to omit the direct loic function of two LSB that use DD such as the computing of INT (DD/4).
Utilizing shown in calculating before the algorithm of homogeneous has not been accomplished the table of compression, will develop the more form of the more system of high order compression based on the Walsh transformation of revising.The Walsh transformation method form pairing consecutive value and with poor, and utilize should and with difference replacement original value.For example, in first time iteration, comprise f0, f1, f2 in order to compress, and the data of f3, replace f0 with s0=f0+f1, replace f1 with f1-f0, replace f2 with s2=f2+f3, and replace f3 with d2=f3-f2.Calculate a pair of value and be called as the butterfly circuit with the circuit of difference.
All be adjacent to follow all differences and arrange.The difference of having seen be than with short word.Then, repeat the process of the block of relevant and value, and process and the value and the difference that then repeat the block of relevant difference dividually respectively be half length of original value, be directed against the algorithm etc. the second time of high order algorithm more with acquisition.High order algorithm to 16384 values is 14 times.At that point, block only respectively has a value, thus, and can not be by further compression.In fact, the 14th time algorithm is 16384 Walsh transformations, and is parked in the intergrade place of full Walsh transformation than the low order algorithm.Connect with final iterative association with value and difference, be stored in the final compressed value in the memory with formation.Should understand that final iteration can be any selected iteration, and thus, needs not be iteration the 14th time.
When form two 24 place values and the time, word length can increase by one.Yet, and always having an identical least significant bit with difference, it can abolish relevant and can drop-out in them thus.Through basis with and utilize below equality (2) instead abolish it, prevent and length increase.
When rebuilding, difference with addition or with subtract each other, and in importing through the carry/borrow that is fed into its LSB first adder/subtractor stage, use that twice this LSB replaces lacking and LSB.Mathematically, the conversion butterfly of modification can be write and become:
SUM(i)=INT((F(2i+1)+F(2i))/2)
DIFF(i)=(2i+1)-F(2i)
Become and rebuild to write:
F(2i+1)=SUM(i)+INT(DIFF(i)/2)+AND(1,DIFF(i))
F(2i)=SUM(i)-INT(DIFF(i)/2)-AND(1,DIFF(i))
Above-mentioned equality is the modification of the known butterfly operation that in Walsh-Fourier transform, uses:
SUM(i)=F(2i+1)+F(2i)
DIFF(i)=F(2i+1)-F(2i)
And it is inversely transformed into:
F(2i+1)=(SUM(i)+DIFF(i))/2
F(2i)=(SUM(i)-DIFF(i))/2
Can utilize the butterfly of above-mentioned modification constitute one fast, the truth of a matter is 2, pseudo-Walsh transformation (pseudo-FWT); So that two consecutive values in groups are transformed into a value and a triangle value; Perhaps four consecutive values in groups are transformed into a value, two triangle values; A triangle value for the second time, etc.Be made into the process of system, with the characteristic of exploring high order algorithm more and carry out more high order algorithm.Provided FORTRAN code below to suitable pseudo-Walsh transformation:
C?F?IS?THE?GROUP?OF?M?VALUES?TO?BE?PSEUDO-WALSH
TRANS?FORMED
C?AND?N=LOG2(M)IS?THE?ORDER?OF?THE?ALGORITHM
SUBROUTINE?PSWLSH(F,M,N)
INTEGER*4F(*),SUM,DIFF
N1=M/2
N2=1
DO?1I=1,N
L0=1
DO?2J=1,N1
L=L0
DO?3K=1,N2
SUM=(F(L+N2)+F(L))/2
DIFF=F(L+N2)-F(L)
F(L+N2)=DIFF
F(L)=SUM
L=L+1
3CONTINUE
L0=L+N2
2CONTINUE
N1=N1/2
N2=2*N2
1CONTINUE
RETURN
END
Following table shows the not amount of the data compression of the pseudo-Walsh algorithm completion of homogeneous example function of utilizing
Position counting in the round parentheses is what when use has utilized the Walsh transformation of standard butterfly operation, to have obtained.The secondary compression section ground of standard Walsh transformation is to cause because of the word length increase with item; And partly be that the expansion of the rounding error in the original tabular value when forming high order difference more causes, its use the butterfly of revising and abolish with LSB the time seldom merit attention.To two kinds of situation, it is when using appropriate algorithm for inversion, to accomplish that the accurate position of original value rebuilds.To the situation of the truth of a matter 2, computing function F
a(x)=log
2(2
-x) to 23 binary location, and to the situation of truth of a matter e, be calculated to 24 binary location, the precision that its expression is similar.
Thereby, utilizing the 14th time final whole array of transform operation, tracing table is reduced to individual character 32085 (23188) bit wides, and it is on average less than 2 of every tabular values; But each value can be reconstructed into 24 (23) precision.Individual character no longer is a table, and does not need the address, because it always " is read ", so these values just in time can hard wire become to rebuild the input of adder.Yet, need 14 grades to rebuild adder, and need about 65000 the one column adder unit of quantity.Because adder unit is greater than memory bit, it does not represent minimum silicon area.Minimum area is to obtain by the figure place of storage and to some compromise proposal between the required adder unit number of value reconstruction.However, it is fully feasible about 4 or 5 silicon area coefficient of diminution still occurring.
What can also comprise is that said method shows creates a kind of machine that calculates any centrifugal pump of smooth function.Memory table has the limitation of disappearance, and it is replaced by the hard wire input to adder tree.This can cause another level to be simplified, and in this is simplified, adds zero adder stage and can be simplified to the carry propagation of only handling from previous stage, and likewise, can be reduced at the adder that always has " 1 " in the input of adder similarly.
Pseudo-FWT gauge pressure compression algorithm can also be used logarithm and the X that the truth of a matter 2 is used in most storage
2Be 9 values 2 when long
-0.X2/ log
e(2) required 512 word tables.This function is near above-mentioned example function, and it can satisfy on the part of independent variable scope.Usefully, on gamut, use this approximation method, come only to store required correction, it has shorter word length.
Utilize the truth of a matter 2, when independent variable from 0.X
2To 1.X
2, 2.X
2Bit patterns to the exponential function of the truth of a matter 2 when changing is utilized a displacement repetition, only need synthesize thus to by 0.X
2The wherein x of expression
2It is the bit patterns of the main scope of 9 place values.Following table has provided will store with the synthetic required total bit of this function, is rounded up to 23 nearest behind the binary point, is used for the independent variable x from O to 511
2Value.
| The algorithm number of times | Number of words | Total word length | Total bit |
| ?O | ?512 | ?23 | ?ll776 |
| ?1 | ?256 | ?37 | ?9472 |
| ?2 | ?128 | ?57 | ?7296 |
| ?3 | ?64 | ?89 | ?5696 |
| ?4 | ?32 | ?139 | ?4448 |
| ?5 | ?16 | ?2l9 | ?3504 |
| ?6 | ?8 | ?358 | ?2864 |
| ?7 | ?4 | ?585 | ?2340 |
| ?8 | ?2 | ?974 | ?1948 |
| ?9 | ?l | ?1608 | ?l608 |
Only utilize to have controlled adder fixing, the hard wire input, under the situation that does not need memory, final row has constituted the machine that is used to calculate this function.
Except that above-mentioned position, need correction, to obtain F to this exponential function
a(x)=log
2(2
-x) or F
s(x)=-log
2(1-2
-x) value, it is used in the Logarithmic Algorithm, is used for the addition or the subtraction of log-domain.
In Fig. 5, drawn the value F that hopes to the situation of the truth of a matter 2
aWith F
sAnd poor between the exponential function.These differences can be stored in the tracing table, and it takes up room less than original function, and chip area is expressed as the triangle area to the right side, bottom of curve below approx.And, because should revise also formation rule function, so this table also can be utilized in this described technique compresses.
In the distortion of compression algorithm, can cancel the reconstruction adder in large quantities.This result with always causing identical 14 LSB who is based on 24 place values that provide and 14 triangle values that provide realizes.Thus, proper effectively, store precalculated 24 with 14 place values and, that is, 14 LSB of cO=fO+dO replace 14 triangle values, that is, dO.Yet, also essential storage extra order, this extra order represent 14 triangle values to 24 place values with whether cause from the 14th to the 15th carry.Thus, replace 14, can cancel first 14 that rebuild adder through storing 15.The 15th is applied to the highest effective 10 carry propagation circuit.The carry propagation circuit is simpler than full adder, and if in application, exist the 15th insertion back one-level addition wherein, then can cancel this carry propagation circuit.This describes by means of Fig. 6.In fact, kept the least significant bit of a pair of original value, but be associated with common most significant part.In addition, extra order representes that whether most significant part must increase progressively when rebuilding the said second right value.
Fig. 6 A shows and is used for according to selecting 14 identical place values of control pair and 24 place values to carry out addition or subtract each other to generate one conventional butterfly circuit of two alternative values.Fig. 6 B shows the butterfly circuit that can be simplified to 14 adders; Being used for 14 LSB to 14 place values and 24 place values carries out addition or subtracts each other; With one among 14 LSB of two alternatives that generate the result; Add operation or subtraction according to 10 MSB that make up 24 place values add a carry or borrow position, generate in 10 MSB patterns of two alternatives.If 14 addition/subtraction devices do not generate borrow or carry, then the MSB pattern can be identical.In Fig. 6 C, two of pre-stored are selected 14 results else simply, store and the corresponding result of poor (subtraction) computing together with 10 MSB with replacing 24 place values, and replace storage of 14 place values ground and and the corresponding result of result.Yet, the 15th need 10 MSB of expression with situation under whether identical or increase progressively 1.Under the situation of difference the 15th always zero, and be illustrated as along the input to selector according to 14 LSB of 24 words.If zero carries out as the transistor that lacks in memory, then zero row do not occupy silicon area.If desired, then in Fig. 6 C, still need the carry propagation device, to increase progressively 10 MSB.
Do not need to be provided with clearly the selector among Fig. 6 C.In memory, it is intrinsic that address bit is selected in many storing values one or another.Therefore, select two alternative 14 (or 15) bit patterns simply according to selection wire, it is the least significant bit of 14 bit address.10 MSB of 13 MSB addressing by the address.Thus, memory comprises 8192 10 words and 16384 14/15 words, amounts to 319488, considers 14 adders of cancellation, and it is above 8192.At last, Fig. 6 D shows through in adder subsequently, postponing to increase progressively 10 MSB with carry digit and the carry digit that will under conventional chance, use of feeding forward simply omits the situation of carry propagation device.
In Fig. 6, being divided into 10 bit positions and 14 bit positions to 24 words results from and shows never greater than the even number of 14 place values and the example function of the triangle value between the odd number addressing value.10 MSB are always identical or only differ from 1 between even number and adjacent odd number addressing value.Similarly, the difference between two parts value of separating is never greater than 15 place values, and the difference between the four part values of separating is never greater than 16 place values.Thus, the individual one group consecutive value in 4 (even 5) also has identical the highest effective 8 except that possible increment 1, has different minimum effective 16 bit positions simultaneously.Thus, it is that memory is set to 4096 8 place values and 16384 17 place values that alternative realizes, whether 8 MSB of the 17th bit representation must increase progressively by 1.Its figure place is reduced into 311296.The table to the different figure places that split that memory is shown is as follows.
| The group size | The MSB figure place | LSB (+1) figure place | The |
| 2 | 10 | 15 | 319488 |
| 4 | 8 | 17 | 311296 |
| 8 | 7 | 18 | 309248 |
| 16 | 6 | 19 | 317440 |
| 32 | 5 | 30 | 330240 |
| 64 | 4 | 21 | 345088 |
This table expression exists the best that minimizes figure place to split.
Now, can the value added LSB pattern of monotone increasing comes this process of iteration in every group through discerning.For example, under last situation, in every group of 64 values, there are not two consecutive values can differ by more than 14 place values.Thus, can be divided into 2 every group 32 groups to a little values, select 14 or a LSB pattern to every couple of storage 7 MSB (it can increase progressively by 1) else with two, whether 7 MAB of the 15th bit representation should increase progressively 1.
Below, show the many groups different modes that is used to represent one group of per 64 21 place value, the total bit of cache.
| The group size | The MSB figure place | LSB (+1) figure place | Figure place to 64 groups | Memory amounts to |
| 2 | 7 | 15 | 1184 | 304128 |
| 4 | 5 | 17 | 1168 | 300032 |
| 8 | 4 | 18 | 1184 | 304128 |
Many groups situation to one group of per 32 value repeats this process generation:
| The group size | The MSB figure place | LSB (+1) figure place | Figure place to 64 groups | Memory amounts to |
| 2 | 6 | 15 | 576 | 297472 |
| 4 | 4 | 17 | 576 | 297472 |
| 8 | 3 | 18 | 588 | 303616 |
Many groups situation to one group of per 16 value obtains:
| The group size | The MSB figure place | LSB (+1) figure place | Figure place to 64 groups | Memory amounts to |
| 2 | 5 | 15 | 280 | 292864 |
| 4 | 3 | 17 | 284 | 296960 |
For the table of any given rule function, can carry out above-mentioned test and really this method will obtain the bank bit of minimal amount.
Said method not only can be used for smooth function, and can be used for piecewise smooth function.For example,, then only need represent a smooth function to the value in per 32 group if utilize many groups of per 32 one class values of above-mentioned arbitrary technique compresses at every turn, that is, and continuous function.So, the discontinuity between on the same group is not unessential.
In order to explain that how can use to more at random data should technology, the structure of typical MOS ROM has been described at first.Fig. 7 shows the structure of MOS transistor.The conductor gate electrode is processed by having dystectic conductor material, so that the tolerance treatment temperature, and utilize the silicon insulating barrier of processing by silicon dioxide usually to separate with silicon substrate.Usually, at first utilize gate electrode structure covering whole base plate also to follow the part of etching except that hoping the gate electrode position and come manufacturing grid and following oxide-insulator thereof.Then, form drain electrode and source electrode through implant impurity on the either side of gate electrode, gate electrode is filled the post of its oneself mask, to guarantee that drain electrode and source electrode are accurately against gate electrode.The impurity of implanting must allow through integrated with the silicon substrate crystal structure under the fusing point that just is increased to temperature silicon.Formed MOS transistor by this method, interconnected them through deposit aluminium interconnection track.
Fig. 8 shows the ROM by this method construct.The alternately striped of source electrode and drain electrode diffusion forms the transistor of its a large amount of source electrodes and drain electrode parallel connection.These transistors are arranged on the place that does not have etching to cross gate electrode structure.Place at hope storage binary one is provided with a transistor, in the place that will store Binary Zero transistor is not set.In fact, in Fig. 8, transistor is shown as including at a gate electrode and the gate electrode from the source electrode wires below the electric leakage polar curve from the source electrode wires above the electric leakage polar curve.The both enables through same strobe pulse, and is actually two transistor parallel connections thus.The number of the electric leakage polar curve that generally speaking, separates is corresponding to figure place.To the active electrode wires of institute of same word all leftward side be connected to that word exercises can signal, it is socketed in ground, to read the word of this row.
The transistor of different lines is corresponding to the different words in the row.Address decoder is accepted the binary address input and an enable signal is applied on all gate electrodes in the row.Enable all positions thus to the particular word of electric leakage polar curve.If in this position, comprise zero, then come drop-down electric leakage polar curve through turn-on transistor, otherwise, if transistor disappearance (corresponding to 0 in the word), then not drop-down electric leakage polar curve.
Can constitute the multirow word, and address wire runs through all row by the top end of to.Be connected to second row-address decoder being directed against the capable source electrode enable line of each word, thereby whole address then becomes the combination of column address and row address.The drop-down suitable row-address decoder that enables the capable source electrode wires of word has been shown in Fig. 9 A and 9B.
Shown in Fig. 9 A, cascade is created module and is comprised that two its source electrodes are shared and be the MOS transistor of " common source electrode (cascoding) input " C.Address wire is connected to a transistorized gate electrode, and its drain electrode produces a compensation address signal, and it is connected to other transistorized gate electrode.This result is that to depend on address wire be height or low to the conduction between arbitrary transistor or source electrode and the drain electrode.The cascade of this establishment module has been shown among Fig. 9 B, and wherein the drain electrode connecting portion DR1 of first device of address bit A1 control and DR2 are connected to the source electrode connecting portion of higher leveled two identical establishment modules being controlled by address bit A2 in this chain structure.Their drain electrode connecting portion is connected to all the establishment module by the pairing of address bit A3 control successively, representing 8 drain electrode connecting portions, etc.Like this, generate from 2 through the binary tree decoding N bit address bit patterns of creating module
NFinal drain electrode connecting portion is the path of the first source electrode connecting portion of " chip enable " to note.
Drop-down chip enable line follow in the drop-down drain electrode connecting portion with enable corresponding one of path; In its capable source of drop-down successively word electrode wires one; It is drop-down successively to the capable electric leakage polar curve of this word, through column address conducting and the capable corresponding word transistor of this word.The sensing amplifier (not shown) detects its bit line and allows to draw the corresponding output enable bit patterns of electric current and generation and address word from power supply.This sensing amplifier compensates the loss through transistorized long-chain formula structure, and to the ppu to memory the logic level output of buffering is provided.
Figure 10 shows the memory that comprises random value and how can arrange by numerical order.Word is classified simply and is arranged by numerical order, keeps being connected to the same output line of address decoder simultaneously, this address wire that need decode intersection as shown in the figure, and it can accommodate suitable multilayer interconnection pattern.Address wire can be arranged by different cross figures again, is used to be connected to the next line of word, and it is probably by diverse order classification.This accomplishes compression at itself, and only is the illustration possibility, in the right value of still reading with each address, arranges the random data on the chip by the level and smooth monotonic increase order of numerical value.
Figure 11 illustration utilize decoder, cross spider pattern, and encoder come execute store to compression first time of random data.Compare with Figure 10, removed each alternative full word of the order of according to value classifying.Now, to the prev word of the word of the removing addressing of address by himself address and abridged word, the address of this abridged word is used for two addresses of OR owing to comprising that an OR gating is adjacent to the address of said prev word.Now, the triangle word that the address wire of abridged word enables to shorten, it will be through rebuilding union in prev word, so that generate the abridged word.The figure place of memory has been reduced to a full word and 1 triangle word to every pair of original full word aspect the cost of two every pair input OR gatings.4 transistors of the maximum existence of two-input OR gating, it can be counted is four.Aspect the silicon area amount of estimate saving, this should with the addition of triangle word length.
The address wire cross figure also needs chip area.Through the OR gating is set below full word again, the interconnection pattern can utilize the metal layer of the insulated from each other or mutual isolation of proper number to run through the top of the area that full word takies.Can utilize the photoetching counting to make this layer.To the object of the invention, suppose the metal layer that can use number basically not limit, and insulating barrier isolates the pattern of cross bonding line and/or cross bonding line, to prevent unnecessary contact.Really, utilize the layer superstructure of interconnection pattern, so that utilize identical chip area, employing vertical dimensions to store more multidata.
In the time possibly not showing significant systematicness, must note at the conceptive of more high order difference algorithm that extends to random data greater than primary difference.Instead, select a full word, and utilize their other word from this group of triangle value replacement of this base value, can improve the efficient of compression as base value to one group of word.For example, can use f0, d1=f1-f0, d2=f2-f0, and d3=f3-f0 replaces quaternate value f0, f1, f2, and f3.Then, the logic OR of 4 addresses must addressing value f0.Four input gatings are 8 transistors to the maximum.The silicon area that these 4 input OR gatings take is to be used for 3 full word f1, f2, and f3 reduces into three triangle word d1, d2, and d3 and the cost saved.4 input OR gatings only twice are more complex than 2 input OR gatings, and current storage is reduced to about 3 times big.Thus, compare, reduced the consumption (overhead) of OR gating with the saving that obtains.Where order identical and that according to value classify thus occurs two data values that no matter will store with being closely adjacent to each other, just utilizes single value to replace them, and does not have the triangle value to need storage.Thereby, not need with the zero corresponding address wire of triangle value, and can utilize AND-OR gating shown in figure 12 to absorb OR in the last level of address decoder to two address wires of the value of single storage.
Figure 12 A shows two decoders that are used for for example being decoded into 8 address bits 256 address wires of Fig. 9 type.Because Fig. 9 has open-drain output, so must add pull up transistor (resistor) to be reset to output the state of known memory between reading.256 line (A0 of a decoder ... A255) with 256 line (B0 of another decoder ... B255) intersect, to obtain 65536 crosspoints.In the time of single one in 65536 lines of must decoding; 2 input AND gatings are set at this tie point place; Owing to can ignore the number of transistors in the 8*256 line decoder by comparison, so shown in Figure 12 B, provided every output line 4 transistorized address decoder complexity.Yet when only needing the OR of two addresses, Figure 12 C shows 8 transistors that how can not utilize extra transistor to reconnect two AND gatings and forms AND-OR.
Some typical DSP programs of utilizing above-mentioned technical Analysis in mobile device, to use are carried out the estimation to saving.First program that comprises 8464 16 bit instructions is detected and only comprises 911 16 different place values.Significantly, many values have been repeated repeatedly, and above-mentionedly only store these and once can in chip area, save coefficient approx with the technology of utilizing the AND-OR address decoder and reach 9, and need not use any triangle value.Second program of 2296 words comes to light and only comprises 567 16 different place values, and allows effective compressibility.The program that merges is compressed to 1397 words from 10760 words.4 transistors in every address that in the address decoder of conventional memory, need in the AND-OR address decoder, use thus, do not increase complexity.more accurate 16 transistors of every word of estimating to increase to 4 transistors in every address of address decoder memory array to the size reduction; To obtain amounting to 20 transistors to every word of conventional memory; It reduces into 4+16/9 transistor, and it approximately representes coefficient of diminution 3.
When the number of words that will store is counted above possible values; For example, store 100 ten thousand 16 words, be apparent that; Utilization enables to realize that by its same word that will store the AND-OR address decoder of 16 specific words merges whole addresses, never need store more than 65536 values.Such memory is benefited from the complicated interconnection pattern that uses more overlapping interconnection layer to handle the AND-OR addressing.Thus, this technology is to utilize vertical dimensions on the chip (more interconnection layers) to realize more effectively methods of storage.
To the latter's situation, even need not store 65536 possible words.Need to be processed in 65536 lines to suitable address wire AND-OR, so that the output of the value of generation from 0 to 65535.A device that is called priority encoder can be used as 65536: 16 line encoders, and it is the inverse function of 16: 65536 line address decoders.Can utilize maximum 6 transistors of every input to construct priority encoder, it is less than 16 of the required every words of others.Figure 13 shows a priority encoder.
One group coding become 0 to N-l N input be divided into first half 0 to N/2-1 and second half N/2 to N-1.Select the corresponding line of pairing half the from first and second, for example, the OR processing together of 0 line as shown in the figure and N/2 line.If any in the input is 1, then the NOR gating provides logical zero.If following input is 1, then input is gone up in the conducting of N transistor npn npn and connection, and it is necessary for 0, with output B.If arbitraryly be input as 1, then P transistor npn npn conducting is and if to go up input be 1 then input to output B in the connection.Thus,, then export the polarity that B has the input of going up, otherwise output B is open circuit if any in the input is 1.If active incoming line is arranged on the first half of input, the output-parallel of then all exporting B forms the OR that lead is connected, and provides 1, otherwise output " 0 ".This is the highest significant position of the coding of hope.Then, repeat this process to N/2, output A0 to A (N/2-1), with second the highest effective carry-out bit of the coding of generate hoping, etc.Thus, be N/2+N/4+N/8 for the required progression of the incoming line of intactly encoding ...=N-1, and every grade comprise 6 transistors (4 transistor NOR gatings add P and N transistor npn npn).
Shown in figure 14, input is just reproduced in the merging of address scrambler and contrary encoder (priority encoder) thereof.Yet,, generate mapping in any 1: 1 (being also known as replacement box or S box, perhaps information preserving coding) if line intersects.If some addresses were handled by AND-OR, more implicit output valve disappearances then generate many than a mapping or an information lossy coding.Thus, can generate any read-only memory that the output word length equals address word length that has in such a way.When this information was the unique difference between Figure 14 A and the 14B, it was stored in address wire intersection or the displaced pattern significantly.Be utilized in 4 transistors of every line and 6 transistors in priority encoder estimated in the address decoder and show saving to the memory that surpasses 1024 10 words.Yet, utilize many interconnection layers, can generate several different interconnection patterns, all share same priority encoder and address decoder, the method for the interconnection pattern of selecting hope is provided.Compose in parallel by P shown in figure 13 and N transistor npn npn, a series of actuating switchs of every line can be used for it.Thus, number M that can the discretionary interconnections pattern, and can utilize every word 2+10/M transistor to generate size M.2
NThe memory of individual N position word when the number of plies increases, shows the efficient of increase.Passable is can use the single-transistor actuating switch, for example through generating power supply-V
1(being used for P type switch) or V
Cc+ V
1(being used for N type switch) is to guarantee the switch conduction to logical one or logical zero level.
Briefly, shown the table of the data of storage, its expression can be compressed monotonic function through only storing a base value of every group of digitally adjacent value, adding to the triangle value according to base value of other value in this group.Alternatively, utilize the butterfly operation of revising, can utilize Walsh transformation to come conversion value in groups.Because memory table disappears,, that is, utilize adder array to merge any monotonic function with fixing input so the latter's limited case causes such performance.Disclose a alternative, comprised the result's of the triangle value addition of store precomputation LSB part,, added that then extra order representes, thus the most reconstruction of cancellation adders if desired to the carry of MS part to storage triangle value.This technological expansion to segmentation monotonic function, wherein, function is the monotonic function to each compressor units.Then, this technology will at first be pressed numerical order storage data through imagination through the relocated address line, extends to the random data to computer program.
When any triangle value was zero, it did not need storage, and also the calculated address line does not come this triangle value of addressing, thus, allowed simplification to need the OR of the address of basic word to handle.Analyze typical DSP program and show, can utilize this technology to accomplish most compressions separately.And, can replace storing all possible output valve ground and generate them through utilizing 6 transistorized priority encoders of every address wire, it is more effective greater than 64 o'clock at possible output valve number.Then information resides in and is connected to address decoder in the interconnection pattern of priority encoder.At last; What illustrate is to construct several this interconnection patterns through adopting vertical dimensions, and be directed against the reduction (every basically word 1 or 2 transistors) of the number of transistors of every word use; Utilize actuating switch to select the pattern of hoping, can store the memory of several groups of random data.
Certainly, under the situation that does not break away from inner characteristic of the present invention, with this concrete set forth those compare, can carry out the present invention otherwise.It is exemplary and nonrestrictive that specific embodiment is regarded as aspect all, and interior whole variations all are included in this with equivalency range to fall into the implication of accompanying claims.
Claims (37)
1. one kind is used for the method for packed data to store at memory, and this method may further comprise the steps:
Form step, it forms the ordered set of look-up table values through by the dullness order look-up table values being resequenced;
Pairing step, it is with the pairing of the consecutive value in the said ordered set, to form a plurality of non-overlapping paired values;
Generate step, it is to a pair of or more to paired value, generates poor between the value in the said paired value;
Replacement step, it reduces the required figure place of storing queries tabular value through replace a value in the said paired value with the difference of correspondence, is compressed with ordered sets with generation; And
Storing step, it is stored in the said ordered sets that is compressed with in the said memory,
Wherein, Said generation step comprises the difference that generates the value in the said paired value and the step of the value sum in the said paired value, said replacement step comprise with replace based on the value of said difference in the value in the said paired value one and with based on said and value replace another the step in the value in the said paired value.
2. method according to claim 1, wherein, the step of revising said set also comprises another the step in the value that keeps in the said paired value.
3. method according to claim 2, this method also comprises the iteration repeating step, said iteration repeating step iteration before said storing step repeats said generation step and said replacement step.
4. method according to claim 3, wherein, one or more in the said pairing is directed against different iteration and difference.
5. method according to claim 1, wherein, the said ordered sets that is compressed with comprises at least one in the said look-up table values.
6. method according to claim 1, this method also comprise and abandon step, said abandon step abandon said and minimum live part.
7. method according to claim 1, this method also comprises the iteration repeating step, said iteration repeating step iteration before said storing step repeats said generation step and said replacement step.
8. method according to claim 7 wherein, is directed against different iteration and difference to the pairing that is worth.
9. method according to claim 1; Wherein, This method comprises that also concatenated value generates step; This concatenated value generate step through serial connection from final iteration with said final iteration in paired value corresponding each said with and said difference generate concatenated value, and said concatenated value is stored as the step of end value.
10. method according to claim 1; This method comprises that also iteration repeating step and concatenated value generate step; Said iteration repeating step iteration repeats said generation step and said replacement step; Up to said with in a summation that comprises the whole look-up table values in the said sequence till, said concatenated value generate step through serial connection from the said of corresponding iteration with and said difference generate concatenated value, and said concatenated value is stored as the step of end value.
11. method according to claim 1; This method also comprises definite step; Said definite step is confirmed one or more combined value based on being directed against a pair of or more combinations with reaching difference to the paired value generation; Wherein, comprise with the minimum live part of said combined value and replace one step in the value in the said paired value to replace one step in the value in the said paired value based on the value of said difference.
12. method according to claim 1; This method also comprises reconstruction procedures; Said reconstruction procedures based on said said one or more value that is compressed with in the ordered sets that is stored in the memory, is rebuild one or more look-up table values according to predetermined reconstruction formula.
13. method according to claim 1, wherein, said look-up table values comprises the value that derives according to monotonic function.
14. method according to claim 13; This method also comprises applying step; Said applying step is used each in said formation step, said generation step, said replacement step and the said storing step, cuts apart the monotonic segment in the segmentation monotonic function.
15. method according to claim 1, wherein, said look-up table values comprises by the tactic non-dull value of dullness.
16. method according to claim 15; This method also comprises step and access step is set; The said step that is provided with is arranged on the said value that is compressed with in the ordered sets on the ground floor of said memory, and said access step is intersected address wire via one on the second layer that is arranged on said memory or more and come in the said value that is compressed with in the ordered sets of access one or more.
17. one kind is used for the method for packed data to store at memory, this method may further comprise the steps:
Form step, it forms the ordered set of look-up table values through arranging look-up table values again by the dullness order:
Pairing step, it is with the pairing of the consecutive value in the said ordered set, to form a plurality of non-overlapping paired values;
Generate step, it is to a pair of or more to paired value, generate the value in the said pairing difference and the value in the said paired value with;
Replacement step, its through with correspondence with and the said paired value of difference replacement in value reduce the required figure place of storing queries tabular value and be compressed with ordered sets, and the required figure place of the reduction said ordered set of storage with generation; And
Storing step, it is stored in the said ordered sets that is compressed with in the said memory.
18. method according to claim 17; Wherein, This method comprises that also concatenated value generates step, and it is said and generate concatenated value with said difference and said concatenated value is stored as the step of end value to each of each paired value through serial connection that this concatenated value generates step.
19. method according to claim 17, this method also comprises the iteration repeating step, and said iteration repeating step iteration before said storing step repeats said generation step and said replacement step.
20. method according to claim 19; Wherein, This method comprises that also concatenated value generates step; It is corresponding said and with said poor from final iteration and pairing said final iteration through serial connection that this concatenated value generates step, generates concatenated value, and said concatenated value is stored as the step of end value.
21. method according to claim 17; This method also comprises the step that rounds off; The said step that rounds off is each and be rounded up to nearest unit; With form one or more revise with, wherein, said replacement step comprises with the difference of correspondence and said modification and replaces the step of the value in the said paired value.
22. method according to claim 21; This method also comprises the iteration repeating step; Said iteration repeating step iteration before the said storing step repeats said generation step, said step and the said replacement step of rounding off, and this method comprises that also concatenated value generates step, and it is corresponding said and with said poor from final iteration and pairing said final iteration through serial connection that said concatenated value generates step; Generate concatenated value, and said concatenated value is stored as the step of end value.
23. method according to claim 17; This method comprises that also combined value generates step; Said combined value generates step and is directed against a pair of or more combined values to said paired value through making up said difference with said generating with value; Wherein, said replacement step comprises to replace the step of the value in the said paired value based on the value of the minimum live part of said and value and said combined value.
24. method according to claim 17, wherein, said look-up table values comprises the value that derives according to monotonic function.
25. method according to claim 24, this method also comprises applying step, and said applying step is used said formation step, said generation step and said replacement step, cuts apart the monotonic segment in the segmentation monotonic function.
26. method according to claim 17, wherein, said look-up table values comprises by the tactic non-dull value of dullness.
27. method according to claim 26; This method also comprises step and access step is set; The said step that is provided with is arranged on the said value that is compressed with in the ordered sets on the ground floor of said memory, and said access step is intersected address wire via one on the second layer that is arranged on said memory or more and come in the said value that is compressed with in the ordered sets of access one or more.
Abandon step 28. method according to claim 17, this method also comprise, said abandon step abandon before the said replacement step said and minimum live part.
29. one kind is used for the method for packed data to store at memory, this method may further comprise the steps:
Form step, it is through arranging the ordered set that look-up table values forms look-up table values by the dullness order again, said ordered set comprise non-overlapping paired value (f0, f1) with (f2, f3);
D0 and d2 generate step, and it generates d0 through the difference of confirming f0 and f1, and adopt generation d2 through the difference of confirming f2 and f3;
S0 and s2 generate step, its through confirm f0 and f1 with generate s0, and through confirm f2 and f3 and generate s2;
Replacement step, it, is replaced f0, reduces the required figure place of storage look-up table values with s2 replacement f2 with s0 with d2 replacement f3 through with d0 replacement f1, is compressed with ordered sets with generation; And
Storing step, it is stored in the said ordered sets that is compressed with in the said memory based on said modification set.
30. method according to claim 29, this method also are included in the following steps before the said storing step:
To the said value that is compressed with in the ordered sets match with generate paired value (f0, f2) with (d0, d2);
Difference through confirming f0 and f2 generates d1, and generates d3 through the difference of confirming d0 and d2; And
With d1 replacement f2 and with d3 replacement d2.
31. method according to claim 29, this method is further comprising the steps of:
Generate first concatenated value through serial connection s0 and d0;
Generate second concatenated value through serial connection s2 and d2; And
Be stored as end value to said concatenated value.
32. method according to claim 29; This method comprises that also generating c0 through combination s0 and d0 also passes through the step that combination s2 and d2 generate c2; Wherein, said replacement step also comprises the step of replacing d2 with s0 replacement f0, with s2 replacement f2, with the minimum live part replacement d0 of c0 and with the minimum live part of c2.
33. method according to claim 29, this method is further comprising the steps of:
To revise value in the set match with generate paired value (s0, s2) with (d0, d2);
Through confirm s0 and s2 with generate s1, and through confirm d0 and d2 with generation s3;
Difference through confirming s0 and s2 generates d1, and generates d3 through the difference of confirming d0 and d2; And
Wherein, said replacement step also comprises the step of replacing d2 with s1 replacement s0, with d1 replacement s2, with s3 replacement d0 and with d3.
34. method according to claim 33, this method also comprises:
Generate the step of concatenated value through serial connection s1, d1, s3 and d3; With
Be stored as end value to said concatenated value.
35. one kind is used for the equipment of packed data to store at memory, this equipment comprises with lower device:
Form device, it forms the ordered set of look-up table values through by the dullness order look-up table values being resequenced;
Contrast means, it is with the pairing of the consecutive value in the said ordered set, to form a plurality of non-overlapping paired values;
Generating apparatus, it is to a pair of or more to paired value, generates poor between the value in the said paired value;
Alternative, it reduces the required figure place of storing queries tabular value through replace a value in the said paired value with the difference of correspondence, is compressed with ordered sets with generation; And
Storage device, it is stored in the said ordered sets that is compressed with in the said memory,
Wherein, Said generating apparatus comprises the difference of the value that is used for generating said paired value and the device of the value sum in the said paired value, said alternative comprise be used for one in the value of replacing said paired value based on the value of said difference and with based on said and value replace another the device in the value in the said paired value.
36. one kind is used for the equipment of packed data to store at memory, this equipment comprises with lower device:
Form device, it forms the ordered set of look-up table values through arranging look-up table values again by the dullness order:
Contrast means, it is with the pairing of the consecutive value in the said ordered set, to form a plurality of non-overlapping paired values;
Generating apparatus, it is to a pair of or more to paired value, generate the value in the said pairing difference and the value in the said paired value with;
Alternative, its through with correspondence with and the said paired value of difference replacement in value reduce the required figure place of storing queries tabular value and be compressed with ordered sets, and the required figure place of the reduction said ordered set of storage with generation; And
Storage device, it is stored in the said ordered sets that is compressed with in the said memory.
37. one kind is used for the equipment of packed data to store at memory, this equipment comprises with lower device:
Form device, it is through arranging the ordered set that look-up table values forms look-up table values by the dullness order again, said ordered set comprise non-overlapping paired value (f0, f1) with (f2, f3);
D0 and d2 generating apparatus, it generates d0 through the difference of confirming f0 and f1, and adopts generation d2 through the difference of confirming f2 and f3;
S0 and s2 generating apparatus, its through confirm f0 and f1 with generate s0, and through confirm f2 and f3 and generate s2;
Alternative, it, is replaced f0, reduces the required figure place of storage look-up table values with s2 replacement f2 with s0 with d2 replacement f3 through with d0 replacement f1, is compressed with ordered sets with generation; And
Storage device, it is stored in the said ordered sets that is compressed with in the said memory based on said modification set.
Applications Claiming Priority (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US57738604P | 2004-06-04 | 2004-06-04 | |
| US60/577,386 | 2004-06-04 | ||
| US11/143,157 US8316068B2 (en) | 2004-06-04 | 2005-06-02 | Memory compression |
| US11/143,157 | 2005-06-02 | ||
| PCT/EP2005/005994 WO2005119920A2 (en) | 2004-06-04 | 2005-06-03 | Memory compression |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN1965486A CN1965486A (en) | 2007-05-16 |
| CN1965486B true CN1965486B (en) | 2012-09-05 |
Family
ID=38083512
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN 200580018166 Pending CN1965292A (en) | 2004-06-04 | 2005-06-02 | Complex logarithmic ALU |
| CN2005800182352A Expired - Lifetime CN1965486B (en) | 2004-06-04 | 2005-06-03 | Memory compression |
Family Applications Before (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN 200580018166 Pending CN1965292A (en) | 2004-06-04 | 2005-06-02 | Complex logarithmic ALU |
Country Status (1)
| Country | Link |
|---|---|
| CN (2) | CN1965292A (en) |
Families Citing this family (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102566963A (en) * | 2010-12-21 | 2012-07-11 | 普天信息技术研究院有限公司 | Method for processing data in field programmable gate array (FPGA) |
| US9411583B2 (en) * | 2011-12-22 | 2016-08-09 | Intel Corporation | Vector instruction for presenting complex conjugates of respective complex numbers |
| US10664277B2 (en) * | 2017-09-29 | 2020-05-26 | Intel Corporation | Systems, apparatuses and methods for dual complex by complex conjugate multiply of signed words |
| CN109190756B (en) * | 2018-09-10 | 2022-02-18 | 中国科学院计算技术研究所 | Arithmetic device based on Winograd convolution and neural network processor comprising same |
| CN109918047A (en) * | 2019-02-01 | 2019-06-21 | 深圳市南方硅谷微电子有限公司 | Modem-based complex modulo method and device |
| CN112748898B (en) * | 2021-02-14 | 2023-03-14 | 成都启英泰伦科技有限公司 | Complex vector computing device and computing method |
| CN118132522B (en) * | 2024-05-10 | 2024-07-19 | 成都联屹科技有限公司 | Data compression device, method and chip |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1145702A (en) * | 1995-01-26 | 1997-03-19 | 世雅企业股份有限公司 | Device and method for encoding data and device and method for decoding data |
| CN1497984A (en) * | 2002-10-18 | 2004-05-19 | 富士通株式会社 | Image compression device and method for frame skipping processing |
-
2005
- 2005-06-02 CN CN 200580018166 patent/CN1965292A/en active Pending
- 2005-06-03 CN CN2005800182352A patent/CN1965486B/en not_active Expired - Lifetime
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1145702A (en) * | 1995-01-26 | 1997-03-19 | 世雅企业股份有限公司 | Device and method for encoding data and device and method for decoding data |
| CN1497984A (en) * | 2002-10-18 | 2004-05-19 | 富士通株式会社 | Image compression device and method for frame skipping processing |
Non-Patent Citations (1)
| Title |
|---|
| S.W.Smith.The scientist and engineer's guide to digital signal processing.California Technical Publishing,1999,481-502. * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN1965486A (en) | 2007-05-16 |
| CN1965292A (en) | 2007-05-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8478804B2 (en) | Memory compression | |
| US11509418B2 (en) | Polar code encoding method and device | |
| Lai et al. | OBDD-based function decomposition: Algorithms and implementation | |
| US5818728A (en) | Mapping of gate arrays | |
| CN1934789B (en) | A Code Construction Method for Irregularly Shortened LDPC Codes with Good Performance | |
| US11303302B2 (en) | Erasure code calculation method | |
| CN1965486B (en) | Memory compression | |
| JPS6122826B2 (en) | ||
| JPH02504339A (en) | Multiplier-adders in Galois fields and their use in digital signal processors | |
| CN111290697B (en) | Data compression method, encoding circuit and arithmetic device | |
| JP3515036B2 (en) | Interleaving method, interleaving device, turbo coding method, and turbo coding device | |
| EP3886324B1 (en) | Compression and/or decompression of activation data | |
| CN101515839A (en) | Method, device and system for outputting codes | |
| KR101236673B1 (en) | Memory compression | |
| CN110688094A (en) | Remainder operation circuit and method based on parallel cyclic compression | |
| Bertolo et al. | An updated table of binary/ternary mixed covering codes | |
| EP1774657A2 (en) | Memory compression | |
| CN108985439B (en) | A kind of glass typesetting determination method and system | |
| KR0134854B1 (en) | Method and apparatus for designing semiconductor device | |
| KR20160100496A (en) | Improved huffman code method and apprartus thereof by using binary clusters | |
| HK1102651A (en) | Memory compression | |
| JPH0222409B2 (en) | ||
| CN113422611A (en) | Highly parallel encoding method of QC-LDPC encoder | |
| CN111294056B (en) | Data decompression method and coding circuit | |
| CN116028764B (en) | A convolution calculation method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1102651 Country of ref document: HK |
|
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: WD Ref document number: 1102651 Country of ref document: HK |
|
| CX01 | Expiry of patent term | ||
| CX01 | Expiry of patent term |
Granted publication date: 20120905 |