[go: up one dir, main page]

CN1297887C - Processor capable of aligning multiple register data across boundaries and method thereof - Google Patents

Processor capable of aligning multiple register data across boundaries and method thereof Download PDF

Info

Publication number
CN1297887C
CN1297887C CNB2003101188147A CN200310118814A CN1297887C CN 1297887 C CN1297887 C CN 1297887C CN B2003101188147 A CNB2003101188147 A CN B2003101188147A CN 200310118814 A CN200310118814 A CN 200310118814A CN 1297887 C CN1297887 C CN 1297887C
Authority
CN
China
Prior art keywords
address
bits
group
temporary register
working storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2003101188147A
Other languages
Chinese (zh)
Other versions
CN1622031A (en
Inventor
梁伯嵩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sunplus Technology Co Ltd
Original Assignee
Sunplus Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sunplus Technology Co Ltd filed Critical Sunplus Technology Co Ltd
Priority to CNB2003101188147A priority Critical patent/CN1297887C/en
Publication of CN1622031A publication Critical patent/CN1622031A/en
Application granted granted Critical
Publication of CN1297887C publication Critical patent/CN1297887C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Executing Machine-Instructions (AREA)

Abstract

The invention provides a processor capable of aligning a plurality of register data by crossing boundaries and a method thereof, wherein a decoding device is used for decoding a multiple shift instruction; a register set having a plurality of registers, each register having N bits; a shifter connects the output contents of the first output end and the second output end of the register set in series to form a 2N bit word, then shifts the 2N bit word by w bits and outputs the first N bits; a control device sets a register group according to the decoded multiple shift instruction, reads out the content of the corresponding register, shifts the content of the read register by w bits by the shifter, and writes the output of the shifter into the register group.

Description

Trans-boundary alignment multiple transient memory DATA PROCESSING device and method thereof
Technical field
The invention relates to the technical field of Data Processing; Especially refer to a kind of trans-boundary alignment multiple transient memory DATA PROCESSING device and method thereof utilized.
Background technology
When processor carried out Data Processing, whether the alignment of data was related to the usefulness of many key operations, for example the usefulness of computing such as word string, array.As shown in Figure 1, a data (ABCDEFGHIJKL) that needs to handle is often crossed over the data storage border, when a processor carries out word string or array operation to this document, need to carry out earlier many extra computings, so that after can be with this document being reduced into the form of alignment, this processor could be to the document utilization of being correlated with.
At the unjustified problem of processing data, a kind of known method is after data is written into processor, utilizes various processor instructions to operate again and obtain needed data.As shown in Figure 2, the data (ZABC) that will be arranged in the 100h place earlier is written into working storage R16, working storage R16 is moved to left 8 bits so that unwanted data (Z) is removed, the data (DEFG) that will be arranged in the 104h place again is written into working storage R17, and working storage R17 moved to right 24 bits so that unwanted data (EFG) is removed, at last with working storage R16 and working storage R17 carries out or (OR) computing and its result deposited to working storage R16, the content among this moment working storage R16 is the data (ABCD) of required processing.According to above-mentioned same steps as, data EFGH and IJKL are written among working storage R17 and the working storage R18 in regular turn.
As shown in the above description: if the required unjustified data length that is written into is n word group (a word group is 32 bits), known method then needs 5n instruction to describe and reads action, simultaneously need 5n instruction cycle just can finish at least and read action, this makes procedure code tediously long, occupy the storage area, the burden that also increases processor simultaneously makes processor efficient unclear.
Use processor instruction to handle the problem that unjustified data is drawn the tediously long and efficient of Hyper program sign indicating number at known method, in U.S. USP4,814, in No. 976 patent announcements, be to be written into the action that unjustified data is promptly alignd simultaneously, and, be divided into twice and read a document of crossing the boundary.As shown in Figure 3, the data (ABC) that will be arranged in 101h to 103h place earlier is written into the bit group 0,1,2 of working storage R16, this moment working storage R16 bit group 3 in data be X (don ' t care), the data (D) that will be arranged in the 104h place again is written into the bit group 3 of R16, and the content among the working storage R16 is the data (ABCD) of required processing at this moment.Same steps as is written into data EFGH and IJKL among working storage R17 and the working storage R18 in regular turn according to this.
As shown in the above description,, then need 2n instruction to describe and read action, need 2n instruction cycle just can finish at least simultaneously and read action if the required unjustified data length that is written into is n word group.And, make the processor pipeline stop (Pipeline Stall) possibility and improve because same reservoir and working storage position are made repetitive read-write.Same reservoir position is repeated to read, can waste bus bandwidth, especially in some system that does not have cache, the delay that is caused is obvious especially.
Summary of the invention
The object of the present invention is to provide a kind of with trans-boundary alignment multiple transient memory DATA PROCESSING device and method thereof, tediously long with the procedure code of avoiding known technology, as to occupy storage area problem, can avoid because same reservoir is repeated to read the problem of waste bus bandwidth simultaneously.
According to one of characteristic of the present invention, a kind of trans-boundary alignment multiple transient memory DATA PROCESSING apparatus is proposed, it mainly comprises:
One decoding device is decoded so that a multiple shift is instructed;
One working storage group, have a plurality of working storages, each working storage is the N bit, this working storage group can read working storage respectively according to one first address and one second address, and by one first output terminal and the output of one second output terminal, and can write this multiple transient memory one of them (N is a positive integer) via an input end according to one the 3rd address;
One shift unit, be coupled to first output terminal and second output terminal of this working storage group, and the output content of this first output terminal and second output terminal is concatenated into a 2N bit word group, again according to a shift value w with this 2N bit word group displacement w bit (w is a positive integer), and export top n bit in this 2N bit word group; And
One control device, be coupled to this decoding device and working storage group, according to this decoded multiple shift instruction, to set this first address, second address, the 3rd address and shift value w, read the content of corresponding working storage, with by this shift unit with the content of read working storage displacement w bit, and the output of this shift unit is write this working storage group according to the 3rd address.
Described device, wherein N is 32.
Described device, wherein w be 8,16,24 one of them.
Described device, wherein this shift unit w bit that can be shifted to the left or to the right.
Described device, wherein the 3rd address is that setting is identical with this first address.
Described device, wherein this second address is the follow-up address that is set at this first address.
According to another characteristic of the present invention, the align method of a plurality of working storage data of a kind of trans-boundary is proposed, these a plurality of working storages form a working storage group, each working storage is the N bit, this working storage group can read working storage respectively according to one first address and one second address, and by one first output terminal and the output of one second output terminal, and can write this multiple transient memory one of them (N is a positive integer) via an input end according to one the 3rd address, this method mainly comprises the following step:
(A) set this first address, this second address, the 3rd address and a shift value w according to multiple shift instruction;
(B) content of reading corresponding working storage according to this first address and second address; And
(C) content strings of step (B) working storage of reading is connected into the word group of 2N bit, again to this 2N bit word group w bit that is shifted, and top n bit in this 2N bit word group after will being shifted, according to the 3rd address write these a plurality of working storages one of them.
Described method, wherein step (A) to step (C) is heavily to cover execution, has all finished displacement up to the working storage of a predetermined number.
Described method, wherein N is 32.
Described method, wherein w be 8,16,24 one of them.
Described method, wherein displacement w bit can be the w bit that is shifted to the left or to the right in the step (C).
Described method, wherein the 3rd address is that setting is identical with this first address.
Described method, wherein this second address is the follow-up address that is set at this first address.
Description of drawings
Fig. 1: be one group of synoptic diagram that unjustified data is arranged in reservoir.
Fig. 2: the procedure code that is written into one group of unjustified data for known technology.
Fig. 3: for another known technology is written into the procedure code of one group of unjustified data and the synoptic diagram of working storage.
Fig. 4: be the calcspar of trans-boundary alignment multiple transient memory DATA PROCESSING apparatus of the present invention.
Fig. 5: be the detailed circuit diagram of the technology of the present invention control device 5.
Fig. 6: be the technology of the present invention running synoptic diagram.
Fig. 7: be an exemplary applications of the technology of the present invention.
Embodiment
Fig. 4 shows the calcspar that utilizes trans-boundary alignment multiple transient memory DATA PROCESSING device of the present invention, and it includes a decoding device 100, a control device 200, a working storage group 300 and a shift unit 400.Working storage group 300 has a plurality of working storages 3001, and each working storage 3001 is the N bit, and in the present embodiment, the N value is preferably 32.This working storage group 300 can read working storage 3001 respectively according to one first address 301 and one second address 302, and by one first output terminal 310 and 320 outputs of one second output terminal, and can write this multiple transient memory 3001 one of them (N is a positive integer) via an input end 330 according to one the 3rd address 303.
This decoding device 100 is that instruction is decoded to a multiple shift, and this multiple shift instruction can be divided into a multiple left shift instruction (Multiple Left Shin Instruction, MLSI) and a multiple right shift instruction (Multiple Right Shift Instruction, MRSI).Wherein, multiple left shift instruction form is MLSIRx, Ry, and w, it is represented the working storage contents value in x to the y scope, and integral body is carried out to the action w bit that shifts left.And multiple right shift instruction form is MRSI Rx, Ry, and w, it is represented the working storage contents value in x to the y scope, and integral body is carried out the action w bit of right shift.Decoding device 100 is after instruction is decoded to a multiple shift, can produce x, y, L_R *And the w signal, and export this control device 200 to, and wherein, L_R *Signal is only first in order to the mobile to the left or to the right w of indication, works as L_R *Signal is 1 o'clock, and expression is moved to the left the w bit, works as L_R *Signal is 0 o'clock, represents to move right the w bit.
This shift unit 400 is first output terminal 310 and second output terminals 320 that are coupled to this working storage group 300, and the output content of this first output terminal 310 and second output terminal 320 is concatenated into one 64 bit space groups, again according to a shift value w and a L_R *Signal is this 64 bit word group w bit (w is a positive integer) that is shifted to the left or to the right, and exports preceding 32 bits in these displacement back 64 bit word groups.
This control device 200 is coupled to this decoding device 100 and working storage group 300, according to this decoded x, y, and L_R *And w signal, setting first address 301, second address 302, the 3rd address 303 and the shift value w of this working storage group 300, and the content of reading x working storage and y working storage in this working storage group 300 by first output terminal 310 of this working storage group 300 and second output terminal 320.
Fig. 5 is the detailed circuit diagram of this control device 200, and it mainly comprises a multiplexer 210, a comparer 220, one first address working storage 230, a totalizer 240 and one second address working storage 250.This multiplexer 210 is selected an x signal that is produced by decoding device 100 or by the contents value of this second address working storage 250.The output of this multiplexer 210 writes this first address working storage 230, and it exports first address 301 of this working storage group 300 to, with the working storage 3001 of these first address, 301 indications of access.This totalizer 240 is written to this second address working storage 250 after the contents value of this first address working storage 230 is added 1 again, and the contents value of this second address working storage 250 is in order to the working storage 3001 of these second address, 302 indications of access.This comparer 220 is the contents value of this first address working storage 230 and the y signal that decoding device 100 is produced relatively, if the contents value of this first address working storage 230 during more than or equal to this y signal, then produces a stop signal (stop_signal).
Fig. 6 shows running synoptic diagram of the present invention, and it carries out a MLSIR16, R19, and 8 instructions, this instruction represent that contents value with working storage R16, R17, R18 and R19 is to 8 bits that shift left.When first performance period began, these decoding device 100 these instructions of decoding, and produce x=16, y=19, L_R *=1 and the w=8 signal.This multiplexer 210 is selected an x signal (=16) that is produced by decoding device 100, and 200 of control device insert 16 with this first address working storage 230, and via these totalizer 240 computings this second address working storage 250 are inserted 17.Because the first address working storage 230 is 16, it is less than 19, so comparer 220 can not produce this stop signal (stop_signal).That is this working storage group 300 can according to this first address 301 (=16) and second address 302 (=17) read respectively working storage R16 contents value (=ZABC) and the contents value of R17 (=DEFG).And export this shift unit 400 to by first output terminal 310 and second output terminal 320.
This shift unit 400 with the contents value of this first output terminal 310 (=ZABC) and the contents value of second output terminal 320 (=DEFG) be concatenated into one 64 bit word groups (=ZABCDEFG), again according to a shift value w=8 and a L_R *=1 signal with this 64 bit word group to 8 bits that shift left (=ABCDEFG0), and export in the 64 bit word groups of this displacement back (=ABCDEFG0) preceding 3 bits (=ABCD).200 of control device according to the 3rd address 303 with the output of this shift unit 400 (=ABCD) write among the working storage R16 of this working storage group 300.
When second performance period began, this multiplexer 210 is selected the contents value (=17) of this second address working storage 250,200 of control device insert 18 with this first address working storage 230, and via these totalizer 240 computings this second address working storage 250 are inserted 18.Its implementation was same as for first performance period, so when second performance period finished, the contents value of this working storage R17 was EFGH.In like manner, so when the 3rd performance period finished, the contents value of this working storage R18 was IJKL.
When the 4th performance period began, this multiplexer 210 is selected the contents value (=19) of this second address working storage 250,200 of control device insert 19 with this first address working storage 230, because the first address working storage 230 is 19, so comparer 220 can produce this stop signal (stop_signal) and stop executive routine, that is only needs three performance periods to get final product.
Fig. 7 shows utilization synoptic diagram of the present invention, when desire is written into one group of unjustified data, can respectively unjustified data be written among working storage R16, R17, R18 and the R19 with being written into instruction (LW) earlier, re-using multiple left shift instruction of the present invention (MLSI) can finish.As shown in Figure 7, its procedure code only needs 5 word groups.
As shown in the above description, technology of the present invention can solve the problem that the known technology procedure code is tediously long, occupy the storage area, can avoid because same reservoir is repeated to read the problem of waste bus bandwidth simultaneously.
It should be noted that above-mentioned many embodiment give an example for convenience of explanation, the interest field that the present invention advocated should be as the criterion so that claim is described certainly, but not only limits to the foregoing description.

Claims (12)

1.一种可跨边界对齐复数暂存器资料的处理器装置,主要包含:1. A processor device capable of aligning multiple register data across boundaries, mainly comprising: 一解码装置,以对一多重移位指令进行解码;a decoding device, to decode a multiple shift instruction; 一暂存器组,具有复数个暂存器,每一暂存器均为N位元,该暂存器组可依据一第一位址及一第二位址分别读取暂存器,并由一第一输出端及一第二输出端输出,及可依据一第三位址经由一输入端写入该复数暂存器其中之一,N为正整数;A temporary register group has a plurality of temporary registers, and each temporary register is N-bit. The temporary register group can respectively read the temporary registers according to a first address and a second address, and output from a first output terminal and a second output terminal, and can be written into one of the multiple registers through an input terminal according to a third address, N is a positive integer; 一移位器,耦合至该暂存器组的第一输出端及第二输出端,并将该第一输出端及第二输出端的输出内容串接成一2N位元字组,再依据一移位值w将该2N位元字组移位w位元,w为正整数,并输出该2N位元字组中的前N个位元;以及A shifter, coupled to the first output end and the second output end of the temporary register group, and the output content of the first output end and the second output end are concatenated into a 2N byte group, and then according to a shift bit value w shifts the 2N-byte word by w bits, where w is a positive integer, and outputs the first N bits in the 2N-byte word; and 一控制装置,耦合至该解码装置及暂存器组,依据该解码后的多重移位指令,以设定该第一位址、第二位址、第三位址及移位值w,读出对应暂存器的内容,以由该移位器将所读出暂存器的内容移位w位元,并依据该第三位址将该移位器的输出写入该暂存器组。A control device, coupled to the decoding device and the register group, according to the decoded multiple shift instruction, to set the first address, second address, third address and shift value w, read Output the content of the corresponding temporary register, so that the content of the read temporary register is shifted by w bits by the shifter, and write the output of the shifter into the temporary register group according to the third address . 2.如权利要求1所述的装置,其特征在于,其中N为32。2. The device according to claim 1, wherein N is 32. 3.如权利要求1所述的装置,其特征在于,其中w为8、16、24其中之一。3. The device according to claim 1, wherein w is one of 8, 16, 24. 4.如权利要求1所述的装置,其特征在于,其中该移位器可向左或向右移位w位元。4. The apparatus of claim 1, wherein the shifter can shift w bits left or right. 5.如权利要求1所述的装置,其特征在于,其中该第三位址是设定与该第一位址相同。5. The device of claim 1, wherein the third address is set to be the same as the first address. 6.如权利要求1所述的装置,其特征在于,其中该第二位址是设定为该第一位址的后续位址。6. The device of claim 1, wherein the second address is set as a subsequent address of the first address. 7.一种可跨边界对齐复数个暂存器资料的方法,该复数个暂存器形成一暂存器组,每一暂存器均为N位元,该暂存器组可依据一第一位址及一第二位址分别读取暂存器,并由一第一输出端及一第二输出端输出,及可依据一第三位址经由一输入端写入该复数暂存器其中之一,N为正整数,该方法主要包含下列步骤:7. A method for aligning multiple temporary register data across boundaries, the multiple temporary registers form a temporary register group, each temporary register is N-bit, and the temporary register group can be based on a first An address and a second address are read from the temporary register respectively, and are output by a first output terminal and a second output terminal, and can be written into the multiple temporary register through an input terminal according to a third address One of them, N is a positive integer, the method mainly includes the following steps: (A)依据一多重移位指令设定该第一位址、该第二位址、该第三位址及一移位值w;(A) setting the first address, the second address, the third address and a shift value w according to a multiple shift instruction; (B)依据该第一位址及第二位址读出对应暂存器的内容;以及(B) read out the content of the corresponding register according to the first address and the second address; and (C)将步骤(B)所读出暂存器的内容串接成2N位元的字组,再对该2N位元字组进行移位w位元,并将移位后的该2N位元字组中前N个位元,依据该第三位址写入该多个暂存器其中之一;以及(C) Concatenate the contents of the temporary register read out in step (B) into a 2N-byte word group, then shift the 2N-byte word group by w bits, and the shifted 2N-bit writing the first N bits of the metaword into one of the plurality of temporary registers according to the third address; and (D)重复执行步骤(A)至步骤(C),直到一预定个数的暂存器均已完成移位。(D) Repeat steps (A) to (C) until a predetermined number of registers have been shifted. 8.如权利要求7所述的方法,其特征在于,其中N为32。8. The method of claim 7, wherein N is 32. 9.如权利要求7所述的方法,其特征在于,其中w为8、16、24其中之一。9. The method according to claim 7, wherein w is one of 8, 16, 24. 10.如权利要求7所述的方法,其特征在于,其中步骤(C)中移位w位元可为向左或向右移位w位元。10 . The method according to claim 7 , wherein the shift of w bits in step (C) can be left or right by w bits. 11 . 11.如权利要求7所述的方法,其特征在于,其中该第三位址是设定与该第一位址相同。11. The method of claim 7, wherein the third address is set to be the same as the first address. 12.如权利要求7所述的方法,其特征在于,其中该第二位址是设定为该第一位址的后续位址。12. The method of claim 7, wherein the second address is set as a subsequent address of the first address.
CNB2003101188147A 2003-11-28 2003-11-28 Processor capable of aligning multiple register data across boundaries and method thereof Expired - Fee Related CN1297887C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2003101188147A CN1297887C (en) 2003-11-28 2003-11-28 Processor capable of aligning multiple register data across boundaries and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2003101188147A CN1297887C (en) 2003-11-28 2003-11-28 Processor capable of aligning multiple register data across boundaries and method thereof

Publications (2)

Publication Number Publication Date
CN1622031A CN1622031A (en) 2005-06-01
CN1297887C true CN1297887C (en) 2007-01-31

Family

ID=34761217

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2003101188147A Expired - Fee Related CN1297887C (en) 2003-11-28 2003-11-28 Processor capable of aligning multiple register data across boundaries and method thereof

Country Status (1)

Country Link
CN (1) CN1297887C (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10394735B2 (en) * 2017-01-09 2019-08-27 Nanya Technology Corporation Comparative forwarding circuit providing first datum and second datum to one of first circuit and second circuit according to target address

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4814976A (en) * 1986-12-23 1989-03-21 Mips Computer Systems, Inc. RISC computer with unaligned reference handling and method for the same
WO2003038601A1 (en) * 2001-10-29 2003-05-08 Intel Corporation Method and apparatus for parallel shift right merge of data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4814976A (en) * 1986-12-23 1989-03-21 Mips Computer Systems, Inc. RISC computer with unaligned reference handling and method for the same
US4814976C1 (en) * 1986-12-23 2002-06-04 Mips Tech Inc Risc computer with unaligned reference handling and method for the same
WO2003038601A1 (en) * 2001-10-29 2003-05-08 Intel Corporation Method and apparatus for parallel shift right merge of data

Also Published As

Publication number Publication date
CN1622031A (en) 2005-06-01

Similar Documents

Publication Publication Date Title
CN1203420C (en) Direct memory access controller for moving memory blocks and moving method thereof
US9058253B2 (en) Data tree storage methods, systems and computer program products using page structure of flash memory
JP2534465B2 (en) Data compression apparatus and method
JP3229180B2 (en) Data compression system
JP3225638B2 (en) Apparatus and method for compressing data and data processing system
US20020091905A1 (en) Parallel compression and decompression system and method having multiple parallel compression and decompression engines
CN111966281B (en) Data storage device and data processing method
CN114764407A (en) Method for near memory acceleration for accelerator and dictionary decoding
CN1249604C (en) Online loading process for on site programmable gate array
CN1258140C (en) Device and method for updating contents of flash memory
CN1881455A (en) Method and system for generating error correction codes
CN1390354A (en) Controlling burst sequence in synchronous memories
CN1297887C (en) Processor capable of aligning multiple register data across boundaries and method thereof
TWI244033B (en) Processor capable of cross-boundary alignment of a plurality of register data and method of the same
CN1335958A (en) Variable-instruction-length processing
US7676651B2 (en) Micro controller for decompressing and compressing variable length codes via a compressed code dictionary
TWI695264B (en) A data storage device and a data processing method
CN1508672A (en) Microcontroller IP core
CN115398413A (en) Bit string accumulation
CN1190738C (en) Data processing device and its data read method
CN1208894A (en) Data-processing equipment
WO2020215951A1 (en) Encoding and decoding method and apparatus, computer device and storage medium
CN117194281B (en) Asymmetric access method for indefinite length data in ASIC
CN118778879A (en) Data writing method, electronic device and computer readable storage medium
CN1529234A (en) First-in-first-out register queue device and control method capable of processing variable-length data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20070131

Termination date: 20141128

EXPY Termination of patent right or utility model