US20040019769A1 - Finding a significant bit in a computer data word - Google Patents
Finding a significant bit in a computer data word Download PDFInfo
- Publication number
- US20040019769A1 US20040019769A1 US10/202,960 US20296002A US2004019769A1 US 20040019769 A1 US20040019769 A1 US 20040019769A1 US 20296002 A US20296002 A US 20296002A US 2004019769 A1 US2004019769 A1 US 2004019769A1
- Authority
- US
- United States
- Prior art keywords
- defining
- datum
- mask
- value
- shifted value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/74—Selecting or encoding within a word the position of one or more bits having a specified value, e.g. most or least significant one or zero detection, priority encoders
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30018—Bit or string instructions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30021—Compare instructions, e.g. Greater-Than, Equal-To, MINMAX
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30007—Arrangements for executing specific machine instructions to perform operations on data operands
- G06F9/30036—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
- G06F9/30038—Instructions to perform operations on packed data, e.g. vector, tile or matrix operations using a mask
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30094—Condition code generation, e.g. Carry, Zero flag
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30098—Register arrangements
- G06F9/30101—Special purpose registers
Definitions
- This invention relates generally to processor-based systems and, particularly, to systems that need to locate a most or least significant bit of a computer word.
- the most significant bit may be located by treating the datum as an integer and performing a binary search for the least power of two which is greater than the datum. This result can then be used to directly compute the index of the most significant bit. Although this algorithm may reduce the computational complexity to the log of the datum size, the overhead for testing and branching is high.
- FIG. 1 is a schematic depiction of one embodiment of the present invention.
- FIG. 2 is a flow chart for one embodiment of the present invention.
- a processor-based system 10 may include a processor 12 coupled to a storage 14 .
- the storage 14 may be any type of device capable of storing data including a semiconductor memory, a rotatable disk storage device, or logic.
- the storage 14 may include a plurality of predicate registers 16 a - 16 d.
- the processor 12 may be capable of implementing parallel operations.
- a predicate register stores a single binary value.
- a software program 18 may be stored on the storage 14 in one embodiment of the present invention. In other embodiments, the program 18 may be stored in a storage different from the storage that includes the predicate registers 16 .
- the software program 18 begins by defining a plurality of masks as indicated in block 20 .
- the first mask has alternate bits set to one.
- a second mask has every other pair of bits set to one.
- the next mask has every other group of four bits set to one.
- the next mask has every other set of eight bits set to one, and so on doubling the number of set bits per group in each mask until the number of consecutive set bits is greater than or equal to one-half the length of the datum to be tested.
- a shifted value is defined as indicated in block 22 .
- the original data value that contains the bit being located may be compared to the original value shifted to the right by one.
- each mask is applied to the original value and compared to the shifted value as indicated in block 24 .
- each mask is applied in turn to the value under question and compared to the original value shifted right by one in one embodiment.
- the predicate register 16 is set to one. Otherwise, the predicate register is set to zero. As shown in FIG. 2, the predicate registers 16 may be loaded with the appropriate values as indicated in block 26 . After this has been done for each mask, the predicate registers 16 are moved into a single machine register by using a MOV instruction. The resulting value is the zero-based index of the most significant bit. The same approach may be utilized to find the least significant bit.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Executing Machine-Instructions (AREA)
Abstract
Description
- This invention relates generally to processor-based systems and, particularly, to systems that need to locate a most or least significant bit of a computer word.
- In a number of circumstances, it is desirable to locate the most or least significant bit of a computer word. For example, in various arithmetic operations this may be desirable. In particular, it is useful to know where the most significant bit is located to estimate the quotient for division, as well as in a variety of other situations.
- One technique to find a significant bit is to simply look at each bit in order and to stop when the first non-zero bit is found. While this approach is simple, it has a running time proportional to the length of the data.
- Alternatively, the most significant bit may be located by treating the datum as an integer and performing a binary search for the least power of two which is greater than the datum. This result can then be used to directly compute the index of the most significant bit. Although this algorithm may reduce the computational complexity to the log of the datum size, the overhead for testing and branching is high.
- Thus, there is a need for better ways to locate significant bits.
- FIG. 1 is a schematic depiction of one embodiment of the present invention; and
- FIG. 2 is a flow chart for one embodiment of the present invention.
- Referring to FIG. 1, a processor-based
system 10 may include aprocessor 12 coupled to astorage 14. Thestorage 14 may be any type of device capable of storing data including a semiconductor memory, a rotatable disk storage device, or logic. - The
storage 14 may include a plurality of predicate registers 16 a-16 d. In one embodiment of the present invention, theprocessor 12 may be capable of implementing parallel operations. A predicate register stores a single binary value. - A
software program 18 may be stored on thestorage 14 in one embodiment of the present invention. In other embodiments, theprogram 18 may be stored in a storage different from the storage that includes the predicate registers 16. - The
software program 18, shown in FIG. 2, begins by defining a plurality of masks as indicated inblock 20. The first mask has alternate bits set to one. A second mask has every other pair of bits set to one. The next mask has every other group of four bits set to one. The next mask has every other set of eight bits set to one, and so on doubling the number of set bits per group in each mask until the number of consecutive set bits is greater than or equal to one-half the length of the datum to be tested. - For simplicity, hereafter, an embodiment of the invention is described that operates on sixteen bit data values. However, the present invention is not limited to any particular data size.
- Next, a shifted value is defined as indicated in
block 22. The original data value that contains the bit being located may be compared to the original value shifted to the right by one. Then, in parallel, each mask is applied to the original value and compared to the shifted value as indicated inblock 24. In other words, each mask is applied in turn to the value under question and compared to the original value shifted right by one in one embodiment. - If the mask value is greater than the shifted value, the corresponding predicate register 16 is set to one. Otherwise, the predicate register is set to zero. As shown in FIG. 2, the predicate registers 16 may be loaded with the appropriate values as indicated in
block 26. After this has been done for each mask, the predicate registers 16 are moved into a single machine register by using a MOV instruction. The resulting value is the zero-based index of the most significant bit. The same approach may be utilized to find the least significant bit. - The following pseudocode may be utilized in an embodiment seeking the most significant bit of a sixteen bit value, producing the zero-based index of the highest non-zero bit (i.e., the leftmost bit):
/* Constants */ MaskHi8 = 0 × FF00 MaskHi4 = 0 × F0F0 MaskHi2 = 0 × CCCC MaskHi1 = 0 × AAAA shifted_value = original_value > > 1 /* Since there are no data dependencies between each * of the next four lines, they can each be executed * in parallel */ p4 = (MaskHi 8 & original value) > shifted_value p3 = (MaskHi 4 & original value) > shifted_value p2 = (MaskHi 2 & original value) > shifted_value p1 = (MaskHi 1 & original value) > shifted_value /* The line below may be a single instruction, a ‘broadside’ load of the predicate registers HighBitIndex = (p4 < < 3) + (p3 < < 2) + (p2 < < 2) + (p1) - While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.
Claims (21)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US10/202,960 US6904518B2 (en) | 2002-07-25 | 2002-07-25 | Finding a significant bit in a computer data word |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US10/202,960 US6904518B2 (en) | 2002-07-25 | 2002-07-25 | Finding a significant bit in a computer data word |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20040019769A1 true US20040019769A1 (en) | 2004-01-29 |
| US6904518B2 US6904518B2 (en) | 2005-06-07 |
Family
ID=30769952
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US10/202,960 Expired - Fee Related US6904518B2 (en) | 2002-07-25 | 2002-07-25 | Finding a significant bit in a computer data word |
Country Status (1)
| Country | Link |
|---|---|
| US (1) | US6904518B2 (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20080133563A1 (en) * | 2004-11-12 | 2008-06-05 | Justsystems Corporation | Data Processing Device And Data Processing Method |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5841683A (en) * | 1996-09-20 | 1998-11-24 | International Business Machines Corporation | Least significant bit and guard bit extractor |
| US5920493A (en) * | 1995-09-06 | 1999-07-06 | Lucent Technologies, Inc. | Apparatus and method to determine a most significant bit |
| US6172623B1 (en) * | 1999-03-22 | 2001-01-09 | Rise Technology Company | Efficient bit scan mechanism |
| US6721772B1 (en) * | 1999-08-19 | 2004-04-13 | National Semiconductor Corporation | Rounding denormalized numbers in a pipelined floating point unit without pipeline stalls |
-
2002
- 2002-07-25 US US10/202,960 patent/US6904518B2/en not_active Expired - Fee Related
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5920493A (en) * | 1995-09-06 | 1999-07-06 | Lucent Technologies, Inc. | Apparatus and method to determine a most significant bit |
| US5841683A (en) * | 1996-09-20 | 1998-11-24 | International Business Machines Corporation | Least significant bit and guard bit extractor |
| US6172623B1 (en) * | 1999-03-22 | 2001-01-09 | Rise Technology Company | Efficient bit scan mechanism |
| US6721772B1 (en) * | 1999-08-19 | 2004-04-13 | National Semiconductor Corporation | Rounding denormalized numbers in a pipelined floating point unit without pipeline stalls |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20080133563A1 (en) * | 2004-11-12 | 2008-06-05 | Justsystems Corporation | Data Processing Device And Data Processing Method |
Also Published As
| Publication number | Publication date |
|---|---|
| US6904518B2 (en) | 2005-06-07 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9223720B2 (en) | Systems and methods for rapidly generating suitable pairs of hash functions | |
| US9009447B2 (en) | Acceleration of string comparisons using vector instructions | |
| US9582474B2 (en) | Method and apparatus for performing a FFT computation | |
| US20070156685A1 (en) | Method for sorting data using SIMD instructions | |
| US5717616A (en) | Computer hardware instruction and method for computing population counts | |
| US10902087B2 (en) | Device and method for accelerating matrix multiply operations as a sum of outer products | |
| US4916649A (en) | Method and apparatus for transforming a bit-reversed order vector into a natural order vector | |
| KR100735944B1 (en) | Method and computer program for single instruction multiple data management | |
| US6904518B2 (en) | Finding a significant bit in a computer data word | |
| JPH07177005A (en) | Bit pattern detector circuit and bit pattern detecting method | |
| US6931518B1 (en) | Branching around conditional processing if states of all single instruction multiple datapaths are disabled and the computer program is non-deterministic | |
| US6708168B2 (en) | Method and apparatus for searching a data stream for character patterns | |
| US20090083507A1 (en) | Shift-add mechanism | |
| US6336113B1 (en) | Data management method and data management apparatus | |
| WO2001071483A2 (en) | Determinaton of a minimum or maximum value in a set of data | |
| US10862485B1 (en) | Lookup table index for a processor | |
| Lee et al. | Processor reordering algorithms toward efficient GEN_BLOCK redistribution | |
| US20080235657A1 (en) | Loop coalescing method and loop coalescing device | |
| US6754685B2 (en) | Dynamic popcount/shift circuit | |
| Kang et al. | Parallel BIST architecture for CAMs | |
| KR101075439B1 (en) | String Matching Device Based on Multi-Core Processor and Its String Matching Method | |
| US6988117B2 (en) | Bit-reversed indexing in a modified harvard DSP architecture | |
| US20220129269A1 (en) | Bit-Packed Array Processing Using SIMD | |
| US5898898A (en) | Collating bits from a byte source | |
| Langr et al. | Block iterators for sparse matrices |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: INTEL CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MOORE, STEPHEN F.;REEL/FRAME:013141/0161 Effective date: 20020724 |
|
| FPAY | Fee payment |
Year of fee payment: 4 |
|
| FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
| REMI | Maintenance fee reminder mailed | ||
| LAPS | Lapse for failure to pay maintenance fees | ||
| STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
| STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
| FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20130607 |