[go: up one dir, main page]

US20040019769A1 - Finding a significant bit in a computer data word - Google Patents

Finding a significant bit in a computer data word Download PDF

Info

Publication number
US20040019769A1
US20040019769A1 US10/202,960 US20296002A US2004019769A1 US 20040019769 A1 US20040019769 A1 US 20040019769A1 US 20296002 A US20296002 A US 20296002A US 2004019769 A1 US2004019769 A1 US 2004019769A1
Authority
US
United States
Prior art keywords
defining
datum
mask
value
shifted value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US10/202,960
Other versions
US6904518B2 (en
Inventor
Stephen Moore
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US10/202,960 priority Critical patent/US6904518B2/en
Assigned to INTEL CORPORATION reassignment INTEL CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MOORE, STEPHEN F.
Publication of US20040019769A1 publication Critical patent/US20040019769A1/en
Application granted granted Critical
Publication of US6904518B2 publication Critical patent/US6904518B2/en
Adjusted expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/74Selecting or encoding within a word the position of one or more bits having a specified value, e.g. most or least significant one or zero detection, priority encoders
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30018Bit or string instructions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30021Compare instructions, e.g. Greater-Than, Equal-To, MINMAX
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • G06F9/30038Instructions to perform operations on packed data, e.g. vector, tile or matrix operations using a mask
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30094Condition code generation, e.g. Carry, Zero flag
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30101Special purpose registers

Definitions

  • This invention relates generally to processor-based systems and, particularly, to systems that need to locate a most or least significant bit of a computer word.
  • the most significant bit may be located by treating the datum as an integer and performing a binary search for the least power of two which is greater than the datum. This result can then be used to directly compute the index of the most significant bit. Although this algorithm may reduce the computational complexity to the log of the datum size, the overhead for testing and branching is high.
  • FIG. 1 is a schematic depiction of one embodiment of the present invention.
  • FIG. 2 is a flow chart for one embodiment of the present invention.
  • a processor-based system 10 may include a processor 12 coupled to a storage 14 .
  • the storage 14 may be any type of device capable of storing data including a semiconductor memory, a rotatable disk storage device, or logic.
  • the storage 14 may include a plurality of predicate registers 16 a - 16 d.
  • the processor 12 may be capable of implementing parallel operations.
  • a predicate register stores a single binary value.
  • a software program 18 may be stored on the storage 14 in one embodiment of the present invention. In other embodiments, the program 18 may be stored in a storage different from the storage that includes the predicate registers 16 .
  • the software program 18 begins by defining a plurality of masks as indicated in block 20 .
  • the first mask has alternate bits set to one.
  • a second mask has every other pair of bits set to one.
  • the next mask has every other group of four bits set to one.
  • the next mask has every other set of eight bits set to one, and so on doubling the number of set bits per group in each mask until the number of consecutive set bits is greater than or equal to one-half the length of the datum to be tested.
  • a shifted value is defined as indicated in block 22 .
  • the original data value that contains the bit being located may be compared to the original value shifted to the right by one.
  • each mask is applied to the original value and compared to the shifted value as indicated in block 24 .
  • each mask is applied in turn to the value under question and compared to the original value shifted right by one in one embodiment.
  • the predicate register 16 is set to one. Otherwise, the predicate register is set to zero. As shown in FIG. 2, the predicate registers 16 may be loaded with the appropriate values as indicated in block 26 . After this has been done for each mask, the predicate registers 16 are moved into a single machine register by using a MOV instruction. The resulting value is the zero-based index of the most significant bit. The same approach may be utilized to find the least significant bit.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Executing Machine-Instructions (AREA)

Abstract

The most or least significant bit of a datum can bet determined using parallel operations. This may result in faster location of the most or least significant bit without necessarily introducing more overhead in some embodiments.

Description

    BACKGROUND
  • This invention relates generally to processor-based systems and, particularly, to systems that need to locate a most or least significant bit of a computer word. [0001]
  • In a number of circumstances, it is desirable to locate the most or least significant bit of a computer word. For example, in various arithmetic operations this may be desirable. In particular, it is useful to know where the most significant bit is located to estimate the quotient for division, as well as in a variety of other situations. [0002]
  • One technique to find a significant bit is to simply look at each bit in order and to stop when the first non-zero bit is found. While this approach is simple, it has a running time proportional to the length of the data. [0003]
  • Alternatively, the most significant bit may be located by treating the datum as an integer and performing a binary search for the least power of two which is greater than the datum. This result can then be used to directly compute the index of the most significant bit. Although this algorithm may reduce the computational complexity to the log of the datum size, the overhead for testing and branching is high. [0004]
  • Thus, there is a need for better ways to locate significant bits.[0005]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a schematic depiction of one embodiment of the present invention; and [0006]
  • FIG. 2 is a flow chart for one embodiment of the present invention.[0007]
  • DETAILED DESCRIPTION
  • Referring to FIG. 1, a processor-based [0008] system 10 may include a processor 12 coupled to a storage 14. The storage 14 may be any type of device capable of storing data including a semiconductor memory, a rotatable disk storage device, or logic.
  • The [0009] storage 14 may include a plurality of predicate registers 16 a-16 d. In one embodiment of the present invention, the processor 12 may be capable of implementing parallel operations. A predicate register stores a single binary value.
  • A [0010] software program 18 may be stored on the storage 14 in one embodiment of the present invention. In other embodiments, the program 18 may be stored in a storage different from the storage that includes the predicate registers 16.
  • The [0011] software program 18, shown in FIG. 2, begins by defining a plurality of masks as indicated in block 20. The first mask has alternate bits set to one. A second mask has every other pair of bits set to one. The next mask has every other group of four bits set to one. The next mask has every other set of eight bits set to one, and so on doubling the number of set bits per group in each mask until the number of consecutive set bits is greater than or equal to one-half the length of the datum to be tested.
  • For simplicity, hereafter, an embodiment of the invention is described that operates on sixteen bit data values. However, the present invention is not limited to any particular data size. [0012]
  • Next, a shifted value is defined as indicated in [0013] block 22. The original data value that contains the bit being located may be compared to the original value shifted to the right by one. Then, in parallel, each mask is applied to the original value and compared to the shifted value as indicated in block 24. In other words, each mask is applied in turn to the value under question and compared to the original value shifted right by one in one embodiment.
  • If the mask value is greater than the shifted value, the corresponding predicate register [0014] 16 is set to one. Otherwise, the predicate register is set to zero. As shown in FIG. 2, the predicate registers 16 may be loaded with the appropriate values as indicated in block 26. After this has been done for each mask, the predicate registers 16 are moved into a single machine register by using a MOV instruction. The resulting value is the zero-based index of the most significant bit. The same approach may be utilized to find the least significant bit.
  • The following pseudocode may be utilized in an embodiment seeking the most significant bit of a sixteen bit value, producing the zero-based index of the highest non-zero bit (i.e., the leftmost bit): [0015]
    /* Constants */
    MaskHi8 = 0 × FF00
    MaskHi4 = 0 × F0F0
    MaskHi2 = 0 × CCCC
    MaskHi1 = 0 × AAAA
    shifted_value = original_value > > 1
    /* Since there are no data dependencies between each
    * of the next four lines, they can each be executed
    * in parallel
    */
    p4 = (MaskHi 8 & original value) > shifted_value
    p3 = (MaskHi 4 & original value) > shifted_value
    p2 = (MaskHi 2 & original value) > shifted_value
    p1 = (MaskHi 1 & original value) > shifted_value
    /* The line below may be a single instruction, a
    ‘broadside’ load of the predicate registers
    HighBitIndex = (p4 < < 3) + (p3 < < 2) + (p2 < < 2) +
    (p1)
  • While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.[0016]

Claims (21)

What is claimed is:
1. A method comprising:
defining a plurality of masks;
receiving a datum and defining a shifted value of the datum;
applying each mask in parallel to the original datum value; and
comparing the result to the shifted value.
2. The method of claim 1 wherein defining a plurality of masks includes defining a plurality of masks including a first mask having alternate bits set to one and a second mask having every other pair of bits set to one.
3. The method of claim 2 wherein defining a plurality of masks includes defining a third mask having every other group of four bits set to one and a fourth mask having other set of eight bits set to one.
4. The method of claim 3 including defining masks until the last mask has groups of bits that are greater than or equal to the number of bits in the datum under test.
5. The method of claim 1 wherein defining a shifted value includes shifting the datum by one.
6. The method of claim 5 wherein defining a shifted value includes shifting the original datum to the right by one.
7. The method of claim 1 wherein, if the result is greater than the shifted value, setting a register to one.
8. The method of claim 7 wherein, if the result is less than the shifted value, setting a register to zero.
9. The method of claim 1 including loading the result of said comparison into predicate registers.
10. The method of claim 9 including moving the results in said predicate registers to a single register.
11. An article comprising a medium storing instructions that, if executed, enable a processor-based system to perform the steps of:
defining a plurality of masks;
receiving a datum and defining a shifted value of the datum;
applying each mask in parallel to the original datum value; and
comparing the result to the shifted value.
12. The article of claim 11 further storing instructions that, if executed, enable the system to perform the step of defining a plurality of masks including a first mask having alternate bits set to one and a second mask having every other pair of bits set to one.
13. The article of claim 12 further storing instructions that, if executed, enable the system to perform the step of defining a third mask having every other group of four bits set to one and a fourth mask having other set of eight bits set to one.
14. The article of claim 11 further storing instructions that, if executed, enable the system to perform the step of defining a shifted value that includes shifting the datum by one.
15. The article of claim 14 further storing instructions that, if executed, enable the system to perform the step of defining a shifted value that includes shifting the original datum to the right by one.
16. The article of claim 11 further storing instructions that, if executed, enable the system to perform the step of, if the result is greater than the shifted value, setting a register to one.
17. The article of claim 16 further storing instructions that, if executed, enable the system to perform the step of, if the result is less than the shifted value, setting a register to zero.
18. The article of claim 11 further storing instructions that, if executed, enable the system to perform the step of loading the result of said comparison into predicate registers.
19. The article of claim 18 further storing instructions that, if executed, enable the system to perform the step of moving the results in said predicate registers to a single register.
20. A system comprising:
a processor; and
a storage coupled to said processor, said storage storing instructions that, if executed, enable the processor to perform the steps of:
defining a plurality of masks;
receiving a datum and defining a shifted value of the datum;
applying each mask in parallel to the original datum value; and
comparing the result to the shifted value.
21. The system of claim 20 wherein said system includes a plurality of predicate registers, the result of said comparison being placed in said predicate registers.
US10/202,960 2002-07-25 2002-07-25 Finding a significant bit in a computer data word Expired - Fee Related US6904518B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/202,960 US6904518B2 (en) 2002-07-25 2002-07-25 Finding a significant bit in a computer data word

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/202,960 US6904518B2 (en) 2002-07-25 2002-07-25 Finding a significant bit in a computer data word

Publications (2)

Publication Number Publication Date
US20040019769A1 true US20040019769A1 (en) 2004-01-29
US6904518B2 US6904518B2 (en) 2005-06-07

Family

ID=30769952

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/202,960 Expired - Fee Related US6904518B2 (en) 2002-07-25 2002-07-25 Finding a significant bit in a computer data word

Country Status (1)

Country Link
US (1) US6904518B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080133563A1 (en) * 2004-11-12 2008-06-05 Justsystems Corporation Data Processing Device And Data Processing Method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5841683A (en) * 1996-09-20 1998-11-24 International Business Machines Corporation Least significant bit and guard bit extractor
US5920493A (en) * 1995-09-06 1999-07-06 Lucent Technologies, Inc. Apparatus and method to determine a most significant bit
US6172623B1 (en) * 1999-03-22 2001-01-09 Rise Technology Company Efficient bit scan mechanism
US6721772B1 (en) * 1999-08-19 2004-04-13 National Semiconductor Corporation Rounding denormalized numbers in a pipelined floating point unit without pipeline stalls

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5920493A (en) * 1995-09-06 1999-07-06 Lucent Technologies, Inc. Apparatus and method to determine a most significant bit
US5841683A (en) * 1996-09-20 1998-11-24 International Business Machines Corporation Least significant bit and guard bit extractor
US6172623B1 (en) * 1999-03-22 2001-01-09 Rise Technology Company Efficient bit scan mechanism
US6721772B1 (en) * 1999-08-19 2004-04-13 National Semiconductor Corporation Rounding denormalized numbers in a pipelined floating point unit without pipeline stalls

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080133563A1 (en) * 2004-11-12 2008-06-05 Justsystems Corporation Data Processing Device And Data Processing Method

Also Published As

Publication number Publication date
US6904518B2 (en) 2005-06-07

Similar Documents

Publication Publication Date Title
US9223720B2 (en) Systems and methods for rapidly generating suitable pairs of hash functions
US9009447B2 (en) Acceleration of string comparisons using vector instructions
US9582474B2 (en) Method and apparatus for performing a FFT computation
US20070156685A1 (en) Method for sorting data using SIMD instructions
US5717616A (en) Computer hardware instruction and method for computing population counts
US10902087B2 (en) Device and method for accelerating matrix multiply operations as a sum of outer products
US4916649A (en) Method and apparatus for transforming a bit-reversed order vector into a natural order vector
KR100735944B1 (en) Method and computer program for single instruction multiple data management
US6904518B2 (en) Finding a significant bit in a computer data word
JPH07177005A (en) Bit pattern detector circuit and bit pattern detecting method
US6931518B1 (en) Branching around conditional processing if states of all single instruction multiple datapaths are disabled and the computer program is non-deterministic
US6708168B2 (en) Method and apparatus for searching a data stream for character patterns
US20090083507A1 (en) Shift-add mechanism
US6336113B1 (en) Data management method and data management apparatus
WO2001071483A2 (en) Determinaton of a minimum or maximum value in a set of data
US10862485B1 (en) Lookup table index for a processor
Lee et al. Processor reordering algorithms toward efficient GEN_BLOCK redistribution
US20080235657A1 (en) Loop coalescing method and loop coalescing device
US6754685B2 (en) Dynamic popcount/shift circuit
Kang et al. Parallel BIST architecture for CAMs
KR101075439B1 (en) String Matching Device Based on Multi-Core Processor and Its String Matching Method
US6988117B2 (en) Bit-reversed indexing in a modified harvard DSP architecture
US20220129269A1 (en) Bit-Packed Array Processing Using SIMD
US5898898A (en) Collating bits from a byte source
Langr et al. Block iterators for sparse matrices

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTEL CORPORATION, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MOORE, STEPHEN F.;REEL/FRAME:013141/0161

Effective date: 20020724

FPAY Fee payment

Year of fee payment: 4

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

REMI Maintenance fee reminder mailed
LAPS Lapse for failure to pay maintenance fees
STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20130607