[go: up one dir, main page]

GB2403310A - Method and Apparatus for the emulation of High Precision Floating Point Instructions - Google Patents

Method and Apparatus for the emulation of High Precision Floating Point Instructions Download PDF

Info

Publication number
GB2403310A
GB2403310A GB0322325A GB0322325A GB2403310A GB 2403310 A GB2403310 A GB 2403310A GB 0322325 A GB0322325 A GB 0322325A GB 0322325 A GB0322325 A GB 0322325A GB 2403310 A GB2403310 A GB 2403310A
Authority
GB
United Kingdom
Prior art keywords
floating point
hardware
instructions
result
mantissa
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB0322325A
Other versions
GB2403310B (en
GB0322325D0 (en
Inventor
Paul Walker
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Transitive Ltd
Original Assignee
Transitive Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Transitive Ltd filed Critical Transitive Ltd
Publication of GB0322325D0 publication Critical patent/GB0322325D0/en
Priority to US10/726,858 priority Critical patent/US7299170B2/en
Priority to TW093117994A priority patent/TW200506720A/en
Priority to PCT/GB2004/002700 priority patent/WO2005003959A2/en
Publication of GB2403310A publication Critical patent/GB2403310A/en
Application granted granted Critical
Publication of GB2403310B publication Critical patent/GB2403310B/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/483Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • G06F9/30014Arithmetic instructions with variable precision
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45504Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Nonlinear Science (AREA)
  • Debugging And Monitoring (AREA)
  • Complex Calculations (AREA)

Abstract

A high precision floating point emulator and associated method for emulating subject program code on a target machine where the subject machine base operands possess a different precision than the target machine. The high precision floating point emulator is provided for the emulation of subject program code instructions having a higher precision than that supported by the target machine architecture by utilizing intermediate calculations having values with a higher precision than that supported by the target machine.

Description

METHOD AND APPARATUS FOR THE EMULATION OF
HIGH PRECISION FLOATING POINT INSTRUCTIONS
The subject invention relates generally to the field of computers and computer software and, more particularly, to an apparatus and method for emulating high precision floating point instructions.
Floating point notation is widely used in digital data processing devices, such as microprocessors, to represent a much larger range of numbers than can be represented in regular binary notation. Various types of floating point notations are used. Typically, a floating point number has a sign bit (s), followed by an exponent field (e) and a
mantissa field.
Microprocessors typically contain and work together with floating point units (FPU) to perform operations, such as addition and subtraction, on floating point numbers. FPUs have the ability to support complex numerical and scientific calculations on data in floating point format. In order to add or subtract floating point numbers, the decimal points must be aligned. The process is equivalent to addition or subtraction of base ten numbers in scientific notation. Generally, the FPU performs operations on the exponents and mantissas of the values in order to align the decimal points.
Once the decimal points are aligned, the mantissas can be added or subtracted in accordance with the sign bits.
The result may need to be normalized, or left shifted, so that a one is in the most significant bit position of the mantissa. The result may also be rounded. Many different representations can be used for the mantissa and exponent themselves, where IEEE Standard 754, entitled "IEEE Standard for Binary Floating point Arithmetic (ANSI/IEEE Std 754-1985), provides a standard used by many CPUs and FPUs which defines formats for representing floating point numbers, representations of special values (e. g., infinity, very small values, NaN), exceptions, rounding modes, and a set of floating point operations that will lo work identically on any conforming system. The IEEE 754 Standard further specifies the formats for representing floating point values with single-precision (32-bit), double-precision (64-bit), single-extended precision (up to 80-bits), and double-extended precision (128-bit).
Most microprocessors also typically include integer units for performing integer operations. An integer unit is typically provided to perform integer operations, such as addition and subtraction. While integer units are common in microprocessors, floating point arithmetic performed using integer operations is much more costly than the equivalent floating point operations. Thus, most microprocessor utilize a combination of FPUs and integer units to perform necessary calculations. The precision capable of being achieved in such calculations is determined by the actual architecture of the FPU and integer unit hardware associated with the microprocessor.
Across the embedded and non-embedded CPU market, one finds predominant Instruction Set Architectures (ISAs) for which large bodies of software exist that could be "Accelerated" for performance, or "Translated" to a myriad of capable processors that could present better cost/performance benefits, provided that they could transparently access the relevant software. One also finds dominant CPU architectures that are locked in time to their ISA, and cannot evolve in performance or market reach and would benefit from "Synthetic CPUn co-architecture.
It is often desired to run program code written for a computer processor of a first type (a "subject" processor) on a processor of a second type (a "target" processor).
Here, an emulator or translator is used to perform program code translation, such that the subject program is able to run on the target processor. The emulator provides a virtual environment, as if the subject program were running natively on a subject processor, by emulating the subject processor. The precision of the calculations which can be performed on the values of the subject program have conventionally been limited by the hardware architecture of the target processor.
According to the present invention there is provided an apparatus and method as set forth in the appended claims. Preferred features of the invention will be apparent from the dependent claims, and the description which follows.
The following is a summary of various aspects and
advantages realizable according to various embodiments of the improved architecture for program code conversion according to the present invention. It is provided as an introduction to assist those skilled in the art to more rapidly assimilate the detailed discussion of the invention that ensues and does not and is not intended in any way to limit the scope of the claims that are appended hereto.
In particular, the inventors have developed an improved method and apparatus for expediting program code conversion, particularly useful in connection with an emulator which emulates subject program code on a target machine where the subject machine base operands possess a different precision than the target machine. More lo particularly, a high precision floating point emulator is provided for the emulation of subject program code instructions having a higher precision than that supported by the target machine architecture by utilizing intermediate calculations having values with a higher precision than that supported by the target machine.
The present invention, both as to its organization and manner of operation, together with further advantages, may best be understood by reference to the following description, taken in connection with the accompanying drawings in which the reference numerals designate like parts throughout the figures thereof and wherein: Figure 1 shows a computing environment including subject and target processor architectures; and Figure 2 is an operational flow diagram that describes an example of the high precision floating point emulation performed in accordance with a preferred embodiment of the present invention.
The following description is provided to enable any person skilled in the art to make and use the invention and sets forth the best modes contemplated by the inventors of carrying out their invention. Various modifications, however, will remain readily apparent to those skilled in the art, since the general principles of the present invention have been defined herein specifically to provide an improved high precision floating point emulation apparatus.
Referring to Figure 1, an example computing lo environment is shown including a subject computing environment 1 ("subject machine 1") and a target computing environment 2 ("target machine 2"). In the subject machine 1, subject code 10 is executable natively on a subject processor 12. The subject processor 12 includes a set of subject registers 14. Here, the subject code 10 may be represented in any suitable language with intermediate layers (e.g., compilers) between the subject code 10 and the subject processor 12, as will be familiar to a person skilled in the art.
It is desired in some situations to run the subject code 10 on the target machine 2 of the present invention, which includes a target processor 22 using a set of target registers 24. The two processors 12 and 22 of the subject machine 1 and the target machine 2, respectively, may be inherently non-compatible, such that these two processors 12 and 24 use different instruction sets. The target processor 22 includes a floating point unit 28 for computing floating point operations and an integer unit 26 for performing integer operations. The floating point unit 28 and the integer unit 26 may comprise any of a wide variety of types of hardware units, as known to those skilled in the art, where the floating point unit 28 is preferably IEEE 754 Standard compatible floating point hardware.
The two processors 12 and 22 may operate with different levels of accuracy and precision depending upon their particular architectures as well as the hardware designs of their respective floating point unit 28 and the integer unit 26. Hence, a floating point emulator 20 is provided in the target machine 2, in order to emulate high lo precision instructions from the subject code 10 in the target computing environment 2. High precision refers to a level of precision which is higher than that provided by the target machine 2, where the base operands in instructions of the subject program code have a higher precision than that supported by the target machine 2.
The floating point emulator 20 provides for a higher level of precision during calculations than the target architecture 2 could otherwise provide, thus providing a higher level of accuracy in the emulated instructions.
The floating point emulator 20 is preferably a software component, i.e., a compiled version of the source code implementing the emulator, run in conjunction with an operating system running on the target processor 22, typically a microprocessor or other suitable processing device. It will be appreciated that the structure illustrated in FIG. 1 is exemplary only and that, for example, software, methods and processes according to the invention may be implemented in code residing within or beneath an operating system.
Referring now to Figure 2, an operational flow diagram of a method of performing high precision floating point emulation in accordance with a preferred embodiment of the present invention is illustrated. The high precision floating point emulation algorithm described hereafter refers to a single embodiment of the invention that provides for the emulation of high precision floating point accumulated instructions, while it is understood that the floating point emulator 20 is capable of emulating any types of instructions where the subject machine 1 base operands are at a different precision than lo the target machine 2. For example, the floating point emulator 20 would allow the emulation of the addition of two double precision values, such as a subject machine's Double(x) + Double(y) operation, on a target machine 2 that only supports single precision floating point operations. With this understanding and for ease of discussion, the high precision floating point emulation algorithm will be described hereafter with reference to the emulation of high precision floating point accumulated instructions of the form: d = _ (a*b _ c) where a, b, c and d are operands which can be expressed as floating point numbers. High precision as referred to in this description means any precision which is higher than that provided by the target machine 2. For instance, if the architecture pf the target machine 2 supports IEEE Standard 754 double-precision floating point values, then high precision would refer to any values having a higher precision than double-precision floating point values. It should be noted that the floating point emulator 20 only calculates the intermediate values of the accumulated instructions at high precision, and the operands themselves and the result are not at high precision.
The high precision floating point algorithm embodied in FIG. 2 utilizes standard integer techniques to perform the calculation of the accumulated instructions in stages, where, at the end of each stage, the intermediate values are tested at runtime to ascertain whether the intermediate values have reached a point at which the lo hardware (i.e., the target processor 22, integer unit 26, and floating point unit 28) of the target machine 2 has enough precision to finish the calculation without loss of accuracy. To achieve this, the high precision floating point algorithm performed by the floating point emulator 20 works in combination with an integer unit 26 and an IEEE Standard 754 compatible floating point hardware unit 28.
A wide number of integer techniques for performing floating point emulation are known to those skilled in the art, where the high precision floating point algorithm of the preferred embodiment described herein accelerates the process by using floating point hardware, namely floating point unit 28. These process accelerations are referred to as fast exit points hereinafter, because they exit the testing routine performed at runtime to determine if the intermediate values are at a level such that the target architecture 2 has enough precision to finish the calculation without loss of accuracy and the hardware of the target architecture 2 is immediately used to perform the necessary calculations.
Fast Exi t Points The floating point emulator 20 begins the high precision floating point algorithm when a floating point accumulated instructions of the form: d = + (a*b + c) is encountered in step 200. It is determined in step 202 if any of the three input operands (a, b, c) can be considered a special value, where special values include zero, infinity or NAN (not a number). For each of these lo special values, there is a known result, such as dictated by the IEEE Standard 754, that all compatible hardware will produce regardless of the level of precision and hence there is no need for expensive integer emulation.
Thus, there is a fast exit point for any operands identified as special values to perform the calculation using the target architecture's floating point unit 28 in step 204.
If none of the operands (a, b, c) are special values, it is next determined in step 206 whether the exponent for the result of the multiplication (a*b) overlaps with the exponent of operand c. Two values will overlap if the addition/subtraction of the significant digits of the two values yields a result different from each of the two values. In this context, non-overlapping refers to the fact that either a*b or c is so large as to make the other insignificant. By way of example, in the situation where a particular FPU is only capable of representing 3 significant digits, if the value 3.10 is added to the value 0.01, then it can be seen that both values are important to the result, i.e., performing the addition will yield a result different to the sources and the sources thus overlap. Contrarily, for the same FPU only capable of representing 3 significant digits, if the value 310 is added to the value 0.01, the result when using 3 significant figures is 310. Thus, in this situation the result is the same as the first source and the two values did not overlap When the two values fail to overlap, the addition of the values is not required. Thus, if the exponent for the result of the multiplication (a*b) does not overlap with lo the exponent of operand c, then the floating point algorithm determines that the addition/subtraction is not required and another fast exit point is provided where the calculation can be performed using the target machine 2's FPU 28 in step 204. It should be noted that thresholds for determining whether two values overlap can either be variably selected or can be thresholds established by formats well-known to those skilled in the art can be utilized, such as the IEEE Standard 754 floating point double-precision format.
When the exponents of the result of (a*b) and c do overlap, the mantissa for the result of the multiplication (a*b) is calculated in step 208 in order to establish how much precision is required by the mantissa. It is determined in step 210 whether the result of the multiplication (a*b) requires more mantissa bits than is provided by the FPU 28 of the target machine 2. For example, when the FPU 28 is capable of handling doubleprecision numbers, it is known that operands containing 52 mantissa bits are utilized for double-precision values.
If the number of bits required by the mantissa(a*b) is less than or equal to the number of mantissa bits capable of being handled by the FPU 28 (e. g., 52 bits in the case of double-precision numbers), then the FPU 28 has sufficient precision to perform the calculation. Thus, if the mantissa(a*b) requires no more mantissa bits than are provided for by the FPU 28, another fast exit point is provided and the result is calculated using the target architectures FPU 28 in step 204. In the determination made in step 210, the precision is determined by comparing the spread of the mantissa, namely the number of bit positions between the most and least significant set bits.
When the mantissa (a*b) requires more bits than provided by the FPU 28, the full calculation of a*b is completed using the integer unit 26 in step 212. At this point, the high precision floating point algorithm calculates the +mantissa(a*b) + mantissa(c) in step 214.
A determination is made in step 216 whether the resulting mantissa is equal to zero, where upon another fast exit point can be created to use the target architecture 2's most efficient mechanism, namely the FPU 28, to set the final result to 0.0 in step 218. Thus, this fast exit point is only valid when the mantissa(a*b) and the mantissa(c) are subtracted from one another to yield a resulting mantissa equal to zero. This removes the need to calculate the final exponent and also bypasses any expensive rounding. However, if the final resulting mantissa is not equal to zero, then the remaining parts of the calculation of a*b + c must be calculated using the integer unit 26 in step 220.
The various fast exit point provided in the high precision floating point algorithm described above provide for faster and more efficient emulation of accumulated instructions by maximizing the use of the FPU 28 for performing floating point arithmetic. By implementing the high precision floating point algorithm of the preferred embodiment, accumulated instructions are calculated at a higher precision than the operands are typically capable of being handled by the architecture of the target machine 2, resulting in accumulated instructions effectively being calculated with greater precision. If the intermediate result is twice the precision of the sources, then the accumulated instructions can be calculated to an infinite lo precision (i.e., no loss of accuracy).
For the purposes of illustrating the steps performed by the floating point emulator 20 in implementing the above-described high precision floating point algorithm, the following example is provided without any intention by the inventors of the present invention to limit the scope of their invention to the described example.
This example utilizes an accumulated instruction having three IEEE Standard 754 double precision floating point values for operands (a, b, c) . The IEEE Standard 754 standard dictates that a double-precision floating point value is 64 bits wide, including 1 sign bit, an 11- bit exponent, and a 52-bit mantissa). The example uses the definitions in the following legend: Legend a, b, c the mput operands a*b, a*b-c the intermediate operands sign(x) the sign part of the operand x, represented as a Boolean value exp(x) the exponent part of operand x, represented as an integer man(x) the mantissa part of the operands x mcludmg the Implied one, represented as a larger integer FPU(X) calculate x using the targets floating point unit 28 and exit.
These are indicative of a fast exit pomt.
sub(x,y) x-y mul(x,Y) x*y shift_right(x,y) Shin x y places to the right In this example, a large integer means larger than that provided by the hardware of the target architecture.
The last three operations in the legend are represented as functions as they operate on large integers.
The pseudo-code for the high precision floating point algorithm for the accumulated instruction fmsub (i.e., a * b - c) is as follows. Initially, it is determined if any lo of the operands are special to apply an early exit: If(aorborc)==(+infini orO.OorNAN) FPU(a*b-c) EndIf When the operands are not special, it must then be determine whether to apply a different early exit by determining if the subtraction operation is significant, i.e., it has an effect on the final result within the required emulation accuracy or SMA (Subject Machine Accuracy).
Considering (a*b) - c, the subtraction would not be significant for the following two cases: 1. (a*b) - c == (a*b) : SMA 2. (a*b) - c == -c: SMA The particular emulation must be taken into account to determine whether accuracy of the SMA is similar to that of the Target Machine Accuracy (TMA). For instance, for the PPC-P4 emulation, it follows that if the subtract operation is insignificant in SMA then it is also insignificant in TMA.
A quick and efficient way to test for significance is to test for the mere possibility of the subtract operation lo affecting the result within the greater SMA. If it is determined that the result does not change within SMA, the calculation can be performed natively in TMA without precision loss. However, if there is a chance that the subtract operation could change the result in a change in SMA, the emulation performed by the high precision floating point algorithm continues.
For subtraction, the operand with the lower exponent is first shifted to make its exponent the same as the larger exponent. The mantissas are then subtracted and the exponent remains the same. For the operand with the lower exponent, the initial exponent shift upwards results in the mantissa being shifted downwards. If this results in a zero mantissa then the subtract operation is not significant. A zero mantissa will be produced if the exponent needs to be raised more than the number of bits of accuracy in the mantissa. Thus, if ((higherExp - lowerExp) > Mantissa Bits) Subtraction not significant else Subtraction possibly significant fi Considering the PPC instruction, fmsub, (a*b) - c, the intermediate result of the multiply is accurate to 106 mantissa bits and the final result of the subtraction is accurate to 53 mantissa bits. Therefore a*b has a maximum 106 mantissa bits and c has a maximum 53 mantissa bits.
For PPC, the above pseudo code now becomes: if(exp(a*b) > exp(c)) if(exp(a*b) - exp(c) > 53) II Subtraction not significant, take fast exit FPU(a*b-c) else II Subtraction possibly significant, continue emulation fi else 11 exp(a*b) ≤ exp(c) 0 if(exp(c) - exp(a*b) > 106) II Subtraction not significant, take fast exit FPU(a*b -c) else II Subtraction possibly significant, continue emulation fi fi The calculation of exp(a*b) involves the quick addition, exp(a) + exp(b). This fast exit check can therefore be done before the expensive SMA multiplication of a and b. The mantissa of a*b is then calculated: 2 5 man(a*b) = mul(man(a), man(b)) It is known that a double-precision floating point value has 52 bits for its mantissa (plus the implied 1), thus man(x) it 53 bits wide. The result of the multiplication will therefore be a maximum of 106 bits wide. It is then determined if the extra precision is required by examining the spread of the resulting mantissa. If this mantissa would fit within a float double (i.e., 53bits including the implied one), then the extra precision is not required. This is tested by checking to see if the bottom 53bits of the resulting mantissa were used.
If ((man(a*b) & Ox I ffffffffffff) == 0) FPU(a*b - c) EndIf exp(a*b) = exp(a) + exp(b) sign(a*b) = sign(a) xor sign(b) It is now necessary to align a*b and c, in order to perform the subtraction.
If (exp(a*b) > exp(c)) shift_right(man(c), exp(a*b) - exp(c)) exp(a*b-c) = exp(a*b) Else shift_right(man(a*b), exp(c) - exp(a*b)) exp(a*b-c) = exp(c) EndIf o If (man(a*b) man(c)) sub(man(a*b), man(c)) sign(a*b-c) = sign(a*b) Else sub(man(c), man(a*b)) is sign(a*b-c) = sign(c) EndIf The resulting mantissa is then checked to see if it equals zero: If (man(a*b- c) == 0) FPU(O.O) 2 5 EndIf At this point, emulation has either exited via a fast exit point or it has been determined that the full precision is required. The result sign, exponent and mantissa have all been calculated, where the only operation remaining is to convert the result into the subject machine's floating point format, which involves aligning the result and rounding.
As can be seen from the foregoing, an emulator described in the various embodiments above provide for the high precision emulation of subject program code on a target machine where the subject machine base operands possess a different precision than the target machine.
Moreover, the emulation of subject program code lo instructions having a higher precision than that supported by the target machine architecture is provided by utilizing intermediate calculations having values with a higher precision than that supported by the target machine.
The different structures of the high precision floating point emulation apparatus and method of the present invention are described separately in each of the above embodiments. However, it is the full intention of the inventors of the present invention that the separate aspects of each embodiment described herein may be combined with the other embodiments described herein.
Those skilled in the art will appreciate that various adaptations and modifications of the just described preferred embodiments can be configured without departing from the scope and spirit of the invention. Therefore, it is to be understood that, within the scope of the appended claims, the invention may be practiced other than as specifically described herein.
Although a few preferred embodiments have been shown and described, it will be appreciated by those skilled in the art that various changes and modifications might be made without departing from the scope of the invention, as defined in the appended claims.
Attention is directed to all papers and documents which are filed concurrently with or previous to this specification in connection with this application and which are open to public inspection with this specification, and the contents of all such papers and documents are incorporated herein by reference.
All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive.
Each feature disclosed in this specification
(including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features.
The invention is not restricted to the details of the foregoing embodiment(s). The invention extends to any novel one, or any novel combination, of the features
disclosed in this specification (including any
accompanying claims, abstract and drawings), or to any novel one, or any novel combination, of the steps of any method or process so disclosed.

Claims (33)

  1. Claims 1. A method of performing high precision emulation of program code
    instructions for a subject machine on a target machine, comprising: determining if operands in instructions of the program code for the subject machine require a different precision than provided for by the target machine; and applying a floating point emulation algorithm to perform intermediate calculations on the operands of the instructions at a higher precision than the precision supported by the target machine.
  2. 2. The method of claim 1, wherein the target machine includes floating point hardware and integer hardware, wherein said floating point emulation algorithm comprises: utilizing floating point hardware on the target machine to perform calculations on the operands of the instructions when it is determined based upon the intermediate calculations that the target machine provides sufficient precision for the calculations required by the instructions; and utilizing integer hardware on the target machine to perform calculations not selected to be performed by the floating point hardware.
  3. 3. The method of claim 2, wherein the program code instructions are accumulated instructions that are calculated at a higher precision than the operands capable of being handled by the target machine.
  4. 4. The method of claim 3, wherein the program code instructions are floating point accumulated instructions of the form: d = +(a*b + c), wherein a, b, c and d are operands expressible as floating point numbers.
  5. 5. The method of claim 4, further comprising lo identifying whether any of the operands (a, b, or c) are special values having a known result that all compatible hardware will produce regardless of the level of precision of said hardware.
  6. 6. The method of claim 5, wherein said special values include either zero, infinity, or NaN (not a number), wherein the floating point hardware is utilized to calculate the result of the accumulated instructions when any of the operands (a, b, or c) are identified as special values.
  7. 7. The method of claim 4, wherein said floating point emulation algorithm further comprises: determining whether the exponent for the result of the multiplication of (a*b) overlaps with the exponent of c; and utilizing the floating point hardware to calculate the result of the accumulated instructions when the exponent for the result of the multiplication of (a*b) fails to overlap with the exponent of c.
  8. 8. The method of claim 7, wherein, when the exponent for the result of the multiplication of (a*b) overlaps with the exponent of c, said floating point emulation algorithm further comprising: determining whether the mantissa for the result of the multiplication (a*b) requires more mantissa bits than provided for by said floating point hardware; and loutilizing the floating point hardware to calculate the result of the accumulated instructions when the result of the multiplication (a*b) does not require more mantissa bits than provided for by the floating point hardware.
  9. 9. The method of claim 8, said floating point emulation algorithm further comprising computing the full calculation of a*b using the integer hardware when mantissa for the result of the multiplication (a*b) requires more mantissa bits than provided for by the floating point hardware.
  10. 10. The method of claim 9, said floating point emulation algorithm further comprising: determining whether the final resulting mantissa of the mantissa(a*b) - the mantissa (c) equals zero; utilizing the floating point hardware to make the result equal to zero when the resulting mantissa is equal to zero; and calculating the remaining parts of the calculation of a*b + c using the integer hardware when the final resulting mantissa is not equal to zero.
  11. 11. A computer-readable storage medium having software resident thereon in the form of computer-readable code executable by a computer to perform the following steps in the high precision emulation of program code instructions for a subject machine on a target machine: determining if operands in instructions of the program code for the subject machine require a different precision than provided for by the target machine; and applying a floating point emulation algorithm to perform intermediate calculations on the operands of the instructions at a higher precision than the precision supported by the target machine.
  12. 12. The computer-readable storage medium of claim 11, wherein the target machine includes floating point hardware and integer hardware, wherein said floating point i emulation algorithm comprises: utilizing floating point hardware on the target machine to perform calculations on the operands of the instructions when it is determined based upon the intermediate calculations that the target machine provides sufficient precision for the calculations required by the instructions; and utilizing integer hardware on the target machine to perform calculations not selected to be performed by the floating point hardware.
  13. 13. The computer-readable storage medium of claim 12, wherein the program code instructions are accumulated instructions that are calculated at a higher precision than the operands capable of being handled by the target machine.
  14. 14. The computer-readable storage medium of claim 13, wherein the program code instructions are floating point accumulated instructions of the form: d = +(a*b + c), wherein a, b, c and d are operands expressible as floating point numbers.
  15. 15. The computer-readable storage medium of claim 14, said computerreadable code further executable for identifying whether any of the operands (a, b, or c) are special values having a known result that all compatible hardware will produce regardless of the level of precision of said hardware.
  16. 16. The computer-readable storage medium of claim 15, wherein said special values include either zero, infinity, or NaN (not a number), wherein the floating point hardware is utilized to calculate the result of the accumulated instructions when any of the operands (a, b, or c) are identified as special values.
  17. 17. The computer-readable storage medium of claim 14, wherein said floating point emulation algorithm further comprises: determining whether the exponent for the result of the multiplication of (a*b) overlaps with the exponent of c; and utilizing the floating point hardware to calculate the result of the accumulated instructions when the exponent for the result of the multiplication of (a*b) fails to overlap with the exponent of c.
  18. 18. The computer-readable storage medium of claim 17, wherein, when the exponent for the result of the multiplication of (a*b) overlaps with the exponent of c, said floating point emulation algorithm further comprises: determining whether the mantissa for the result of the multiplication (a*b) requires more mantissa bits than provided for by said floating point hardware; and utilizing the floating point hardware to calculate the result of the accumulated instructions when the result of the multiplication (a*b) does not require more mantissa bits than provided for by the floating point hardware.
  19. 19. The computer-readable storage medium of claim 18, said floating point emulation algorithm further comprising computing the full calculation of a*b using the integer hardware when mantissa for the result of the multiplication (a*b) requires more mantissa bits than provided for by the floating point hardware.
  20. 20. The computer-readable storage medium of claim 19, said floating point emulation algorithm further comprising: determining whether the final resulting mantissa of the mantissa(a*b) - the mantissa (c) equals zero; utilizing the floating point hardware to make the result equal to zero when the resulting mantissa is equal lo to zero; and calculating the remaining parts of the calculation of a*b + c using the integer hardware when the final resulting mantissa is not equal to zero.
  21. 21. In combination: a target processor; and translator code for performing high precision emulation of program code instructions for a subject machine on a target machine, said translator code comprising code executable by said target processor for performing the following steps: determining if operands in instructions of the program code for the subject machine require a different precision than provided for by the target machine; and applying a floating point emulation algorithm to perform intermediate calculations on the operands of the instructions at a higher precision than the precision supported by the target machine.
  22. 22. The combination of claim 21, wherein the target machine includes floating point hardware and integer hardware, wherein said floating point emulation algorithm comprises: utilizing floating point hardware on the target machine to perform calculations on the operands of the instructions when it is determined based upon the lo intermediate calculations that the target machine provides sufficient precision for the calculations required by the instructions; and utilizing integer hardware on the target machine to perform calculations not selected to be performed by the floating point hardware.
  23. 23. The combination of claim 22, wherein the program code instructions are accumulated instructions that are calculated at a higher precision than the operands capable of being handled by the target machine.
  24. 24. The combination of claim 23, wherein the program code instructions are floating point accumulated instructions of the form: d = +(a*b + c), wherein a, b, c and d are operands expressible as floating point numbers.
  25. 25. The combination of claim 24, wherein said floating point emulation algorithm comprises identifying whether any of the operands (a, b, or c) are special values having a known result that all compatible hardware will produce regardless of the level of precision of said hardware.
  26. 26. The combination of claim 25, wherein said special values include either zero, infinity, or NaN (not a number), wherein the floating point hardware is utilized to calculate the result of the accumulated instructions when any of the operands (a, b, or c) are identified as special values.
  27. 27. The combination of claim 24, wherein said floating point emulation algorithm further comprises: determining whether the exponent for the result of the multiplication of (a*b) overlaps with the exponent of c; and utilizing the floating point hardware to calculate the result of the accumulated instructions when the exponent for the result of the multiplication of (a*b) fails to overlap with the exponent of c.
  28. 28. The combination of claim 27, wherein, when the exponent for the result of the multiplication of (a*b) overlaps with the exponent of c, said floating point emulation algorithm further comprises: determining whether the mantissa for the result of the multiplication (a*b) requires more mantissa bits than provided for by said floating point hardware; and utilizing the floating point hardware to calculate the result of the accumulated instructions when the result of the multiplication (a*b) does not require more mantissa bits than provided for by the floating point hardware.
  29. 29. The combination of claim 28, said floating point emulation algorithm further comprising computing the full calculation of a*b using the integer hardware when mantissa for the result of the multiplication (a*b) requires more mantissa bits than provided for by the floating point hardware.
  30. 30. The combination of claim 29, said floating point emulation algorithm further comprising: determining whether the final resulting mantissa of the mantissa(a*b) - the mantissa (c) equals zero; utilizing the floating point hardware to make the result equal to zero when the resulting mantissa is equal to zero; and calculating the remaining parts of the calculation of a*b + c using the integer hardware when the final resulting mantissa is not equal to zero.
  31. 31. A method of performing high precision emulation of program code instructions for a subject machine on a target machine, substantially as hereinbefore described with reference to the accompanying drawings.
  32. 32. A computer-readable storage medium having software resident thereon in the form of computer-readable code executable by a computer to perform the high precision emulation of program code instructions for a subject machine on a target machine, substantially as hereinbefore described with reference to the accompanying drawings.
  33. 33. In combination a target processor and translator code for performing high precision emulation of program code instructions for a subject machine on a target machine, said translator code comprising code executable by said target processor, substantially as hereinbefore described with reference to the accompanying drawings.
GB0322325A 2003-06-28 2003-09-24 Method and apparatus for the emulation of high precision floating point instructions Expired - Lifetime GB2403310B (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US10/726,858 US7299170B2 (en) 2003-06-28 2003-12-02 Method and apparatus for the emulation of high precision floating point instructions
TW093117994A TW200506720A (en) 2003-06-28 2004-06-21 Method and apparatus for the emulation of high precision floating point instructions
PCT/GB2004/002700 WO2005003959A2 (en) 2003-06-28 2004-06-22 Method and apparatus for the emulation of high precision floating point instructions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GBGB0315350.9A GB0315350D0 (en) 2003-06-28 2003-06-28 Method and apparatus for the emulation of high precision floating point instructions

Publications (3)

Publication Number Publication Date
GB0322325D0 GB0322325D0 (en) 2003-10-22
GB2403310A true GB2403310A (en) 2004-12-29
GB2403310B GB2403310B (en) 2007-03-28

Family

ID=27676415

Family Applications (2)

Application Number Title Priority Date Filing Date
GBGB0315350.9A Ceased GB0315350D0 (en) 2003-06-28 2003-06-28 Method and apparatus for the emulation of high precision floating point instructions
GB0322325A Expired - Lifetime GB2403310B (en) 2003-06-28 2003-09-24 Method and apparatus for the emulation of high precision floating point instructions

Family Applications Before (1)

Application Number Title Priority Date Filing Date
GBGB0315350.9A Ceased GB0315350D0 (en) 2003-06-28 2003-06-28 Method and apparatus for the emulation of high precision floating point instructions

Country Status (2)

Country Link
GB (2) GB0315350D0 (en)
TW (1) TW200506720A (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
MX2009007876A (en) * 2007-01-24 2009-07-31 Ibm Field device having an assembly clip for fastening to a fastening area.
GB2447968B (en) * 2007-03-30 2010-07-07 Transitive Ltd Improvements in and relating to floating point operations
US8386755B2 (en) * 2009-07-28 2013-02-26 Via Technologies, Inc. Non-atomic scheduling of micro-operations to perform round instruction

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2294565A (en) * 1994-10-27 1996-05-01 Hewlett Packard Co Floating point arithmetic unit
US5732005A (en) * 1995-02-10 1998-03-24 International Business Machines Corporation Single-precision, floating-point register array for floating-point units performing double-precision operations by emulation
US6138135A (en) * 1998-08-27 2000-10-24 Institute For The Development Of Emerging Architectures, L.L.C. Propagating NaNs during high precision calculations using lesser precision hardware

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2294565A (en) * 1994-10-27 1996-05-01 Hewlett Packard Co Floating point arithmetic unit
US5732005A (en) * 1995-02-10 1998-03-24 International Business Machines Corporation Single-precision, floating-point register array for floating-point units performing double-precision operations by emulation
US6138135A (en) * 1998-08-27 2000-10-24 Institute For The Development Of Emerging Architectures, L.L.C. Propagating NaNs during high precision calculations using lesser precision hardware

Also Published As

Publication number Publication date
GB2403310B (en) 2007-03-28
GB0322325D0 (en) 2003-10-22
GB0315350D0 (en) 2003-08-06
HK1068425A1 (en) 2005-04-29
TW200506720A (en) 2005-02-16

Similar Documents

Publication Publication Date Title
US10649733B2 (en) Multiply add functional unit capable of executing scale, round, getexp, round, getmant, reduce, range and class instructions
US11347511B2 (en) Floating-point scaling operation
Schwarz et al. Decimal floating-point support on the IBM System z10 processor
US20110185157A1 (en) Multifunction hexadecimal instruction form system and program product
US7299170B2 (en) Method and apparatus for the emulation of high precision floating point instructions
CN111752526B (en) Floating-point addition
EP3921942B1 (en) Encoding special value in anchored-data element
GB2600915A (en) Floating point number format
EP3912272B1 (en) Anchored data element conversion
EP2354939A1 (en) Method and apparatus providing cobol decimal type arithmetic functions with improved performance
You et al. Translating AArch64 floating-point instruction set to the x86-64 platform
US9703626B2 (en) Recycling error bits in floating point units
US6233595B1 (en) Fast multiplication of floating point values and integer powers of two
US7406589B2 (en) Processor having efficient function estimate instructions
GB2403310A (en) Method and Apparatus for the emulation of High Precision Floating Point Instructions
JP4476210B2 (en) Data processing apparatus and method for obtaining initial estimated value of result value of reciprocal operation
Zurstraßen et al. Efficient RISC-V-on-x64 Floating Point Simulation
US20250085925A1 (en) System emulation of a floating-point dot product operation
Gross Floating-point arithmetic on a reduced-instruction-set processor
WO2000048080A1 (en) Processor having a compare extension of an instruction set architecture
WO2000048080A9 (en) Processor having a compare extension of an instruction set architecture
Xenoulis et al. On-Line Periodic Self-Testing of High-Speed Floating-Point Units in Microprocessors
Stine et al. A Case for Interval Hardware on Superscalar Processors
Wagenbach The Design and Implementation of a 32-bit Arithmetic Logic Unit for an 8-bit Microcontroller as a Culminating Experience
JP2010049614A (en) Computer

Legal Events

Date Code Title Description
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1068425

Country of ref document: HK

REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1068425

Country of ref document: HK

732E Amendments to the register in respect of changes of name or changes affecting rights (sect. 32/1977)

Free format text: REGISTERED BETWEEN 20090702 AND 20090708

PE20 Patent expired after termination of 20 years

Expiry date: 20230923