WO2001023997A1

WO2001023997A1 - Units system and method with underdefined quantities

Info

Publication number: WO2001023997A1
Application number: PCT/US2000/025689
Authority: WO
Inventors: Morgan S. Mcguire
Original assignee: Curl Corporation
Priority date: 1999-09-30
Filing date: 2000-09-20
Publication date: 2001-04-05
Also published as: AU7592700A

Abstract

A system and method create and manipulate variables having both a numeric value and a units designation. The units designation is a vector of unit exponents which are operated upon consistent with operations on values. Exactly defined and underdefined quantities are stored in data structures of values and unit designations. Operations on underdefined quantities may result in expression data structures of operand quantities, operators and unit designations. The system allows the creation of variables having a unit specification and the transparent manipulation of such a variable during conventional numerical and logical operations. The system automatically signals an error condition when an operation is attempted on a set of variables having incompatible units designations. The system also includes both a predetermined dictionary of units and a customizable dictionary of units. In addition, conversion factors can be specified allowing the automatic conversion between variable of the same kind (i.e. distance) but different underlying units (i.e. yards and meters). The invention is implemented on a computer and can be an extension to a programming language resulting in such variables being considered first class data types.

Description

UNITS SYSTEM AND METHOD WITH UNDERDEFINED QUANTITIES

BACKGROUND OF THE INVENTION

Computer languages utilize variables to manipulate, reference and represent data. For instance, in a conventional programming language a variable Temperature can be created, referencing a value stored in memory that represents the temperature of an underlying process being modeled the program. The computer can perform operations on the variable, changing the value referenced by Temperature. But the variable Temperature does not have an inherent understanding of the units for the data referenced therein. In one example, the computer program simulates a process where the variable Temperature changes. The variable Temperature can be modified by the operation of the program. Temperature can be increased by addition of another value represented by the variable Increase. Temperature may represent a value stored in the Celsius scale, and Increase may represent a value stored in the Fahrenheit scale. If the computer program is not adapted correctly by the programmer to carefully monitor the units for each variable, the results from the operation of the program can be in error. In this example, if Increase is 5° C and Temperature is 0° F, and the values are simply added, the result is 5 which, in either the Celsius or Fahrenheit scale, is incorrect. Without an understanding of the units represented by each variable, the program will operate incorrectly.

As another example, take a program calculating a force on an object. The variable Object-Mass can be assigned the mass of the object. Given that F=MA, an acceleration, here Acceleration, can be applied to the object, creating a force equal to Object-Mass* Acceleration. If this were assigned to a variable Force, again the programmer would be required to know the units in which Object-Mass and Acceleration were denominated so as to create the correct force unit. SUMMARY OF THE INVENTION

In accordance with the invention, and as claimed in the above noted related application, variables are handled with both a value and a unit designation. More specifically, in accordance with this invention, exactly defined quantities are stored in data structures comprising values and unit designations, and underdefined quantities are stored in data structures comprising values and unit designations. Operations are performed on such quantities, and certain results of such operations on underdefined quantities are stored in expression data structures comprising operand quantities, operators and unit designations. Preferably, the unit designations represent exponents of units corresponding to the values. The unit designations may comprise vectors of exponents corresponding to physical units.

Unit designations may be determined from input unit strings and conversions to standard units may be determined from the unit strings in order to store the values of exactly defined quantities in the standard units.

Preferably, unit designations are checked for compatibility prior to allowing an operation.

An operation on an exactly defined quantity and an underdefined quantity may result in an underdefined quantity when the exactly defined quantity is unitless and the operation is multiplication or division. An operation on two underdefined quantities may result in an underdefined quantity when the operation is addition or subtraction. An operation on two underdefined quantities may result in an exactly defined quantity when the operation is division and the unit designations divide out.

BRIEF DESCRIPTION OF THE DRAWINGS The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention. Figure 1 illustrates a personal computer on which the invention may be implemented.

Figure 2 shows the internal structure of the personal computer of Figure 1.

Figure 3 illustrates data structures and methods of the present invention. Figure 4 illustrates the data structure associated with an exactly defined quantity of the present invention.

Figure 5 illustrates the data structure of an underdefined quantity of the invention.

Figure 6 illustrates the data structure of a quantity expression of the invention.

Figure 7 illustrates a process of converting underdefined quantities and quantity expressions to exactly defined quantities.

DETAILED DESCRIPTION OF THE INVENTION

Figure 1 shows an example of a personal computer (PC) on which the present invention may be implemented. As shown, PC 1 includes a variety of peripherals, among them being: i) network connection 2 for interfacing to a network or internet, ii) a fax/modem 4 for interfacing with telecommunication devices (not shown), iii) a display screen 5 for displaying images/video or other information to a user, iv) a keyboard 6 for inputting text and user commands and a mouse 7 for positioning a cursor on display screen 5 and for inputting user commands, and v) a set of disk drives 9 for reading from and writing to a floppy disk, a CDROM and/or a DVD. PC 1 may also have one or more local peripheral devices connected thereto, such as printer 11.

Figure 2 shows the internal structure of PC 1. As illustrated, PC 1 includes mass storage 12, which comprises a computer-readable medium such as a computer hard disk and/or RAID ("redundant array of inexpensive disks"). Mass storage 12 is adapted to store applications 14, databases 15, and operating systems 16. In preferred embodiments of the invention, the operating system 16 is a windowing operating system, such as RedHat® Linux or Microsoft® Windows98, although the invention may be used with other operating systems as well. Among the applications stored in memory 12 is a programming environment 17 and source files. Programming environment 17 compiles the source files written in a language that creates the output generated by the present invention. In the preferred embodiment of the invention, this language is Curl™, developed by Curl Corporation of Cambridge, Massachusetts. The programming language is based upon a language developed at Massachusetts Institute of Technology and presented in "Curl: A Gentle Slope Language for the Web," Worldwide Web Journal, by M. Hostetter et al., Vol II. Issue 2, O'Reilly & Associates, Spring 1997.

PC 1 also includes display interface 20, keyboard interface 21 , mouse interface 22, disk drive interface 24, CDROM/DVD drive interface 25, computer bus 26, RAM 27, processor 29, and printer interface 30. Processor 29 preferably comprises a Pentium II® (Intel Corporation, Santa Clara, CA) microprocessor or the like for executing applications, such those noted above, out of RAM 27. Such applications, including the programming environment and/or the present invention 17, may be stored in memory 12 (as above) or, alternatively, on a floppy disk in disk drive 9. Processor 29 accesses applications (or other data) stored on a floppy disk via disk drive interface 24 and accesses applications (or other data) stored on a CDROM/DVD via CDROM/DVD drive interface 25.

Application execution and other tasks of PC 1 may be initiated using keyboard 6 or mouse 7, commands from which are transmitted to processor 29 via keyboard interface 21 and mouse interface 22, respectively. Output results from applications running on PC 1 may be processed by display interface 20 and then displayed to a user on display 5 or, alternatively, output to a network via network connection 2. To this end, display interface 20 preferably comprises a display processor for forming images based on image data provided by processor 29 over computer bus 26, and for outputting those images to display 5.

The present invention is a system and method for manipulating, storing and retrieving variables that include a value and an underlying representation for the units represented by the value of the variable. Many methods exist for representing convention values in a computer and these methods are utilized in the invention. Fixed-point or floating-point representations are commonly used to encode numbers. Fixed-point numbers are encoded by storing the integer part and the fractional part of a number in an integers contiguous memory. Floating-point numbers are encoded in a logarithmic fashion, with two fixed-point parts: one to represent the mantissa and one for the exponent. For a given bit size, fixed-point representation typically has greater precision and floating point has greater range. The IEEE floating point is a standard implementation of floating point numbers. Ordinary integers can be considered a subset of fixed-point notation, where 0 bits are devoted to the fractional part of a number.

Strings are used commonly to encode Text. A string is a series of characters placed in contiguous memory cells, with either a fixed-point number indicating the number of characters, or a special, termination character at the end of the string. Pointers encode references to other values. Pointers are typically encoded as the address in memory of the value they reference. Pointers are useful for sharing a single data structure instance between multiple other structures.

We combine these basic value representations to create a new structure, a Quantity, that is suitable for encoding: Ordinary floating point numbers,

Fundamental physical quantities such as measurements of distance or time, Defined, non-physical quantities such as angular measure or percentage, Physical quantities derived from (2) and (3), such as measurements of force or voltage, and Undefined or relative quantities represented by a magnitude and a unit string, such as a number of pixels, dollars, or people.

This structure may be implemented as part of the internal mechanism of the processor, as a basic data type of a language, or as a high level class in an object oriented language. Such a representation may be of practical importance in creating programs that operate on such numbers. For example, scientific data sets and business spreadsheets often contain measurements of the types listed above. The representations described in the next section may be suitable for encoding such numbers in a compact manner that enables efficient computation and detects errors caused by operation in incompatible units. For example, 3inches + 2seconds is an error because the kind of units in the two quantities are not capable of being summed. We divide the set of Quantities into two groups - known and unknown. A known Quantity represents a physical quantity, as defined by the International System (SI). When representing a known physical quantity, operations such as conversion between different types of units, arithmetic, and comparison can be performed, and the proper handling, including error generation, of the units will occur automatically. For example, an error will be generated if you attempt to add 3cm to 2s {3cm+ 2s} because the units do not match for this operation; whereas, if the same numbers are multiplied, the units will be multiplied {3cm* 2s = 6cm*s}. A wide range of known unit types are recognized by the library, including all of the SI units.

Unknown Quantities holds a number and a "tag." When an unknown Quantity is used, a limited number of operations may be performed like reading the tag and reading the value. Context sensitive units (pixels), obscure/non-standard units (Smoots), and ill defined units (months) are not recognized and considered "unknown units" by the system. Quantities with these types of units can be constructed, but the operations that can be performed on them are limited. For example, pixels may be added to pixels, but pixels cannot be divided or multiplied by any other kind of units. To support unknown units, methods are provided to support additional conversion factors enabling conversion from unknown types of units to known types where the definition is supplied at conversion time.

Once created, Quantities may be used in equations in the same fashion as is natural for a typical numeric value or variable. Thus, the normal arithmetic operators (+, -, *, /, ^Λ) are available, and allow interoperation with other numeric types by accepting either a Quantity or a Numeric Type as an argument. Similarly, the normal comparisons (>, <, >=, <=, =) are also supported between Quantities and between Quantities and numbers.

By using a Quantity to represent a physical quantity in an equation, certain types of errors can be signaled that would not be available if a conventional variable or numeric value were used in the Quantity's place. Thus, performing an inappropriate operation (e.g. {3cm+ 4sec}) with a Quantity can result in a runtime or compile time math error, similar in spirit to the way that dividing a numeric value by 0 would result in an error. In the description below, the following technical terms are used:

Quantity: A Quantity is a value with an associated unit designation. The unit designation includes a Kind of Unit and a Type of Unit as defined below.

Kind of Unit: In general, a Quantity has a Kind of Unit that represents the nature of what the Quantity measures. Two units are of the same Kind of Unit if it is possible to convert between them. Examples of Kinds of Units include distance, time, temperature, unitless etc. Celsius and Kelvin are the same Kind of Unit (temperature) and thus each can be converted to the other representation.

Type of Unit: A Quantity also has a Type of Unit that represents the actual measurement unit for measuring the Quantity's Kind of Unit. Example Types of Units for the Kind of Unit "distance" include m (meters), in (inch), cm (centimeter) etc.

Dictionary: A list of units and their known synonyms are defined by the system for use in Quantities. Table I includes at least a partial listing of known units (and their synonyms) in the current embodiment of the invention.

none value unitless unitless-constant

% percent frame frames f rad radians radian degree deg degrees m meter meters mm millimeter millimeters cm centimeter centimeters km kilometer kilometers in inch inches ft foot feet point points pt gdim gdims mi mile miles

A ampere amp amps Amperes Ampere amperes kelvin Kelvin K kelvins Kelvins s sec seconds second ms msec milliseconds millisecond min minutes minute h hr hour hours kg kilogram Kilogram Kilograms kilograms g gram grams pound pounds lb lbs # avoirdupois-pound avoirdupois-pounds dry-pounds dry-pound candela candelas cd candle candles liter 1 L liters cc cubic-centimeter cubic-centimeters mL milliliter milliliters

Bq becquerel becquerels Hz hertz dots dot dpi dots-per-inch fps pair brace yoke pr hundred thousand million billion trillion quadrillion quintillion mole moles arcminute arcminutes arcsec arcsecond arcseconds pm picometer picometers bicron bicrons dm decimeter decimeters dkm dekameter dekameters hm hectometer hectometers angstrom angstroms didot didots agate agates cicero ciceros pica picas digit digits button-measure button-measures line lines barleycorn barleycorns nail nails palm palms hand hands pace paces shaftment shaftments link links span spans breadth cubit cubits yd yard yards fth fathom fathoms fath goad goads rod rods pole poles chain chains furlong furlongs fur shackle shackles nmi NM nautical-mile naut-mile nautical-miles league leagues aln alen arpent arpents arshin braccio

Celsius celsius

Fahrenheit fahrenheit watch watches bell bells d day days fortnight fortnights a year y years yr aeon aeons eons eon amu u atomic-mass-unit atomic-mass-units cantar cantars quintal quintals mph miles-per-hour mile-per-hour kph kilometers-per-hour kilometer-per-hour candlepower cp steradian Steradian Steradians steradians sr square-meters square-meter square-miles square-mile acre acres Acres Acre ac are ares a barn barns

CCF gallon gal gallons dry-gallon dry-gal dry-gallons imperial-gallon imperial-gallons acre-foot acre-feet ac-ft acre-inch acre-inches ac-in barrel barrels bbl bo barrel-bulk

BF fbm bd-ft board- foot board-feet bu bushels bushel amber ambers anker ankers aume aumes breakfast-cup breakfast-cups bucket buckets amagat amagats

Bq becquerel becquerels Hz hertz coulomb Coulomb coulombs Coulombs

N newton newtons Newton Newtons J joule joules

V volt Volt Volts volts watt watts Watts Watt W lumen lm lumens lux lx asb apostilb apostilbs foot-candle foot-candles fc amp-hr ampere-hour ampere-hours A-h

At ampere-turn ampere-turns dyne dynes

Ibf pdl pi poundal poundals ohm Ohm Ohms ohms kiloPascals kPa kiloPascal atm atmosphere atmospheres bar bars b

Pascals Pa Pascal pascal pascals barye ba baryes

Btu British-Trade-Unit heat-unit heat-units cal calorie calories gram-calorie small-calorie gram-calories small-calories

Calorie Calories Cal kilogram-calorie kilogram-calories large-calorie large-calories

Table I

Known Unit: A unit that is included in the Dictionary.

Unknown Unit: A unit that is not included in the Dictionary.

In one embodiment of the present invention, the system and method for variables representing Quantities is implemented in an object oriented language from Curl Corporation. The language allows the creation of objects and has similar capabilities to those found in languages such as Scheme, Lisp and C++.

A variety of structures are defined and utilized in this embodiment of the invention. The structures include:

Quantity Structure: As illustrated in Figure 3, a Quantity Structure 38,42 contains three parts: an internal-value, an intemal-units pointer and an intemal-tag pointer. The internal-tag pointer points to a string 36,34 that maintains the type of units to be displayed. The internal-units pointer points to a KindOfUnits instance, and represents the kind of units. The internal-value is a 64 bit floating point number that stores either the value of the Quantity in the SI units specified by the internal-units, or the value of the Quantity in internal-tag units if the internal-units pointer is void. The internal tag pointer is optional and is often disregarded in favor of heuristics and user preferences when printing.

KindOfUnits: A KindOfUnits 40, pointed to by a Quantity's internal-units pointer, stores the exponents on the seven SI fundamental units (meters, kilograms, seconds, candelas, Kelvin, and Ampere), the exponent on radians and an exponent on an amount of substance. A KindOfUnits measuring force (Newtons=kg*m/s²) would be represented as m¹, kg¹, s^"2, cd°, K°, A⁰, rad° and amount⁰ for a KindOfUnits vector of 11-20000. In the KindOfUnits structure, the exponent for each unit is presented in eight bits, for a total entry of 8x8=64 bits. As discussed below, uniformly designating the units by exponents presented in subunit fields for the available standard units allows mathematical operations to be performed on unit designations as they are performed on values. Such operations both check for compatibility of units in specified operations and determine resultant unit designations.

The underlying system defines a set of basic quantities such as listed in Table I. For example, the system includes a quantity centimeter. Further, the system identifies other strings which apply to that basic quantity such as cm and centimeters. The user of the program defines new quantities in terms of the basic quantities. For example, when the user identifies a quantity 3cm the system relies on the basic quantity centimeter to set up the new quantity 3cm That process is facilitated by a unit table 32 illustrated in Figure 3.

The unit table comprises pairs of key pointers and conversion value pointers. A key pointer is provided for each string identified by the system. For example, for the cm pair, the key pointer points to the string 34 "cm" Similarly, the key for the centimeter pair points to the string 36 "centimeter." The conversion value pointers for both entries point to the same Quantity 38 installed in the system. That Quantity includes a tag pointer to the string 36, the preferred string for use in printing the value unless otherwise indicated in another Quantity. The quantity also includes a value which is the scale factor to be applied to the basic SI unit meters to define the Quantity centimeter. From the basic meter Quantity one multiplies by .01 to obtain a centimeter. The units pointer points to a KindOfUnits entry 40 which is used for all distance Quantities defined in terms of meters. The first of the 8 digits in the KindOfUnits vector is the distance digit which, in SI units, is in meters. For all distance Quantities, the meters exponent is 1, and the digit in the KindOfUnits vector 40 is 1.

When the user of the language establishes a quantity 3cm, a new quantity 42 is created. The system takes the tag "cm" and walks through the unit table comparing the tag to each string to which the keys point. When the string 34 is reached, an equality is noted, and the key pointer from the unit table to that string is taken as the tag pointer in the new quantity 42. Further, the corresponding conversion value pointer in the unit table points to the centimeter quantity 38. That quantity includes the conversion factor .01 by which the value 3 is multiplied to provide a value in the quantity 42 of .03. Further, the new quantity shares the KindOfUnits of the basic quantity 38 so the units pointer of Quantity 38 is also inserted as the units pointer of the new quantity 42.

Certain conversions, such as from degrees F to degrees C cannot be made with a simple scale factor. In those cases, the conversion value pointer in the unit table is not to a Quantity but rather to a routine. That routine performs the necessary conversion and provides the appropriate pointers to the KindOfUnits and string.

Any of the quantities listed in Table I can be defined in terms of one or more of the basic KindOfUnits. For example, 1 volt = 1 kg-m/(sec²A) for a KindOfUnits vector of 11-200100. As another example, 1U.S. dry gallon is defined as 4.404884 liters and 1 liter = .1 meter³. Accordingly, a U.S. dry gallon can be defined by the value .0004404884 and the KindOfUnits vector 30000000.

The Quantity structure can also be used in situations where the system does not have an underlying basic quantity previously defined. In that case, the tag pointer serves as a definable units designation and the units pointer is void. The value in the quantity would simply be the value provided by the user in association with that definable units designation.

The conventional operations available for a numeric value are also available for a Quantity, with certain limitations as described below. Operations on Quantities

Addition and Subtraction. Quantities may be added to each other and subtracted from each other. The limitation is that the units (i.e. distance added to distance, weight to weight etc.) for each Quantity must match for the operation. While the scale used for the units need not match (i.e. meters and feet), the kind of unit must match (i.e. distance and distance). Thus, lm can be added to 1ft but lm cannot be added to Is. The scale need not match because both values are converted to SI units before the operation as discussed above.

For example, the steps performed to add two Quantities include: 1) Retrieving the units for each of the Quantities as found in either the

KindOfUnits pointed to by the internal-units or internal-tag if the internal units pointer is null; and

2) Comparing the units retrieved to ensure they are equivalent, and if so, performing the addition on the internal- values. Because of the 64 bit KindOfUnits representation, the comparison of units reduces to a simple comparison of bits in two vectors.

Multiplication and Division. Quantities may be multiplied to each other and divided from each other. Each Quantity can have different underlying units (i.e. distance multiplied by mass, distance divided by time etc.) for the operation. In addition, the scale used for the units need not match (i.e. meters and feet). Thus, lm can be multiplied by 1ft and lm can be divided by Is. The resulting Quantity contains the combined units designation (if any) resulting from the operation.

For example, the steps performed to multiply two Quantities include:

1) Retrieving the units for each of the Quantities as found in either the KindOfUnits pointed to by the internal-units or internal-tag if the internal units pointer is null; and

2) Comparing the units retrieved to determine if they are known types of units, and if so, performing the multiplication on the internal-values and performing an addition on the exponent- values found in the KindOfUnits entry pointed to by the internal-units; or 3) Comparing the units retrieved to determine if one is an unknown type of units and the other is unitless, and if so, performing the multiplication on the internal-values and using the internal-tag of the Quantity having the unknown type of units.

As an example of multiplying two known types of units, the multiplication of 3m times 3m/sec will multiply the internal values 3x3. The KindOfUnits entries will present a 1 in the distance field of each and a -1 in the time filed of only one. Adding the two entries results in a 2 in the distance field and -1 in the time filed for the proper result of nrVsec.

Logical Operations. Logical operations such as <, <=, >, >= and equal are also supported for Quantities. As in the case for the mathematical operations addition and subtraction, the units specified for each Quantity within a logical operation must match while the scale need not. Thus, lm > 1ft is allowed by lm > Is is not. For example, the steps performed to test one Quantity being greater than a second Quantity include:

1) Retrieving the units for each of the Quantities as found in either the KindOfUnits pointed to by the internal-units or in the internal-tag if the internal units pointer is null; and 2) Comparing the units retrieved to ensure they are equivalent, and if so, performing the logical operation on the internal- values.

Conversion Operations. The units designated for a given Quantity can be converted between unit systems. For instance, a distance Quantity containing a value specified in meters can be converted to feet. Printing Operations.

There are two steps to printing (including display): selecting the appropriate units to display the Quantity as, and converting the quantity to those units to print. Different sources can provide guidance to what units to use for printing. In one embodiment, the invention determines the units by looking at the following sources in the order given:

1) if the Quantity has a valid internal-tag, it is used.

2) otherwise, the Quantity is converted to the unit specification found in alternate sources. For instance, the user may have a series of preferences that are initialized when the user accesses the program. The preferences may specify a preferred default unit family (i.e. the SI units) or specify a default for each kind of unit (i.e. distance in meters). If the user preferences do not provide information for determining the units to use, then the system can look to information specifying the locale of the computer operating the system. For instance, many operating systems are able to track and specify the physical location of the machine, for instance personal computer running the Microsoft Windows operating system. By knowing the location, the invention can determine the units used for that location. So, a European based computer will utilize metric system of units while American based computers will utilize U.S. system of units. The steps to print a Quantity thus are:

1) Look up conversion factor for internal-tag in the units table. If the conversion factor is Quantity, divide x. value by conversion. value and print that number; then print x.tag. If the conversion factor is a procedure, call with the "inverse" option, that is, call the inverse conversion, and print the result; then print x.tag.

Compile Time Unit Checking

It is possible to integrate the notion of Quantity types and kinds into the type system to get compile-time checking for argument passing. The first order implementation of this, using known compile time type checking, is just to make Quantity a language type, like integer or string. Declaring an argument of language- type Quantity allows any value stored in a Quantity to be accepted by a procedure: {defme {my-proc x:Quantity} } {my-proc 3cm} || legal

{my-proc 2ft} || legal

{my-proc 7s} || legal {my-proc "hi"} || illegal; not a Quantity. Compile time error

The next level of type checking would be to make a conventional parameterized type declaration for quantities based on the kind of units. Let {Quantity-of kind-of-units} be a declaration for a language type where the Quantity type is unspecified, but the Quantity kind is equal to kind-of-units. Example:

{defme-constant distance:KindOfUnits=lm.kind-of-units}

{define {my-proc x: {Quantity-of distance}}

{my-proc 3cm} || legal

{my-proc 2ft} || legal

{my-proc 7s} || illegal; wrong kind of units. Compile time error. {my-proc "hi"} || illegal; not a Quantity. Compile time error.

Note that this requires type inferencing rules to be applied at compile time for the results of arithmetic operators. The type inferencing rules are:

{type-of {* ql q2}} = {Quantity-of {+ ql. kind-of-units {q2.kind-of-units}} {type-of {/ ql q2}} = {Quantity-of {- ql. kind-of-units {q2.kind-of-units} } {type-of {+ ql q2} } = {type-of ql } = {type-of q2} {type-of {- ql q2} } = {type-of ql } = {type-of q2} {type-of {^Λ q n} } = {Quantity-of {* ql. kind-of-units n} }

The process of inferring types as a result of arithmetic operations is a standard part of compilation. For example, when an integer and floating point number are multiplied, the compiler must assign a type to the result. Extensions for Under-defined Quantities

The basic unit handling system described above allows for two kinds of Quantities: known and unknown. These are stored in a single implementation, the Quantity class, where state bits differentiate known from unknown. In this system, a known Quantity has a known value, unit specification, a tag indicating display preferences, and conversion procedures from and to canonical units. An example of a known Quantity is 3meters. This is known because "meters" are well-defined units. An unknown quantity has only a known value and a tag identifying the type of unknown unit. An example of an unknown Quantity is 3pixels. This is unknown because there is no well-defined conversion factor from "pixels" to canonical distance units ("meters"). Known Quantities can generally be operated on using arithmetic and comparison operators. Unknown Quantities can only be operated on in very limited circumstances because there are no conversion procedures or ways of performing unit-checking for operators without any unit information. In a further embodiment of the invention, an extension is made to this basic system to allow the addition of partial definitions of units to the Units-Table, and specify ways that the resulting under-defined units can be used to provide unit- checking. This allows some under-defined units, like "pixels", to be used in a wider set of circumstances than under the basic system. The extension for under-defined units uses the same framework as the basic unit handling system, but adds more classes and redefines the Quantity class to better use Object Oriented design principles to manage the increased complexity of the more powerful system.

Introducing Under-defined Quantities allows more complex expressions to be built using the Quantity system. In this extension, there are five major language types for handling Quantities:

Quantity

ExactlyDefinedQuantity UnderDefinedQuantity UnDefinedQuantity QuantityExpression Quantity is the base-class for the other four classes, and is the only type that is visible to the programmer. The others are subclasses of Quantity that are used to indicate how much information is known about the units of a quantity. Object oriented design allows these sub-classes to override the methods of the Quantity super-class, so that all provide the same interface to the programmer.

An ExactlyDefmedQuantity is a type specifically for what were previously referred to as known Quantities. An ExactlyDefmedQuantity has a valuefl- oat in SI units, a units: KindOfUnits package, and a tag:String that maps to a conversion factor in the global Units-Table. An UnDefmedQuantity is what was previously referred to as an unknown Quantity, and contains a value: float that is not usually in SI units (i.e. it is not canonicalized), and a tag:String that does not map to a conversion factor in the global Units-Table. The kind of units is unknown for such a Quantity, so there is no property carrying unit information.

Figure 4 shows the structure of an ExactlyDefmedQuantity, along with the relevant entry from Unit-Table. The ExactlyDefmedQuantity 50 contains three storage slots. Each storage slot has a name and a type, which are formatted as name'.type in the diagram. The tag and kind-of-units slots are pointers to objects. The objects they point to are shown outside the ExactlyDefmedQuantity structure. Note that the String and the KindOfUnits objects 52 and 54 are shared between multiple data structures. These objects are immutable, so this sharing leads to memory efficiency without any danger of one structure corrupting the state of another. The Unit-Table 56 is depicted as a table with String key entries that map to value entries of type ConversionF actor. As discussed in previous sections, a ConversionFactor 58 may be either a Quantity or a procedure. Both forms specify a KindOfUnits, so that pointer is drawn coming out of the conversion factor box. Note that the ExactlyDefmedQuantity and the ConversionFactor to which its tag- maps in the UnitTable both point at the same KindOfUnits.

An UnderDefinedQuantity has the same representation as an ExactlyDefmedQuantity, but has different semantics. In the Unit-Table, the tag of an UnderDefinedQuantity does not map to a conversion factor; instead it maps to a KindOfUnits. Like the UnDefmedQuantity, the value property is not canonicalized. Such a Quantity is essentiallt an UnDefmedQuantity where a small amount of extra information is available: the kind of units the tag declares.

UnderDefinedQuantities are used when it is known what kind of units a unit represents, but the conversion factor is subject to change. For example, a number in pixels always measures distance, but a conversion from pixels to meters requires additional information: the screen resolution. Monetary units fall into this category as well because they are subject to changing currency conversion rates. A unit like "pixels" can be defined in the Unit-Table to map to distance units, but with no conversion factor. In a system with such a definition in the Unit-Table, 3pixels is an UnderDefinedQuantity, not a UnDefmedQuantity, and more operations can be performed on it.

Figure 5 shows the internal structure of an UnderDefinedQuantity 60 and a relevant entry in the Unit-Table 64. Note that the storage scheme is identical to that of the ExactlyDefmedQuantity. An ExactlyDefmedQuantity contains a tag that maps to a ConversionFactor in the Unit-Table. However, an UnderDefinedQuantity contains a tag that maps to a KindOfUnits 62 in the Unit-Table 64. The lack of a ConversionFactor prevents an UnderDefinedQuantity from being converted to canonical units or to an ExactlyDefinedQuantity. The KindOfUnits entry in the Unit-Table allows unit-checking during operations, however. A QuantityExpression defers the computation of an algebraic expression containing a mixture of ExactlyDefmedQuantities and UnderDefinedQuantities until the UnderDefinedQuantities inside it can be exactly expressed. A QuantityExpression 66, illustrated in Figure 6, contains the input to an operator: the operator itself, left-hand operand 70, and a right-hand operand 72, both of type Quantity. It also contains a pointer to a KindOfUnits structure 68 for unit checking purposes. Note that because QuantityExpression is a subclass of Quantity, QuantityExpressions can be nested. That is, the left-operand and right-operand slots of a QuantityExpression may be other QuantityExpressions. They will never be circular dependencies where following such a chain will lead around in a loop; such a structure has been corrupted and is an error. Algebra on ExactlyDefinedQuantities

The algebraic operators for multiplication, division, subtraction, negation, addition, and exponentiation can be applied to ExactlyDefinedQuantities. In each case, the result is an ExactlyDefmedQuantity. These operations are all performed in constant time. Specifically, the units or combination of units does not affect the time needed for the computation, nor does the preferred units of the arguments. The results of the operations 3(kg*m/s^Λ2) * 7(W*s) and 4seconds + 2hours can be computed in the same amount of time as 7s + 3s. Three mechanisms make this possible. First, no unit conversion occurs when performing algebra because all ExactlyDefinedQuantities are converted to canonical units when they are constructed. The Quantities 4seconds and 2hours are both stored in seconds, so no conversion needs to occur when they are added. Second, units are stored in a canonical exponential form inside of a KindOfUnits structure. The result of adding two Quantities has the same KindOfUnits as each operand. Multiplying Quantities adds the values inside the KindOfUnits structure; division subtracts. During algebraic computation, only algebraic operators are applied to KindOfUnits; no string concatenation or other linear time operation is performed to derive the KindOfUnits for the result. Third, the tag is destroyed during multiplication, division, and exponentiation, and are not restored until a heuristic generates a new tag during printing. If the tag was to be preserved during these operations, a linear time string concatenation (plus a linear time unit cancellation) would be performed. An extended implementation could trade off efficiency for usability by performing these slow operations, but the implementation described here does not.

Multiplication eq₃= eq,*eq₂: eq_j . value=eq , .value* eq₂.value eq₃.units=eq₁.units+eq₂.units eq₃.tag=nul Division eq₃=eq₁/eq₂: eq₃.value=eq, .value/eq₂.value eq₃.units=eq, .units-eq_j.units eq₃.tag=null

Addition eq₃=eq,+eq₂: assert(eq, .units=eq₂.units) eq₃.value=eq₁.value+eq₂.value eq₃.units=eq,. units eq₃.tag=eq,.tag

Subtraction eq₂=eq,-eq₂: assert(eq, .units=eq₂.units) ew₃.value=eq₁.vaιue-eq₂.value eq₃units=eq₁.units eq₃.tag=eq,.tag

Negation eq₂=-eq,: eq^value—eq,. value eq₃.units=eq , . units eq₃.tag=eq,.tag

Exponentiation eq₃=eq,^Λeq₂: assert(eq₂.units=UNITLESS) eq₃.value=eq, .value^Aeq₂.value eq₂.units=eq₁.units*eq₂. value eq₃.tag=null Algebra with Mixed Subclasses

Algebraic operators can also be used with any combination of ExactlyDefinedQuantities, UnderDefmedQuantitiess, and QuantityExpressions. When an UnderDefinedQuantity is involved in a computation, whether explicitly or nested inside a QuantityExpression, there is not enough information to actually perform the algebra and produce a single ExactlyDefmedQuantity as a result. For example, the expression 3pixels + 1 meter cannot be fully evaluated into a single ExactlyDefmedQuantity until a conversion factor from pixels to canonical distance units (meters) is known. Because of this lack of information, the results of many algebraic operations on mixed Quantity subclasses result in QuantityExpressions. A QuantityExpression simply maintains pointers to the arguments to the operator until it is provided with conversion factors for all of the UnderDefinedQuantities and forced to evaluate itself. The following table gives the result type of an algebraic operation on mixed types. Note that almost all cases result in a QuantityExpression. Although the exact value of a QuantityExpression is not known until conversion factors are provided to it, the KindOfUnits for a QuantityExpression is known, so in each of these cases, units checking occurs at the point of the algebraic expression.

UnderDefinedQuantity when the ExactlyDefinedQuantity is unitless and the operator is multiplication or division.

UnderDefinedQuantity when the operator is addition or subtraction ExactlyDefmedQuantity when the operator is division and the UnderDefinedQuantitys have the same tag, so the units divide out. When the result of an operator is a QuantityExpression, the operator and its inputs are encapsulated in the QuantityExpression that is the result. The only computations that are performed are a check that the units of the inputs are compatible for the given operator, and determination of the units of the output. The algorithms for applying algebraic operators to mixed Quantity subclasses are given below. Not explicitly shown are the exceptional cases noted in the above table with asterisks. For these cases, a more constrained result than a QuantityExpression can be produced. These cases occur when one operand is under defined. The * case refers to an UnderDefinedQuantity being multiplied or divided by a unitless ExactlyDefmedQuantity, such as 15cows * 3. In this case, the result can be computed by multiplying the ExactlyDefinedQuantities by the UnderDefinedQuantity' s value because no understanding of the conversion factor for the UnderDefinedQuantity is required. The ** case refers to addition or subtraction between UnderDefinedQuantities with identical tags. As in the previous case, the conversion factor is not needed because no conversion takes place. Finally, the *** case refers to division between UnderDefinedQuantities with identical tags, where the units divide out. Again, this can be more constrained because no conversion factors are necessary. All of these cases are essentially optimizations and are not necessary for a naϊve implementation.

Addition qe=q,+q₂ assert(q, .units=q₂. units) qe.units=q₁. units qe.left-operand=q, qe.right=operand=q₂ qe.operator=+

Subtraction qe=q,-q₂ assert(q,units=q₂. units) qe.units=q_I-operand=q₁ qe . left-operand=q , qe.right-operand=q₂ qe.operator=-

Negation qe=q_! qe.units^q^units qe. left-operand=NULL qe.right-operand=q, qe.operator=-

Multiplication qe=q,*q₂ qe.units=q, .units+q₂.units qe.left-operand=q,

qe.operator=*

Division qe=q,/q₂ qe.units=q ιιnits-q₂.units qe . left-operand=q , qe.right-operand=q₂ qe.operator=/

Exponentiation qe=q, eq₂: assert(eq₂.units=UNITLESS) qe.units=q,.units*eq₂. value qe.left-operand=q₁ qe.right-operand=eq₂ qe.operator=^Λ Note that all of these operations are constant time and constant space.

Converting UnderDefinedQuantities and QuantityExpressions to ExactlyDefinedQuantities

UnderDefinedQuantities contain a value, unit tag, and have a known KindOfUnits. The actual conversion factor between the unit tag and canonical units is subject to change, however. For example, 3pixels is an UnderDefinedQuantity. The system can identify that this represents a distance, but how much distance is unknown. An UnderDefinedQuantity can be converted to an ExactlyDefmedQuantity by providing a conversion factor. The convert-to method performs this function.

Quantity.convert-to has the syntax:

{Quantity.convert-to [tag, [tag₂ ...]] [deflne-tag,=cf, [defιne-tag₂=cf₂ ...]]}

Strings tag_x are choices of output tags, define-tag_y are keyword argument names that provide temporary mappings of tags to ConversionF actors cf_y. The result of this method is an ExactlyDefmedQuantity with the same kind-of-units as the original object, a tag chosen from tag_x, and value of the original Quantity's value processed by appropriate ConversionFactors. For example, the result of:

{3pixels. convert-to "points" pixels=2points}

is όpoints. Semantically, the mapping of "pixels" to distance KindOfUnits in the Unit-Table is replaced with a mapping of "pixels" to 2points, 3pixels is converted to "points", then the original mapping of "pixels" to distance KindOfUnits is restored. If "pixels" had not been bound and the lookup had occurred, an error would have been thrown. Note that the syntax above, with keyword arguments and undefined numbers of arguments, is specific to the features of the Curl language. In another language, the same purposes could be accomplished using arrays of Strings and values. Figure 7 illustrates the conversion process based on the type of quantity at 74. With exactly defined quantities, no conversion is required in the process since the conversion was performed when the quantity was first defined. An underdefined quantity is simply converted at 76 with the conversion factor which has been supplied. QuantityExpressions can be converted to

ExactlyDefinedQuantities by a recursive process. First the left-operand is converted at 78; then the right-operand is converted at 80. Finally the results are combined at 82 according to operator. The process is recursive because either operand may be a QuantityExpression as well. In figure 7, this recursive relationship is denoted by a dotted line.

While this invention has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.

Claims

CLAΓMSWhat is claimed is:

1. A method of processing quantities in a data processing system comprising: storing exactly defined quantities in data structures comprising values and unit designations; storing underdefined quantities in data structures comprising values and unit designations; performing operations on quantities; and storing certain results of operations on underdefined quantities in expression data structure comprising operand quantities, operators and unit designations.

2. A method as claimed in claim 1 wherein the unit designations represent exponents of units corresponding to the values.

3. A method as claimed in claim 2 where the unit designations comprise vectors of exponents corresponding to physical units.

4. A method as claimed in claim 1 further comprising checking compatibility of unit designations for the operations.

5. A method as claimed in claim 1 further comprising determining the unit designations of exactly defined and underdefined quantities from unit strings.

6. A method as claimed in claim 5 further comprising determining conversions to standard units from unit strings, and storing values of the exactly defined quantities in the standard units.

7. A method as claimed in claim 1 wherein an operation on an exactly defined quantity and an underdefined quantity results in an underdefined quantity when the exactly defined quantity is unitless and the operation is multiplication or division.

8. A method as claimed in claim 7 wherein the units designations represent exponents of units corresponding to the values.

9. A method as claimed in claim 1 wherein an operation on two underdefined quantities results in an underdefined quantity when the operation is addition or subtraction.

10. A method as claimed in claim 9 wherein the units designations represent exponents of units corresponding to the values.

11. A method as claimed in claim 1 wherein an operation on two underdefined quantities results in an exactly defined quantity when the operation is division and the unit designations divide out.

12. A method as claimed in claim 11 wherein the units designations represent exponents of units corresponding to the values.

13. A data structure representing quantities to be operated upon in a data processing system comprising: exactly defined quantities comprising values and unit designations; underdefined quantities comprising values and unit designations; and expression quantities comprising operand quantities, operators and unit designations.

14. A data structure as claimed in claim 13 wherein the units designations represent exponents of units corresponding to the values.

15. A data structure as claimed in claim 14 where the unit designations comprise vectors of exponents corresponding to physical units.

16. A data structure as claimed in claim 13 further comprising a table for providing unit designations from unit strings.

17. A data structure as claimed in claim 16 further comprising a table for providing conversion to standard values from unit strings.

18. An electromagnetic waveform comprising computer program code, the computer program code comprising a data structure representing quantities, the data structure including: exactly defined quantities comprising values and unit designations; underdefined quantities comprising values and unit designations; and expression quantities comprising operating quantities, operaters and unit designations.

19. A computer program product comprising: a computer usable medium for storing data; a set of computer program instructions embodied on the computer usable medium including instructions to: store exactly defined quantities in data structures comprising values and unit designations; store underdefined quantities in data structures comprising values and unit designations; perform operations on quantities; and store certain results of operations on underdefined quantities in expression data structure comprising operand quantities, operators and unit designations.