[go: up one dir, main page]

HK1173789B - Instruction optimization - Google Patents

Instruction optimization Download PDF

Info

Publication number
HK1173789B
HK1173789B HK13100633.8A HK13100633A HK1173789B HK 1173789 B HK1173789 B HK 1173789B HK 13100633 A HK13100633 A HK 13100633A HK 1173789 B HK1173789 B HK 1173789B
Authority
HK
Hong Kong
Prior art keywords
instructions
recorded
query
instruction
optimization
Prior art date
Application number
HK13100633.8A
Other languages
Chinese (zh)
Other versions
HK1173789A1 (en
Inventor
B.德斯梅特
H.J.M.梅杰
Original Assignee
微软技术许可有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US12/966,536 external-priority patent/US20120151187A1/en
Application filed by 微软技术许可有限责任公司 filed Critical 微软技术许可有限责任公司
Publication of HK1173789A1 publication Critical patent/HK1173789A1/en
Publication of HK1173789B publication Critical patent/HK1173789B/en

Links

Description

instruction optimization
Technical Field
The invention relates to instruction optimization.
Background
A computer program is a set of instructions that describes operations (i.e., acts) performed by a computer or other processor-based device. When the computer program is loaded and executed on computer hardware, the computer will behave in a predetermined manner following the instructions of the computer program. Thus, a computer becomes a special purpose machine that performs tasks described by instructions.
Programmers using one or more programming languages create the instructions that make up the computer program. Typically, source code is specified or edited manually by a programmer and/or with the help of an Integrated Development Environment (IDE) that includes various development services (e.g., editor, debugger, auto-fill, intelligent assistance). MakingBy way of example, a programmer may choose to use an object-oriented programming language (e.g.,source code is implemented where program logic is specified as interactions between instances of classes or objects, etc. The source code may then be compiled or otherwise transformed into another form for execution by a computer or similar device.
A compiler conventionally generates code for a particular object from source code. For example, some compilers transform source code into native code for execution by a particular machine. Other compilers generate intermediate code from source code, where the intermediate code is then dynamically interpreted or just-in-time (JIT) compiled at runtime, for example, to facilitate execution on a computer platform. Typically, most optimizations of a program are performed at compile time when the source code is compiled into native or intermediate code. However, limited program optimization may also be performed at runtime during code interpretation or JIT compilation.
Disclosure of Invention
The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed subject matter. This summary is not an extensive overview. It is not intended to identify key/critical elements or to delineate the scope of the claimed subject matter. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.
Briefly, the present invention generally relates to instruction optimization. More specifically, instead of executing program instructions eagerly at runtime, execution may also be deferred and instructions may be recorded. Subsequently or concurrently, the recorded instructions may be optimized using local and/or global optimization techniques. For example, instructions may be removed, reordered, and/or combined based on other recorded instructions. When an instruction needs to be executed, such as to provide a result, an optimized set of instructions is executed that is no worse than the original set of instructions by some metric (e.g., run time, amount of memory).
To the accomplishment of the foregoing and related ends, certain illustrative aspects of the claimed subject matter are described herein in connection with the following description and the annexed drawings. These aspects are indicative of various ways in which the subject matter may be practiced, all of which are within the scope of the claimed subject matter. Other advantages and novel features of the invention may become apparent from the following detailed description when considered in conjunction with the drawings.
Drawings
FIG. 1 is a block diagram of an instruction optimization system.
FIG. 2 is a block diagram of a representative optimization component.
FIG. 3 is a block diagram illustrating the components of an instruction optimization system.
FIG. 4 graphically depicts query operators encoded as types.
FIG. 5 is a flow chart diagram of a method of instruction optimization.
FIG. 6 is a flow diagram of a method of enabling runtime instruction optimization.
FIG. 7 is a schematic block diagram illustrating a suitable operating environment for aspects of the present disclosure.
Detailed Description
The following details are generally directed to instruction optimization. The instructions may be recorded and transformed at runtime prior to execution to improve performance of the specified operations. Such transformations may involve removing, reordering, and/or combining instructions. In other words, execution may be deferred by recording the operations that need to be performed and optimizing the operations prior to execution rather than performing the operations immediately. This may be referred to as immediate instruction optimization. Further, such functionality may correspond to instruction virtualization in that a layer of indirection is included for specified instructions and instructions that are actually executed. According to one embodiment, optimization may be performed locally, on a small set of instructions (e.g., a peephole or window). Additionally or alternatively, a larger, i.e., more global, optimization approach may be employed.
Various aspects of the present disclosure will now be described in more detail with reference to the appended drawings, wherein like reference numerals designate like or corresponding elements throughout. It should be understood, however, that the drawings and detailed description relating thereto are not intended to limit the claimed subject matter to the particular form disclosed. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the claimed subject matter.
Referring initially to FIG. 1, an instruction optimization system 100 is shown. As shown, the instruction optimization system 100 receives, retrieves, or otherwise obtains or acquires instructions, or in other words an instruction stream (also referred to as a stream of instructions), and outputs an optimized instruction stream. Such optimization may be performed at runtime, before execution, initiated by an internal or external trigger. Further, the instruction optimization system includes a recording component 110 and an optimization component 120.
The recording component 110 can receive, retrieve, or otherwise obtain or obtain a stream of instructions, or a series of instructions specifying one or more actions to be performed, for example, and record the instructions as they are obtained. In a sense, an instruction buffer is created in which instructions are recorded but not executed. The instructions may be recorded on any computer-readable medium.
The optimization component 120 can transform the recorded instructions into an optimized form, e.g., according to algebraic properties, etc. (e.g., domain-specific information, cost). As previously described, optimization may be triggered by internal or external triggers or events. By way of example and not limitation, optimization may be triggered when a determined number of instructions are logged and/or upon a request for a result produced by the instructions. Upon the occurrence of one or more triggering events, optimization component 120 can transform the recorded instructions into a preferred form to facilitate optimized execution of its specified actions.
Turning attention to FIG. 2, a representative optimization component 120 is depicted. The optimization component 120 can include a number of subcomponents that perform optimization operations, including but not limited to: a remove component 210, a reorder component 220, and a combine component 230. The removal component 210 can remove or delete instructions. For example, if there is an instruction to add an element to a list and then remove the same element from the list, the removal component 210 can remove both instructions because the actions cancel.
The reordering component 220 may reorder the instructions to optimize the computation. In other words, there may be a computational cost associated with the instruction set arrangement that the reordering component 220 may attempt to minimize. For example, execution may be improved by filtering a data set before performing some action, as the data set will likely be reduced by the filtering. More specifically, if the instruction indicates that the sort operation (e.g., OrderBy) is to be performed before the filter operation (e.g., Where), the instruction may be reversed such that the filter operation is performed before the sort operation, such that the sort operation is performed with respect to the potentially reduced data set.
The combining component 230 may combine (in other words, splice) two or more instructions into a single instruction. More specifically, a new instruction may be generated that captures multiple instructions and then other instructions may be removed. For example, instead of performing multiple filtering operations that require traversing a data set multiple times, the filtering operations may be combined such that the data set only needs to be traversed once.
Returning to FIG. 1, the instruction optimization system 100 may operate at runtime prior to execution. Instead of executing instructions immediately, execution may be deferred and instructions may be recorded and optimized. For clarity and ease of understanding, consider the following analogy. Assume that an individual (e.g., a person) is to add three dollar amounts (such as $2.50, $0.25, and $1.50) together. The individual may simply add the amounts together as provided (e.g., $2.50+ $0.25 $2.75, $2.75+ $1.50 $ 4.25). However, this calculation can be made easier by delaying the calculation until all values to be added together are obtained, reordering the values, and then performing the calculation. In particular, it is often easier for one to sum up a half dollar (e.g., $0.50) other score than dollars (e.g., $0.75, $ 0.25.). Thus, instead of performing an addition to the amount seen, only the value may be recorded. The amount may then be reordered into $2.50, $1.50, and $0.25, and calculated (e.g., $2.50+ $1.50 $4.00, $4.00+ $0.25 $ 4.25). The same results are obtained in a manner that is relatively easy to calculate. Instruction optimization system 100 may provide similar functionality for any machine-executable instruction.
Optimization via the optimization component 120 can also be performed at various levels of granularity. According to one embodiment, optimization may be performed for a small set of instructions (e.g., a peephole or window). For example, optimization may be triggered after each instruction is fetched for the previous "N" adjacent instructions, where "N" is a positive integer. Additionally or alternatively, a more global approach may be employed in which optimization is performed over a large set of instructions. For example, optimization may be initiated immediately prior to execution, such as when requesting results produced from recorded instructions. In one embodiment, simpler optimizations may be specified for execution on a small set of instructions, while more complex optimizations may be specified for execution on a larger set of instructions to leverage the aggregated knowledge about the instructions. Of course, this is not required. In practice, the optimizations are highly configurable so that a user can specify which optimizations to perform and when they should be performed.
The functionality provided by the instruction optimization system 100 may be implemented in a variety of different ways. In one example, dynamic dispatching may be utilized where the results of an operation expose an object (e.g., a virtual method) with specialized behavior for successive operations. Similarly, a state machine may be employed in which acquisition of additional knowledge via instructions moves from one node to another based on knowledge (in other words, state). Of course, these are just two implementations contemplated. Other implementations are possible and will be apparent to those skilled in the art.
The instruction optimization system 100 can be used alone or in combination with other instruction optimization systems. More specifically, instruction optimization system 100 may include a plurality of instruction optimization subsystems. As shown in FIG. 3, the instruction optimization system 100 may include two additional instruction optimization subsystems 310 and 320. The instruction optimization system 100 may delegate instructions to the subsystems 310 and 320 to, for example, allow parallel processing of instruction streams. Further, the instruction optimization subsystem 310 may delegate instruction optimization to another instruction optimization subsystem 312. In other words, the instruction optimization system is synthetic and accordingly supports parallel as well as recursive processing and the like.
By way of example, and not limitation, the instructions may relate to graphics, and more particularly to rendering polygons. The instructions fed into the instruction optimization system 100 may be divided and distributed to the instruction optimization subsystems 310 and 320, which may, for example, render triangles. Thus, the execution of rendering a polygon may be virtualized as it is divided into simpler things, or multiple triangles may be rendered to form a polygon. Also, optimization can be done if it is determined that two polygons overlap, which can result in an optimal splitting (rending) of only one polygon.
According to an exemplary embodiment, optimization may be performed for query instructions (i.e., operators) that include a query expression, such as, but not limited to, a language integrated query (LINQ) expression. In a high-level language (such asAndthe specified query expression may benefit from an optimization strategy that works independently of the back-end query language targeted by the query provider (e.g., transactional SQL).
At the local level, optimization may be performed on query operators, which are expressed as methods that implement the function of a specified operator (e.g., Select, where.). Moreover, semantic attributes of the query operator can be utilized to assist in optimization. Consider the following query expression:
fromxinxswherex%2==0wherex%3==0selectx+1
the query expression is translated into three query operator method calls (e.g., Where, Select). A naive implementation of these operators would result in three iterative operations being created and executed. All these operations iterate separately on the source sequence, two performing filtering (e.g., using if statements) and the other performing projection (e.g., by producing a result that calls the select function "x + 1"). To optimize this expression, "where" filtering can be combined, and the projection "select" can be performed as part of the same iteration code, as follows:
Foreach(varxinxs)
if(x%2==0&&x%3==0)
yieldreturnx+1;
even if these optimizations work, much more local optimizations can be done at the query operator level.
Note that the ability to combine queries can easily lead to sub-optimal queries. In particular, nested query expressions can be written in an indirect manner, much like creating views in a database product:
since queries are the first type of objects that can be transmitted, queries such as above can reside in different places, resulting in the use of neighboring query operators that are not immediately apparent. For example, the above, "cheepproducts" establishes an ascending sort of prices, while "noncountTopTosy" applies another sort, actually making the former sort redundant.
Appendix A provides some exemplary attributes that hold for at least the query operator. For the most part, these properties allow local optimization based thereon and typically allow folding of two operators into one operator. These and other optimizations can be achieved by using a virtual dispatch mechanism, where the result of the sequence operator exposes objects with specialized behavior applied by the continuous sequence operator. For example, the following example code shows how the result of an "OrderBy" call works with "Where" and "OrderBy" operations that follow the call directly:
each of these operators covers a virtual method on the base class "Sequence < T >":
the sequence object holds its left-hand sequence object (e.g., the object to which the operator represented by the type applies). The subclasses may override the provided virtual query operator methods for local optimization. "Sequence < T >" implements "ieee numeric < T >" the implementation of which is provided via an abstract attribute called "Source". Here, the sequence operations may be rewritten in accordance with the LINQ query. There are different strategies for creating these "Sequence < T >" objects. For example, the "Sequence < T >" object may be utilized within an existing "IEnumerable < T >" extension method, or may allow a user to explicitly move into the "optimized Sequence" world, such as utilizing an extension method on "IEnumerable < T >".
FIG. 4 graphically illustrates how operators can be coded as types and how the underlying "IEnumerable < T >" object can be adapted to reflect the optimization of the query operator being invoked. The example illustrates the use of "OrderBy" and "Where" clauses:
fromxinsourceorderbyk1orderbyk2wheref1selectx
the query expression is converted into:
fromxinsourcewheref1orderbyk2selectx
more specifically, at query execution time, the "source" 400 is encapsulated by its optimized version, i.e., "Sequence < T >" 410. Operations on the data can now be performed on "Sequence < T >" 410 using a method that covers the virtual method of base class "Sequence < T >" 410. When the first operator of the query expression "OrderBy" is seen, the object "OrderedSequence < T >" 420 is generated, which captures the "OrderBy" query operator with respect to the first key selector "k 1". When the second "OrderBy" operator is seen with respect to the second key selector "k 2", another "OrderedSequence < T >" object 430 cancels the first ordering and replaces it with a second ordering. In other words, the query execution plan for two "OrderBy" operations includes only the last "OrderBy" operation, since the first ordering can be eliminated as redundancy. Subsequently, when identifying the "where" operator, "OrderedSequence < T >" 440 can exchange the order of filtering and sorting to potentially restrict the data set before sorting is performed. In other words, a query plan of two "OrderBy" operations followed by "Where" is the "Where" operator followed by the second "OrderBy" operation.
Note that optimization can be performed at query build/formulation time (e.g., after compilation but before execution) as a result of invoking the query operator method. The technique can be used for various query Application Programming Interfaces (APIs) not just under "IEnumerable < T >. In particular, the "Sequence < T >" layer is an abstraction over the rewrite of operations and their relative ordering. Simply replacing another type that supports similar operators with a "source" property type would be sufficient to rewrite the operations applied to it. For example, the technique can be used to optimize query operators on the form "IEnumeralbe < T >" or "" IObservable < T > "or their respective identical expressions (homo-iconic)" IQUEble < T > "and" IQbservable < T > ". For the latter form, the underlying query provider will be provided with a pre-optimized query in terms of high-level query operations.
To illustrate the general nature of the rewrite mechanism, a similar set of types may be constructed that are isomorphic with "System. string" operations, e.g., to eliminate conflicting operations (e.g., those that may be cancelled).
In fact, the general mode is a lazy operation with triggering optimized computations. In the "String" example above, it is "ToUpper". In the "Sequence < T >" case, the enumeration of the "Source" attribute triggers the optimization computation.
Moreover, the optimization mechanism of the present invention is advantageous for any immutable type. A problem with the immutable type is that whenever something corresponding to a change or change needs to be done, a new "thing" (e.g., object, element) needs to be created. For example, there may be many instructions related to creating a new thing, deleting a thing, and creating a new thing. Instead of performing several allocations and deallocations for immutable, all changes can be recorded and then used to create only a single new immutable that captures all changes until the point of distribution.
The above systems, architectures, environments, etc. have been described with reference to interaction between several components. It should be understood that such systems and components may include those components or sub-components specified therein, some of the specified components or sub-components, and/or additional components. Sub-components may also be implemented as components communicatively coupled to other components rather than included within parent components. Further, one or more components and/or sub-components may be combined into a single component providing aggregate functionality. Communication between systems, components, and/or sub-components may be implemented according to a push (push) and/or pull (pull) model. Each component may also interact with one or more other components not specifically described herein for the sake of brevity, but known by those of skill in the art.
Moreover, various portions of the above disclosed systems and the following methods may include or incorporate components, subcomponents, processes, devices, methods or mechanisms based on artificial intelligence, machine learning, or knowledge or rules (e.g., support vector machines, neural networks, expert systems, bayesian belief networks, fuzzy logic, data fusion engines, classifiers). Such components, and others, may automate certain mechanisms or processes performed thereby to make portions of the systems and methods more adaptive as well as efficient and intelligent. By way of example and not limitation, the instruction optimization system 100 can employ such mechanisms to determine or infer optimizations, e.g., from historical or contextual information.
In view of the exemplary systems described above, methodologies that may be implemented in accordance with the disclosed subject matter will be better appreciated with reference to the flow charts of fig. 5-6. While, for purposes of simplicity of explanation, the methodologies are shown and described as a series of blocks, it is to be understood and appreciated that the claimed subject matter is not limited by the order of the blocks, as some blocks may occur in different orders and/or concurrently with other blocks from what is depicted and described herein. Moreover, not all illustrated blocks may be required to implement the methodologies described hereinafter.
Referring to FIG. 5, a methodology 500 that facilitates instruction optimization is illustrated. At reference numeral 510, an instruction is identified, received, retrieved, or otherwise obtained or obtained. At reference numeral 520, the identified instruction is recorded, or in other words, the instruction is remembered in some way that it is not executed. At reference numeral 530, a determination is made whether optimization should be performed with respect to the instruction set. Such a determination may be made based on an internal or external trigger. An example, an internal trigger may be the identification of a particular number of instructions (e.g., optimized after identifying each instruction, optimized after identifying every three instructions). The external trigger may, for example, correspond to a request for data generated or manipulated by an instruction or, more specifically, an operation specified by the instruction. If it is determined that an optimization should be performed ("YES"), the method 500 may continue at reference numeral 540, where the recorded set of instructions is optimized. However, if optimization ("NO") is not desired, the method can continue to reference numeral 510, where another instruction is identified. It is to be appreciated that the method 500 can be lazy. In other words, instructions may continue to be collected, which may form part of the collective knowledge that may be utilized for optimization until execution is needed, e.g., to produce results rather than merely eagerly executing instructions upon identification of the instructions. Thus, method 500 may be referred to as performing immediate instruction optimization.
For clarity and understanding of various aspects related to the claimed subject matter, consider the following real world comparison. Suppose the homeowner instructs the contractor to paint his house and fit a new window two weeks while the homeowner is on vacation. The contractor may rush the next day to paint the house and install a new window. Alternatively, the contractor may simply notice to paint the house and install the window and complete all work just before the house owner vacates. In other words, the contractor may virtualize the instructions. However, the homeowner may think that the work will start shortly after the indication, and there is no difference to the homeowner as to when the work is performed (e.g., and how and by whom the work is performed). However, in the latter case, the efficiency of work execution may be optimized. For example, assume that a homeowner calls on his vacation and changes the paint color of the house. If the contractor has painted the house, he will have to repaint the house with the new color, i.e. double the work. However, if he has not started, he can just change the previously noted color and utilize the new color when he first and only once whitewashes the house. Further, suppose the homeowner adds an additional task such as roofing. If the contractor has completed work, he may remove all of his tools from the work site and thus need to bring them back to repair the roof. However, if the contractor performs the work lazily, he may only notice adding tasks to his list. The contractor may then wait until the homeowner has just completed all work before returning from vacation. Of course, this may also trigger the contractor to complete all tasks on the list if the homeowner decides to return home earlier. Further, it will be appreciated that the contractor need not perform all tasks in person, but rather the contractor may delegate work to one or more subcontractors who may perform similar lazy optimizations.
FIG. 6 is a flow diagram of a method 600 of enabling runtime instruction optimization. At reference numeral 610, the computer program can be analyzed. For example, the source code may be analyzed during the compilation process. At reference numeral 620, code can be injected for a program based on the analysis to support runtime optimization as previously described herein. For example, code that utilizes special types and virtual dispatches to achieve optimization can be injected or linked to a program. Alternatively, code specifying a state machine, e.g., based on algebraic property-coding optimization techniques, may be injected or linked into the program. For example, a runtime library that modifies an existing instruction implementation may be employed.
As used herein, the terms "component," "system," and "engine," as well as forms thereof, refer to a computer-related entity, either hardware, a combination of hardware and software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, an object, an instance, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computer and the computer can be a component. One or more components may reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers.
The word "exemplary" or various forms thereof is used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects or designs. Furthermore, the examples are provided solely for purposes of clarity and understanding and are not meant to limit or restrict the claimed subject matter or relevant portions of the invention in any way. It will be appreciated that a number of additional or alternative examples of varying scope could have been presented, but have been omitted for purposes of brevity.
As used herein, the term to "infer" or "inference" refers generally to the process of reasoning about or inferring states of the system, environment, and/or user from a set of observations as captured via events and/or data. Inference can be employed to identify a specific context or action, or can generate a probability distribution over states, for example. The inference can be probabilistic-that is, the computation of a probability distribution over states of interest based on a consideration of data and events. Inference can also refer to techniques employed for composing higher-level events from a set of events and/or data. Such inference results in the construction of new events or actions from a set of observed events and/or stored event data, whether or not the events are correlated in close temporal proximity, and whether the events and data come from one or several event and data sources. Various classification schemes and/or systems (e.g., support vector machines, neural networks, expert systems, bayesian belief networks, fuzzy logic, data fusion engine … …) may be employed to perform automated and/or inferred actions with respect to the claimed subject matter.
Furthermore, to the extent that the terms "includes," "including," "has," "contains," or other variants thereof are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim.
In order to provide a context for the claimed subject matter, FIG. 7 as well as the following discussion are intended to provide a brief, general description of a suitable environment in which the various aspects of the subject matter can be implemented. However, the suitable environment is only an example and is not intended to suggest any limitation as to scope of use or functionality.
While the above disclosed systems and methods can be described in the general context of computer-executable instructions of a program that runs on one or more computers, those skilled in the art will recognize that aspects also may be implemented in combination with other program modules and the like. Generally, program modules include routines, programs, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Moreover, those skilled in the art will appreciate that the above-described systems and methods can be practiced with various computer system configurations, including single-processor, multiprocessor or multi-core processor computer systems, minicomputers, mainframe computers, as well as personal computers, hand-held computing devices (e.g., Personal Digital Assistants (PDAs), telephones, watches … …), microprocessor-based or programmable consumer or industrial electronic devices, and the like. Aspects may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. However, some, if not all aspects of the claimed subject matter can be practiced on stand-alone computers. In a distributed computing environment, program modules may be located in one or both of local and remote memory storage devices.
Referring to FIG. 7, an example general purpose computer 710 or computing device (e.g., desktop, laptop, server, handheld device, programmable consumer or industrial electronics, set-top box, gaming system … …) is illustrated. The computer 710 includes one or more processors 720, memory 730, a system bus 740, mass storage 750, and one or more interface components 770. The system bus 740 communicatively couples at least the above-described system components. However, it is to be appreciated that in its simplest form, the computer 710 may include one or more processors 720 coupled to memory 730, the one or more processors 720 executing various computer-executable acts, instructions, and or components stored in the memory 730.
Processor 720 may be implemented with a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any processor, controller, microcontroller, or state machine. Processor 720 may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, a multi-core processor, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
Computer 710 may include or otherwise interact with a variety of computer-readable media to facilitate controlling computer 710 to implement one or more aspects of the claimed subject matter. Computer readable media can be any available media that can be accessed by computer 710 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media.
Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, memory devices (e.g., Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM) … …), magnetic storage devices (e.g., hard disk, floppy disk, magnetic cassettes, magnetic tape … …), optical disks (e.g., Compact Disk (CD), Digital Versatile Disk (DVD) … …), and solid state devices (e.g., Solid State Drive (SSD), flash memory drive (e.g., card, stick, key drive … …) … …), or any other medium which can be used to store the desired information and which can accessed by computer 710.
Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.
Memory 730 and mass storage 750 are examples of computer-readable storage media. Depending on the exact configuration and type of computing device, memory 730 may be volatile (such as RAM), non-volatile (such as ROM, flash memory … …), or some combination of the two. By way of example, the basic input/output system (BIOS), containing the basic routines to transfer information between elements within the computer 710, such as during start-up, may be stored in a non-volatile memory, while a volatile memory may act as an external cache memory to facilitate processing by the processor 720, and the like.
Mass storage 750 includes removable/non-removable, volatile/nonvolatile computer storage media for storage of large amounts of data relative to memory 730. For example, mass storage 750 includes, but is not limited to, one or more devices such as a magnetic or optical disk drive, floppy disk drive, flash memory, solid state drive, or memory stick.
Memory 730 and mass storage 750 may include or have stored therein operating system 760, one or more applications 762, one or more program modules 764, and data 766. Operating system 760 acts to control and allocate resources of the computer 710. Applications 762 include one or both of system and application software and may utilize management of resources by operating system 760 to perform one or more actions through program modules 764 and data 766 stored in memory 730 and/or mass storage 750. Thus, the application 762 may turn a general-purpose computer 710 into a special-purpose machine in accordance with the logic provided thereby.
All or portions of the claimed subject matter can be implemented using standard programming and/or engineering techniques to produce software, firmware, hardware, or any combination thereof to control a computer to implement the disclosed functionality. By way of example and not limitation, instruction optimization system 100 or a portion thereof can be an application 762 or form a part of an application 762, and includes one or more modules 764 stored in memory and/or mass storage 750 and data 766, the functionality of which can be implemented when executed by one or more processors 720.
According to a particular embodiment, processor 720 may correspond to a system on a chip (SOC) or similar architecture that includes, or in other words integrates, hardware and software on a single integrated circuit die. Here, processor 720 may include one or more processors and memory, etc., at least similar to processor 720 and memory 730. Conventional processors include a minimal amount of hardware and software and rely extensively on external hardware and software. By contrast, an SOC implementation of a processor is more powerful because it embeds hardware and software therein to enable specific functions with minimal or no reliance on external hardware and software. For example, the instruction optimization system 100 and/or associated functionality may be embedded within hardware in a SOC architecture.
The computer 710 also includes one or more interface components 770 that are communicatively coupled to the system bus 740 and facilitate interaction with the computer 710. By way of example, the interface component 770 can be a port (e.g., serial, parallel, PCMCIA, USB, firewire … …) or an interface card (e.g., voice, video … …), among others. In one example implementation, the interface component 770 may be embodied as a user input/output interface that enables a user to enter commands and information into the computer 710 through one or more input devices (e.g., a pointing device such as a mouse, trackball, stylus, touch pad, keyboard, microphone, joystick, game pad, satellite dish, scanner, camera, other computer … …). In another example implementation, the interface component 770 may be embodied as an output peripheral interface that provides output to a display (e.g., CRT, LCD, plasma … …), speakers, printer, and/or other computer or the like. Further, the interface component 770 may be embodied as a network interface that enables communication with other computing devices (not shown), such as over wired or wireless communication links.
What has been described above includes examples of aspects of the claimed subject matter. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the claimed subject matter, but one of ordinary skill in the art may recognize that many further combinations and permutations of the claimed subject matter are possible. Accordingly, the disclosed subject matter is intended to embrace all such alterations, modifications and variations that fall within the spirit and scope of the appended claims.
Appendix A
● adjacent where filter: where (f)1).Where(f2)==xs.Where(x=>f1(x)&&f2(x))
O also for the resulting operator like OfType
● adjacent select projection: select (p)1).Select(p2)==xs.Select(x=>p2(p1(x))
O also for resulting operators like Cast (forced conversion)
Similar to Zip, assuming there is an n-ary selector overload
● idempotent of distinction: xs, distint (), xs, distint ()
● cancellation of redundant ordering: orderby (k)1){.ThenBy(...)}*.OrderBy(k2)==xs.OrderBy(k2)
The use of descending variants of these operators also holds.
● interchangeability of ordering and filtering: orderby (k)1).Where(f1)==xs.Where(f1).OrderBy(k1)
Basic principle: ordering the reduced sequence is cheaper
● N-ary operator restore:
○xs.Concat(ys)==EnumerableEx.Concat(xs,ys)
○EnumerableEx.Concat(xs,ys).Concat(zs)==EnumerableEx.Concat(xs,ys,zs)
o is also used for similar operators with joint properties, such as Union (Union)
● propagation of quantity:
empty ()
Similar comments hold for Return (e.g., with Select), Throw (Throw) (unless followed by Catch).
● cancellation of the cancellation operator:
○xs.Reverse().Reverse()==xs
this holds only when the input sequence is not infinite.
● Skip and Take (when m is 0, n is 0):
○xs.Take(m).Take(n)==xs.Take(Math.Min(m,n))
○xs.Take(m).Skip(n)(wherem>n)==xs.Skip(n).Take(m-n)
○xs.Skip(m).Skip(n)==xs.Skip(m+n)
● reduction of intermediate allocations:
○xs.ToArray().ToArray()==xs.ToArray()
○xs.ToList().ToList()==xs.ToList()
in general, the following To operator eliminates the need for the previous To operator
● push the middle allocation down:
○xs.To[Array|List]().[Where|Select|...]==
xs.[Where|Select|...].To[Array|List]()
this holds only when the input sequence is not infinite.
● Reverse and OrderBy change the ordering direction:
○xs.OrderBy(k1).Reverse()==xs.OrderByDescending(k1)
○xs.OrderByDescending(k1).Reverse()==xs.OrderBy(k1)

Claims (10)

1. A method of optimizing instructions, comprising:
employing at least one processor (720) configured to execute computer-executable instructions stored in a memory (730) to perform the following acts:
recording a stream of instructions designated for execution, the instructions comprising language-integrated query expressions; and
prior to executing the recorded query instructions, optimizing the recorded query instructions at runtime, the optimizing being independent of the target data source query language, wherein the optimizing is triggered when a determined number of instructions are recorded and/or upon a request to: i.e. a request for the result produced by the recorded instruction.
2. The method of claim 1, wherein the recorded query instructions are incrementally optimized when adding instructions to the recorded query instructions.
3. The method of claim 2, wherein the recorded query instructions are globally optimized.
4. The method of claim 1, wherein the recorded query instructions are globally optimized.
5. The method of claim 1, further comprising deferring optimizing the logged query instructions until results produced by execution of the instructions are requested.
6. An instruction optimization system (100), comprising:
a processor (720) coupled to a memory (730), the processor (720) configured to execute the following computer-executable components stored in the memory (730):
a first component (110) configured to record a query instruction specified for execution, the query instruction comprising a language-integrated query expression; and
a second component (120) configured to optimize the recorded instructions independent of a target data storage query language, wherein the first and second components operate at runtime prior to executing the recorded instructions, wherein the optimization is triggered when a determined number of instructions are recorded and/or upon a request to: i.e. a request for the result produced by the recorded instruction.
7. The system of claim 6, the second component is configured to incrementally optimize recorded instructions as they are recorded.
8. The system of claim 7, the second component is configured to globally optimize the recorded instructions.
9. The system of claim 6, wherein the second component is configured to globally optimize the recorded instructions when one or more of the recorded instructions are to be executed.
10. The system of claim 6, wherein the second component is configured to defer optimization of the recorded instructions until results produced by execution of the instructions are requested.
HK13100633.8A 2010-12-13 2013-01-15 Instruction optimization HK1173789B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US12/966,536 2010-12-13
US12/966,536 US20120151187A1 (en) 2010-12-13 2010-12-13 Instruction optimization

Publications (2)

Publication Number Publication Date
HK1173789A1 HK1173789A1 (en) 2013-05-24
HK1173789B true HK1173789B (en) 2017-02-03

Family

ID=

Similar Documents

Publication Publication Date Title
US20120151187A1 (en) Instruction optimization
Kersten et al. Tidy Tuples and Flying Start: fast compilation and fast execution of relational queries in Umbra
US8667474B2 (en) Generation of parallel code representations
US8307337B2 (en) Parallelization and instrumentation in a producer graph oriented programming framework
US8060857B2 (en) Automated partitioning of a computation for parallel or other high capability architecture
Jouault et al. Towards incremental execution of ATL transformations
Donadio et al. A language for the compact representation of multiple program versions
US20120047495A1 (en) Execution environment support for reactive programming
Klein et al. Advances in probabilistic model checking with PRISM: variable reordering, quantiles and weak deterministic Büchi automata
US20130117288A1 (en) Dynamically typed query expressions
US20130139132A1 (en) Method and system for program building
EP2635985A2 (en) Object model to key-value data model mapping
Orchard et al. Ypnos: declarative, parallel structured grid programming
Castro et al. Farms, pipes, streams and reforestation: reasoning about structured parallel processes using types and hylomorphisms
US20120072411A1 (en) Data representation for push-based queries
Fonseca et al. Controlling the granularity of automatic parallel programs
Burghardt et al. Functional programming on top of SQL engines
Douence et al. A systematic study of functional language implementations
Franken et al. An autonomous data language
HK1173789B (en) Instruction optimization
Vasilache et al. Structured Operations: Modular Design of Code Generators for Tensor Compilers
Tetzel et al. Efficient compilation of regular path queries
Sharygin et al. Query compilation in PostgreSQL by specialization of the DBMS source code
Kahl Dependently-typed formalisation of relation-algebraic abstractions
Remmelg et al. High-level hardware feature extraction for GPU performance prediction of stencils