CN114428639A

CN114428639A - Instruction reduction method and system for bytecode instruction set

Info

Publication number: CN114428639A
Application number: CN202111600610.1A
Authority: CN
Inventors: 石玉平; 郑江东; 王幼君
Original assignee: Beijing Watchdata Co ltd
Current assignee: Beijing Watchdata Co ltd
Priority date: 2021-12-24
Filing date: 2021-12-24
Publication date: 2022-05-03
Anticipated expiration: 2041-12-24
Also published as: CN114428639B

Abstract

The invention relates to an instruction compaction method and system of a byte code instruction set. The method can optimize the instruction of assigning the constant to the register and the instruction using the register as the operand in the initial instruction into the instruction with the constant operand, or directly generate the instruction with the constant operand, convert the instruction accessing the example domain of the current example or generate the special instruction accessing the example domain of the current example, encode the index operand of the sub-constant pool in the instruction into the operation code adjacent to the instruction, determine the type of the sub-constant pool through the operation code, separate the operand with variable length in the instruction with variable length from the instruction code, generate the pseudo instruction initialized in the static domain, replace the traditional static domain component, optimize the instruction repeatedly using the register into the instruction multiplexing the register, and compound a group of instructions with similar functions and unusual functions into one instruction. The method of the invention can reduce the length of the byte code and improve the execution performance of the byte code.

Description

Instruction reduction method and system for bytecode instruction set

技术领域technical field

本发明属于指令集精简的技术领域，具体涉及一种字节码指令集的指令精简方法和系统。The invention belongs to the technical field of instruction set simplification, and in particular relates to an instruction simplification method and system for a bytecode instruction set.

背景技术Background technique

虚拟机是由一种被处理器执行的软件应用程序或指令序列产生的抽象计算机，虚拟机可以执行虚拟机支持的指令集，指令集有基于操作数栈和基于寄存器两类指令集。基于操作数栈的指令集，相比基于寄存器的指令集，采用了栈帧结构，方法调用过程中涉及入栈和出栈，字节码执行过程中也涉及操作数栈的入栈和出栈，虽然RAM开销较小，但是执行效率较低。A virtual machine is an abstract computer generated by a software application program or instruction sequence executed by a processor. The virtual machine can execute the instruction set supported by the virtual machine. The instruction set includes two types of instruction sets, operand stack-based and register-based. The instruction set based on the operand stack, compared with the instruction set based on the register, adopts the stack frame structure. The method call process involves pushing and popping from the stack, and the bytecode execution process also involves the stacking and popping of the operand stack. , although the RAM overhead is small, the execution efficiency is low.

发明内容SUMMARY OF THE INVENTION

针对现有技术中存在的缺陷，本发明的目的是提供一种基于寄存器的字节码指令集的指令精简方法和系统。本发明的方法能够将初始指令中将常数赋值到寄存器的指令与使用寄存器为操作数的指令优化为精简指令,或直接生成带常数的精简指令，将访问当前实例域的指令转换为访问当前实例域专有指令，对指令中常量池索引操作数编码到紧邻指令的操作码位置，将每一指令中长度可变的操作数与指令码分离，生成静态域初始化的伪指令，伪指令可替代传统的静态域组件，将重复使用寄存器的指令优化为复用寄存器的指令，将一组类似功能且不常用的指令复合到一条指令。通过上述步骤，可减少字节码的长度，可提升字节码的执行性能。In view of the defects existing in the prior art, the purpose of the present invention is to provide a method and system for reducing the instruction of a register-based bytecode instruction set. The method of the present invention can optimize the instructions that assign constants to registers and the instructions that use registers as operands in the initial instructions into reduced instructions, or directly generate reduced instructions with constants, and convert the instructions that access the current instance domain into accessing the current instance. Domain-specific instructions, encode the constant pool index operand in the instruction to the opcode position next to the instruction, separate the variable-length operand in each instruction from the instruction code, and generate a pseudo-instruction for static domain initialization, which can be replaced by the pseudo-instruction The traditional static domain component optimizes the instruction that reuses the register into the instruction that reuses the register, and combines a group of similar functions and less commonly used instructions into one instruction. Through the above steps, the length of the bytecode can be reduced, and the execution performance of the bytecode can be improved.

为达到以上目的，本发明采用的技术方案是：对所述初始指令中将常数赋值到寄存器的指令与使用该寄存器为操作数的指令进行优化处理，得到直接使用常数操作数的精简指令，或者直接生成使用常数操作数的精简指令；将所述初始指令中访问实例域的实例为方法的this参数对应寄存器的指令按预置指令格式进行转换，得到访问当前实例的实例域专有指令；将常量池拆分为多个子常量池，指令中的常量池索引为子常量池索引，子常量池索引有助于指令采用较短的子常量池索引的编码方式；将所述初始指令的常量池索引操作数编码到紧邻指令的操作码的位置；从而通过常量池索引的位置得到使用所述常量池索引的指令的操作码，通过所述操作码确定子常量池的类型；将所述初始指令中长度可变的操作数与指令码分离；获取所述初始指令中静态域的静态域类型及初始配置值，生成与每一静态域对应的伪指令，伪指令可替代传统的静态域组件；对重复使用寄存器的指令进行优化处理，得到复用寄存器的指令；将一组功能类似指令复合到一条指令，并使用指令中操作类型操作数指定复合后指令的操作类型，得到对应的复合多操作指令。In order to achieve the above purpose, the technical solution adopted in the present invention is: in the initial instruction, the instruction that assigns a constant to a register and the instruction that uses the register as an operand are optimized to obtain a simplified instruction that directly uses the constant operand, or Directly generate a simplified instruction using constant operands; convert the instruction of the register corresponding to the this parameter of the method in the initial instruction that accesses the instance domain of the method according to the preset instruction format, and obtain the instance domain-specific instruction for accessing the current instance; The constant pool is divided into a plurality of sub-constant pools, the constant pool index in the instruction is the sub-constant pool index, and the sub-constant pool index helps the instruction to use the encoding method of the shorter sub-constant pool index; the constant pool of the initial instruction is The index operand is encoded to the position of the opcode next to the instruction; thus, the opcode of the instruction using the constant pool index is obtained through the position of the constant pool index, and the type of the sub-constant pool is determined by the opcode; the initial instruction The variable-length operand is separated from the instruction code; the static field type and initial configuration value of the static field in the initial instruction are obtained, and a pseudo-instruction corresponding to each static field is generated, and the pseudo-instruction can replace the traditional static field component; Optimize the instructions that reuse registers to obtain instructions that multiplex registers; combine a group of instructions with similar functions into one instruction, and use the operation type operand in the instruction to specify the operation type of the combined instruction to obtain the corresponding compound multi-operation instruction.

进一步，所述将常数赋值到寄存器的指令包括算术运算和逻辑运算指令、和字面常量的比较后跳转指令、访问数组成员指令。Further, the instructions for assigning constants to registers include arithmetic operation and logical operation instructions, jump instructions after comparison with literal constants, and array member access instructions.

进一步，所述将所述初始指令中访问实例域的对象为方法的this参数对应寄存器的指令按预置指令格式进行转换，得到访问当前实例的实例域专有指令，包括：将访问实例域的对象为方法的this参数对应寄存器的指令按预置指令格式删除指定实例的操作数，得到访问当前实例的实例域专有指令；访问当前实例的实例域专有指令可进一步优化，可缺省指定待设置或者返回结果的寄存器。Further, converting the object of accessing the instance domain in the initial instruction to the instruction corresponding to the register of the this parameter of the method is converted according to the preset instruction format to obtain the instance domain-specific instruction for accessing the current instance, including: The instruction of the register corresponding to the this parameter of the method deletes the operand of the specified instance according to the preset instruction format, and obtains the instance domain-specific instruction for accessing the current instance; the instance domain-specific instruction for accessing the current instance can be further optimized and can be specified by default. The register to set or return the result to.

进一步，所述将所述初始指令的每一指令中常量池索引操作数编码到紧邻指令的操作码，包括：将每一指令中常量池索引操作数统一编码至指定码的后面；或者是，将每一指令中常量池索引操作数统一编码至指令码的前面。Further, the encoding the constant pool index operand in each instruction of the initial instruction to the opcode immediately adjacent to the instruction includes: uniformly encoding the constant pool index operand in each instruction to the back of the specified code; or, The constant pool index operand in each instruction is uniformly encoded to the front of the instruction code.

一种字节码指令集的指令精简系统，包括以下装置：第一合并处理装置，用于对所述初始指令中将常数赋值到寄存器的指令与使用寄存器为操作数的指令进行优化处理，得到直接使用常数操作数的精简指令，或者直接生成使用常数操作数的精简指令；指令转换装置，用于将所述初始指令中访问实例域的对象为方法的this参数对应寄存器的指令按预置指令格式进行转换，得到访问当前实例的实例域专有指令；常量池拆分装置，用于将常量池拆分为多个子常量池，指令中的常量池索引为子常量池索引；常量池索引操作数编码装置，用于将所述初始指令的每一指令中常量池索引操作数编码到紧邻指令的操作码；从而通过常量池索引的位置得到使用所述常量池索引的指令的操作码，通过所述操作码确定子常量池的类型；操作数分离装置，用于将所述初始指令中每一指令中长度可变的操作数与指令码分离；伪指令生成装置，用于获取所述初始指令中静态域指令的静态域类型及初始配置值，生成与每一静态域对应的伪指令；第二合并处理装置，用于对重复使用寄存器的指令进行优化处理，得到复用寄存器的指令；复合多操作指令获取装置，用于将一组功能类似的指令复合到一条指令，并使用指令中操作类型操作数指定复合后指令的操作类型，得到对应的复合多操作指令。An instruction reduction system of a bytecode instruction set, comprising the following devices: a first merging processing device for performing optimization processing on an instruction that assigns a constant to a register and an instruction that uses a register as an operand in the initial instruction, to obtain Directly using a reduced instruction of constant operands, or directly generating a reduced instruction using constant operands; an instruction conversion device for converting the object accessing the instance domain in the initial instruction to the instruction corresponding to the register of the this parameter of the method according to the preset instruction The format is converted to obtain the instance domain-specific instruction for accessing the current instance; the constant pool splitting device is used to split the constant pool into multiple sub-constant pools, and the constant pool index in the instruction is the sub-constant pool index; constant pool index operation A number encoding device for encoding the operand of the constant pool index in each instruction of the initial instruction to the opcode of the next instruction; thereby obtaining the opcode of the instruction using the constant pool index through the position of the constant pool index, and by The operation code determines the type of the sub-constant pool; the operand separation device is used to separate the variable-length operand in each instruction in the initial instruction from the instruction code; the pseudo-instruction generation device is used to obtain the initial instruction The static field type and initial configuration value of the static field instruction in the instruction generate pseudo-instructions corresponding to each static field; the second merging processing device is used to optimize the instruction of the repeated use of the register, and obtain the instruction of the multiplexed register; The compound multi-operation instruction obtaining device is used for compounding a group of instructions with similar functions into one instruction, and using the operation type operand in the instruction to specify the operation type of the compounded instruction to obtain the corresponding compound multi-operation instruction.

进一步，所述指令转换装置包括单元：寄存器操作数删除单元，用于将访问实例域的对象为方法的this参数对应寄存器的指令按预置指令格式删除指定实例的操作数，得到访问当前实例的实例域专有指令；访问当前实例的实例域专有指令可进一步优化，可缺省指定待设置或者返回结果的寄存器。Further, the instruction conversion device includes a unit: a register operand deletion unit, which is used to delete the instruction of the corresponding register of the this parameter of the method for the object of the access instance domain to delete the operand of the specified instance by the preset instruction format, and obtain the access to the current instance. Instance domain-specific instructions; accessing the instance-specific instructions of the current instance can be further optimized, and the register to be set or the result returned can be specified by default.

进一步，所述常量池索引操作数编码装置包括第一编码单元或第二编码单元：所述第一编码单元，用于将每一指令中常量池索引操作数统一编码至指定码的后面；所述第二编码单元，用于将每一指令中常量池索引操作数统一编码至指令码的前面。Further, the constant pool index operand encoding device includes a first encoding unit or a second encoding unit: the first encoding unit is used to uniformly encode the constant pool index operand in each instruction to the back of the specified code; The second encoding unit is used for uniformly encoding the constant pool index operand in each instruction to the front of the instruction code.

本发明的效果在于：采用本发明所述的方法，能够将初始指令中将常数赋值到寄存器的指令与使用寄存器为操作数的指令优化为精简指令,或直接生成带常数的精简指令，将访问当前实例域的指令转换为访问当前实例域专有指令，对指令中常量池索引操作数编码到紧邻指令的操作码，将每一指令中长度可变的操作数与指令码分离，生成静态域初始化的伪指令，根据配置项调整指令长度，将重复使用寄存器的指令优化为复用寄存器的指令，将一组类似功能的指令复合到一条指令。通过上述步骤，对初始指令进行精简后精简指令，从而可减少字节码的长度，可提升字节码的执行性能。The effect of the present invention is: by adopting the method of the present invention, the instruction of assigning a constant to the register in the initial instruction and the instruction of using the register as an operand can be optimized into a reduced instruction, or a reduced instruction with a constant can be directly generated, and the access The instruction of the current instance domain is converted into a specific instruction to access the current instance domain, the constant pool index operand in the instruction is encoded into the opcode of the next instruction, the variable-length operand in each instruction is separated from the instruction code, and a static domain is generated. The initialized pseudo-instruction adjusts the instruction length according to the configuration item, optimizes the instruction that reuses the register into the instruction of multiplexing the register, and combines a group of instructions with similar functions into one instruction. Through the above steps, the initial instruction is simplified and then the instruction is simplified, so that the length of the bytecode can be reduced, and the execution performance of the bytecode can be improved.

附图说明Description of drawings

图1是本发明所述方法的流程图；Fig. 1 is the flow chart of the method of the present invention;

图2是本发明所述系统的结构图；Fig. 2 is the structure diagram of the system of the present invention;

具体实施方式Detailed ways

下面结合附图和具体实施方式对本发明作进一步描述。The present invention will be further described below with reference to the accompanying drawings and specific embodiments.

如图1所示，一种字节码指令集的指令精简方法，该字节码指令集的指令精简方法应用于用户终端中，该字节码指令集的指令精简方法通过安装于用户终端中的应用软件进行执行，用户终端即是用于执行字节码指令集的指令精简方法以对初始代码指令进行转换的终端设备，如台式电脑、笔记本电脑、平板电脑或手机等，包括以下步骤：As shown in FIG. 1, an instruction reduction method of a bytecode instruction set, the instruction reduction method of the bytecode instruction set is applied in a user terminal, and the instruction reduction method of the bytecode instruction set is installed in the user terminal by The user terminal is a terminal device used to execute the instruction reduction method of the bytecode instruction set to convert the initial code instruction, such as a desktop computer, a laptop computer, a tablet computer or a mobile phone, etc., including the following steps:

S110、对所述将常数赋值到寄存器的指令与使用该寄存器为操作数的指令进行优化处理，得到直接使用常数操作数的精简指令，或者直接生成使用常数操作数的精简指令。S110. Perform optimization processing on the instruction that assigns the constant to the register and the instruction that uses the register as an operand to obtain a reduced instruction that directly uses the constant operand, or directly generates a reduced instruction that uses the constant operand.

初始指令由多条指令组成，每一条指令中均由操作码及操作数组成，操作码指定要执行的操作，指令中可包含一个或多个操作数，也可以不包含操作数，操作数即表示该指令所需要操作的数值。The initial instruction consists of multiple instructions. Each instruction consists of an opcode and an operand. The opcode specifies the operation to be performed. The instruction may contain one or more operands or no operands. The operands are Indicates the value of the operation required by this instruction.

其中，本实施例进行指令优化处理的上述规则可配置于上述字节码指令集中，该字节码指令集可以是基于寄存器的虚拟机指令集，该指令集用于智能SE、安全MCU芯片之类的资源受限装置，面向对象的、体系结构中立的各种程序，专门针对智能SE、安全MCU芯片之类的资源受限装置的虚拟机而设计，尽可能减少字节码的大小，不但可以减少芯片的持久化存储的需求，而且能提高代码的执行效率。The above-mentioned rules for performing instruction optimization processing in this embodiment may be configured in the above-mentioned bytecode instruction set, and the bytecode instruction set may be a register-based virtual machine instruction set, and the instruction set is used for intelligent SE and security MCU chips. Class resource-constrained devices, object-oriented, architecture-neutral programs, specially designed for virtual machines of resource-constrained devices such as intelligent SE and secure MCU chips, reducing the size of bytecodes as much as possible, not only It can reduce the demand for persistent storage of the chip, and can improve the execution efficiency of the code.

具体的，本实施例中的寄存器为16位，则指令中一个基本数据单元即表示一个16位二进制寄存器，如boolean、byte、short、引用类型以及returnAddress类型的字节码均占用一个基本数据单元，也即仅需一个寄存器；int类型的字节码需要占用两个基本数据单元，则需要两个连续编号的寄存器对进行表示。Specifically, the register in this embodiment is 16 bits, and a basic data unit in the instruction represents a 16-bit binary register. For example, the bytecodes of boolean, byte, short, reference type and returnAddress type all occupy a basic data unit , that is, only one register is required; if the bytecode of the int type needs to occupy two basic data units, it needs two consecutively numbered register pairs to represent.

可将常数赋值到指定寄存器中的指令和使用该寄存器为操作数的指令，优化为一条直接使用常数操作数的指令，或者直接生成使用常数操作数的精简指令。在一实施例中，将常数赋值到寄存器的指令包括算术运算和逻辑运算指令、和字面常量的比较后跳转指令、访问数组成员指令。An instruction that assigns a constant to a specified register and an instruction that uses the register as an operand can be optimized to an instruction that directly uses a constant operand, or directly generate a reduced instruction that uses a constant operand. In one embodiment, the instructions for assigning constants to registers include arithmetic and logical operation instructions, jump after comparison instructions with literal constants, and access array member instructions.

以加操作为例，对上述合并处理过程进行说明，下述两条未优化的指令可以优化一条精简指令。Taking the addition operation as an example, the above merging process will be described. The following two unoptimized instructions can optimize a reduced instruction.

const-c4 rC C将常数C赋值到寄存器rC中(将常数赋值到指定寄存器中的指令)；const-c4 rC C assigns the constant C to the register rC (the instruction to assign the constant to the specified register);

add rA rA rC将寄存器rA和rC中的值相加，结果赋值到寄存器rA中(使用寄存器为操作数的指令)add rA rA rC Add the values in registers rA and rC, and assign the result to register rA (instructions that use registers as operands)

进行合并处理后即可精简为add-c4 rA C将常数C和寄存器rA的值相加，结果赋值到寄存器rA中(直接使用常数操作数的指令)。其它指令进行类似合并操作以精简的方法不在逐一说明。After the merging process, it can be reduced to add-c4 rA C Add the value of the constant C and the register rA, and assign the result to the register rA (instructions that directly use the constant operand). The method of performing similar merging operations for other instructions to simplify will not be described one by one.

算术运算和逻辑运算指令如表1所示。Arithmetic and logical operation instructions are shown in Table 1.

表1Table 1

指令注记码instruction notation code 指令编码类型Instruction encoding type 指令功能说明Instruction function description add-c4 rA Badd-c4 rA B op A|Bop A|B rA＝rA+BrA=rA+B add-c8 rAA rBB CCadd-c8 rAA rBB CC op AA BB CCop AA BB CC rAA＝rBB+CCrAA=rBB+CC sub-c8 rAA rBB CCsub-c8 rAA rBB CC op AA BB CCop AA BB CC rAA＝rBB-CCrAA=rBB-CC and-c8 rAA rBB CCand-c8 rAA rBB CC op AA BB CCop AA BB CC rAA＝rBB&CCrAA=rBB&CC or-c8 rAA,rBB,CCor-c8 rAA, rBB, CC op AA BB CCop AA BB CC rAA＝rBB|CCrAA=rBB|CC xor-c8 rAA,rBB,CCxor-c8 rAA, rBB, CC op AA BB CCop AA BB CC rAA＝rBB^CCrAA=rBB^CC

和字面量的比较后跳转指令如表2所示。The jump instructions after comparison with literals are shown in Table 2.

表2Table 2

访问数组成员指令如表3所示。The access array member instructions are shown in Table 3.

表3table 3

指定常量值索引或者指定常量值索引并赋值指定常量值。由于采用上述方式进行精简处理后所得到的精简指令不需要将常数赋值到寄存器的指令，减少字节码的长度，指令执行由于无需从寄存器读出常数，可提高执行性能。Specify the constant value index or specify the constant value index and assign the specified constant value. Since the simplified instruction obtained after the simplification processing in the above manner does not require the instruction of assigning constants to the register, the length of the bytecode is reduced, and the execution performance of the instruction can be improved without reading the constant from the register.

S120、将所述初始指令中访问实例域的对象为方法的this参数对应寄存器的指令按预置指令格式进行转换，得到访问当前实例的实例域专有指令。S120: Convert the instruction in the initial instruction that accesses the instance domain to the register corresponding to the this parameter of the method according to the preset instruction format, to obtain an instance domain-specific instruction for accessing the current instance.

具体的，将访问实例域的对象为方法的this参数对应寄存器的指令按预置指令格式删除指定实例的操作数，得到访问当前实例的实例域专有指令；当前实例的实例域专有指令可进一步优化，可缺省指定待设置或者返回结果的寄存器。Specifically, the object accessing the instance domain is the instruction of the register corresponding to the this parameter of the method, and the operand of the specified instance is deleted according to the preset instruction format, and the instance domain-specific instruction for accessing the current instance is obtained; the instance domain-specific instruction of the current instance can be For further optimization, the register to be set or the result to be returned can be specified by default.

访问实例域通用格式的指令，以读对象类型的实例域为例说明：getfield-o rAArBB CCcpi指令的编码为op AA BB CC,指令的长度为4个字节，其中rAA保存返回的实例域的值，rBB为要访问实例域的对象，CC为实例域的常量池索引，具体指定是对象中的那个实例域。如果访问实例域的对象为this参数对应的寄存器，即方法的第一个参数对应的寄存器，通过设计新的指令格式，不包括rBB这个操作数(即隐含是方法的this参数指定的当前实例)，从而是指令可减少一个寄存器操作数，字节码就减少了一个字节的长度，执行指令时减少一次读寄存器可提高执行性能。指令码缺省指定结果寄存器，还可以减少一个寄存器，进一步减少了字节码的长度。The instruction to access the general format of the instance field, take the instance field of the object type as an example to illustrate: the code of the getfield-o rAArBB CCcpi instruction is op AA BB CC, the length of the instruction is 4 bytes, and rAA saves the returned instance field. value, rBB is the object to access the instance domain, CC is the constant pool index of the instance domain, and the specific instance domain in the object is specified. If the object accessing the instance domain is the register corresponding to the this parameter, that is, the register corresponding to the first parameter of the method, by designing a new instruction format, the operand rBB is not included (that is, it is implicitly the current instance specified by the this parameter of the method). ), so that the instruction can reduce one register operand, the byte code is reduced by one byte length, and the execution performance can be improved by reducing one read register when executing the instruction. The instruction code specifies the result register by default, and one register can also be reduced, further reducing the length of the bytecode.

访问当前实例的实例域专有指令如表4所示。The instance domain-specific instructions for accessing the current instance are shown in Table 4.

表4Table 4

S130、将常量池拆分为多个子常量池，指令中的常量池索引为子常量池索引。S130. Split the constant pool into multiple sub-constant pools, and the constant pool index in the instruction is the sub-constant pool index.

JavaCard虚拟机规范提到的常量池包括6种引用类型：类引用类型、实例域引用类型、静态域引用类型、虚方法引用类型、Super方法引用类型、静态方法引用类型，这6种引用类型统一放在一个常量池中，也即是将常量池拆分为6个子常量池，每一子常量池与相应子常量池索引相对应。The constant pool mentioned in the JavaCard virtual machine specification includes 6 reference types: class reference type, instance field reference type, static field reference type, virtual method reference type, Super method reference type, and static method reference type. These 6 reference types are unified In a constant pool, that is, the constant pool is divided into 6 sub-constant pools, each sub-constant pool corresponds to the corresponding sub-constant pool index.

为了在指令中使用单字节的常量池索引，本发明提出子常量池的概念，即每种引用类型单独放到一个子常量池，6个子常量池组成一个大的常量池，每个子常量池位置顺序不重要，这样每类指令中常量池索引是对应的子常量池的索引。In order to use a single-byte constant pool index in an instruction, the present invention proposes the concept of sub-constant pools, that is, each reference type is placed in a sub-constant pool separately, six sub-constant pools form a large constant pool, and each sub-constant pool The order of positions is not important, so that the constant pool index in each type of instruction is the index of the corresponding sub-constant pool.

S140、将所述初始指令的每一指令中常量池索引操作数编码到紧邻指令的操作码；从而通过常量池索引的位置得到使用所述常量池索引的指令的操作码，通过所述操作码确定子常量池的类型。S140: Encode the constant pool index operand in each instruction of the initial instruction to the opcode of the next instruction; thereby obtain the opcode of the instruction using the constant pool index through the position of the constant pool index, and obtain the opcode of the instruction using the constant pool index through the operation code Determines the type of the child constant pool.

指定常量池索引的编码位置，从该位置可以确定指令的操作码，进一步确定常量池的类型。创建实例、实例类型转换、静态域、实例域、方法调用等指令都包括常量池索引的操作数，通过将常量池索引操作数编码到紧临指令的操作码(后面或者前面，和指令编码的大小端相关)，可以通过常量池索引的位置，确定出指令码的位置，通过分析指令码可以确定指令的具体操作类型，最终可以确定访问的子常量池类型。Specifies the encoding position of the constant pool index, from which the opcode of the instruction can be determined, which further determines the type of the constant pool. Instructions such as creating instances, instance type conversions, static fields, instance fields, and method calls all include constant pool index operands by encoding the constant pool index operand into the opcode immediately following the instruction (behind or before, and the instruction-encoded operand). The position of the instruction code can be determined by the position of the constant pool index, the specific operation type of the instruction can be determined by analyzing the instruction code, and finally the type of the sub-constant pool to be accessed can be determined.

由于本实施例中将常量池拆分为多个子常量池，则在应用安装过程中，进行方法的字节码链接时带来问题，只知道指令的常量池索引位置情况下，无法确定是那一个子常量池的索引。如果规定子常量池的索引必须放到指令码的后面(也可以是前面)，就可以很容易地确定出指令码的位置，通过分析指令码可以确定指令的具体操作类型，最终可以确定访问的子常量池类型。也即本申请中可将每一指令中常量池索引操作数统一编码至指定码的后面，也可以将每一指令中常量池索引操作数统一编码至指令码的前面。Since the constant pool is divided into multiple sub-constant pools in this embodiment, during the application installation process, there will be problems when linking the bytecode of the method. If only the index position of the constant pool of the instruction is known, it is impossible to determine which An index into a subconstant pool. If it is stipulated that the index of the sub-constant pool must be placed after the instruction code (or in front of it), the location of the instruction code can be easily determined. By analyzing the instruction code, the specific operation type of the instruction can be determined, and finally the access code can be determined. Child constant pool type. That is, in the present application, the constant pool index operands in each instruction can be uniformly encoded to the back of the specified code, and the constant pool index operands of each instruction can also be uniformly encoded to the front of the instruction code.

下面访问静态域指令为例进行说明：putfield-b rAA rBB CC必须编码为op CCAA BB形式，其中CC是常量池索引，必须放到op的后面，AA和BB相对位置无关紧要。The following is an example of accessing the static field instruction: putfield-b rAA rBB CC must be encoded in the form of op CCAA BB, where CC is the constant pool index and must be placed after the op. The relative position of AA and BB is irrelevant.

S150、将所述初始指令中每一指令中长度可变的操作数与指令码分离。S150. Separate the variable-length operand in each instruction in the initial instruction from the instruction code.

具体的，可将长度可变的操作数和指令码分离，可以将长度可变的操作数统一放在方法尾部(或者其他合适位置)，指令只要包含一个固定长度的偏移量即可，从而无需考虑指令对齐问题，提高指令的读取效率和执行效率。Specifically, the variable-length operands can be separated from the instruction code, and the variable-length operands can be placed at the end of the method (or other suitable positions), and the instruction only needs to contain a fixed-length offset, so that There is no need to consider the problem of instruction alignment, and the reading efficiency and execution efficiency of instructions are improved.

如相应指令如表5所示。The corresponding instructions are shown in Table 5.

表5table 5

packedswitch指令对应的数据格式如表6所示。The data format corresponding to the packedswitch instruction is shown in Table 6.

表6Table 6

sparseswitch指令对应的数据格式如表7所示。The data format corresponding to the sparseswitch instruction is shown in Table 7.

表7Table 7

S160、获取所述初始指令中静态域的静态域类型及初始配置值，生成与每一静态域对应的伪指令。S160. Obtain the static field type and initial configuration value of the static field in the initial instruction, and generate a pseudo-instruction corresponding to each static field.

高级语言中类定义了静态域，其设置了初始化值，编译器会生成一个特殊<clinit>方法，处理静态域的初始化操作。高级语言可以定义数组类型静态域，其中数组可指定初始值；高级语言的编译器一般需要生成新建数组指令以及给数组成员赋初值的指令；高级语言可以定义有初值的基本类型静态域，高级语言的编译器一般需要生成静态域赋值的指令；目前JavaCard处理<clinit>方法的机制，代码转换工具中通过模拟执行<clinit>方法中指令，提取出静态的类型以及对应初始值后保存，供后续转换步骤使用。In the high-level language, the class defines the static field, which sets the initialization value, and the compiler generates a special <clinit> method to handle the initialization operation of the static field. High-level languages can define static fields of array types, in which arrays can specify initial values; compilers in high-level languages generally need to generate new array instructions and instructions for assigning initial values to array members; high-level languages can define basic type static fields with initial values, Compilers of high-level languages generally need to generate instructions for assigning static fields; currently, the mechanism of JavaCard processing the <clinit> method, the code conversion tool simulates and executes the instructions in the <clinit> method, extracts the static type and the corresponding initial value and saves it. For use in subsequent conversion steps.

本发明设计基于静态域的静态域类型及初始配置值生成一条特殊伪指令，首先记录每个静态域的类型和所设置的初始值，之后基于每个静态域对应的信息生成一条统一格式的伪指令。所述伪指令，是指不需要虚拟机真正执行的指令，只是在应用安装时，安装程序获取对应实例域的初始化数据，虚拟机可建立应用程序执行需要的静态域数据区，编译器通过生成对应的伪指令，可以简化后续转换工具对<clinit>方法的处理流程，并移除传统可执行程序中的静态域组件，该组件的功能是记录应用包中所有静态域的初始化信息。The present invention is designed to generate a special pseudo-instruction based on the static field type and initial configuration value of the static field. First, the type of each static field and the set initial value are recorded, and then a uniform format pseudo-instruction is generated based on the information corresponding to each static field. instruction. The pseudo-instruction refers to an instruction that does not need to be executed by the virtual machine, but when the application is installed, the installation program obtains the initialization data of the corresponding instance domain, and the virtual machine can create the static domain data area required for the execution of the application program. The corresponding pseudo-instruction can simplify the processing flow of the <clinit> method by the subsequent conversion tool, and remove the static domain component in the traditional executable program. The function of this component is to record the initialization information of all static domains in the application package.

静态域初始化指令如表8所示。The static domain initialization instructions are shown in Table 8.

表8Table 8

S170、对重复使用寄存器的指令进行优化处理，得到复用寄存器的指令。S170 , performing optimization processing on the instruction that reuses the register to obtain the instruction for multiplexing the register.

例如，可对两个寄存器中操作数进行操作，结果赋值另一个寄存器。以加操作为例说明：add rAA rBB rCC指令的功能为rAA＝rBB+rCC；如果编译器能确定后续不再使用某个操作数寄存器的当前值，假如rBB寄存器的当前值不再使用，以加操作为例，就可以生成如下指令：add rBB rCC指令的功能为rBB＝BB+rCC；add rAA rBB rCC指令的长度为4字节，add rBB rCC指令的长度为3字节。复用寄存器可减少字节码的长度，执行指令时减少了读指令的时间，可提高执行性能，同时减少了寄存器的需求。复用寄存器的指令如表9所示。For example, you can operate on operands in two registers and assign the result to another register. Take the addition operation as an example: the function of the add rAA rBB rCC instruction is rAA=rBB+rCC; if the compiler can determine that the current value of an operand register will not be used in the future, if the current value of the rBB register is no longer used, use Taking the addition operation as an example, the following instructions can be generated: the function of the add rBB rCC instruction is rBB=BB+rCC; the length of the add rAA rBB rCC instruction is 4 bytes, and the length of the add rBB rCC instruction is 3 bytes. Multiplexing registers can reduce the length of bytecodes, reduce the time to read instructions when executing instructions, improve execution performance, and reduce the need for registers. The instructions for the multiplexed registers are shown in Table 9.

表9Table 9

指令注记码instruction notation code 指令编码instruction code 指令功能说明Instruction function description add-2r rAA rBBadd-2r rAA rBB op AA BBop AA BB rAA＝rAA+rBBrAA=rAA+rBB and-2r rAA rBBand-2r rAA rBB op AA BBop AA BB rAA＝rAA&rBBrAA=rAA&rBB

S180、将一组功能类似指令复合到一条指令，并使用指令中操作类型操作数指定复合后指令的操作类型，得到对应的复合多操作指令。S180. Combine a group of instructions with similar functions into one instruction, and use the operation type operand in the instruction to specify the operation type of the combined instruction to obtain a corresponding compound multi-operation instruction.

获取到上述指令后后，即可对其中一组功能类似的指令进行复合精简处理，得到对应的复合多操作指令。在资源比较受限的设备，比如智能卡、安全元件等，其应用主要使用short数据类型的操作，int数据类型操作较少使用；指令集的设计一般操作码只使用一个字节可以表示的范围(0到255)，如果每个int数操作使用一个独立操作码，减少了后续扩展常用操作的指令的可能性。因此将一些不常用的指令，复合到一条指令，用指令中操作类型操作数指定具体的操作类型，从而减少指令集的指令码的使用，为后续扩展指令预留可使用的操作码，其中，进行复合指令精简处理的规则可配置于上述字节码指令集中。具体的，复合多操作的指令如表10所示。After the above-mentioned instructions are obtained, a group of instructions with similar functions can be compounded and simplified to obtain corresponding compound multi-operation instructions. In devices with limited resources, such as smart cards, secure elements, etc., their applications mainly use operations of the short data type, and operations of the int data type are rarely used; the design of the instruction set generally uses only the range that can be represented by one byte ( 0 to 255), if each int operation uses a separate opcode, reducing the likelihood of subsequent instructions extending common operations. Therefore, some uncommon instructions are compounded into one instruction, and the specific operation type is specified by the operation type operand in the instruction, thereby reducing the use of instruction codes in the instruction set, and reserving usable opcodes for subsequent extended instructions. Among them, The rules for performing compound instruction reduction processing can be configured in the above-mentioned bytecode instruction set. Specifically, the instructions for compound multi-operation are shown in Table 10.

表10Table 10

本实施例中，能够将初始指令中将常数赋值到寄存器的指令与使用寄存器为操作数的指令优化为精简指令,或直接生成带常数的精简指令，将访问当前实例域的指令转换为访问当前实例域专有指令，对指令中常量池索引操作数编码到紧邻指令的操作码，将每一指令中长度可变的操作数与指令码分离，生成静态域初始化的伪指令，将重复使用寄存器的指令优化为复用寄存器的指令，将一组类似功能的指令复合到一条指令。通过上述步骤，对初始指令进行精简后精简指令，从而可减少字节码的长度，可提升字节码的执行性能。In this embodiment, the instruction that assigns a constant to a register and an instruction that uses a register as an operand in the initial instruction can be optimized into reduced instructions, or a reduced instruction with constants can be directly generated, and the instruction accessing the current instance domain can be converted into accessing the current instance domain. Instance domain-specific instructions, encode the constant pool index operand in the instruction to the opcode next to the instruction, separate the variable-length operand in each instruction from the instruction code, generate a pseudo-instruction for static domain initialization, and reuse the register The instructions are optimized to multiplexed register instructions, which combine a group of instructions with similar functions into a single instruction. Through the above steps, the initial instruction is simplified and then the instruction is simplified, so that the length of the bytecode can be reduced, and the execution performance of the bytecode can be improved.

如图2所示，一种字节码指令集的指令精简系统100，该系统可配置于用户终端中，该系统用于执行前述的字节码指令集的指令精简方法的任一实施例，包括以下装置：第一合并处理装置110、指令转换装置120、常量池拆分装置130、常量池索引操作数编码装置140、操作数分离装置150、伪指令生成装置160、第二合并处理装置170和复合多操作指令获取装置180。As shown in FIG. 2 , an instruction reduction system 100 of a bytecode instruction set, the system can be configured in a user terminal, and the system is used to execute any embodiment of the foregoing instruction reduction method of a bytecode instruction set, It includes the following devices: a first merging processing device 110, an instruction converting device 120, a constant pool splitting device 130, a constant pool index operand encoding device 140, an operand separating device 150, a pseudo-instruction generating device 160, and a second merging processing device 170 and a composite multi-operation instruction acquisition device 180 .

第一合并处理装置110，用于对所述初始指令中将常数赋值到寄存器的指令与使用寄存器为操作数的指令进行优化处理，得到直接使用常数操作数的精简指令，或者直接生成使用常数操作数的精简指令。The first merging processing device 110 is configured to perform optimization processing on an instruction that assigns a constant to a register and an instruction that uses a register as an operand in the initial instruction to obtain a simplified instruction that directly uses a constant operand, or directly generates an operation that uses a constant. Number of reduced instructions.

其中所述将常数赋值到寄存器的指令包括算术运算和逻辑运算指令、和字面常量的比较后跳转指令、访问数组成员指令。The instructions for assigning constants to registers include arithmetic operation and logical operation instructions, jump instructions after comparison with literal constants, and array member access instructions.

指令转换装置120，用于将所述初始指令中访问实例域的对象为方法的this参数对应寄存器的指令按预置指令格式进行转换，得到访问当前实例的实例域专有指令。The instruction conversion device 120 is configured to convert the instruction of the register corresponding to the this parameter of the method as the object accessing the instance domain in the initial instruction according to the preset instruction format to obtain the instance domain-specific instruction for accessing the current instance.

在一具体实施例中，所述指令转换装置120包括单元：寄存器操作数删除单元，用于将访问实例域的对象为方法的this参数对应寄存器的指令按预置指令格式删除指定实例的操作数，得到访问当前实例的实例域专有指令；当前实例的实例域专有指令可缺省指定待设置或者返回结果的寄存器。In a specific embodiment, the instruction conversion device 120 includes a unit: a register operand deletion unit, which is used to delete the instruction of the register corresponding to the this parameter of the method as the object of the access instance field according to the preset instruction format to delete the operand of the specified instance. , to obtain the instance domain-specific instruction for accessing the current instance; the instance-domain-specific instruction of the current instance can specify the register to be set or return the result by default.

常量池拆分装置130，用于将常量池拆分为多个子常量池，指令中的常量池索引为子常量池索引。The constant pool splitting device 130 is configured to split the constant pool into multiple sub-constant pools, and the constant pool index in the instruction is the sub-constant pool index.

常量池索引操作数编码装置140，用于将所述初始指令的每一指令中常量池索引操作数编码到紧邻指令的操作码；从而通过常量池索引的位置得到使用所述常量池索引的指令的操作码，通过所述操作码确定子常量池的类型。The constant pool index operand encoding device 140 is used for encoding the constant pool index operand in each instruction of the initial instruction to the opcode of the next instruction; thereby obtaining the instruction using the constant pool index through the position of the constant pool index The opcode by which the type of the sub-constant pool is determined.

在一具体实施例中，所述常量池索引操作数编码装置140包括第一编码单元或第二编码单元。In a specific embodiment, the constant pool index operand encoding device 140 includes a first encoding unit or a second encoding unit.

所述第一编码单元，用于将每一指令中常量池索引操作数统一编码至指定码的后面；所述第二编码单元，用于将每一指令中常量池索引操作数统一编码至指令码的前面。The first encoding unit is used to uniformly encode the constant pool index operand in each instruction to the back of the specified code; the second encoding unit is used to uniformly encode the constant pool index operand in each instruction to the instruction the front of the code.

操作数分离装置150，用于将所述初始指令中每一指令中长度可变的操作数与指令码分离。The operand separating device 150 is used for separating variable-length operands in each instruction in the initial instruction from the instruction code.

伪指令生成装置160，用于获取所述初始指令中静态域指令的静态域类型及初始配置值，生成与每一静态域对应的伪指令。The pseudo-instruction generating device 160 is configured to obtain the static field type and initial configuration value of the static field instructions in the initial instruction, and generate a pseudo-instruction corresponding to each static field.

第二合并处理装置170，用于对重复使用寄存器的指令进行优化处理，得到复用寄存器的指令。The second merging processing device 170 is configured to perform optimization processing on the instruction of multiplexing registers to obtain the instruction of multiplexing registers.

复合多操作指令获取装置180，用于将一组功能类似的指令复合到一条指令，并使用指令中操作类型操作数指定复合后指令的操作类型，得到对应的复合多操作指令。The compound multi-operation instruction obtaining device 180 is configured to compound a group of instructions with similar functions into one instruction, and use the operation type operand in the instruction to specify the operation type of the compounded instruction to obtain the corresponding compound multi-operation instruction.

本实施例中，能够将初始指令中将常数赋值到寄存器的指令与使用寄存器为操作数的指令优化为精简指令,或直接生成带常数的精简指令，将访问当前实例域的指令转换为访问当前实例域专有指令，对指令中常量池索引操作数编码到紧邻指令的操作码，将每一指令中长度可变的操作数与指令码分离，生成静态域初始化的伪指令，根据配置项调整指令长度，将重复使用寄存器的指令优化为复用寄存器的指令，将一组类似功能的指令复合到一条指令。通过上述步骤，对初始指令进行精简后精简指令，从而可减少字节码的长度，可提升字节码的执行性能。In this embodiment, the instruction that assigns a constant to a register and an instruction that uses a register as an operand in the initial instruction can be optimized into reduced instructions, or a reduced instruction with constants can be directly generated, and the instruction accessing the current instance domain can be converted into accessing the current instance domain. Instance domain-specific instructions, encode the constant pool index operand in the instruction to the opcode next to the instruction, separate the variable-length operand in each instruction from the instruction code, generate pseudo-instructions for static domain initialization, and adjust according to configuration items The length of the instruction, which optimizes the instruction that reuses the register into the instruction that reuses the register, and combines a group of instructions with similar functions into one instruction. Through the above steps, the initial instruction is simplified and then the instruction is simplified, so that the length of the bytecode can be reduced, and the execution performance of the bytecode can be improved.

本领域技术人员应该明白，本发明所述的方法和系统并不限于具体实施方式中所述的实施例，上面的具体描述只是为了解释本发明的目的，并非用于限制本发明。本领域技术人员根据本发明的技术方案得出其他的实施方式，同样属于本发明的技术创新范围，本发明的保护范围由权利要求及其等同物限定。Those skilled in the art should understand that the method and system described in the present invention are not limited to the embodiments described in the specific implementation manner, and the above specific description is only for the purpose of explaining the present invention, not for limiting the present invention. Those skilled in the art can obtain other embodiments according to the technical solutions of the present invention, which also belong to the technical innovation scope of the present invention, and the protection scope of the present invention is defined by the claims and their equivalents.

Claims

1. an instruction reduction method of a bytecode instruction set, is characterized in that, comprises the following steps:

Optimizing the instruction that assigns a constant to a register and the instruction that uses the register as an operand in the initial instruction to obtain an instruction that directly uses a constant operand, or directly generates an instruction that uses a constant operand;

Convert the instruction of the register corresponding to the this parameter of the method whose instance of the access instance domain in the initial instruction is the method according to the preset instruction format, and obtain the instance domain exclusive instruction for accessing the current instance;

Split the constant pool into multiple sub-constant pools, and the constant pool index in the instruction is the sub-constant pool index;

Encode the constant pool index operand in each instruction of the initial instruction to the opcode position next to the instruction; thus obtain the opcode of the instruction using the constant pool index through the position of the constant pool index, and the opcode can Type of stator constant pool;

Separating the variable-length operand and the instruction code in the variable-length instruction;

Obtain the static domain type and initial configuration value of the static domain, and generate a pseudo-instruction corresponding to each static domain declaration, and the pseudo-instruction can replace the traditional static domain component;

Optimizing the instructions for reusing registers to obtain instructions for multiplexing registers;

Combine a group of instructions with similar functions and are not commonly used into one instruction, use the operation type operand in the instruction to specify the operation type of the combined instruction, and obtain the corresponding compound multi-operation instruction.

2. the instruction reduction method of a kind of bytecode instruction set as claimed in claim 1, it is characterized in that, the described instruction that constant is assigned to register comprises arithmetic operation and logical operation instruction, and jump after the comparison of literal constant instruction, access array member instruction.

3. the instruction reduction method of a kind of bytecode instruction set as claimed in claim 1, it is characterised in that described by the instance of the instance domain visited as the instruction of the this parameter corresponding register of the method is converted according to the preset instruction format , get instance domain-specific instructions for accessing the current instance, including:

Delete the operand of the specified instance according to the preset instruction format by deleting the instruction corresponding to the register corresponding to the this parameter of the method whose instance of the instance domain is the instance of the method, and obtain the instance domain-specific instruction for accessing the current instance; the instance domain-specific instruction of the current instance can be further optimized. The register to be set or the result to be returned can be specified by default.

4. The instruction reduction method of a bytecode instruction set as claimed in claim 1, wherein the constant pool index operand in each instruction with a constant pool index is encoded to the position of the opcode next to the instruction ,include:

The constant pool index operand in the instruction is uniformly encoded to the back of the specified code;

Or, uniformly encode the constant pool index operand in the instruction to the front of the instruction code.

5. An instruction reduction system of a bytecode instruction set, characterized in that, comprising the following devices:

The first merging processing device is configured to perform optimization processing on the instruction that assigns a constant to the register and the instruction that uses the register as an operand in the initial instruction to obtain an instruction that directly uses the constant operand, or directly generates an instruction that uses the constant operand instructions;

An instruction conversion device, used for converting the instruction of the register corresponding to the this parameter of the method whose instance of the access instance domain in the initial instruction is a method, according to the preset instruction format, and obtains the instance domain exclusive instruction for accessing the current instance;

The constant pool splitting device is used to split the constant pool into multiple sub-constant pools. The constant pool index in the instruction is the index of the sub-constant pool, and the sub-constant pool index helps the instruction to use a shorter sub-constant pool index for encoding. Way;

A constant pool index operand encoding device, for encoding the constant pool index operand in the instruction to the position of the operation code next to the instruction; thereby obtaining the operation code of the instruction using the constant pool index through the position of the constant pool index, The type of the sub-constant pool can be determined by the operation code;

an operand separation device, used for separating the variable-length operand in the instruction from the instruction code;

A pseudo-instruction generating device, configured to obtain the static domain type and initial configuration value of the static domain instruction, and generate pseudo-instructions corresponding to each static domain declaration, and the pseudo-instructions can replace traditional static domain components;

The second merging processing device is used to optimize the instruction of the multiplexed register to obtain the instruction of the multiplexed register;

The compound multi-operation instruction obtaining device is used for compounding a group of instructions with similar functions and not commonly used into one instruction, and using the operation type operand in the instruction to specify the operation type of the compounded instruction to obtain the corresponding compound multi-operation instruction.

6. the instruction reduction system of a kind of bytecode instruction set as claimed in claim 5, it is characterized in that, the described instruction that assigns constant to register comprises arithmetic operation and logical operation instruction, and jump after the comparison of literal constant instruction, access array member instruction.

7. The instruction reduction system of a bytecode instruction set according to claim 5, wherein the instruction conversion device comprises a unit:

The register operand deletion unit is used to delete the instruction of the register corresponding to the this parameter of the method that accesses the instance field of the method according to the preset instruction format to delete the operand of the specified instance, and obtain the instance field-specific instruction to access the current instance; the instance of the current instance Domain-specific instructions can be further optimized to specify registers to be set or return results by default.

8. The instruction reduction system of a bytecode instruction set according to claim 5, wherein the constant pool index operand encoding device comprises a first encoding unit or a second encoding unit:

The first encoding unit is used to uniformly encode the constant pool index operand in each instruction to the back of the specified code;

The second encoding unit is used for uniformly encoding the constant pool index operand in each instruction to the front of the instruction code.