CN109089140A

CN109089140A - A kind of sound control method and device

Info

Publication number: CN109089140A
Application number: CN201710448427.1A
Authority: CN
Inventors: 吴鹏鹏
Original assignee: BEIJING UNION VOOLE TECHNOLOGY Co Ltd
Current assignee: BEIJING UNION VOOLE TECHNOLOGY Co Ltd
Priority date: 2017-06-14
Filing date: 2017-06-14
Publication date: 2018-12-25

Abstract

This application involves computer technology and internet area, a kind of sound control method and device are disclosed to form dynamic voice library and, to reduce the data volume of sound bank, reduces the pressure of server.This method are as follows: intelligent terminal obtains the first scene interface currently shown, and obtains relevant first sound bank in first scene interface；First sound bank is sent to speech recognition apparatus by the intelligent terminal；The intelligent terminal receives the first phonetic order, first phonetic order is sent to the speech recognition apparatus, first phonetic order is used to indicate the speech recognition apparatus and searches in first sound bank and matched first control instruction of first phonetic order；The intelligent terminal receives first control instruction that the speech recognition apparatus returns, and executes and the first control instruction corresponding operation.

Description

A kind of sound control method and device

Technical field

The invention relates to computer technology and internet area more particularly to a kind of sound control methods and dress It sets.

Background technique

With the development of intelligent terminal technology, different types of intelligent terminal enters in the application of user.For example, intelligence electricity Depending on.In recent years, the application program on smart television is enriched constantly, and man-machine interaction mode is more and more diversified, makes smart television Center as home entertaining.Currently, other than traditional TV remote controller, voice control, gesture operation, recognition of face, The interactive modes such as touch control have all obtained different degrees of application on smart television, and every technology is continuously developed, day It is beneficial mature.The voice control of smart television is the speech recognition technology used, receives sound by Mike, then uses computer skill Art analyzes sound, is compared according to the frequency of sound, frequency spectrum with prestored instruction, finally determines performed instruction.

In the prior art, a kind of sound control method is: establishing the cloud voice server comprising different scenes interface Speech database, voice controller and cloud voice server, which are established, to be communicated to connect, and voice controller receives the voice life of user It enables, the scene interface to match with voice command is transferred from speech database, cloud voice server receives the selection of user The corresponding operation of selection instruction is instructed and executed, the interactive voice with intelligent terminal is realized by voice controller.

But the shortcomings that this scheme is exactly to need in advance to establish different scenes interface in voice server beyond the clouds Speech database, under normal circumstances, one speech database of each scene interface correspondence establishment, usual intelligent terminal are applied to Scene interface it is relatively more, the speech database controllability that will lead to different scenes interface in this way is lower, and call voice data Excessive redundant code is executed when library, in addition, the data volume of speech database is excessively huge, so as to cause cloud voice server Load it is heavier, pressure is larger.

Summary of the invention

The embodiment of the present application provides a kind of sound control method and device, to solve the language in voice control intelligent terminal The problem that the data volume of sound database is excessive and server load is heavier.

Specific technical solution provided by the embodiments of the present application is as follows:

A kind of sound control method, comprising: intelligent terminal obtains the first scene interface for currently showing, and obtains described the Relevant first sound bank in one scene interface；

First sound bank is sent to speech recognition apparatus by the intelligent terminal；

The intelligent terminal receives the first phonetic order, and first phonetic order is sent to the speech recognition and is set Standby, first phonetic order is used to indicate the speech recognition apparatus and searches in first sound bank and first language First control instruction of sound instructions match；

The intelligent terminal receives first control instruction that the speech recognition apparatus returns, and executes and described first Control instruction corresponding operation.

Optionally, first sound bank is sent to speech recognition apparatus by the intelligent terminal, comprising:

The intelligent terminal enables operation in response to the event for inputting first phonetic order, by first language Sound library is sent to speech recognition apparatus；Or

The intelligent terminal sends out first sound bank when historic scenery changing interface is to the first scene interface Give speech recognition apparatus, wherein the historic scenery interface be the intelligent terminal display first scene interface it Preceding shown scene interface.

Optionally, the execution and the first control instruction corresponding operation, comprising:

The intelligent terminal calls scripting language relevant to first control instruction, and the scripting language is for making to take Data information needed for business device provides the operation for the intelligent terminal；

Wherein, the intelligent terminal and the server submit to the application of browser/server B/S network structure mode Mode.

Optionally, after execution with the first control instruction corresponding operation, further includes:

Intelligent terminal obtains the second scene interface for updating display, and obtains relevant second language in second scene interface Sound library；

Second sound bank is sent to speech recognition apparatus by the intelligent terminal；

The intelligent terminal receives the second phonetic order, and second phonetic order is sent to the speech recognition and is set Standby, second phonetic order is used to indicate the speech recognition apparatus and searches in second sound bank and second language Second control instruction of sound instructions match；

The intelligent terminal receives second control instruction that the speech recognition apparatus returns, and executes and described second Control instruction corresponding operation.

Optionally, after obtaining corresponding first sound bank of first scene information, further includes:

The intelligent terminal caches first sound bank；

After execution with the control instruction corresponding operation, further includes:

The intelligent terminal is if it is determined that the scene interface for updating display is still first scene interface, then by the institute of caching It states the first sound bank and is sent to speech recognition apparatus；

The intelligent terminal receives third phonetic order, and the third phonetic order is sent to the speech recognition and is set Standby, the third phonetic order is used to indicate the speech recognition apparatus and searches in first sound bank and the third language The third control instruction of sound instructions match；

The intelligent terminal receives the third control instruction that the speech recognition apparatus returns, and executes and the third Control instruction corresponding operation.

A kind of phonetic controller, comprising: processing unit for obtaining the first scene interface currently shown, and obtains Relevant first sound bank in first scene interface；

Transmission unit, first sound bank for obtaining the processing unit are sent to speech recognition apparatus；

Receiving unit, for receiving the first phonetic order；

The transmission unit, is also used to for first phonetic order being sent to the speech recognition apparatus, and described first Phonetic order, which is used to indicate the speech recognition apparatus, to be searched in first sound bank and matches with first phonetic order The first control instruction；

The receiving unit is also used to receive first control instruction that the speech recognition apparatus returns；

The processing unit is also used to execute and the first control instruction corresponding operation.

Optionally, the processing unit is used for:

Operation is enabled in response to the event for inputting first phonetic order, by the transmission unit by described the One sound bank is sent to speech recognition apparatus；Or

When historic scenery changing interface is to the first scene interface, by the transmission unit by first voice Library is sent to speech recognition apparatus, wherein the historic scenery interface is that the intelligent terminal is showing first scene circle Shown scene interface before face.

Optionally, the processing unit is used for:

Scripting language relevant to first control instruction is called, the scripting language is for making the server place Data information needed for reason unit provides the operation；

Wherein, described device and the server submit to the application side of browser/server B/S network structure mode Formula.

Optionally, the processing unit is also used to, and is being executed with after the first control instruction corresponding operation, is being obtained The second scene interface of display is updated, and obtains relevant second sound bank in second scene interface；

The transmission unit is also used to, and second sound bank is sent to speech recognition apparatus；

The receiving unit is also used to, and receives the second phonetic order, second phonetic order is sent to the voice Identify equipment, second phonetic order be used to indicate the speech recognition apparatus searched in second sound bank with it is described Matched second control instruction of second phonetic order；And it receives second control that the speech recognition apparatus returns and refers to It enables；

The processing unit is also used to, and is executed and the second control instruction corresponding operation.

Optionally, further includes:

Cache unit is used for after the processing unit obtains corresponding first sound bank of first scene information, Cache first sound bank；

The processing unit is also used to, after execution with the control instruction corresponding operation, however, it is determined that updates display Scene interface be still first scene interface, then first sound bank cached the cache unit is sent to voice Identify equipment；

The receiving unit is also used to, and receives third phonetic order；

The transmission unit is also used to, and the third phonetic order is sent to the speech recognition apparatus, the third Phonetic order, which is used to indicate the speech recognition apparatus, to be searched in first sound bank and matches with the third phonetic order Third control instruction；

The receiving unit is also used to, and receives the third control instruction that the speech recognition apparatus returns；

The processing unit is also used to, and is executed and the third control instruction corresponding operation.

Detailed description of the invention

Fig. 1 is application system configuration diagram in the embodiment of the present application；

Fig. 2 is sound control method flow diagram in the embodiment of the present application；

Fig. 3 is phonetic controller structural schematic diagram in the embodiment of the present application.

Specific embodiment

Below in conjunction with attached drawing, the embodiment of the present application is described in detail.

As shown in Figure 1, including intelligent terminal 101, server 102 and voice in the system architecture of the embodiment of the present application application Identify equipment 103.Wherein, intelligent terminal 101 is used as intermediate hub, hands over respectively with server 102 and speech recognition apparatus 103 Mutually.Intelligent terminal 101 is the equipment that can be realized human-computer interaction, specifically includes the phonetic order that can receive user's input, and Relevant operation is executed according to the phonetic order.Intelligent terminal 101 can show scene interface, which includes display The elements such as each icon, button, text.For example, intelligent terminal 101 may include the handheld devices such as smart television, smart phone, Mobile unit etc..Server 102 can provide each scene interface and each scene circle needed for user for intelligent terminal 101 The corresponding scene information in face, the scene information include that the corresponding phonetic order of each element in scene interface is formed by voice Library, wherein sound bank is alternatively referred to as speech database.In a kind of possible implementation, intelligent terminal 101 and server 102 Submit to the application mode of browser/server (Browser/Server, B/S) network structure mode.Wherein, B/S network knot Structure mode is a kind of network structure mode after network (i.e. WEB) rises.Web browser, which is that intelligent terminal 101 is most important, answers Use software.B/S network structure mode focuses on the core that the system function of intelligent terminal 101 is realized on server 102, Simplify the exploitation, maintenance and use of system.Intelligent terminal 101 carries out data interaction to server 102 by browser.Voice Identification equipment 103 can regard a kind of speech recognition tools as, can more received phonetic order, obscured in sound bank Matching, to identify phonetic order.

Based on system architecture shown in FIG. 1, the embodiment of the present application provides a kind of sound control method and device, passes through intelligence Terminal provides the relevant sound bank in scene interface currently shown to speech recognition apparatus, and speech recognition apparatus is being connect The relevant control instruction of phonetic order is searched in the sound bank received.Since intelligent terminal is in the scene interface currently shown In the case where, phonetic order relevant to the scene interface currently shown can be received, therefore, speech recognition apparatus is receiving Sound bank in search the relevant control instruction of phonetic order, seeking scope can be substantially reduced, it is more enough faster more accurately to look into Look for success.Also, intelligent terminal dynamically sends the relevant sound bank in current newest scene interface, energy to speech recognition apparatus It is enough so that the newest phonetic order received and seeking scope constantly strong correlation, form a kind of dynamic voice library.In addition, intelligent terminal Without obtaining the related voice library at all scene interfaces, it is only necessary to obtain the relevant sound bank in scene interface currently shown i.e. Can, the pressure of server is reduced, so that speech control process is more flexible controllable.

Further details of introduction is made to sound control method provided by the embodiments of the present application below in conjunction with attached drawing.

As shown in Fig. 2, the detailed process of sound control method provided by the embodiments of the present application is as follows.

Step 201, intelligent terminal obtain the first scene interface for currently showing, and obtain the first scene interface relevant the One sound bank.

Specifically, intelligent terminal delays the first sound bank after obtaining relevant first sound bank in the first scene interface There are locals.

First sound bank is sent to speech recognition apparatus by step 202, intelligent terminal.

Specifically, the first sound bank is sent to speech recognition apparatus and meets following trigger condition: condition one by intelligent terminal And/or condition two.

Condition one, intelligent terminal enable operation in response to the event for inputting the first phonetic order, by the first sound bank It is sent to speech recognition apparatus.

For example, intelligent terminal is smart television, user passes through the first phonetic order of talk button key input of remote controler, when When intelligent terminal monitors to occur the operation of talk button key pressing, the first sound bank is sent to speech recognition apparatus.

First sound bank is sent to by condition two, intelligent terminal when historic scenery changing interface is to the first scene interface Speech recognition apparatus, wherein historic scenery interface is intelligent terminal scene circle shown before showing the first scene interface Face.

It is, intelligent terminal once monitor display scene changing interface to the first scene interface, will trigger by First sound bank is sent to speech recognition apparatus.

Step 203, intelligent terminal receive the first phonetic order, and the first phonetic order is sent to speech recognition apparatus, the One phonetic order is used to indicate speech recognition apparatus and searches in the first sound bank and matched first control of the first phonetic order Instruction.

Step 204, intelligent terminal receive the first control instruction that speech recognition apparatus returns.

Step 205, intelligent terminal execute and the first control instruction corresponding operation.

For example, intelligent terminal is smart television, the first control instruction is used to indicate intelligent terminal and opens variety show, then intelligence Energy terminal executes the operation for opening variety show.

In a kind of possible implementation, under B/S network structure mode, intelligent terminal calls and the first control instruction phase The scripting language of pass, data information needed for which is used to that server intelligent terminal to be made to provide operation, for example, the foot This language is Java Script code.

In a kind of possible implementation, executing with after the first control instruction corresponding operation, intelligent terminal may The the first scene interface currently shown is substituted for other scene interfaces, it is also possible to continue to show the first scene interface, i.e., not The replacement at occurrence scene interface.According to both possible situations, intelligent terminal may execute following operation.

In the first possible situation, intelligent terminal updates the second scene interface of display, and it is related to obtain the second scene interface The second sound bank, the second sound bank is sent to speech recognition apparatus by intelligent terminal, and intelligent terminal receives the second phonetic order, Second phonetic order is sent to speech recognition apparatus, the second phonetic order is used to indicate speech recognition apparatus in the second sound bank Middle lookup and matched second control instruction of the second phonetic order, intelligent terminal receive the second control that speech recognition apparatus returns Instruction executes and the second control instruction corresponding operation.

In the first possible situation, intelligent terminal is if it is determined that the scene interface for updating display is still the first scene interface, then First sound bank of caching is sent to speech recognition apparatus, intelligent terminal receives third phonetic order, by third phonetic order It is sent to speech recognition apparatus, third phonetic order is used to indicate speech recognition apparatus and searches in the first sound bank and third language The third control instruction of sound instructions match, intelligent terminal receive the third control instruction that speech recognition apparatus returns, and execute and the Three control instruction corresponding operation.

To sum up, the relevant sound bank in newest scene interface is sent to speech recognition apparatus every time by intelligent terminal, is reduced The range of speech recognition apparatus identification phonetic order, and extra sound bank will not be loaded, alleviate the load of server, with And control process is simple and effective.

For example bright sound control method shown in Fig. 2 below, it is assumed that intelligent terminal is smart television, smart television with Server is based under B/S network structure mode.

Smart television initializes scene interface, and specifically, smart television needs homepage to be shown to server calls, and obtains Take this relevant sound bank in scene interface of homepage.On this scene interface of homepage, user presses the talk button of remote controler Under, phonetic order will be inputted, the relevant sound bank of homepage is immediately sent to speech recognition apparatus by smart television, is known in voice This sound bank is registered in other equipment.For example, user's input " entering TV play column ", completes phonetic order when user inputs Afterwards, smart television monitors that talk button key lifts, and phonetic order is sent to speech recognition apparatus, speech recognition apparatus is rigid The result of identification is returned to smart television by fuzzy matching phonetic order in the sound bank just received, and smart television is according to return As a result know the task that user needs to be implemented, i.e. user wants to enter into TV play column, then smart television calls Java Script Code, so that the related data of TV play column is sent to smart television, smart television display TV play column phase by server The scene interface of pass.

Based on the same inventive concept with sound control method shown in Fig. 2, as shown in figure 3, the embodiment of the present application also mentions A kind of phonetic controller 300 is supplied, phonetic controller 300 is for executing sound control method shown in Fig. 2.Voice control Device 300 includes: processing unit 301, transmission unit 302, receiving unit 303.Wherein:

Processing unit 301 for obtaining the first scene interface currently shown, and obtains the first scene interface relevant the One sound bank；

Transmission unit 302, the first sound bank for obtaining processing unit 301 are sent to speech recognition apparatus；

Receiving unit 303, for receiving the first phonetic order；

Transmission unit 302 is also used to for the first phonetic order being sent to speech recognition apparatus, and the first phonetic order is for referring to Show that speech recognition apparatus is searched and matched first control instruction of the first phonetic order in the first sound bank；

Receiving unit 303 is also used to receive the first control instruction of speech recognition apparatus return；

Processing unit 301 is also used to execute and the first control instruction corresponding operation.

Optionally, processing unit 301 is used for:

Operation is enabled in response to the event for inputting the first phonetic order, by transmission unit 302 by the first sound bank It is sent to speech recognition apparatus；Alternatively, passing through transmission unit 302 will when historic scenery changing interface is to the first scene interface First sound bank is sent to speech recognition apparatus, wherein historic scenery interface be intelligent terminal show the first scene interface it Preceding shown scene interface.

Optionally, processing unit 301 is used for:

Scripting language relevant to the first control instruction is called, scripting language is for mentioning server processing unit 301 For the data information needed for operating；

Wherein, device and server submit to the application mode of browser/server B/S network structure mode.

Optionally, processing unit 301 is also used to, and is being executed with after the first control instruction corresponding operation, is being obtained and update Second scene interface of display, and obtain relevant second sound bank in the second scene interface；

Transmission unit 302 is also used to, and the second sound bank is sent to speech recognition apparatus；

Receiving unit 303 is also used to, and receives the second phonetic order, the second phonetic order is sent to speech recognition apparatus, Second phonetic order is used to indicate speech recognition apparatus and searches in the second sound bank and matched second control of the second phonetic order System instruction；And receive the second control instruction that speech recognition apparatus returns；

Processing unit 301 is also used to, and is executed and the second control instruction corresponding operation.

Optionally, further includes:

Cache unit 304, for delaying after processing unit 301 obtains corresponding first sound bank of the first scene information Deposit the first sound bank；

Processing unit 301 is also used to, after execution with control instruction corresponding operation, however, it is determined that update the scene of display Interface is still the first scene interface, then the first sound bank cached cache unit 304 is sent to speech recognition apparatus；

Receiving unit 303 is also used to, and receives third phonetic order；

Transmission unit 302 is also used to, and third phonetic order is sent to speech recognition apparatus, third phonetic order is for referring to Show that speech recognition apparatus is searched and the matched third control instruction of third phonetic order in the first sound bank；

Receiving unit 303 is also used to, and receives the third control instruction that speech recognition apparatus returns；

Processing unit 301 is also used to, and is executed and third control instruction corresponding operation.

It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.

The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.

These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

Although the preferred embodiment of the application has been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the application range.

Obviously, those skilled in the art can carry out various modification and variations without departing from this Shen to the embodiment of the present application Please embodiment spirit and scope.In this way, if these modifications and variations of the embodiment of the present application belong to the claim of this application And its within the scope of equivalent technologies, then the application is also intended to include these modifications and variations.

Claims

1. a kind of sound control method characterized by comprising

Intelligent terminal obtains the first scene interface currently shown, and obtains relevant first voice in first scene interface Library；

The intelligent terminal receives the first phonetic order, and first phonetic order is sent to the speech recognition apparatus, institute It states the first phonetic order and is used to indicate the speech recognition apparatus and is searched in first sound bank and refer to first voice Enable matched first control instruction；

The intelligent terminal receives first control instruction that the speech recognition apparatus returns, and executes and first control Instruct corresponding operation.

2. the method as described in claim 1, which is characterized in that first sound bank is sent to voice by the intelligent terminal Identify equipment, comprising:

The intelligent terminal enables operation in response to the event for inputting first phonetic order, by first sound bank It is sent to speech recognition apparatus；Or

First sound bank is sent to by the intelligent terminal when historic scenery changing interface is to the first scene interface Speech recognition apparatus, wherein historic scenery interface institute before showing first scene interface for the intelligent terminal The scene interface of display.

3. method according to claim 1 or 2, which is characterized in that described to execute grasp corresponding with first control instruction Make, comprising:

The intelligent terminal calls scripting language relevant to first control instruction, and the scripting language is for making server Data information needed for providing the operation for the intelligent terminal；

Wherein, the intelligent terminal and the server submit to the application side of browser/server B/S network structure mode Formula.

4. method according to claim 1 or 2, which is characterized in that executing and the first control instruction corresponding operation Later, further includes:

Intelligent terminal obtains the second scene interface for updating display, and obtains relevant second voice in second scene interface Library；

The intelligent terminal receives the second phonetic order, and second phonetic order is sent to the speech recognition apparatus, institute It states the second phonetic order and is used to indicate the speech recognition apparatus and is searched in second sound bank and refer to second voice Enable matched second control instruction；

The intelligent terminal receives second control instruction that the speech recognition apparatus returns, and executes and second control Instruct corresponding operation.

5. method according to claim 1 or 2, which is characterized in that obtaining corresponding first language of first scene information After sound library, further includes:

The intelligent terminal caches first sound bank；

The intelligent terminal is if it is determined that the scene interface for updating display is still first scene interface, then by described the of caching One sound bank is sent to speech recognition apparatus；

The intelligent terminal receives third phonetic order, and the third phonetic order is sent to the speech recognition apparatus, institute It states third phonetic order and is used to indicate the speech recognition apparatus and is searched in first sound bank and refer to the third voice Enable matched third control instruction；

The intelligent terminal receives the third control instruction that the speech recognition apparatus returns, and executes and controls with the third Instruct corresponding operation.

6. a kind of phonetic controller characterized by comprising

Processing unit for obtaining the first scene interface currently shown, and obtains first scene interface relevant first Sound bank；

Receiving unit, for receiving the first phonetic order；

The transmission unit is also used to for first phonetic order to be sent to the speech recognition apparatus, first voice Instruction is used to indicate the speech recognition apparatus and searches in first sound bank and first phonetic order matched the One control instruction；

7. device as claimed in claim 6, which is characterized in that the processing unit is used for:

Operation is enabled in response to the event for inputting first phonetic order, by the transmission unit by first language Sound library is sent to speech recognition apparatus；Or

When historic scenery changing interface is to the first scene interface, first sound bank is sent out by the transmission unit Give speech recognition apparatus, wherein the historic scenery interface be the intelligent terminal display first scene interface it Preceding shown scene interface.

8. device as claimed in claims 6 or 7, which is characterized in that the processing unit is used for:

Scripting language relevant to first control instruction is called, the scripting language is single for making the server processing Data information needed for member provides the operation；

Wherein, described device and the server submit to the application mode of browser/server B/S network structure mode.

9. device as claimed in claims 6 or 7, which is characterized in that the processing unit is also used to, and is being executed and described first After control instruction corresponding operation, the second scene interface for updating display is obtained, and obtains second scene interface correlation The second sound bank；

The receiving unit is also used to, and receives the second phonetic order, second phonetic order is sent to the speech recognition Equipment, second phonetic order are used to indicate the speech recognition apparatus and search in second sound bank and described second Matched second control instruction of phonetic order；And receive second control instruction that the speech recognition apparatus returns；

10. device as claimed in claims 6 or 7, which is characterized in that further include:

Cache unit, for caching after the processing unit obtains corresponding first sound bank of first scene information First sound bank；

The processing unit is also used to, after execution with the control instruction corresponding operation, however, it is determined that update the field of display Scape interface is still first scene interface, then first sound bank cached the cache unit is sent to speech recognition Equipment；

The receiving unit is also used to, and receives third phonetic order；

The transmission unit is also used to, and the third phonetic order is sent to the speech recognition apparatus, the third voice Instruction is used to indicate the speech recognition apparatus and searches in first sound bank and the third phonetic order matched the Three control instructions；