[go: up one dir, main page]

CN115527531A - Equipment control method, device, equipment and storage medium - Google Patents

Equipment control method, device, equipment and storage medium Download PDF

Info

Publication number
CN115527531A
CN115527531A CN202210823600.2A CN202210823600A CN115527531A CN 115527531 A CN115527531 A CN 115527531A CN 202210823600 A CN202210823600 A CN 202210823600A CN 115527531 A CN115527531 A CN 115527531A
Authority
CN
China
Prior art keywords
text information
target
module
identification information
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210823600.2A
Other languages
Chinese (zh)
Inventor
余海超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Coocaa Network Technology Co Ltd
Original Assignee
Shenzhen Coocaa Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Coocaa Network Technology Co Ltd filed Critical Shenzhen Coocaa Network Technology Co Ltd
Priority to CN202210823600.2A priority Critical patent/CN115527531A/en
Publication of CN115527531A publication Critical patent/CN115527531A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/485End-user interface for client configuration
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiment of the invention relates to a method, a device, equipment and a storage medium for controlling equipment, wherein the method comprises the following steps: acquiring first text information and identification information of each module in the current interface of the equipment to obtain a first text information set and an identification information set; acquiring second text information corresponding to a voice signal of a user; determining target first text information matched with the second text information from the first text information set; determining target identification information corresponding to the target first text information from the identification information set; and controlling the target module to be started according to the target identification information. Therefore, the target module can be quickly matched from the current interface and controlled to be started according to the voice signal of the user, the voice intention does not need to be recognized through the voice recognition model, the voice recognition process is simplified, the efficiency of voice recognition and voice control is improved, and the user experience is improved, so that the module in the current interface scene can be seen and can be spoken.

Description

Equipment control method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of voice control, in particular to a method, a device, equipment and a storage medium for controlling equipment.
Background
When the user uses the voice command to control the intelligent household appliance, the intention of the user needs to be recognized according to the model, and the corresponding module needs to be controlled according to the recognized intention.
In the prior art, global voice commands need to be recognized through a voice recognition model, and voice control cannot be rapidly performed on the content on the current interface, so that the existing voice recognition process is complicated, the efficiency is low, the text generalization query real-time performance of the current interface is realized, and real-time dynamic update registration is not realized.
Disclosure of Invention
In view of this, in order to solve the technical problems of tedious voice recognition process and low efficiency, embodiments of the present invention provide a method, an apparatus, a device, and a storage medium for controlling a device.
In a first aspect, an embodiment of the present invention provides a method for controlling a device, including:
acquiring first text information and identification information of each module in the current interface of the equipment to obtain a first text information set and an identification information set;
acquiring second text information corresponding to a voice signal of a user;
determining target first text information matched with the second text information from the first text information set;
determining target identification information corresponding to the target first text information from the identification information set;
and controlling the target module to be started according to the target identification information.
In a possible embodiment, the obtaining the first text information of each module in the current interface of the device includes:
acquiring a text corresponding to a selectable module in the current interface of the equipment as the first text information;
the acquiring of the second text information corresponding to the voice signal of the user includes:
receiving a voice signal of a user in a preset mode;
second text information is extracted from the speech signal.
In one possible embodiment, the determining, from the first set of text information, the target first text information that matches the second text information includes:
generating an information list based on the first text information set;
determining a plurality of matching degrees of a plurality of first text messages and second text messages in the information list according to a shortest path algorithm;
and determining the target first text information according to the matching degrees.
In one possible embodiment, the determining the target first text information according to the plurality of matching degrees includes:
determining a plurality of confidence levels for a plurality of said degrees of match;
determining the confidence coefficient with the maximum confidence coefficient and larger than a set threshold value as a target confidence coefficient from the plurality of confidence coefficients;
and determining the first text information corresponding to the target confidence degree as the target first text information.
In one possible embodiment, the method further comprises:
when target first text information matching the second text information is not determined from the first text information set, identifying the second text information through a speech intention recognition model;
and controlling the target module to be started according to the identification result.
In one possible embodiment, the method further comprises:
generating an incidence relation among the module, the first text information and the identification information;
determining target identification information corresponding to the target first text information from the identification information set, including:
determining target identification information corresponding to the target first text information from the identification information set according to the incidence relation;
the controlling the target module to be started according to the target identification information comprises the following steps:
determining a target module corresponding to the target identification information according to the incidence relation;
and controlling the target module to be started.
In one possible embodiment, the method further comprises:
when the target module is a module not in the current interface, generating display information;
and displaying the display information in the current interface.
In a second aspect, an embodiment of the present invention provides a device control apparatus, including:
the acquisition module is used for acquiring first text information and identification information of each module in the current interface of the equipment to obtain a first text information set and an identification information set;
the acquisition module is further used for acquiring second text information corresponding to the voice signal of the user;
the processing module is used for determining target first text information matched with the second text information from the first text information set;
the processing module is further configured to determine target identification information corresponding to the target first text information from the identification information set;
and the control module is used for controlling the target module to be started according to the target identification information.
In a third aspect, an embodiment of the present invention provides an apparatus, including: a processor and a memory, the processor being configured to execute a control program of the apparatus stored in the memory to implement the control method of the apparatus of any one of the above first aspects.
In a fourth aspect, an embodiment of the present invention provides a storage medium, where one or more programs are stored, and the one or more programs are executable by one or more processors to implement the method for controlling an apparatus according to any one of the first aspects.
According to the control scheme of the equipment provided by the embodiment of the invention, a first text information set and an identification information set are obtained by acquiring first text information and identification information of each module in a current interface of the equipment; acquiring second text information corresponding to a voice signal of a user; determining target first text information matched with the second text information from the first text information set; determining target identification information corresponding to the target first text information from the identification information set; and the target module is controlled to be started according to the target identification information, so that the target module can be quickly matched and controlled from the current interface according to the first text information spoken by the user, the voice recognition process is simplified, the voice recognition and voice control efficiency is improved, and the user experience is improved, so that the user can control the module in the current interface scene of the equipment through voice.
Drawings
Fig. 1 is a schematic flowchart of a method for controlling a device according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of another method for controlling a device according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a control device of an apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
For the convenience of understanding of the embodiments of the present invention, the following description will be further explained with reference to specific embodiments, which are not to be construed as limiting the embodiments of the present invention.
Fig. 1 is a schematic flowchart of a method for controlling a device according to an embodiment of the present invention, and as shown in fig. 1, the method specifically includes:
s11, acquiring first text information and identification information of each module in the current interface of the equipment to obtain a first text information set and an identification information set;
the control method of the equipment provided by the embodiment of the invention is applied to the intelligent household equipment, and the intelligent household equipment is provided with a display screen, which can be: the target module is controlled by acquiring first text information of a current equipment screen and second text information in voice sent by a user.
In this embodiment, the modules displayed in the current interface of the device are determined, the name of each module is extracted as first text information, the first text information of all the modules forms a first text information set, the unique identifier of each module is extracted as identifier information, and the identifier information of all the modules forms an identifier information set. The first text information may be a text corresponding to a module displayed on the current interface of the device (for example, the device is a television, the module on the current interface of the television includes modules such as a recommended movie, a category in the navigation bar, and a function of the television, and a name of the movie, a name of the category in the navigation bar, and a name of the function of the television are used as the first text information), and the identification information may be a unique ID or an address of each module displayed on the current interface, that is, only the module in the interface currently displayed on the device is acquired and identified in this embodiment.
S12, acquiring second text information corresponding to the voice signal of the user;
in this embodiment, the device may receive audio of the user as a voice signal, and extract content related to a module in a current interface of the device in the voice signal as second text information, where the second text information is used to represent a control instruction of the user on the device.
Specifically, the user can speak a preset voice instruction, the voice control function of the device is awakened through the voice instruction, the awakened device receives audio sent by the user through the voice recognition device to serve as a voice signal, the voice signal is recognized, and recognized characters are extracted from the voice signal to serve as second text information.
S13, determining target first text information matched with the second text information from the first text information set;
in this embodiment, the device sends a connection request to the voice cloud backend, sends the first text information set, the identification information set and the second text information to the voice cloud backend after the connection is successful, registers a real-time information list of the first text information set, and determines, from the real-time information list, the first text information that is most matched with the second text information as the target first text information.
Specifically, the matching method may include: and extracting keywords related to the module from the second text information, determining the similarity between the keywords and each piece of first text information in the real-time information list, and taking the first text information with the highest similarity and larger than a set threshold value as target first text information.
S14, determining target identification information corresponding to the target first text information from the identification information set;
in this embodiment, each piece of identification information corresponds to a unique module, each module corresponds to a unique piece of first text information, an association relationship between the first text information and the identification information is generated when the first text information and the identification information are acquired, and the identification information having the association relationship with the first text information is determined to be the target identification information.
And S15, controlling the target module to be started according to the target identification information.
In this embodiment, since the target identification information is the unique ID of the module, the target module can be determined from the plurality of modules on the current interface according to the target identification information, a control instruction of the target module is generated, and the target module is controlled to be started according to the control instruction.
For example, the first text information set obtained in the current interface includes: and the modules of movies, TV shows, comprehensive arts and the like. And the obtained second text information is used for opening the movie, the movie is extracted from the second text information and used as a keyword, the first text information matched with the keyword is determined as a movie module, the position of the movie module in the interface is determined according to the identification information corresponding to the movie module, and the movie module is controlled to be opened.
According to the control method of the equipment provided by the embodiment of the invention, a first text information set and an identification information set are obtained by acquiring the first text information and the identification information of each module in the current interface of the equipment; acquiring second text information corresponding to a voice signal of a user; determining target first text information matched with the second text information from the first text information set; determining target identification information corresponding to the target first text information from the identification information set; the target module is controlled to be started according to the target identification information, the target module can be quickly matched from the current interface according to the voice signal of the user and is controlled to be started, the voice intention is not required to be recognized through a voice recognition model, the voice recognition process is simplified, the voice recognition efficiency and the voice control efficiency are accelerated, the user experience is improved, and the module in the current interface scene can be seen and can be spoken.
Fig. 2 is a schematic flowchart of another method for controlling a device according to an embodiment of the present invention, where as shown in fig. 2, the method specifically includes:
s21, receiving a voice signal of a user in a preset mode; extracting second text information from the voice signal; acquiring texts corresponding to selectable modules in the current interface of the equipment as the first text information, and acquiring identification information of each module;
in this embodiment, the preset mode is a mode capable of performing voice control; the second text information is a text corresponding to the voice instruction of the user; the selectable module is a module which can be selected or clicked by a user in an interface currently displayed by the equipment, a text corresponding to the module can be described through spoken language of the user, and the first text information is a text displayed in the current interface by the module.
Specifically, a user initiates a voice conversation process, the device is awakened through a far-field awakening use mode, the device enters a preset mode, the device starts a voice module to receive audio of the user as a voice signal, then the voice module calls a system interface to acquire a text corresponding to a selectable module in a current interface of the device as first text information, a first text information set is obtained, and a unique ID of each module is acquired as identification information, so that an identification information set is obtained. The voice module sends the voice signal, the first text information set and the identification information set to the voice cloud background server, and after the voice cloud background server receives the voice signal, the voice signal is recognized, and the second text information is extracted from the voice signal.
S22, generating an incidence relation among the module, the first text information and the identification information;
in this embodiment, each module corresponds to unique identification information, and each module corresponds to unique first text information, so that the voice cloud background server generates an association relationship between the identification information and the first text information after receiving the identification information and the first text information, and the association relationship between the two modules.
S23, generating an information list based on the first text information set; determining a plurality of matching degrees of a plurality of first text messages and second text messages in the information list according to a shortest path algorithm;
s24, determining a plurality of confidence degrees of the matching degrees; determining the confidence coefficient with the maximum confidence coefficient and larger than a set threshold value as a target confidence coefficient from the plurality of confidence coefficients; determining first text information corresponding to the target confidence degree as target first text information;
in this embodiment, after receiving the first text information set, the voice cloud backend server registers a real-time information list of the first text information, performs matching based on a shortest path detection algorithm according to the second text information and the real-time information list, determines a matching degree between the second text information and each piece of first text information to obtain a plurality of matching degrees, scores the plurality of matching degrees to obtain a plurality of confidence degrees corresponding to the plurality of matching degrees, selects a maximum confidence degree from the plurality of confidence degrees, determines whether the maximum confidence degree is greater than a set threshold (for example, the set threshold is 0.8), determines the maximum confidence degree as a target confidence degree when the determination result is greater than the maximum confidence degree, and determines that the first text information corresponding to the target confidence degree is the target first text information.
The shortest path detection algorithm of the embodiment supports real-time high-efficiency matching, information lists of hundreds of modules are simultaneously registered on one page at one time, the matching calculation time reaches 1 millisecond, and the method has generalization fault tolerance.
For example, when the first text message of a module in the registered real-time information list is "Huang Rihua version eight tianlongs", the user directly says that the eight tianrons, the eight tianrons and the Huang Rihua eight tianrons are opened, and the first text message can be matched according to the shortest path detection algorithm. Without the user having to speak the corresponding first text information in its entirety.
In a possible implementation manner, when the target first text information matched with the second text information is not determined from the first text information set, it is stated that the second text information spoken by the user is not the module of the current interface, so that a no-match result is returned through the shortest path detection matching algorithm, at this time, the second text information is recognized according to the standard voice intention recognition model, and a corresponding control instruction is returned according to the recognition result to the voice module of the device for receiving.
S25, determining target identification information corresponding to the target first text information from the identification information set according to the incidence relation; determining a target module corresponding to the target identification information according to the incidence relation; and controlling the target module to be started.
In this embodiment, identification information corresponding to the target first module is determined from the association relationship as target identification information, the target module corresponding to the target identification information is determined, the server generates a control instruction corresponding to the target first text information, the control instruction and the target identification information are returned to the voice module of the device, after the voice module receives the control instruction, the voice module calls the system interface of the current interface to control the target module to execute the control instruction according to the target identification information, and the target module is controlled to be started according to the control instruction.
In a possible implementation manner, when it is described that the module corresponding to the second text information spoken by the user is not the module of the current interface, it is described that the target module which the user wants to open is not in the current interface, and at this time, display information is generated; displaying the display information in the current interface, and reminding the user that the module corresponding to the voice input audio is a non-current interface module through the display information, so that the user can input the audio again or inquire whether the user opens the non-current interface module.
In the control method of the device provided by this embodiment, a voice signal of a user is received in a preset mode; extracting second text information from the voice signal; acquiring texts corresponding to selectable modules in a current interface of equipment as first text information, and acquiring identification information of each module; generating an incidence relation among the module, the first text information and the identification information; generating an information list based on the first text information set; determining a plurality of matching degrees of a plurality of first text messages and the second text messages in an information list according to a shortest path algorithm; determining a plurality of confidence degrees of a plurality of matching degrees; determining the confidence coefficient with the maximum confidence coefficient and larger than a set threshold value as a target confidence coefficient from the confidence coefficients; determining first text information corresponding to the target confidence degree as target first text information; determining target identification information corresponding to the target first text information from the identification information set according to the incidence relation; determining a target module corresponding to the target identification information according to the association relation; and controlling the target module to be started. The real-time registration capability of the data is realized through a real-time registration mechanism between the cloud and the equipment, the generalized matching calculation of the text similarity with high efficiency is realized through the combination of the cloud shortest path detection matching algorithm and the real-time registration capability of the data, the module can be directly selected without clicking the module through a remote controller or recognizing and calculating the voice content through a model, the voice recognition efficiency and the voice control efficiency of the current interface are accelerated, and the voice recognition process is simplified.
Fig. 3 is a schematic structural diagram of a control device of an apparatus according to an embodiment of the present invention, which specifically includes:
the acquiring module 31 is configured to acquire first text information and identification information of each module in the current interface of the device, so as to obtain a first text information set and an identification information set;
the obtaining module 31 is further configured to obtain second text information corresponding to a voice signal of a user;
a processing module 32, configured to determine, from the first text information set, target first text information that matches the second text information;
the processing module 32 is further configured to determine target identification information corresponding to the target first text information from the identification information set;
and the control module 33 is used for controlling the target module to be started according to the target identification information.
In a possible implementation manner, the obtaining module 31 is specifically configured to obtain a text corresponding to a selectable module in the current interface of the device as the first text information;
receiving a voice signal of a user in a preset mode;
second text information is extracted from the speech signal.
In a possible embodiment, the processing module 32 is specifically configured to generate an information list based on the first text information set;
determining a plurality of matching degrees of a plurality of first text messages and second text messages in the information list according to a shortest path algorithm;
and determining the target first text information according to the matching degrees.
In a possible embodiment, the processing module 32 is specifically configured to determine a plurality of confidence degrees of a plurality of the matching degrees;
determining the confidence coefficient with the maximum confidence coefficient and larger than a set threshold value as a target confidence coefficient from the plurality of confidence coefficients;
and determining the first text information corresponding to the target confidence coefficient as target first text information.
In one possible embodiment, the processing module 32 is specifically configured to identify the second text information through a speech intention recognition model when the target first text information matching the second text information is not determined from the first text information set;
the control module 33 is specifically configured to control the target module to be started according to the recognition result.
In a possible embodiment, the processing module 32 is specifically configured to generate an association relationship among the module, the first text information, and the identification information;
determining target identification information corresponding to the target first text information from the identification information set according to the incidence relation;
determining a target module corresponding to the target identification information according to the incidence relation;
the control module 33 is specifically configured to control the target module to be started.
In a possible embodiment, the processing module 32 is specifically configured to generate the display information when the target module is a module in the non-current interface;
and displaying the display information in the current interface.
The apparatus for controlling a device provided in this embodiment may be the apparatus shown in fig. 3, and may perform all the steps of the method for controlling the device shown in fig. 1 and 2, so as to achieve the technical effect of the method for controlling the device shown in fig. 1 and 2.
Fig. 4 is a schematic structural diagram of an apparatus according to an embodiment of the present invention, where the apparatus 400 shown in fig. 4 includes: at least one processor 401, memory 402, at least one network interface 404, and other user interfaces 403. The various components in the device 400 are coupled together by a bus system 405. It is understood that the bus system 405 is used to enable connection communication between these components. The bus system 405 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 405 in fig. 4.
The user interface 403 may include, among other things, a display, a keyboard or a pointing device (e.g., a mouse, trackball (trackball), a touch pad or touch screen, etc.
It will be appreciated that memory 402 in embodiments of the invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), dynamic Random Access Memory (DRAM), synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), double Data Rate Synchronous Dynamic random access memory (ddr Data Rate SDRAM, ddr SDRAM), enhanced Synchronous SDRAM (ESDRAM), synchlronous SDRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The memory 402 described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
In some embodiments, memory 402 stores the following elements, executable units or data structures, or a subset thereof, or an expanded set thereof: an operating system 4021 and application programs 4022.
The operating system 4021 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is configured to implement various basic services and process hardware-based tasks. The application programs 4022 include various application programs, such as a Media Player (Media Player), a Browser (Browser), and the like, for implementing various application services. A program for implementing the method according to the embodiment of the present invention may be included in the application 4022.
In this embodiment of the present invention, by calling a program or an instruction stored in the memory 402, specifically, a program or an instruction stored in the application 4022, the processor 401 is configured to execute the method steps provided by the method embodiments, for example, including:
acquiring first text information and identification information of each module in the current interface of the equipment to obtain a first text information set and an identification information set;
acquiring second text information corresponding to a voice signal of a user;
determining target first text information matched with the second text information from the first text information set;
determining target identification information corresponding to the target first text information from the identification information set;
and controlling the target module to be started according to the target identification information.
The method disclosed in the above embodiments of the present invention may be applied to the processor 401, or implemented by the processor 401. The processor 401 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 401. The Processor 401 may be a general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, or discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software elements in the decoding processor. The software elements may be located in ram, flash, rom, prom, or eprom, registers, among other storage media that are well known in the art. The storage medium is located in the memory 402, and the processor 401 reads the information in the memory 402 and completes the steps of the method in combination with the hardware.
It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or any combination thereof. For a hardware implementation, the Processing units may be implemented within one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units configured to perform the functions described herein, or a combination thereof.
For a software implementation, the techniques described herein may be implemented by means of units performing the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
The device provided in this embodiment may be the device shown in fig. 4, and may perform all the steps of the control method of the device shown in fig. 1-2, so as to achieve the technical effect of the control method of the device shown in fig. 1-2, and for brevity, it is not described herein again.
The embodiment of the invention also provides a storage medium (computer readable storage medium). The storage medium herein stores one or more programs. Among others, the storage medium may include volatile memory, such as random access memory; the memory may also include non-volatile memory, such as read-only memory, flash memory, a hard disk, or a solid state disk; the memory may also comprise a combination of memories of the kind described above.
When one or more programs in the storage medium are executable by one or more processors, the control method of the device executed on the device side described above is realized.
The processor is configured to execute a control program of the device stored in the memory to implement the following steps of the control method of the device executed on the device side:
acquiring first text information and identification information of each module in the current interface of the equipment to obtain a first text information set and an identification information set;
acquiring second text information corresponding to a voice signal of a user;
determining target first text information matched with the second text information from the first text information set;
determining target identification information corresponding to the target first text information from the identification information set;
and controlling the target module to be started according to the target identification information.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of controlling a device, comprising:
acquiring first text information and identification information of each module in the current interface of the equipment to obtain a first text information set and an identification information set;
acquiring second text information corresponding to a voice signal of a user;
determining target first text information matched with the second text information from the first text information set;
determining target identification information corresponding to the target first text information from the identification information set;
and controlling the target module to be started according to the target identification information.
2. The method of claim 1, wherein the obtaining first text information of each module in the current interface of the device comprises:
acquiring a text corresponding to a selectable module in the current interface of the equipment as the first text information;
the acquiring of the second text information corresponding to the voice signal of the user includes:
receiving a voice signal of a user in a preset mode;
second text information is extracted from the speech signal.
3. The method of claim 1, wherein the determining the target first textual information from the first set of textual information that matches the second textual information comprises:
generating an information list based on the first text information set;
determining a plurality of matching degrees of a plurality of first text messages and second text messages in the information list according to a shortest path algorithm;
and determining the target first text information according to the matching degrees.
4. The method according to claim 3, wherein said determining the target first text information according to the plurality of matching degrees comprises:
determining a plurality of confidence levels for a plurality of said degrees of match;
determining the confidence coefficient with the maximum confidence coefficient and larger than a set threshold value as a target confidence coefficient from the plurality of confidence coefficients;
and determining the first text information corresponding to the target confidence degree as the target first text information.
5. The method of claim 1, further comprising:
when target first text information matching the second text information is not determined from the first text information set, identifying the second text information through a speech intention recognition model;
and controlling the target module to be started according to the recognition result.
6. The method of claim 4, further comprising:
generating an incidence relation among the module, the first text information and the identification information;
determining target identification information corresponding to the target first text information from the identification information set, including:
determining target identification information corresponding to the target first text information from the identification information set according to the incidence relation;
the controlling the target module to be started according to the target identification information comprises the following steps:
determining a target module corresponding to the target identification information according to the incidence relation;
and controlling the target module to be started.
7. The method of claim 1, further comprising:
when the target module is a module not in the current interface, generating display information;
and displaying the display information in the current interface.
8. A control apparatus of a device, characterized by comprising:
the acquisition module is used for acquiring first text information and identification information of each module in the current interface of the equipment to obtain a first text information set and an identification information set;
the acquisition module is further used for acquiring second text information corresponding to the voice signal of the user;
the processing module is used for determining target first text information matched with the second text information from the first text information set;
the processing module is further configured to determine target identification information corresponding to the target first text information from the identification information set;
and the control module is used for controlling the target module to be started according to the target identification information.
9. An apparatus, comprising: a processor and a memory, the processor being configured to execute a control program of the apparatus stored in the memory to implement the control method of the apparatus of any one of claims 1 to 7.
10. A storage medium characterized in that the storage medium stores one or more programs executable by one or more processors to implement a control method of an apparatus according to any one of claims 1 to 7.
CN202210823600.2A 2022-07-12 2022-07-12 Equipment control method, device, equipment and storage medium Pending CN115527531A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210823600.2A CN115527531A (en) 2022-07-12 2022-07-12 Equipment control method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210823600.2A CN115527531A (en) 2022-07-12 2022-07-12 Equipment control method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115527531A true CN115527531A (en) 2022-12-27

Family

ID=84695262

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210823600.2A Pending CN115527531A (en) 2022-07-12 2022-07-12 Equipment control method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115527531A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116403581A (en) * 2023-03-30 2023-07-07 成都赛力斯科技有限公司 Vehicle-mounted voice interaction method, device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116403581A (en) * 2023-03-30 2023-07-07 成都赛力斯科技有限公司 Vehicle-mounted voice interaction method, device, computer equipment and storage medium
CN116403581B (en) * 2023-03-30 2025-06-24 重庆赛力斯凤凰智创科技有限公司 In-vehicle voice interaction method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US10706854B2 (en) Dialog management with multiple applications
EP3724875B1 (en) Text independent speaker recognition
US20220180868A1 (en) Streaming Action Fulfillment Based on Partial Hypotheses
JP6588637B2 (en) Learning personalized entity pronunciation
US20200151258A1 (en) Method, computer device and storage medium for impementing speech interaction
CN109741737B (en) Voice control method and device
CN105931644A (en) Voice recognition method and mobile terminal
US12217751B2 (en) Digital signal processor-based continued conversation
WO2019062090A1 (en) Method and apparatus for controlling service device to perform service operation, device, and medium
CN112669842A (en) Man-machine conversation control method, device, computer equipment and storage medium
US20240212687A1 (en) Supplemental content output
CN112185374A (en) Method and device for determining voice intention
US10861453B1 (en) Resource scheduling with voice controlled devices
CN115527531A (en) Equipment control method, device, equipment and storage medium
US20230188488A1 (en) Voice user interface sharing of content
CN113241067B (en) Voice interaction method and system and voice interaction equipment
US12267286B1 (en) Sharing of content
KR20220135398A (en) Speech procssing method and apparatus thereof
CN107577728B (en) User request processing method and device
CN115662430A (en) Input data analysis method and device, electronic equipment and storage medium
CN113360607A (en) Information query method, device and storage medium
US20210049215A1 (en) Shared Context Manager for Cohabitating Agents
WO2025137255A1 (en) Utilizing generative model in generating summary of long-form content
CN115273903A (en) Information pushing method and device, electronic equipment and storage medium
CN120048249A (en) Method, apparatus, device and medium for translating voice data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination