CN111429907B

CN111429907B - Voice service mode switching method, device, equipment and storage medium

Info

Publication number: CN111429907B
Application number: CN202010220646.6A
Authority: CN
Inventors: 李扬; 李士岩
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-03-25
Filing date: 2020-03-25
Publication date: 2023-10-20
Anticipated expiration: 2040-03-25
Also published as: CN111429907A

Abstract

This application discloses a voice service mode switching method, device, equipment and storage medium, and relates to the field of intelligent voice technology. The specific implementation plan is: by identifying the current service scenario of the virtual image; determining the target service mode corresponding to the current service scenario according to the current service scenario; switching the service mode of the virtual image to the target service mode, according to The target service mode performs voice interaction with the user. In the embodiment of this application, the corresponding service mode is switched according to different service scenarios, so that the virtual image is more in line with the current service scenario and more easily accepted by users, thereby achieving accurate provision of services to users and improving user experience.

Description

Voice service mode switching method, device, equipment and storage medium

技术领域Technical field

本申请涉及计算机技术领域，尤其涉及智能语音技术。This application relates to the field of computer technology, and in particular to intelligent voice technology.

背景技术Background technique

随着信息技术的发展，智能语音技术已经成为人们信息获取和沟通最便捷、最有效的手段。例如，商场、展厅等场所通常会放置智能语音机器人，用户可以与机器人进行语音交互，从而解答用户问题、向用户推销商品、与用户闲聊等，为用户提供便利。With the development of information technology, intelligent voice technology has become the most convenient and effective means for people to obtain information and communicate. For example, intelligent voice robots are usually placed in shopping malls, exhibition halls and other places. Users can conduct voice interaction with the robots to answer user questions, promote products to users, chat with users, etc., to provide convenience to users.

现有的智能语音机器人通常在固定的场景下完成固定的职能，通常预先设置固定的语料库，问题和回答都是固定的，用户问什么回答什么，可能存在不合适的回答，智能语音机器人与用户的交互模式比较固定，缺少人性化，用户体验较差。Existing intelligent voice robots usually complete fixed functions in fixed scenarios. A fixed corpus is usually set in advance. The questions and answers are fixed. Users can answer whatever they ask. There may be inappropriate answers. Intelligent voice robots are closely related to users. The interaction mode is relatively fixed, lacks humanization, and the user experience is poor.

发明内容Contents of the invention

本申请提供一种语音服务模式切换方法、装置、设备及存储介质，以在不同的服务场景以合适的服务模式与用户进行语音交互，精准的向用户提供服务，提高用户体验。This application provides a voice service mode switching method, device, equipment and storage medium to conduct voice interaction with users in appropriate service modes in different service scenarios, accurately provide services to users, and improve user experience.

本申请第一个方面提供一种语音服务模式切换方法，包括：The first aspect of this application provides a voice service mode switching method, including:

识别虚拟形象的当前服务场景；Identify the current service scenario of the avatar;

根据所述当前服务场景，确定所述当前服务场景对应的目标服务模式；其中，所述目标服务模式包括所述虚拟形象的目标外观、目标动作策略、目标交互逻辑、目标话术策略中至少一项；According to the current service scenario, a target service mode corresponding to the current service scenario is determined; wherein the target service mode includes at least one of the target appearance of the avatar, the target action strategy, the target interaction logic, and the target speech strategy. item;

将所述虚拟形象的服务模式切换至所述目标服务模式，根据所述目标服务模式与用户进行语音交互。Switch the service mode of the avatar to the target service mode, and perform voice interaction with the user according to the target service mode.

本申请第二个方面提供一种语音服务模式切换装置，包括：The second aspect of this application provides a voice service mode switching device, including:

场景识别模块，用于识别虚拟形象的当前服务场景；Scene recognition module, used to identify the current service scenario of the virtual image;

服务模式确定模块，用于根据所述当前服务场景，确定所述当前服务场景对应的目标服务模式；其中，所述目标服务模式包括所述虚拟形象的目标外观、目标动作策略、目标交互逻辑、目标话术策略中至少一项；A service mode determination module, configured to determine a target service mode corresponding to the current service scenario according to the current service scenario; wherein the target service mode includes the target appearance of the avatar, the target action strategy, the target interaction logic, At least one of the target speech strategies;

处理模块，用于将所述虚拟形象的服务模式切换至所述目标服务模式，根据所述目标服务模式与用户进行语音交互。A processing module, configured to switch the service mode of the avatar to the target service mode, and perform voice interaction with the user according to the target service mode.

本申请第三个方面提供一种电子设备，包括：A third aspect of this application provides an electronic device, including:

至少一个处理器；以及at least one processor; and

与所述至少一个处理器通信连接的存储器；其中，a memory communicatively connected to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的指令，所述指令被所述至少一个处理器执行，以使所述至少一个处理器能够执行第一方面所述的方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform the method described in the first aspect.

本申请第四个方面提供一种存储有计算机指令的非瞬时计算机可读存储介质，所述计算机指令用于使所述计算机执行第一方面所述的方法。A fourth aspect of the present application provides a non-transitory computer-readable storage medium storing computer instructions, the computer instructions being used to cause the computer to execute the method described in the first aspect.

本申请第五个方面提供一种计算机程序，包括程序代码，当计算机运行所述计算机程序时，所述程序代码执行如第一方面所述的方法。A fifth aspect of the present application provides a computer program, including program code. When a computer runs the computer program, the program code executes the method described in the first aspect.

上述申请中的一个实施例具有如下优点或有益效果：通过识别虚拟形象的当前服务场景；根据所述当前服务场景，确定所述当前服务场景对应的目标服务模式；将所述虚拟形象的服务模式切换至所述目标服务模式，根据所述目标服务模式与用户进行语音交互。本申请实施例中根据不同的服务场景切换对应的服务模式，使得虚拟形象更符合当前服务场景，更易被用户接受，从而实现精准的向用户提供服务，提高用户体验。One embodiment in the above application has the following advantages or beneficial effects: by identifying the current service scenario of the avatar; determining the target service mode corresponding to the current service scenario according to the current service scenario; and converting the service mode of the avatar into Switch to the target service mode and perform voice interaction with the user according to the target service mode. In the embodiment of this application, the corresponding service mode is switched according to different service scenarios, so that the virtual image is more in line with the current service scenario and more easily accepted by users, thereby achieving accurate provision of services to users and improving user experience.

上述可选方式所具有的其他效果将在下文中结合具体实施例加以说明。Other effects of the above optional methods will be described below in conjunction with specific embodiments.

附图说明Description of the drawings

附图用于更好地理解本方案，不构成对本申请的限定。其中：The accompanying drawings are used to better understand the present solution and do not constitute a limitation of the present application. in:

图1是本申请一实施例提供的语音服务模式切换方法的系统示意图；Figure 1 is a system schematic diagram of a voice service mode switching method provided by an embodiment of the present application;

图2是本申请一实施例提供的语音服务模式切换方法的示意图；Figure 2 is a schematic diagram of a voice service mode switching method provided by an embodiment of the present application;

图3是本申请另一实施例提供的语音服务模式切换方法的示意图；Figure 3 is a schematic diagram of a voice service mode switching method provided by another embodiment of the present application;

图4是本申请另一实施例提供的语音服务模式切换方法的示意图；Figure 4 is a schematic diagram of a voice service mode switching method provided by another embodiment of the present application;

图5是本申请一实施例提供的语音服务模式切换装置的框图；Figure 5 is a block diagram of a voice service mode switching device provided by an embodiment of the present application;

图6是用来实现本申请实施例的语音服务模式切换方法的电子设备的框图。Figure 6 is a block diagram of an electronic device used to implement the voice service mode switching method according to the embodiment of the present application.

具体实施方式Detailed ways

以下结合附图对本申请的示范性实施例做出说明，其中包括本申请实施例的各种细节以助于理解，应当将它们认为仅仅是示范性的。因此，本领域普通技术人员应当认识到，可以对这里描述的实施例做出各种改变和修改，而不会背离本申请的范围和精神。同样，为了清楚和简明，以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and they should be considered to be exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the application. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

为了清楚理解本申请的技术方案，首先对现有技术的方案进行详细介绍。现有的智能语音机器人通常在固定的场景下完成固定的职能，通常预先设置固定的语料库，问题和回答都是固定的，用户问什么回答什么，可能存在不合适的回答，智能语音机器人与用户的交互模式比较固定，缺少人性化，用户体验较差。In order to clearly understand the technical solution of the present application, the prior art solution will first be introduced in detail. Existing intelligent voice robots usually complete fixed functions in fixed scenarios. A fixed corpus is usually set in advance. The questions and answers are fixed. Users can answer whatever they ask. There may be inappropriate answers. Intelligent voice robots are closely related to users. The interaction mode is relatively fixed, lacks humanization, and the user experience is poor.

为了解决上述问题，本申请中针对销售场景、客服场景、指引场景、咨询场景、陪伴场景等不同服务场景预先配置不同服务模式，进而可根据当前服务场景选择合适的服务模式，与用户进行语音交互，实现精准的向用户提供服务，提高用户体验。In order to solve the above problems, in this application, different service modes are pre-configured for different service scenarios such as sales scenarios, customer service scenarios, guidance scenarios, consultation scenarios, companion scenarios, etc., and then the appropriate service mode can be selected according to the current service scenario to conduct voice interaction with users. , to achieve accurate provision of services to users and improve user experience.

进一步的，本申请中采用显示于显示装置中的虚拟形象与用户进行语音交互，更便于进行服务模式的切换，其中在不同的服务模式下，虚拟形象的外观、动作策略、交互逻辑、话术策略中的至少一项可以不同。Furthermore, in this application, the virtual image displayed on the display device is used to interact with the user through voice, which is more convenient for switching service modes. In different service modes, the appearance, action strategy, interaction logic, and speech skills of the virtual image At least one of the policies can be different.

本申请提供的语音服务模式切换方法可应用如图1所示的智能语音系统，该系统包括显示装置111、控制设备112，还可包括传感器、扬声器等(图1中未示出)，其中传感器可以包括采集声音的传感器、采集图像的传感器等，显示装置111可以用于显示虚拟形象113，控制设备112用于执行本申请的语音服务模式切换方法，也即识别虚拟形象的当前服务场景，根据当前服务场景确定目标服务模式，并将虚拟形象的服务模式切换至目标服务模式，根据目标服务模式与用户120进行语音交互。可选的，控制设备112可以与显示装置111集成在一起。The voice service mode switching method provided by this application can be applied to the intelligent voice system as shown in Figure 1. The system includes a display device 111, a control device 112, and may also include sensors, speakers, etc. (not shown in Figure 1), where the sensor It may include sensors that collect sounds, sensors that collect images, etc. The display device 111 can be used to display the virtual image 113, and the control device 112 is used to execute the voice service mode switching method of the present application, that is, to identify the current service scene of the virtual image, according to The current service scenario determines the target service mode, switches the service mode of the avatar to the target service mode, and performs voice interaction with the user 120 according to the target service mode. Optionally, the control device 112 can be integrated with the display device 111 .

下面结合具体实施例对本申请的语音服务模式切换过程进行详细介绍。The voice service mode switching process of this application will be introduced in detail below with reference to specific embodiments.

本申请一实施例提供一种语音服务模式切换方法，图2为本发明实施例提供的语音服务模式切换方法流程图。所述执行主体可以为智能语音系统的控制设备，如图2所示，所述语音服务模式切换方法具体步骤如下：An embodiment of the present application provides a voice service mode switching method. FIG. 2 is a flow chart of the voice service mode switching method provided by an embodiment of the present invention. The execution subject may be the control device of the intelligent voice system, as shown in Figure 2. The specific steps of the voice service mode switching method are as follows:

S201、识别虚拟形象的当前服务场景。S201. Identify the current service scenario of the virtual image.

在本实施例中，虚拟形象显示于固定设置的显示装置上，能够通过传感器采集用户声音、与用户进行语音交互，向用户提供语音交互服务，其中显示装置可以设置在店面、商场、大厅、游客中心、公园等位置。在每一虚拟形象向用户提供服务的过程中，会存在不同的服务场景，例如在商场中某一显示装置的虚拟形象，可能会遇到用户咨询、闲聊、问路等不同的情况，本实施例中将每一种情况视为一种服务场景。可选的，服务场景可包括但不限于以下任意一项：销售场景、客服场景、指引场景、咨询场景、陪伴场景，本实施例中通过提供更丰富、更细化的服务场景，可便于实现更加精准的服务。In this embodiment, the virtual image is displayed on a fixed display device, which can collect the user's voice through sensors, conduct voice interaction with the user, and provide voice interaction services to the user. The display device can be installed in stores, shopping malls, lobbies, and tourist attractions. Centers, parks and other locations. In the process of each virtual image providing services to users, there will be different service scenarios. For example, the virtual image of a certain display device in a shopping mall may encounter different situations such as user consultation, chatting, and asking for directions. This implementation The example considers each situation as a service scenario. Optionally, service scenarios may include but are not limited to any of the following: sales scenarios, customer service scenarios, guidance scenarios, consultation scenarios, and companionship scenarios. In this embodiment, by providing richer and more detailed service scenarios, it can be easily implemented. More precise service.

此外，可选的，虚拟形象可以显示于普通的显示器上；此外也可以显示于透明显示装置上，例如空气屏，而虚拟形象可以为三维的人物形象，使用户感觉人物形象站在面前，给用户带来更真实的视觉效果，提高用户体验。当然也可采用其他的显示方式，此处不再一一赘述。本实施例中采用虚拟形象更方便切换不同的服务模式，尤其是切换虚拟形象的外观和动作策略。In addition, optionally, the virtual image can be displayed on an ordinary display; in addition, it can also be displayed on a transparent display device, such as an air screen, and the virtual image can be a three-dimensional character, making the user feel that the character is standing in front of them, giving It brings more realistic visual effects to users and improves user experience. Of course, other display methods can also be used, which will not be described one by one here. In this embodiment, the use of avatars makes it easier to switch between different service modes, especially to switch the appearance and action strategies of the avatars.

本实施例中具体的可获取用户意图或者通过其他手段，确定当前服务场景，从而针对不同的服务场景采用不同的服务模式向用户提供服务。In this embodiment, the user's intention may be obtained or other means may be used to determine the current service scenario, so that different service modes are used to provide services to the user for different service scenarios.

S202、根据所述当前服务场景，确定所述当前服务场景对应的目标服务模式。S202. According to the current service scenario, determine the target service mode corresponding to the current service scenario.

其中，所述目标服务模式包括所述虚拟形象的目标外观、目标动作策略、目标交互逻辑、目标话术策略中至少一项。Wherein, the target service mode includes at least one of the target appearance of the avatar, the target action strategy, the target interaction logic, and the target speech strategy.

在本实施例中，在确定当前服务场景后，可根据当前服务场景确定对应的目标服务模式。也即在本实施例中，不同的服务场景对应有不同的服务模式，其中不同的服务模式中，虚拟形象的外观、动作策略、交互逻辑、话术策略中至少一项可以不同。In this embodiment, after the current service scenario is determined, the corresponding target service mode may be determined according to the current service scenario. That is to say, in this embodiment, different service scenarios correspond to different service modes, and in different service modes, at least one of the avatar's appearance, action strategy, interaction logic, and speech strategy may be different.

例如，在销售场景的服务模式中，虚拟形象的衣着比较正式可以和店员的衣着相同，笑容比较柔和，此外虚拟形象的动作可以符合销售场景，比如手势幅度不宜过大、不宜过于频繁、手势柔和等，而交互逻辑则可符合销售场景中的交互逻辑，哪些情况下可以直接回答用户问题，哪些情况可以委婉的回答用户问题，哪些情况下不回答用户问题，这些逻辑可以预先配置，而话术策略则是基于预先配置的销售场景的语料库来组织语言，采用合适销售场景的音色、语气、语调等；再如，在陪伴场景的服务模式中，虚拟形象的衣着比较随意，笑容也可比较随意，允许夸张的大笑，虚拟形象的动作也可以比较夸张，而交互逻辑则可符合陪伴场景中的交互逻辑，可以回答用户问题，可以开玩笑、讲笑话，这些逻辑可以预先配置，而话术策略则是基于预先配置的陪伴场景的语料库来组织语言，采用合适陪伴场景的音色、语气、语调等。For example, in the service mode of the sales scene, the avatar's clothing can be more formal and the same as that of the store clerk, and the smile can be softer. In addition, the avatar's movements can match the sales scene. For example, the gestures should not be too large, not too frequent, and the gestures should be soft. etc., and the interaction logic can conform to the interaction logic in the sales scenario. In which cases the user questions can be answered directly, in which cases the user questions can be answered tactfully, and in which cases the user questions are not answered. These logics can be pre-configured, and the language skills can be pre-configured. The strategy is to organize the language based on the corpus of pre-configured sales scenarios, and adopt the timbre, tone, intonation, etc. that are suitable for the sales scenario; for another example, in the service mode of the companion scenario, the avatar's clothing is more casual and the smile can be more casual , allowing exaggerated laughter, the avatar's movements can also be more exaggerated, and the interaction logic can be consistent with the interaction logic in the companion scene. It can answer user questions, make jokes, and tell jokes. These logics can be pre-configured, and the speech strategy The language is organized based on the corpus of pre-configured companion scenes, using the timbre, tone, intonation, etc. suitable for the companion scene.

S203、将所述虚拟形象的服务模式切换至所述目标服务模式，根据所述目标服务模式与用户进行语音交互。S203. Switch the service mode of the avatar to the target service mode, and perform voice interaction with the user according to the target service mode.

在本实施例中，在确定虚拟形象的目标服务模式后，则对虚拟形象进行服务模式切换，进而在目标服务模式下与客户进行语音交互，使得虚拟形象更符合当前服务场景，更易被用户接受，精准的向用户提供服务。In this embodiment, after the target service mode of the virtual image is determined, the service mode is switched on the virtual image, and then voice interaction is performed with the customer in the target service mode, so that the virtual image is more in line with the current service scenario and easier to be accepted by users. , accurately provide services to users.

本实施例提供的语音服务模式切换方法，通过识别虚拟形象的当前服务场景；根据所述当前服务场景，确定所述当前服务场景对应的目标服务模式；将所述虚拟形象的服务模式切换至所述目标服务模式，根据所述目标服务模式与用户进行语音交互。本实施例中根据不同的服务场景切换对应的服务模式，使得虚拟形象更符合当前服务场景，更易被用户接受，从而实现精准的向用户提供服务，提高用户体验。The voice service mode switching method provided by this embodiment identifies the current service scenario of the avatar; determines the target service mode corresponding to the current service scenario according to the current service scenario; and switches the service mode of the avatar to the desired service mode. The target service mode is described, and voice interaction is performed with the user according to the target service mode. In this embodiment, the corresponding service mode is switched according to different service scenarios, so that the virtual image is more in line with the current service scenario and easier to be accepted by users, thereby accurately providing services to users and improving user experience.

在上述实施例的基础上，如图3所示，S202所述的根据所述当前服务场景，确定所述当前服务场景对应的目标服务模式，具体可包括：Based on the above embodiment, as shown in Figure 3, determining the target service mode corresponding to the current service scenario according to the current service scenario in S202 may specifically include:

S301、根据所述当前服务场景，确定所述当前服务场景对应的所述虚拟形象的目标外观、目标动作策略、目标交互逻辑、目标话术策略中至少一项；S301. According to the current service scenario, determine at least one of the target appearance, target action strategy, target interaction logic, and target speech strategy of the avatar corresponding to the current service scenario;

S302、将所述目标外观、目标动作策略、目标交互逻辑、目标话术策略中至少一项确定为所述目标服务模式。S302. Determine at least one of the target appearance, target action strategy, target interaction logic, and target speech strategy as the target service mode.

在本实施例中，为了更好的精准的向用户提供服务，可以根据当前服务场景确定对应的目标服务模式，其中目标服务模式可以包括目标外观、目标动作策略、目标交互逻辑、目标话术策略中至少一项。本实施例中需要切换模式时可以仅改变虚拟形象的外观、动作策略、交互逻辑、话术策略中的一项或多项，更加灵活、自然的切换服务模式，向用户提供服务。In this embodiment, in order to provide services to users more accurately, the corresponding target service model can be determined according to the current service scenario, where the target service model can include target appearance, target action strategy, target interaction logic, and target speech strategy. at least one of them. In this embodiment, when you need to switch modes, you can only change one or more of the avatar's appearance, action strategy, interaction logic, and speech strategy, switching service modes more flexibly and naturally to provide services to users.

更具体的，所述根据所述当前服务场景，确定所述当前服务场景对应的目标服务模式，包括：More specifically, determining the target service mode corresponding to the current service scenario according to the current service scenario includes:

根据所述当前服务场景、以及预设服务场景与预设服务模式的对应关系，确定所述当前服务场景对应的预设服务模式，作为所述目标服务模式；According to the current service scenario and the corresponding relationship between the preset service scenario and the preset service mode, determine the preset service mode corresponding to the current service scenario as the target service mode;

其中，所述预设服务场景与预设服务模式的对应关系包括：预设服务场景与所述虚拟形象的预设外观、预设动作策略、预设交互逻辑、预设话术策略中至少一项的对应关系。Wherein, the corresponding relationship between the preset service scene and the preset service mode includes: the preset service scene and at least one of the preset appearance of the avatar, the preset action strategy, the preset interaction logic, and the preset speech strategy. Correspondence between items.

在本实施例中，可提前配置不同预设服务场景对应的预设服务模式，也即配置不同预设服务场景下虚拟形象的预设外观、预设动作策略、预设交互逻辑、预设话术策略中至少一项，获取各预设服务场景与所配置的虚拟形象的预设外观、预设动作策略、预设交互逻辑、预设话术策略中至少一项的对应关系，进而可根据当前服务场景以及上述的对应关系，确定目标服务模式。本实施例中通过提前配置不同预设服务场景对应的预设服务模式，可在服务场景发生变化时准确的切换到合适的服务模式下，对用户进行精准服务。In this embodiment, the preset service modes corresponding to different preset service scenarios can be configured in advance, that is, the preset appearance, preset action strategy, preset interaction logic, and preset words of the avatar under different preset service scenarios can be configured. At least one of the technical strategies is obtained, and the corresponding relationship between each preset service scene and at least one of the configured virtual image's default appearance, preset action strategy, preset interaction logic, and preset speaking strategy is obtained, and then the corresponding relationship can be obtained according to The current service scenario and the above-mentioned corresponding relationships determine the target service model. In this embodiment, by configuring the preset service modes corresponding to different preset service scenarios in advance, when the service scenario changes, it can accurately switch to the appropriate service mode to provide accurate services to users.

在上述任一实施例的基础上，在S201所述的识别虚拟形象的当前服务场景，可以采用一些交互场景识别的方法，判断当前服务场景是销售场景、客服场景、指引场景、咨询场景、陪伴场景等场景中的哪一种场景。Based on any of the above embodiments, in identifying the current service scene of the virtual image in S201, some interactive scene recognition methods can be used to determine whether the current service scene is a sales scene, a customer service scene, a guidance scene, a consultation scene, or a companionship scene. Which of the scenes and so on.

在一种可选实施例中，如图4所示，S201所述的识别虚拟形象的当前服务场景，具体可包括：In an optional embodiment, as shown in Figure 4, identifying the current service scenario of the virtual image in S201 may specifically include:

S401、根据用户的语音指令确定用户意图；S401. Determine the user's intention according to the user's voice command;

S402、根据所述用户意图识别所述虚拟形象的当前服务场景。S402. Identify the current service scenario of the avatar according to the user intention.

在本实施例中，可在用户与虚拟形象对话时，基于采集到的用户的语音指令确定当前服务场景，根据用户的语音指令进行语音识别和语义理解，确定用户意图，进而可确定用户的需求是销售、客服、指引、咨询、陪伴中的哪一种，进而确定虚拟形象的当前服务场景，例如用户询问卫生间位置，则确定用户意图是需要虚拟形象指引卫生间位置，进而确定虚拟形象的当前服务场景为指引场景；再如用户询问商品信息，则确定用户意图是购买商品、想了解商品信息，进而确定虚拟形象的当前服务场景为销售场景。In this embodiment, when the user talks to the avatar, the current service scenario can be determined based on the collected user's voice instructions, and speech recognition and semantic understanding can be performed based on the user's voice instructions to determine the user's intention and then determine the user's needs. Which one of sales, customer service, guidance, consultation, and companionship is used to determine the current service scenario of the avatar. For example, if the user asks for the location of the bathroom, it is determined that the user's intention is to require the avatar to guide the location of the bathroom, and then the current service of the avatar is determined. The scene is a guidance scene; if the user asks for product information, it is determined that the user intends to purchase the product and wants to know the product information, and then determines the current service scene of the avatar as a sales scene.

在另一种可选实施例中，S201所述的识别虚拟形象的当前服务场景，具体可包括：In another optional embodiment, identifying the current service scenario of the virtual image in S201 may specifically include:

若识别到用户手中的商品，则确定所述虚拟形象的当前服务场景为销售场景。If the product in the user's hand is identified, the current service scene of the virtual image is determined to be a sales scene.

在本实施例中，当用户与虚拟形象交互时、或者用户处于虚拟形象前时，可对用户进行图像采集和识别，若识别到用户手中的商品，则可确定虚拟形象的当前服务场景为销售场景，进一步的，根据当前服务场景，确定目标服务模式为销售场景对应的服务模式；进而将所述虚拟形象的服务模式切换至销售场景对应的服务模式，根据销售场景对应的服务模式向用户讲解所述商品的相关信息。本实施例中，在虚拟形象处于非销售场景对应的服务模式时，当识别到用户手中的商品后，自动切换至销售场景对应的服务模式，可更智能的向用户进行商品推销，从而实现精准的向用户提供商品推销服务，提高用户体验。In this embodiment, when the user interacts with the avatar or when the user is in front of the avatar, the user's image can be collected and recognized. If the product in the user's hand is recognized, it can be determined that the current service scenario of the avatar is sales. Scenario, further, according to the current service scenario, determine the target service mode as the service mode corresponding to the sales scenario; then switch the service mode of the avatar to the service mode corresponding to the sales scenario, and explain to the user according to the service mode corresponding to the sales scenario Information about the product in question. In this embodiment, when the avatar is in a service mode corresponding to a non-sales scenario, after recognizing the product in the user's hand, it automatically switches to the service mode corresponding to the sales scenario, which can more intelligently promote products to the user, thereby achieving accurate Provide product promotion services to users and improve user experience.

对用户进行人脸识别，根据人脸识别结果确定所述虚拟形象的当前服务场景。Perform face recognition on the user, and determine the current service scenario of the virtual image based on the face recognition result.

在本实施例中，当用户与虚拟形象交互时、或者用户处于虚拟形象前时，可对用户进行人脸识别，从而针对不同的用户采用不同的服务模式，例如在迎宾场景下，可以对不同的用户采用不同的迎宾服务模式，也即将迎宾场景分为更细化的、与不同用户相对应的场景，从而采用更灵活的、有针对性的服务模式满足不同用户的需求，提高服务质量，提升用户体验。In this embodiment, when the user interacts with the avatar, or when the user is in front of the avatar, face recognition can be performed on the user, so that different service modes can be adopted for different users. For example, in a welcome scene, the user can be Different users adopt different welcome service models, which means that the welcome scene is divided into more detailed scenarios corresponding to different users, so as to adopt more flexible and targeted service models to meet the needs of different users and improve Service quality and improve user experience.

进一步的，在本实施例中可以配置预设用户身份信息与预设服务场景的对应关系，进而，所述根据人脸识别结果确定所述虚拟形象的当前服务场景，可包括：Furthermore, in this embodiment, the corresponding relationship between the preset user identity information and the preset service scenario can be configured. Furthermore, determining the current service scenario of the avatar based on the face recognition result may include:

根据人脸识别结果确定用户身份信息；根据预设用户身份信息与预设服务场景的对应关系，确定所述用户身份对应的预设服务场景，作为所述虚拟形象的当前服务场景。The user identity information is determined according to the face recognition result; and the preset service scenario corresponding to the user identity is determined according to the corresponding relationship between the preset user identity information and the preset service scenario as the current service scenario of the avatar.

本实施例中，用户身份信息可以包括用户性别、年龄等信息，也可包括一些提前输入的用户信息，例如用户姓名、职业等。预设用户身份信息与预设服务场景的对应关系可以为不同类别的用户与预设服务场景的对应关系，例如不同年龄段的用户与预设服务场景的对应关系、不同性别用户与预设服务场景的对应关系等，当然也可以细化到某一个用户与预设服务场景的对应关系，从而能够更加有针对性的向用户提供服务，满足不同用户的需求。In this embodiment, user identity information may include user gender, age and other information, and may also include some user information input in advance, such as user name, occupation, etc. The corresponding relationship between the preset user identity information and the preset service scenario can be the corresponding relationship between different categories of users and the preset service scenario, such as the corresponding relationship between users of different age groups and the preset service scenario, and the corresponding relationship between users of different genders and the preset service. The corresponding relationship between scenarios, etc., of course, can also be refined to the corresponding relationship between a certain user and the preset service scenario, so that services can be provided to users in a more targeted manner to meet the needs of different users.

在上述任一实施例的基础上，S203所述将所述虚拟形象的服务模式切换至所述目标服务模式，具体可包括：Based on any of the above embodiments, switching the service mode of the avatar to the target service mode in S203 may specifically include:

若满足预设条件时，将所述虚拟形象的服务模式实时切换至所述目标服务模式；If the preset conditions are met, switch the service mode of the avatar to the target service mode in real time;

若不满足预设条件时，保持所述虚拟形象的服务模式不变；If the preset conditions are not met, the service mode of the virtual image remains unchanged;

其中，所述预设条件为预设的时间条件和/或所述预设的环境条件。Wherein, the preset condition is a preset time condition and/or the preset environmental condition.

在本实施例中，在服务场景发生变化时，可以实时进行服务模式的切换；当然也可以预先设定一些预设条件，在满足预设条件的情况下进行服务模式的切换，在不满足预设条件的情况下不进行服务模式的切换，从而使得服务模式切换更符合实际情况、更加规范，如当前时间处于第一预设时间段内虚拟形象可以实时切换服务模式，当前时间未处于第一预设时间段内(如在第二预设时间段内)虚拟形象可保持服务模式不变，例如指定虚拟形象在第二预设时间段内保持销售场景对应的服务模式，当用户需要闲聊或陪伴时，可以不进行服务模式的切换，并且可以拒绝用户，例如回复用户“我正在工作，不能与您闲聊”；再如虚拟形象的显示装置位于柜台前时，需要虚拟形象保持销售场景对应的服务模式不变，而移动至其他位置后，才能根据服务场景进行服务模式的实时切换；可选的，可以设置多个显示装置，根据传感器采集的用户位置确定虚拟形象所在的显示装置，实现虚拟形象跟随用户、进行随身服务，当虚拟形象移动至某一环境时，也即虚拟形象出现在该环境处的显示装置中时，如虚拟形象移动至柜台前的显示装置，可设定虚拟形象的服务模式不变，如保持销售场景对应的服务模式不变，而移动至另一环境时，可设定虚拟形象根据服务场景进行服务模式的实时切换。In this embodiment, when the service scene changes, the service mode can be switched in real time; of course, some preset conditions can also be set in advance, and the service mode can be switched when the preset conditions are met. The service mode switch is not performed under certain conditions, so that the service mode switch is more in line with the actual situation and more standardized. If the current time is within the first preset time period, the avatar can switch the service mode in real time. If the current time is not in the first The avatar can keep the service mode unchanged during the preset time period (such as the second preset time period). For example, the designated avatar can maintain the service mode corresponding to the sales scene during the second preset time period. When the user needs to chat or When accompanying, you do not need to switch the service mode, and you can refuse the user, for example, replying to the user "I am working and can't chat with you"; another example is when the avatar display device is located in front of the counter, the avatar needs to maintain the corresponding sales scene. The service mode remains unchanged, and after moving to other locations, the service mode can be switched in real time according to the service scenario; optionally, multiple display devices can be set up, and the display device where the avatar is located is determined based on the user position collected by the sensor to realize virtualization. The image follows the user and performs portable services. When the virtual image moves to a certain environment, that is, when the virtual image appears in the display device in that environment, such as the virtual image moves to the display device in front of the counter, the virtual image can be set. The service mode remains unchanged. For example, if the service mode corresponding to the sales scene remains unchanged, and when moving to another environment, the avatar can be set to switch the service mode in real time according to the service scene.

在上述实施例的基础上，本实施例中在根据目标服务模式与用户进行语音交互的过程中，可以不限定流程化的交互逻辑，也即交互过程不是按照流程机械式的进行，例如用户办理证件时，用户需要咨询或指引，则虚拟形象选择合适的、非流程化的交互逻辑与用户交流，而不是必须先交流什么、再交流什么，如果用户突然询问天气，则确定服务场景发生变化，切换至闲聊场景对应的服务模式，用户问完天气后继续办理证件，则再将服务模式切换回来，通过灵活的服务模式切换过程，为用户提供更好的服务，提高用户体验。On the basis of the above embodiments, in this embodiment, during the voice interaction with the user according to the target service mode, the process-based interaction logic may not be limited, that is, the interaction process is not carried out mechanically according to the process, for example, the user handles When issuing a certificate, if the user needs consultation or guidance, the avatar will choose appropriate, non-process interaction logic to communicate with the user, instead of having to communicate first and then what. If the user suddenly asks about the weather, it is determined that the service scenario has changed. Switch to the service mode corresponding to the chat scene. After the user asks about the weather and continues to apply for documents, the service mode is switched back. Through the flexible service mode switching process, the user can be provided with better services and improve the user experience.

本申请一实施例提供一种语音服务模式切换装置，图5为本发明实施例提供的语音服务模式切换装置的结构图。如图5所示，所述语音服务模式切换装置500具体包括：场景识别模块501、服务模式确定模块502以及处理模块503。An embodiment of the present application provides a voice service mode switching device. FIG. 5 is a structural diagram of the voice service mode switching device provided by an embodiment of the present invention. As shown in FIG. 5 , the voice service mode switching device 500 specifically includes: a scene recognition module 501 , a service mode determination module 502 and a processing module 503 .

场景识别模块501，用于识别虚拟形象的当前服务场景；Scene recognition module 501, used to identify the current service scene of the virtual image;

服务模式确定模块502，用于根据所述当前服务场景，确定所述当前服务场景对应的目标服务模式；其中，所述目标服务模式包括所述虚拟形象的目标外观、目标动作策略、目标交互逻辑、目标话术策略中至少一项；The service mode determination module 502 is used to determine the target service mode corresponding to the current service scenario according to the current service scenario; wherein the target service mode includes the target appearance, target action strategy, and target interaction logic of the avatar. , at least one of the target speech strategies;

处理模块503，用于将所述虚拟形象的服务模式切换至所述目标服务模式，根据所述目标服务模式与用户进行语音交互。The processing module 503 is configured to switch the service mode of the avatar to the target service mode, and perform voice interaction with the user according to the target service mode.

在上述实施例的基础上，所述服务模式确定模块502用于：Based on the above embodiments, the service mode determination module 502 is used to:

根据所述当前服务场景，确定所述当前服务场景对应的所述虚拟形象的目标外观、目标动作策略、目标交互逻辑、目标话术策略中至少一项；According to the current service scenario, determine at least one of the target appearance, target action strategy, target interaction logic, and target speech strategy of the avatar corresponding to the current service scenario;

将所述目标外观、目标动作策略、目标交互逻辑、目标话术策略中至少一项确定为所述目标服务模式。At least one of the target appearance, target action strategy, target interaction logic, and target speech strategy is determined as the target service mode.

在上述任一实施例的基础上，可选的，所述场景识别模块501用于：Based on any of the above embodiments, optionally, the scene recognition module 501 is used to:

根据用户的语音指令确定用户意图；Determine user intent based on user voice commands;

根据所述用户意图识别所述虚拟形象的当前服务场景。Identify the current service scenario of the avatar according to the user intention.

若识别到用户手中的商品，则确定所述虚拟形象的当前服务场景为销售场景；If the product in the user's hand is identified, the current service scenario of the avatar is determined to be a sales scenario;

所述处理模块503用于：The processing module 503 is used for:

将所述虚拟形象的服务模式切换至销售场景对应的服务模式，根据所述销售场景对应的服务模式向用户讲解所述商品的相关信息。Switch the service mode of the avatar to the service mode corresponding to the sales scenario, and explain the relevant information of the product to the user according to the service mode corresponding to the sales scenario.

在上述实施例的基础上，所述场景识别模块501在根据人脸识别结果确定所述虚拟形象的当前服务场景时，用于：Based on the above embodiments, when determining the current service scene of the avatar according to the face recognition result, the scene recognition module 501 is used to:

根据人脸识别结果确定用户身份信息；Determine user identity information based on face recognition results;

根据预设用户身份信息与预设服务场景的对应关系，确定所述用户身份对应的预设服务场景，作为所述虚拟形象的当前服务场景。According to the corresponding relationship between the preset user identity information and the preset service scenario, the preset service scenario corresponding to the user identity is determined as the current service scenario of the avatar.

在上述任一实施例的基础上，所述处理模块503用于：Based on any of the above embodiments, the processing module 503 is used to:

在上述任一实施例的基础上，所述当前服务场景包括以下任意一项：销售场景、客服场景、指引场景、咨询场景、陪伴场景。Based on any of the above embodiments, the current service scenario includes any one of the following: sales scenario, customer service scenario, guidance scenario, consultation scenario, and companion scenario.

在上述任一实施例的基础上，所述虚拟形象显示于透明显示装置上。Based on any of the above embodiments, the virtual image is displayed on a transparent display device.

本实施例提供的语音服务模式切换装置可以具体用于执行上述图2-4所提供的语音服务模式切换方法实施例，具体功能此处不再赘述。The voice service mode switching device provided in this embodiment can be specifically used to execute the above embodiment of the voice service mode switching method provided in Figures 2-4, and the specific functions will not be described again here.

本实施例提供的语音服务模式切换装置，通过识别虚拟形象的当前服务场景；根据所述当前服务场景，确定所述当前服务场景对应的目标服务模式；将所述虚拟形象的服务模式切换至所述目标服务模式，根据所述目标服务模式与用户进行语音交互。本实施例中根据不同的服务场景切换对应的服务模式，使得虚拟形象更符合当前服务场景，更易被用户接受，从而实现精准的向用户提供服务，提高用户体验。The voice service mode switching device provided in this embodiment identifies the current service scenario of the avatar; determines the target service mode corresponding to the current service scenario according to the current service scenario; and switches the service mode of the avatar to the desired service mode. The target service mode is described, and voice interaction is performed with the user according to the target service mode. In this embodiment, the corresponding service mode is switched according to different service scenarios, so that the virtual image is more in line with the current service scenario and easier to be accepted by users, thereby accurately providing services to users and improving user experience.

根据本申请的实施例，本申请还提供了一种电子设备和一种可读存储介质。According to embodiments of the present application, the present application also provides an electronic device and a readable storage medium.

如图6所示，是根据本申请实施例的语音服务模式切换方法的电子设备的框图。电子设备旨在表示各种形式的数字计算机，诸如，膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置，诸如，个人数字助理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例，并且不意在限制本文中描述的和/或者要求的本申请的实现。As shown in Figure 6, it is a block diagram of an electronic device according to the voice service mode switching method according to the embodiment of the present application. Electronic devices are intended to refer to various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are examples only and are not intended to limit the implementation of the present application as described and/or claimed herein.

如图6所示，该电子设备包括：一个或多个处理器601、存储器602，以及用于连接各部件的接口，包括高速接口和低速接口。各个部件利用不同的总线互相连接，并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器可以对在电子设备内执行的指令进行处理，包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如，耦合至接口的显示设备)上显示GUI的图形信息的指令。在其它实施方式中，若需要，可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样，可以连接多个电子设备，各个设备提供部分必要的操作(例如，作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图6中以一个处理器601为例。As shown in Figure 6, the electronic device includes: one or more processors 601, memory 602, and interfaces for connecting various components, including high-speed interfaces and low-speed interfaces. The various components are connected to each other using different buses and can be mounted on a common motherboard or otherwise mounted as desired. The processor may process instructions executed within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired. Likewise, multiple electronic devices can be connected, each device providing part of the necessary operation (eg, as a server array, a set of blade servers, or a multi-processor system). In Figure 6, a processor 601 is taken as an example.

存储器602即为本申请所提供的非瞬时计算机可读存储介质。其中，所述存储器存储有可由至少一个处理器执行的指令，以使所述至少一个处理器执行本申请所提供的语音服务模式切换方法。本申请的非瞬时计算机可读存储介质存储计算机指令，该计算机指令用于使计算机执行本申请所提供的语音服务模式切换方法。The memory 602 is the non-transitory computer-readable storage medium provided by this application. Wherein, the memory stores instructions that can be executed by at least one processor, so that the at least one processor executes the voice service mode switching method provided by this application. The non-transitory computer-readable storage medium of this application stores computer instructions, which are used to cause the computer to execute the voice service mode switching method provided by this application.

存储器602作为一种非瞬时计算机可读存储介质，可用于存储非瞬时软件程序、非瞬时计算机可执行程序以及模块，如本申请实施例中的语音服务模式切换方法对应的程序指令/模块(例如，附图5所示的场景识别模块501、服务模式确定模块502以及处理模块503)。处理器601通过运行存储在存储器602中的非瞬时软件程序、指令以及模块，从而执行服务器的各种功能应用以及数据处理，即实现上述方法实施例中的语音服务模式切换方法。As a non-transitory computer-readable storage medium, the memory 602 can be used to store non-transitory software programs, non-transitory computer executable programs and modules, such as program instructions/modules corresponding to the voice service mode switching method in the embodiments of the present application (for example, , the scene recognition module 501, the service mode determination module 502 and the processing module 503 shown in Figure 5). The processor 601 executes various functional applications and data processing of the server by running non-transient software programs, instructions and modules stored in the memory 602, that is, implementing the voice service mode switching method in the above method embodiment.

存储器602可以包括存储程序区和存储数据区，其中，存储程序区可存储操作系统、至少一个功能所需要的应用程序；存储数据区可存储根据语音服务模式切换方法的电子设备的使用所创建的数据等。此外，存储器602可以包括高速随机存取存储器，还可以包括非瞬时存储器，例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些实施例中，存储器602可选包括相对于处理器601远程设置的存储器，这些远程存储器可以通过网络连接至语音服务模式切换方法的电子设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 602 may include a storage program area and a storage data area, where the storage program area may store an operating system and an application program required for at least one function; the storage data area may store data created according to the use of the electronic device according to the voice service mode switching method. Data etc. In addition, memory 602 may include high-speed random access memory, and may also include non-transitory memory, such as at least one disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, the memory 602 optionally includes memories remotely located relative to the processor 601, and these remote memories can be connected to electronic devices of the voice service mode switching method through a network. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.

语音服务模式切换方法的电子设备还可以包括：输入装置603和输出装置604。处理器601、存储器602、输入装置603和输出装置604可以通过总线或者其他方式连接，图6中以通过总线连接为例。The electronic device of the voice service mode switching method may also include: an input device 603 and an output device 604. The processor 601, the memory 602, the input device 603 and the output device 604 can be connected through a bus or other means. In Figure 6, connection through a bus is taken as an example.

输入装置603可接收输入的数字或字符信息，以及产生与语音服务模式切换方法的电子设备的用户设置以及功能控制有关的键信号输入，例如触摸屏、小键盘、鼠标、轨迹板、触摸板、指示杆、一个或者多个鼠标按钮、轨迹球、操纵杆等输入装置。输出装置604可以包括显示设备、辅助照明装置(例如，LED)和触觉反馈装置(例如，振动电机)等。该显示设备可以包括但不限于，液晶显示器(LCD)、发光二极管(LED)显示器和等离子体显示器。在一些实施方式中，显示设备可以是触摸屏。The input device 603 can receive input numeric or character information, and generate key signal input related to user settings and function control of the electronic device of the voice service mode switching method, such as a touch screen, a small keyboard, a mouse, a trackpad, a touch pad, and a pointer. An input device such as a stick, one or more mouse buttons, a trackball, or a joystick. Output devices 604 may include display devices, auxiliary lighting devices (eg, LEDs), tactile feedback devices (eg, vibration motors), and the like. The display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、专用ASIC(专用集成电路)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括：实施在一个或者多个计算机程序中，该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释，该可编程处理器可以是专用或者通用可编程处理器，可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令，并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein may be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include implementation in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor The processor, which may be a special purpose or general purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device. An output device.

这些计算程序(也称作程序、软件、软件应用、或者代码)包括可编程处理器的机器指令，并且可以利用高级过程和/或面向对象的编程语言、和/或汇编/机器语言来实施这些计算程序。如本文使用的，术语“机器可读介质”和“计算机可读介质”指的是用于将机器指令和/或数据提供给可编程处理器的任何计算机程序产品、设备、和/或装置(例如，磁盘、光盘、存储器、可编程逻辑装置(PLD))，包括，接收作为机器可读信号的机器指令的机器可读介质。术语“机器可读信号”指的是用于将机器指令和/或数据提供给可编程处理器的任何信号。These computing programs (also referred to as programs, software, software applications, or code) include machine instructions for programmable processors, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine language Calculation program. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or means for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLD)), including machine-readable media that receive machine instructions as machine-readable signals. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

为了提供与用户的交互，可以在计算机上实施此处描述的系统和技术，该计算机具有：用于向用户显示信息的显示装置(例如，CRT(阴极射线管)或者LCD(液晶显示器)监视器)；以及键盘和指向装置(例如，鼠标或者轨迹球)，用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互；例如，提供给用户的反馈可以是任何形式的传感反馈(例如，视觉反馈、听觉反馈、或者触觉反馈)；并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer. Other kinds of devices may also be used to provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and may be provided in any form, including Acoustic input, voice input or tactile input) to receive input from the user.

可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如，作为数据服务器)、或者包括中间件部件的计算系统(例如，应用服务器)、或者包括前端部件的计算系统(例如，具有图形用户界面或者网络浏览器的用户计算机，用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如，通信网络)来将系统的部件相互连接。通信网络的示例包括：局域网(LAN)、广域网(WAN)和互联网。The systems and techniques described herein may be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., A user's computer having a graphical user interface or web browser through which the user can interact with implementations of the systems and technologies described herein), or including such backend components, middleware components, or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communications network). Examples of communication networks include: local area network (LAN), wide area network (WAN), and the Internet.

计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。Computer systems may include clients and servers. Clients and servers are generally remote from each other and typically interact over a communications network. The relationship of client and server is created by computer programs running on corresponding computers and having a client-server relationship with each other.

根据本申请实施例的技术方案，通过识别虚拟形象的当前服务场景；根据所述当前服务场景，确定所述当前服务场景对应的目标服务模式；将所述虚拟形象的服务模式切换至所述目标服务模式，根据所述目标服务模式与用户进行语音交互。本实施例中根据不同的服务场景切换对应的服务模式，使得虚拟形象更符合当前服务场景，更易被用户接受，从而实现精准的向用户提供服务，提高用户体验。According to the technical solution of the embodiment of the present application, by identifying the current service scenario of the avatar; determining the target service mode corresponding to the current service scenario according to the current service scenario; switching the service mode of the avatar to the target Service mode: perform voice interaction with the user according to the target service mode. In this embodiment, the corresponding service mode is switched according to different service scenarios, so that the virtual image is more in line with the current service scenario and easier to be accepted by users, thereby accurately providing services to users and improving user experience.

本申请还提供了一种计算机程序，包括程序代码，当计算机运行所述计算机程序时，所述程序代码执行如上述实施例所述的语音服务模式切换方法。This application also provides a computer program, including program code. When the computer runs the computer program, the program code executes the voice service mode switching method as described in the above embodiment.

应该理解，可以使用上面所示的各种形式的流程，重新排序、增加或删除步骤。例如，本发申请中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行，只要能够实现本申请公开的技术方案所期望的结果，本文在此不进行限制。It should be understood that various forms of the process shown above may be used, with steps reordered, added or deleted. For example, each step described in the present application can be executed in parallel, sequentially, or in a different order. As long as the desired results of the technical solution disclosed in the present application can be achieved, there is no limitation here.

上述具体实施方式，并不构成对本申请保护范围的限制。本领域技术人员应该明白的是，根据设计要求和其他因素，可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等，均应包含在本申请保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the scope of protection of the present application. It will be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions are possible depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of this application shall be included in the protection scope of this application.

Claims

1. A voice service mode switching method, characterized by comprising:

Identify the current service scenario of the avatar;

According to the current service scenario, a target service mode corresponding to the current service scenario is determined; wherein the target service mode includes at least one of the target appearance of the avatar, the target action strategy, the target interaction logic, and the target speech strategy. item; wherein, in different service modes, at least one of the appearance, action strategy, interaction logic, and speech strategy of the virtual image is different;

Switch the service mode of the avatar to the target service mode, and perform voice interaction with the user according to the target service mode;

Switching the service mode of the avatar to the target service mode includes:

If the preset conditions are met, switch the service mode of the avatar to the target service mode in real time;

If the preset conditions are not met, the service mode of the virtual image remains unchanged;

Wherein, the preset conditions are preset time conditions and/or preset environmental conditions.

2. The method according to claim 1, wherein determining the target service mode corresponding to the current service scenario according to the current service scenario includes:

According to the current service scenario, determine at least one of the target appearance, target action strategy, target interaction logic, and target speech strategy of the avatar corresponding to the current service scenario;

At least one of the target appearance, target action strategy, target interaction logic, and target speech strategy is determined as the target service mode.

3. The method according to claim 2, wherein determining the target service mode corresponding to the current service scenario according to the current service scenario includes:

According to the current service scenario and the corresponding relationship between the preset service scenario and the preset service mode, determine the preset service mode corresponding to the current service scenario as the target service mode;

Wherein, the corresponding relationship between the preset service scene and the preset service mode includes: the preset service scene and at least one of the preset appearance of the avatar, the preset action strategy, the preset interaction logic, and the preset speech strategy. Correspondence between items.

4. The method according to claim 1, characterized in that identifying the current service scenario of the virtual image includes:

Determine user intent based on user voice commands;

Identify the current service scenario of the avatar according to the user intention.

5. The method according to claim 1, characterized in that identifying the current service scenario of the virtual image includes:

If the product in the user's hand is identified, the current service scenario of the avatar is determined to be a sales scenario;

Switching the service mode of the avatar to the target service mode and performing voice interaction with the user according to the target service mode includes:

Switch the service mode of the avatar to the service mode corresponding to the sales scenario, and explain the relevant information of the product to the user according to the service mode corresponding to the sales scenario.

6. The method according to claim 1, characterized in that identifying the current service scenario of the virtual image includes:

Perform face recognition on the user, and determine the current service scenario of the virtual image based on the face recognition result.

7. The method according to claim 6, characterized in that determining the current service scenario of the avatar according to the face recognition result includes:

Determine user identity information based on face recognition results;

According to the corresponding relationship between the preset user identity information and the preset service scenario, the preset service scenario corresponding to the user identity is determined as the current service scenario of the avatar.

8. The method according to any one of claims 1 to 7, characterized in that the current service scenario includes any one of the following: sales scenario, customer service scenario, guidance scenario, consultation scenario, and companion scenario.

9. The method according to any one of claims 1 to 7, characterized in that the virtual image is displayed on a transparent display device.

10. A voice service mode switching device, characterized by comprising:

Scene recognition module, used to identify the current service scenario of the virtual image;

A service mode determination module, configured to determine a target service mode corresponding to the current service scenario according to the current service scenario; wherein the target service mode includes the target appearance of the avatar, the target action strategy, the target interaction logic, At least one of the target speech strategies; wherein, in different service modes, at least one of the avatar's appearance, action strategy, interaction logic, and speech strategy is different;

A processing module, configured to switch the service mode of the avatar to the target service mode, and perform voice interaction with the user according to the target service mode;

The processing module is used for:

11. The device according to claim 10, characterized in that the service mode determination module is configured to:

12. The device according to claim 11, characterized in that the service mode determination module is configured to:

13. The device according to claim 10, characterized in that the scene recognition module is used for:

Determine user intent based on user voice commands;

14. The device according to claim 10, characterized in that the scene recognition module is used for:

The processing module is used for:

15. The device according to claim 10, characterized in that the scene recognition module is used for:

16. The device according to claim 15, characterized in that when the scene recognition module performs face recognition on the user and determines the current service scene of the avatar according to the face recognition result, it is specifically used to:

Determine user identity information based on face recognition results;

17. The device according to any one of claims 10 to 16, characterized in that the current service scenario includes any one of the following: sales scenario, customer service scenario, guidance scenario, consultation scenario, and companion scenario.

18. The device according to any one of claims 10 to 16, wherein the virtual image is displayed on a transparent display device.

19. An electronic device, characterized in that it includes:

at least one processor; and

a memory communicatively connected to the at least one processor; wherein,

The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform any one of claims 1-9. Methods.

20. A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the method according to any one of claims 1-9.