WO2016177262A1

WO2016177262A1 - Collaboration method for intelligent conference and conference terminal

Info

Publication number: WO2016177262A1
Application number: PCT/CN2016/079202
Authority: WO
Inventors: 刘源
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2015-05-06
Filing date: 2016-04-13
Publication date: 2016-11-10
Anticipated expiration: 2017-11-06
Also published as: CN104836981A; CN104836981B

Abstract

Provided are a collaboration method for an intelligent conference and a conference terminal. The method comprises: a conference terminal receiving multimedia information about a conference sent by a remote device; according to the multimedia information about the conference and a pre-set multimedia content, the conference terminal determining whether the user needs to participate in the conference; the conference terminal determining a participating state of the user; and if the conference terminal determines that the user needs to participate in the conference and the participating state of the user is an absent state, the conference terminal sending participation reminding information to the user, the participation reminding information being used to prompt the user to participate in the conference. The method provided in the embodiments of the present invention can accurately control the participation time of a user, so that the user will not miss the content of a conference, thereby improving the participation efficiency of the user.

Description

Intelligent conference collaboration method and conference terminal

本申请要求于2015年5月6日提交中国专利局、申请号为201510227117.8、发明名称为“智能会议的协作方法和会议终端”的中国专利申请的优先权，其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application No. 201510227117.8, entitled "Collaborative Method for Intelligent Conference and Conference Terminal", which is filed on May 6, 2015, the entire contents of which is hereby incorporated by reference. in.

Technical field

本发明实施例涉及通信技术，尤其涉及一种智能会议的协作方法和会议终端。The embodiments of the present invention relate to communication technologies, and in particular, to a method and a conference terminal for an intelligent conference.

Background technique

对于许多工作忙碌的人来说，经常有同时参加多个会议的需求。但是一个人同时只能参加一个传统的面对面会议。电话、视频会议系统等的出现允许用户同时加入多个远程会议，但是一个人很难对多个会议的进度保持跟踪。在参加某些会议时，往往有较大一部分时间的会议内容和参会人关系不大，但参会人又必须得参加。参会人大部分时间处于听取别人的发言内容或者做不相关的事情的状态，参会者真正所关心的会议内容或会议需要参会人参与的时间较短且比较随机，因此导致参会者浪费了很多时间。For many people who are busy with work, there is often a need to attend multiple meetings at the same time. But one can only participate in a traditional face-to-face meeting at the same time. The emergence of telephones, video conferencing systems, etc. allows users to join multiple remote conferences at the same time, but it is difficult for one person to keep track of the progress of multiple conferences. When attending certain meetings, there is often a large part of the time that the content of the meeting has little to do with the participants, but the participants must also participate. Most of the participants are in a state of listening to other people's speeches or doing irrelevant things. The participants' real interest in the meeting content or the meeting requires participants to participate in a shorter and more random time, thus causing participants to waste. A lot of time.

由于会议涉及到参会人的内容经常只占会议的一小部分集中的时间段，因此为了能够节省时间，参会者采用分时间段参与的方式参会，但这种方式下何时入会的时间点也比较难控制，可能上一个议题被延期或提前导致不能在正确的时间段参会，而错过参会者所关注的会议内容，从而使得参会者的参会效率不高。Since the content of the participants involved in the conference often only occupies a small part of the time period of the conference, in order to save time, the participants participate in the participation in a time-sharing manner, but when the party joins in this way The time point is also difficult to control. It may be that the previous topic is postponed or advanced, and it is impossible to participate in the correct time period. The meeting content that the participants are concerned about is missed, so that the participation of the participants is not efficient.

发明内容Summary of the invention

本发明实施例提供一种智能会议的协作方法和会议终端，用以解决现有技术中无法控制参会者的入会时间，导致参会者错过其所关注的会议内容进而引发参会者参会效率低下的技术问题。The embodiments of the present invention provide a method for collaborating on a smart conference and a conference terminal, which are used to solve the joining time of the participants in the prior art, which causes the participants to miss the conference content of the conference and invite the participants to participate. Inefficient technical problems.

第一方面，本发明实施例提供一种智能会议的协作方法，包括：In a first aspect, an embodiment of the present invention provides a collaboration method for an intelligent conference, including:

会议终端接收远端设备发送的会议的多媒体信息；所述会议的多媒体信息包括语音信息、图像信息和文本信息中的至少一种；其中，所述会议终端为已加入所述会议的终端；The conference terminal receives the multimedia information of the conference sent by the remote device; the multimedia information of the conference includes at least one of voice information, image information, and text information; wherein the conference terminal is a terminal that has joined the conference;

所述会议终端根据所述会议的多媒体信息和预设的多媒体内容，确定所述会议是否需要所述用户参加；其中，所述预设的多媒体内容包括：所述用户关注的语音内容、所述用户关注的图像内容、所述用户关注的图像内容的关联信息和所述用户关注的文本内容中的至少一种；Determining, by the conference terminal, whether the conference requires the user to participate according to the multimedia information of the conference and the preset multimedia content, where the preset multimedia content includes: the voice content that the user pays attention to, the At least one of image content of interest of the user, associated information of the image content of interest of the user, and text content of interest of the user;

所述会议终端确定所述用户的参会状态；所述用户的参会状态包括缺席状态或用户参会冲突状态；Determining the participation status of the user by the conference terminal; the participation status of the user includes an absent state or a user participation Conflict state

若所述会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则所述会议终端向所述用户发送参会提醒信息；所述参会提醒信息用于提示所述用户参加所述会议。If the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the conference terminal sends the conference reminder information to the user; the conference reminder information is used for The user is prompted to participate in the meeting.

结合第一方面，在第一方面的第一种可能的实施方式中，所述会议终端根据所述会议的多媒体信息和预设的多媒体内容，确定所述会议是否需要所述用户参加，包括：With reference to the first aspect, in a first possible implementation manner of the first aspect, the conference terminal determines, according to the multimedia information of the conference and the preset multimedia content, whether the conference requires the user to participate, including:

所述会议终端根据所述会议的多媒体信息和所述预设的多媒体内容，确定所述会议的多媒体信息中是否存在预设的会议信息；所述预设的会议信息为用于表征需要所述用户参加所述会议的信息；Determining, by the conference terminal, whether preset conference information exists in the multimedia information of the conference according to the multimedia information of the conference and the preset multimedia content; the preset conference information is used for characterization Information about the user participating in the meeting;

若所述会议的多媒体信息中存在所述预设的会议信息，则所述会议终端确定所述会议需要所述用户参加。If the preset conference information exists in the multimedia information of the conference, the conference terminal determines that the conference requires the user to participate.

结合第一方面的第一种可能的实施方式，在第一方面的第二种可能的实施方式中，所述会议终端根据所述会议的多媒体信息和所述预设的多媒体内容，确定所述会议的多媒体信息中是否存在预设的会议信息，还包括：With reference to the first possible implementation manner of the first aspect, in a second possible implementation manner of the first aspect, the conference terminal determines, according to the multimedia information of the conference and the preset multimedia content, Whether there is preset meeting information in the multimedia information of the meeting, including:

若所述会议终端确定所述会议的多媒体信息中不存在所述预设的会议信息，则所述会议终端确定所述会议的多媒体信息中是否存在所述用户关注的第一多媒体信息；If the conference terminal determines that the preset conference information does not exist in the multimedia information of the conference, the conference terminal determines whether the first multimedia information that is of interest to the user exists in the multimedia information of the conference;

当所述会议的多媒体信息中存在所述第一多媒体信息时，所述会议终端根据所述第一多媒体信息和第一映射关系确定所述第一多媒体信息的用户关注度；所述第一映射关系为所述预设的多媒体内容中的不同内容与用户关注度的对应关系；When the first multimedia information exists in the multimedia information of the conference, the conference terminal determines the user attention of the first multimedia information according to the first multimedia information and the first mapping relationship. The first mapping relationship is a correspondence between different content in the preset multimedia content and user attention degree;

所述会议终端判断所述第一多媒体信息的用户关注度是否大于预设的用户关注度阈值；若是，则所述会议终端确定所述会议需要所述用户参加；若否，则所述会议终端确定所述会议不需要所述用户参加。The conference terminal determines whether the user attention of the first multimedia information is greater than a preset user attention threshold; if yes, the conference terminal determines that the conference requires the user to participate; if not, the conference The conference terminal determines that the conference does not require the user to participate.

结合第一方面的第二种可能的实施方式，在第一方面的第三种可能的实施方式中，所述会议终端根据所述会议的多媒体信息和所述预设的多媒体内容，确定所述会议的多媒体信息中是否存在所述用户关注的第一多媒体信息，包括：With reference to the second possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, the conference terminal determines, according to the multimedia information of the conference and the preset multimedia content, Whether the first multimedia information that the user pays attention to exists in the multimedia information of the conference, including:

所述会议终端对所述会议的多媒体信息进行检测识别，获得识别结果，并将所述识别结果和所述预设的多媒体内容进行特征匹配，获得特征匹配结果；The conference terminal detects and identifies the multimedia information of the conference, obtains a recognition result, and performs feature matching on the recognition result and the preset multimedia content to obtain a feature matching result;

所述会议终端判断所述特征匹配结果是否大于预设的匹配阈值；Determining, by the conference terminal, whether the feature matching result is greater than a preset matching threshold;

若是，则所述会议终端确定所述会议的多媒体信息中存在所述第一多媒体信息；If yes, the conference terminal determines that the first multimedia information exists in the multimedia information of the conference;

若否，则所述会议终端确定所述会议的多媒体信息中不存在所述第一多媒体信息。If not, the conference terminal determines that the first multimedia information does not exist in the multimedia information of the conference.

结合第一方面至第一方面的第三种可能的实施方式中的任一项，在第一方面的第四种可能的实施方式中，所述方法还包括：In conjunction with the first aspect to any one of the third possible implementation manners of the first aspect, in a fourth possible implementation manner of the first aspect, the method further includes:

若所述会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则所述会议终端确定所述参会提醒信息的形式；其中，所述参会提醒信息的形式包括界面形式、图像形式、视频形式、音频形式、即时消息形式中的至少一种；If the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the conference terminal determines the form of the conference reminder information; wherein the conference reminder information Interface form At least one of a form, an image form, a video form, an audio form, and an instant message form;

则所述会议终端向所述用户发送参会提醒信息，包括：The conference terminal sends the conference reminder information to the user, including:

所述会议终端根据所述参会提醒信息的形式向所述用户发送所述参会提醒信息。The conference terminal sends the participation reminding information to the user according to the form of the participation reminding information.

结合第一方面至第一方面的第四种可能的实施方式中的任一项，在第一方面的第五种可能的实施方式中，所述方法还包括：In combination with the first aspect to any one of the fourth possible implementation manners of the first aspect, in a fifth possible implementation manner of the first aspect, the method further includes:

所述会议终端根据所述会议是否需要所述用户参会以及所述用户的参会状态确定反馈策略；所述反馈策略用于指示所述会议终端向所述远端设备发送反馈数据的类型；Determining, by the conference terminal, a feedback policy according to whether the conference requires the user participation and the participation status of the user; the feedback policy is used to indicate that the conference terminal sends the type of feedback data to the remote device;

所述会议终端根据所述反馈策略向所述远端设备发送所述反馈数据；所述反馈数据包括：与所述远端设备进行交互的视频内容、与所述远端设备进行交互的音频内容、与所述远端设备进行交互的文本内容中的至少一种。The conference terminal sends the feedback data to the remote device according to the feedback policy; the feedback data includes: video content that interacts with the remote device, and audio content that interacts with the remote device At least one of textual content that interacts with the remote device.

结合第一方面的第五种可能的实施方式，在第一方面的第六种可能的实施方式中，所述会议终端根据所述会议是否需要所述用户参会以及所述用户的参会状态确定反馈策略，包括：With reference to the fifth possible implementation manner of the first aspect, in a sixth possible implementation manner of the first aspect, the conference terminal, according to whether the conference requires the user to participate in the conference, and the participant's participation status Identify feedback strategies, including:

若所述会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则所述会议终端确定的所述反馈策略为向所述远端设备发送第一反馈数据，所述第一反馈数据用于向所述远端设备指示所述会议终端正在通知所述用户加入所述会议。If the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the feedback policy determined by the conference terminal is to send the first feedback data to the remote device. The first feedback data is used to indicate to the remote device that the conference terminal is notifying the user to join the conference.

结合第一方面的第五种可能的实施方式，在第一方面的第七种可能的实施方式中，所述会议终端根据所述会议是否需要所述用户参会以及所述用户的参会状态确定反馈策略，包括：With reference to the fifth possible implementation manner of the first aspect, in a seventh possible implementation manner of the first aspect, the conference terminal, according to whether the conference needs the user participation and the participation status of the user Identify feedback strategies, including:

若所述会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则所述会议终端确定的所述反馈策略为向所述远端设备发送第二反馈数据；所述第二反馈数据用于向所述远端设备示出所述用户预设的与所述会议相关的会议内容。If the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the feedback policy determined by the conference terminal is to send second feedback data to the remote device. The second feedback data is used to show the remote device the conference content related to the conference preset by the user.

结合第一方面的第五种可能的实施方式，在第一方面的第八种可能的实施方式中，所述会议终端根据所述会议是否需要所述用户参会以及所述用户的参会状态确定反馈策略，包括：With reference to the fifth possible implementation manner of the first aspect, in an eighth possible implementation manner of the first aspect, the conference terminal, according to whether the conference requires the user to participate in the conference, and the participant's participation status Identify feedback strategies, including:

若所述会议终端确定所述会议不需要所述用户参加，且所述用户的参会状态为缺席状态，则所述会议终端确定的所述反馈策略为向所述远端设备发送第三反馈数据；所述第三反馈数据用于向所述远端设备指示所述用户正在参会。If the conference terminal determines that the conference does not require the user to participate, and the participant's participation status is absent, the feedback policy determined by the conference terminal is to send a third feedback to the remote device. Data; the third feedback data is used to indicate to the remote device that the user is participating.

结合第一方面的第五种可能的实施方式，在第一方面的第九种可能的实施方式中，所述会议终端根据所述会议是否需要所述用户参会以及所述用户的参会状态确定反馈策略，包括：With reference to the fifth possible implementation manner of the first aspect, in a ninth possible implementation manner of the first aspect, the conference terminal, according to whether the conference requires the user to participate in the conference, and the participant's participation status Identify feedback strategies, including:

若所述会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为用户参会冲突状态，则所述会议终端确定的所述反馈策略为向所述远端设备发送第四反馈数据，并记录所述会议当前的会议内容；所述第四反馈数据用于向所述远端设备指示所述用户的参会状态为用户参会冲突状态。If the conference terminal determines that the conference requires the user to participate, and the participant's participation status is a user participation conflict state, the feedback policy determined by the conference terminal is to send the first to the remote device. The fourth feedback data is used to record the current conference content of the conference; the fourth feedback data is used to indicate to the remote device that the participant's participation status is a user participation conflict state.

结合第一方面的第二种可能的实施方式至第一方面的第九种可能的实施方式中的任一项，在第一方面的第十种可能的实施方式中，若所述用户关注的第一多媒体信息包括所述会议中第一参会者的人脸信息，所述预设的多媒体内容中所述用户关注的图像内容的关联信息为所述第一参会者的身份信息；则所述会议终端对所述会议的多媒体信息进行检测识别，获得识别结果，并将所述识别结果和所述预设的多媒体内容进行特征匹配，获得特征匹配结果，具体包括：With reference to the second possible implementation of the first aspect to any one of the ninth possible implementation manners of the first aspect, in the tenth possible implementation manner of the first aspect, The first multimedia information includes the face information of the first participant in the conference, and the associated information of the image content that the user pays attention to in the preset multimedia content is Determining identity information of the first participant; the conference terminal detects and identifies the multimedia information of the conference, obtains a recognition result, and performs feature matching on the recognition result and the preset multimedia content to obtain a feature. Matching results, including:

所述会议终端对所述会议的多媒体信息进行检测，确定参会的参会者在所述多媒体信息中的人脸位置和人脸大小；The conference terminal detects multimedia information of the conference, and determines a face position and a face size of the participant in the multimedia information;

所述会议终端对所述参会者在所述多媒体信息中的人脸位置和人脸大小进行特征提取，获得所述参会者的人脸特征；The conference terminal performs feature extraction on the face position and the face size of the participant in the multimedia information to obtain a face feature of the participant;

所述会议终端将每个参会者的人脸特征与预设的人脸信息库进行匹配，确定第一匹配度；The conference terminal matches each participant's face feature with a preset face information database to determine a first matching degree;

所述会议终端确定所述识别结果为所述人脸信息库中的所述第一匹配度大于预设的第一阈值的参会者的身份信息；Determining, by the conference terminal, that the identification result is identity information of the participant whose first matching degree is greater than a preset first threshold in the face information database;

所述会议终端将所述人脸信息库中的所述第一匹配度大于预设的第一阈值的参会者的身份信息与所述预设的多媒体内容中的第一参会者的身份信息匹配，获得所述特征匹配结果。The conference terminal sets the identity information of the participant whose first matching degree is greater than the preset first threshold in the face information database and the identity of the first participant in the preset multimedia content. The information is matched to obtain the feature matching result.

结合第一方面的第二种可能的实施方式至第一方面的第九种可能的实施方式中的任一项，在第一方面的第十一种可能的实施方式中，若所述用户关注的第一多媒体信息包括第一文本信息，所述用户关注的文本内容为所述第一文本信息，则所述会议终端对所述会议的多媒体信息进行检测识别，获得识别结果，并将所述识别结果和所述预设的多媒体内容进行特征匹配，获得特征匹配结果，具体包括：In combination with the second possible implementation of the first aspect to any one of the ninth possible implementation manner of the first aspect, in the eleventh possible implementation manner of the first aspect, The first multimedia information includes first text information, and the text content of the user's attention is the first text information, and the conference terminal detects and identifies the multimedia information of the conference, obtains the recognition result, and Performing feature matching on the identification result and the preset multimedia content to obtain a feature matching result, specifically including:

所述会议终端对所述会议的多媒体信息进行检测，确定文本块区域在所述会议的多媒体信息中的位置和大小；The conference terminal detects multimedia information of the conference, and determines a location and a size of a text block area in the multimedia information of the conference;

所述会议终端根据所述文本块区域在所述会议的多媒体信息中的位置和大小，获得所述会议的多媒体信息中的文本块；And obtaining, by the conference terminal, a text block in the multimedia information of the conference according to a location and a size of the text block area in the multimedia information of the conference;

所述会议终端将所述会议的多媒体信息中的文本块与预设的文本信息库进行匹配，确定第二匹配度；The conference terminal matches a text block in the multimedia information of the conference with a preset text information library, and determines a second matching degree;

所述会议终端确定所述识别结果为所述文本信息库中的所述第二匹配度大于预设的第二阈值的文本信息；The conference terminal determines that the recognition result is text information that the second matching degree in the text information library is greater than a preset second threshold;

所述会议终端将所述文本信息库中的所述第二匹配度大于预设的第二阈值的文本信息与所述预设的多媒体内容中的所述第一文本信息匹配，获得所述特征匹配结果。The conference terminal matches the text information in the text information library with the second matching degree greater than the preset second threshold value and the first text information in the preset multimedia content, to obtain the feature. Match the result.

结合第一方面的第二种可能的实施方式至第一方面的第九种可能的实施方式中的任一项，在第一方面的第十二种可能的实施方式中，若所述用户关注的第一多媒体信息包括第一文本信息，所述用户关注的文本内容为所述第一文本信息，则所述会议终端对所述会议的多媒体信息进行检测识别，获得识别结果，并将所述识别结果和所述预设的多媒体内容进行特征匹配，获得特征匹配结果，具体包括：With reference to the second possible implementation of the first aspect to any one of the ninth possible implementation manners of the first aspect, in the twelfth possible implementation manner of the first aspect, The first multimedia information includes first text information, and the text content of the user's attention is the first text information, and the conference terminal detects and identifies the multimedia information of the conference, obtains the recognition result, and Performing feature matching on the identification result and the preset multimedia content to obtain a feature matching result, specifically including:

所述会议终端对所述会议的多媒体信息进行检测，确定文本块区域在所述会议的多媒体信息中的位置和大小，获得所述会议的多媒体信息中的文本块； The conference terminal detects the multimedia information of the conference, determines the location and size of the text block area in the multimedia information of the conference, and obtains a text block in the multimedia information of the conference;

所述会议终端根据所述文本块的几何特征确定所述识别结果；Determining, by the conference terminal, the recognition result according to a geometric feature of the text block;

所述会议终端将所述识别结果与所述预设的多媒体内容中的所述第一文本信息匹配，获得所述特征匹配结果。The conference terminal matches the identification result with the first text information in the preset multimedia content to obtain the feature matching result.

结合第一方面的第十一种可能的实施方式或第一方面的第十二种可能的实施方式，在第一方面的第十三种可能的实施方式中，若所述第一多媒体信息包括所述会议的多媒体信息中除文本类型、参会者人脸类型之外的与所述第一文本信息相关的第一数据信息，则所述会议终端对所述会议的多媒体信息进行检测识别，获得识别结果，并将所述识别结果和所述预设的多媒体内容进行特征匹配，获得特征匹配结果，还包括：With reference to the eleventh possible implementation manner of the first aspect or the twelfth possible implementation manner of the first aspect, in the thirteenth possible implementation manner of the first aspect, The information includes the first data information related to the first text information except the text type and the participant face type in the multimedia information of the conference, and the conference terminal detects the multimedia information of the conference. Identifying, obtaining a recognition result, and performing feature matching on the recognition result and the preset multimedia content to obtain a feature matching result, further comprising:

所述会议终端对所述会议的多媒体信息进行检测，确定所述会议的多媒体信息中除所述文本类型、所述参会者人脸类型之外其他数据信息；The conference terminal detects the multimedia information of the conference, and determines other data information in the multimedia information of the conference, except the text type and the participant face type;

所述会议终端根据所述第一文本信息和所述其他数据信息，确定所述其他数据信息与所述第一文本信息的相关度；Determining, by the conference terminal, the degree of relevance of the other data information and the first text information according to the first text information and the other data information;

所述会议终端确定所述识别结果为所述其他数据信息中所述相关度大于预设的第三阈值的数据信息；Determining, by the conference terminal, that the identification result is data information that the correlation is greater than a preset third threshold in the other data information;

所述会议终端将所述其他数据信息中所述相关度大于预设的第三阈值的数据信息与所述第一数据信息匹配，获得所述特征匹配结果。And the conference terminal matches the data information of the other data information that is greater than a preset third threshold with the first data information to obtain the feature matching result.

结合第一方面的第五种可能的实施方式至第一方面的第十三种可能的实施方式中的任一项，在第一方面的第十四种可能的实施方式中，所述反馈数据为所述用户预设的反馈内容，或者，根据所述反馈策略和所述用户预设的反馈内容生成的数据。In conjunction with any of the fifth possible implementation of the first aspect to the thirteenth possible implementation of the first aspect, in the fourteenth possible implementation of the first aspect, the feedback data The feedback content preset for the user, or the data generated according to the feedback policy and the feedback content preset by the user.

第二方面，本发明实施例提供一种会议终端，包括：In a second aspect, an embodiment of the present invention provides a conference terminal, including:

接收模块，用于接收远端设备发送的会议的多媒体信息；所述会议的多媒体信息包括语音信息、图像信息和文本信息中的至少一种；其中，所述会议终端为已加入所述会议的终端；a receiving module, configured to receive multimedia information of a conference sent by the remote device, where the multimedia information of the conference includes at least one of voice information, image information, and text information, where the conference terminal is added to the conference terminal;

第一确定模块，用于根据所述会议的多媒体信息和预设的多媒体内容，确定所述会议是否需要所述用户参加；其中，所述预设的多媒体内容包括：所述用户关注的语音内容、所述用户关注的图像内容、所述用户关注的图像内容的关联信息和所述用户关注的文本内容中的至少一种；a first determining module, configured to determine, according to the multimedia information of the conference and the preset multimedia content, whether the conference needs the user to participate; wherein the preset multimedia content includes: the voice content that the user pays attention to At least one of image content of the user's attention, association information of the image content of interest of the user, and text content of the user's attention;

第二确定模块，用于确定所述用户的参会状态；所述用户的参会状态包括缺席状态或用户参会冲突状态；a second determining module, configured to determine a participating state of the user; the participating state of the user includes an absent state or a user participating conflict state;

发送模块，用于在所述第一确定模块确定所述会议需要所述用户参加，且所述第二确定模块确定所述用户的参会状态为缺席状态时，向所述用户发送参会提醒信息；所述参会提醒信息用于提示所述用户参加所述会议。a sending module, configured to send a reminder to the user when the first determining module determines that the conference requires the user to participate, and the second determining module determines that the participating state of the user is an absent state Information; the participation reminding information is used to prompt the user to participate in the meeting.

结合第二方面，在第二方面的第一种可能的实施方式中，所述第一确定模块，包括：With reference to the second aspect, in a first possible implementation manner of the second aspect, the first determining module includes:

第一确定单元，用于根据所述会议的多媒体信息和所述预设的多媒体内容，确定所述会议的多媒体信息中是否存在预设的会议信息，若所述会议的多媒体信息中存在所述预设的会议信息时，确定所述会议需要所述用户参加；所述预设的会议信息为用于表征需要所述用户参加所述会议的信息。a first determining unit, configured to determine, according to the multimedia information of the conference and the preset multimedia content, whether preset conference information exists in the multimedia information of the conference, if the multimedia information of the conference exists Preset meeting When the information is discussed, it is determined that the conference needs the user to participate; the preset conference information is information used to represent that the user needs to participate in the conference.

结合第二方面的第一种可能的实施方式，在第二方面的第二种可能的实施方式中，所述第一确定模块，还包括：In conjunction with the first possible implementation of the second aspect, in the second possible implementation of the second aspect, the first determining module further includes:

第二确定单元，用于在所述第一确定单元确定所述会议的多媒体信息中不存在所述预设的会议信息时，确定所述会议的多媒体信息中是否存在所述用户关注的第一多媒体信息，并在确定所述会议的多媒体信息中存在所述用户关注的第一多媒体信息时，根据所述第一多媒体信息和第一映射关系确定所述第一多媒体信息的用户关注度；所述第一映射关系为所述预设的多媒体内容中的不同内容与用户关注度的对应关系；a second determining unit, configured to determine, when the first determining unit determines that the preset meeting information does not exist in the multimedia information of the conference, whether the first information of the user is present in the multimedia information of the conference Determining, by the first multimedia information, the first multimedia information according to the first multimedia information and the first mapping relationship, when determining that the first multimedia information that is of interest to the user exists in the multimedia information of the conference User attention of the body information; the first mapping relationship is a correspondence between different content in the preset multimedia content and user attention;

判断单元，用于判断所述第一多媒体信息的用户关注度是否大于预设的用户关注度阈值；若是，则确定所述会议需要所述用户参加；若否，则确定所述会议不需要所述用户参加。a determining unit, configured to determine whether the user attention level of the first multimedia information is greater than a preset user attention degree threshold; if yes, determining that the meeting requires the user to participate; if not, determining that the meeting is not The user is required to participate.

结合第二方面的第二种可能的实施方式，在第二方面的第三种可能的实施方式中，所述第二确定单元，具体用于对所述会议的多媒体信息进行检测识别，获得识别结果，并将所述识别结果和所述预设的多媒体内容进行特征匹配，获得特征匹配结果，并判断所述特征匹配结果是否大于预设的匹配阈值；若是，则确定所述会议的多媒体信息中存在所述第一多媒体信息；若否，则确定所述会议的多媒体信息中不存在所述第一多媒体信息。With reference to the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the second determining unit is configured to detect and identify the multimedia information of the conference, and obtain the identifier Assending, the matching result is matched with the preset multimedia content to obtain a feature matching result, and determining whether the feature matching result is greater than a preset matching threshold; if yes, determining the multimedia information of the meeting The first multimedia information exists in the first multimedia information; if not, it is determined that the first multimedia information does not exist in the multimedia information of the conference.

结合第二方面至第二方面的第三种可能的实施方式中的任一项，在第二方面的第四种可能的实施方式中，所述会议终端还包括：In conjunction with the second aspect, the third possible implementation manner of the second aspect, in the fourth possible implementation manner of the second aspect, the conference terminal further includes:

第三确定模块，用于在所述第一确定模块确定所述会议需要所述用户参加，且所述第二确定模块确定所述用户的参会状态为缺席状态时，确定所述参会提醒信息的形式；其中，所述参会提醒信息的形式包括界面形式、图像形式、视频形式、音频形式、即时消息形式中的至少一种；a third determining module, configured to: when the first determining module determines that the conference needs the user to participate, and the second determining module determines that the participating state of the user is an absent state, determining the participation reminder a form of the information; wherein the form of the participation reminding information includes at least one of an interface form, an image form, a video form, an audio form, and an instant message form;

所述发送模块，具体用于根据所述参会提醒信息的形式向所述用户发送所述参会提醒信息。The sending module is specifically configured to send the participation reminding information to the user according to the form of the participation reminding information.

结合第二方面至第二方面的第四种可能的实施方式中的任一项，在第二方面的第五种可能的实施方式中，所述会议终端还包括：第四确定模块；In conjunction with the second aspect, the fourth possible implementation manner of the second aspect, in the fifth possible implementation manner of the second aspect, the conference terminal further includes: a fourth determining module;

所述第四确定模块，用于根据所述会议是否需要所述用户参会以及所述用户的参会状态确定反馈策略；所述反馈策略用于指示所述会议终端向所述远端设备发送反馈数据的类型；The fourth determining module is configured to determine, according to whether the conference requires the user to participate in the conference and the participation status of the user, the feedback policy is used to indicate that the conference terminal sends the conference terminal to the remote device. The type of feedback data;

则所述发送模块，还用于根据所述反馈策略向所述远端设备发送所述反馈数据；所述反馈数据包括：与所述远端设备进行交互的视频内容、与所述远端设备进行交互的音频内容、与所述远端设备进行交互的文本内容中的至少一种。The sending module is further configured to send the feedback data to the remote device according to the feedback policy; the feedback data includes: video content that interacts with the remote device, and the remote device At least one of interactive audio content, textual content that interacts with the remote device.

结合第二方面的第五种可能的实施方式，在第二方面的第六种可能的实施方式中，所述第四确定模块，具体用于若所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则确定所述反馈策略为向所述远端设备发送第一反馈数据，所述第一反馈数据用于向所述远端设备指示所述会议终端正在通知所述用户加入所述会议。With reference to the fifth possible implementation manner of the second aspect, in a sixth possible implementation manner of the second aspect, the fourth determining module is specifically configured to: if the conference requires the user to participate, If the status of the user is absent, the feedback policy is determined to send the first feedback data to the remote device, where the first feedback data is used to The end device indicates that the conference terminal is informing the user to join the conference.

结合第二方面的第五种可能的实施方式，在第二方面的第七种可能的实施方式中，所述第四确定模块，具体用于若所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则确定所述反馈策略为向所述远端设备发送第二反馈数据；所述第二反馈数据用于向所述远端设备示出所述用户预设的与所述会议相关的会议内容。With reference to the fifth possible implementation manner of the second aspect, in a seventh possible implementation manner of the second aspect, the fourth determining module is specifically configured to: if the conference requires the user to participate, Determining the feedback policy is to send the second feedback data to the remote device, where the second feedback data is used to show the remote device The content of the meeting related to the meeting.

结合第二方面的第五种可能的实施方式，在第二方面的第八种可能的实施方式中，所述第四确定模块，具体用于若所述会议不需要所述用户参加，且所述用户的参会状态为缺席状态，则确定所述反馈策略为向所述远端设备发送第三反馈数据；所述第三反馈数据用于向所述远端设备指示所述用户正在参会。With reference to the fifth possible implementation manner of the second aspect, in the eighth possible implementation manner of the second aspect, the fourth determining module is specifically configured to: if the conference does not require the user to participate, Determining that the user's participation status is an absent state, determining that the feedback policy is to send third feedback data to the remote device; the third feedback data is used to indicate to the remote device that the user is participating in the conference .

结合第二方面的第五种可能的实施方式，在第二方面的第九种可能的实施方式中，所述第四确定模块，具体用于若所述会议需要所述用户参加，且所述用户的参会状态为用户参会冲突状态，则确定所述反馈策略为向所述远端设备发送第四反馈数据，并记录所述会议当前的会议内容；所述第四反馈数据用于向所述远端设备指示所述用户的参会状态为用户参会冲突状态。In conjunction with the fifth possible implementation of the second aspect, in a ninth possible implementation manner of the second aspect, the fourth determining module is specifically configured to: if the conference requires the user to participate, If the user participates in the conflicting state, the feedback policy is to send the fourth feedback data to the remote device, and record the current conference content of the conference; the fourth feedback data is used to The remote device indicates that the participation status of the user is a user participation conflict state.

结合第二方面的第二种可能的实施方式至第二方面的第九种可能的实施方式中的任一项，在第二方面的第十种可能的实施方式中，若所述用户关注的第一多媒体信息包括所述会议中第一参会者的人脸信息，所述预设的多媒体内容中所述用户关注的图像内容的关联信息为所述第一参会者的身份信息；则所述第二确定单元，具体包括：With reference to the second possible implementation of the second aspect to any one of the ninth possible implementation manner of the second aspect, in the tenth possible implementation manner of the second aspect, The first multimedia information includes the face information of the first participant in the conference, and the associated information of the image content that the user pays attention to in the preset multimedia content is the identity information of the first participant. The second determining unit specifically includes:

第一检测子单元，用于对所述会议的多媒体信息进行检测，确定参会的参会者在所述多媒体信息中的人脸位置和人脸大小；a first detecting subunit, configured to detect multimedia information of the conference, and determine a face position and a face size of the participant in the multimedia information;

特征提取子单元，用于对所述参会者在所述多媒体信息中的人脸位置和人脸大小进行特征提取，获得所述参会者的人脸特征；a feature extraction sub-unit, configured to perform feature extraction on a face position and a face size of the participant in the multimedia information, to obtain a face feature of the participant;

第一匹配子单元，用于将每个参会者的人脸特征与预设的人脸信息库进行匹配，确定第一匹配度；a first matching sub-unit, configured to match a face feature of each participant with a preset face information database to determine a first matching degree;

第一确定子单元，用于确定所述识别结果为所述人脸信息库中的所述第一匹配度大于预设的第一阈值的参会者的身份信息；a first determining subunit, configured to determine that the identification result is identity information of the participant whose first matching degree is greater than a preset first threshold in the face information database;

第二匹配子单元，用于将所述人脸信息库中的所述第一匹配度大于预设的第一阈值的参会者的身份信息与所述预设的多媒体内容中的第一参会者的身份信息匹配，获得所述特征匹配结果。a second matching subunit, configured to use the identity information of the participant whose first matching degree is greater than a preset first threshold in the face information database and the first parameter in the preset multimedia content The identity information of the participants is matched, and the feature matching result is obtained.

结合第二方面的第二种可能的实施方式至第二方面的第九种可能的实施方式中的任一项，在第二方面的第十一种可能的实施方式中，若所述用户关注的第一多媒体信息包括第一文本信息，所述用户关注的文本内容为所述第一文本信息，则所述第二确定单元，具体包括：In combination with the second possible implementation of the second aspect to any one of the ninth possible implementation manner of the second aspect, in the eleventh possible implementation manner of the second aspect, The first multimedia information includes the first text information, and the text content of the user's attention is the first text information, and the second determining unit specifically includes:

第二检测子单元，用于对所述会议的多媒体信息进行检测，确定文本块区域在所述会议的多媒体信息中的位置和大小； a second detecting subunit, configured to detect multimedia information of the conference, and determine a location and a size of a text block area in the multimedia information of the conference;

获取子单元，用于根据所述文本块区域在所述会议的多媒体信息中的位置和大小，获得所述会议的多媒体信息中的文本块；Obtaining a subunit, configured to obtain a text block in the multimedia information of the conference according to a location and a size of the text block area in the multimedia information of the conference;

第三匹配子单元，用于将所述会议的多媒体信息中的文本块与预设的文本信息库进行匹配，确定第二匹配度；a third matching subunit, configured to match a text block in the multimedia information of the conference with a preset text information library, and determine a second matching degree;

第二确定子单元，用于确定所述识别结果为所述文本信息库中的所述第二匹配度大于预设的第二阈值的文本信息；a second determining subunit, configured to determine that the recognition result is text information that the second matching degree in the text information library is greater than a preset second threshold;

第四匹配子单元，用于将所述文本信息库中的所述第二匹配度大于预设的第二阈值的文本信息与所述预设的多媒体内容中的所述第一文本信息匹配，获得所述特征匹配结果。a fourth matching sub-unit, configured to match the text information in the text information library that is greater than a preset second threshold with the first text information in the preset multimedia content, The feature matching result is obtained.

结合第二方面的第二种可能的实施方式至第二方面的第九种可能的实施方式中的任一项，在第二方面的第十二种可能的实施方式中，若所述用户关注的第一多媒体信息包括第一文本信息，所述用户关注的文本内容为所述第一文本信息，则所述第二确定单元，具体包括：With reference to the second possible implementation of the second aspect to any one of the ninth possible implementation manner of the second aspect, in the twelfth possible implementation manner of the second aspect, The first multimedia information includes the first text information, and the text content of the user's attention is the first text information, and the second determining unit specifically includes:

第三检测子单元，用于对所述会议的多媒体信息进行检测，确定文本块区域在所述会议的多媒体信息中的位置和大小，获得所述会议的多媒体信息中的文本块；a third detecting subunit, configured to detect multimedia information of the conference, determine a location and a size of a text block area in the multimedia information of the conference, and obtain a text block in the multimedia information of the conference;

第三确定子单元，用于根据所述文本块的几何特征确定所述识别结果；a third determining subunit, configured to determine the recognition result according to a geometric feature of the text block;

第五匹配子单元，用于将所述识别结果与所述预设的多媒体内容中的所述第一文本信息匹配，获得所述特征匹配结果。And a fifth matching subunit, configured to match the identification result with the first text information in the preset multimedia content to obtain the feature matching result.

结合第二方面的第十一种可能的实施方式或第二方面的第十二种可能的实施方式，在第二方面的第十三种可能的实施方式中，若所述第一多媒体信息包括所述会议的多媒体信息中除文本类型、参会者人脸类型之外的与所述第一文本信息相关的第一数据信息，则所述第二确定子单元，还包括：With reference to the eleventh possible implementation manner of the second aspect or the twelfth possible implementation manner of the second aspect, in the thirteenth possible implementation manner of the second aspect, The information includes the first data information related to the first text information in addition to the text type and the participant face type in the multimedia information of the conference, and the second determining subunit further includes:

第四检测子单元，用于对所述会议的多媒体信息进行检测，确定所述会议的多媒体信息中除所述文本类型、所述参会者人脸类型之外其他数据信息；a fourth detecting subunit, configured to detect multimedia information of the conference, and determine other data information in the multimedia information of the conference, except the text type and the participant face type;

第四确定子单元，用于根据所述第一文本信息和所述其他数据信息，确定所述其他数据信息与所述第一文本信息的相关度；a fourth determining subunit, configured to determine, according to the first text information and the other data information, a degree of correlation between the other data information and the first text information;

第五确定子单元，用于确定所述识别结果为所述其他数据信息中所述相关度大于预设的第三阈值的数据信息；a fifth determining subunit, configured to determine that the identification result is data information that the correlation is greater than a preset third threshold in the other data information;

第六匹配子单元，用于所述其他数据信息中所述相关度大于预设的第三阈值的数据信息与所述第一数据信息匹配，获得所述特征匹配结果。a sixth matching subunit, configured to match the data information whose correlation degree is greater than a preset third threshold value to the first data information in the other data information, to obtain the feature matching result.

结合第二方面的第五种可能的实施方式至第二方面的第十三种可能的实施方式中的任一项，在第二方面的第十四种可能的实施方式中，所述反馈数据为所述用户预设的反馈内容，或者，根据所述反馈策略和所述用户预设的反馈内容生成的数据。With reference to any one of the fifth possible implementation manner of the second aspect to the thirteenth possible implementation manner of the second aspect, in the fourteenth possible implementation manner of the second aspect, the feedback data The feedback content preset for the user, or the data generated according to the feedback policy and the feedback content preset by the user.

本发明实施例提供的智能会议的协作方法和会议终端，会议终端根据所接收到的所述会议的多媒体信息和预设的多媒体内容，确定所述会议是否需要所述用户参加，同时，会议终端确定用户的参会状态；当会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则会议终端向所述用户发送用于提示所述用户参加所述会议的参会提醒信息。本发明实施例提供的方法，通过会议终端确定当前会议是否需要用户参加，并结合用户当前的参会状态来确定向用户发送参会提醒信息，从而可以准确控制用户入会的时间，使得用户不会错过会议的内容，进而提高了用户的参会效率。The method for the collaboration of the intelligent conference and the conference terminal provided by the embodiment of the present invention, the conference terminal determines, according to the received multimedia information of the conference and the preset multimedia content, whether the conference needs the user to participate, and at the same time, the conference terminal Determining a participant's participation status; when the conference terminal determines that the conference requires the user to participate, and the user's participation status When the state is an absent state, the conference terminal sends the participant reminding information for prompting the user to participate in the conference. The method provided by the embodiment of the present invention determines whether the current conference requires the user to participate by using the conference terminal, and determines the sending reminder information to the user according to the current participation status of the user, so that the time for the user to join the conference can be accurately controlled, so that the user does not Missing the content of the meeting, thereby improving the efficiency of the user's participation.

DRAWINGS

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, a brief description of the drawings used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any inventive labor.

图1为本发明实施例提供的智能会议协作机器人设备的实现框架示意图；1 is a schematic diagram of an implementation framework of an intelligent conference collaboration robot device according to an embodiment of the present invention;

图2为本发明实施例提供的智能视频会议协作系统的组网示意图；2 is a schematic diagram of networking of an intelligent video conference collaboration system according to an embodiment of the present invention;

图3为本发明实施例提供的智能会议的协作方法实施例一的流程示意图；FIG. 3 is a schematic flowchart diagram of Embodiment 1 of a method for collating intelligent conferences according to an embodiment of the present disclosure;

图4为本发明实施例提供的智能会议的协作方法实施例二的流程示意图；FIG. 4 is a schematic flowchart of Embodiment 2 of a method for collaborating on a smart conference according to an embodiment of the present disclosure;

图5为本发明实施例提供的智能会议的协作方法实施例三的流程示意图；FIG. 5 is a schematic flowchart of Embodiment 3 of a method for collating an intelligent conference according to an embodiment of the present disclosure;

图6为本发明实施例提供的智能会议的协作方法实施例四的流程示意图；FIG. 6 is a schematic flowchart of Embodiment 4 of a method for collating an intelligent conference according to an embodiment of the present disclosure;

图7为本发明实施例提供的智能会议的协作方法实施例五的流程示意图；FIG. 7 is a schematic flowchart of Embodiment 5 of a method for collating an intelligent conference according to an embodiment of the present disclosure;

图8为本发明实施例提供的智能会议的协作方法实施例六的流程示意图；FIG. 8 is a schematic flowchart of Embodiment 6 of a method for collating an intelligent conference according to an embodiment of the present disclosure;

图9为本发明实施例提供的智能会议的协作方法实施例七的流程示意图；FIG. 9 is a schematic flowchart of Embodiment 7 of a method for collaborating on an intelligent conference according to an embodiment of the present disclosure;

图10为本发明实施例提供的智能会议的协作方法实施例八的流程示意图；FIG. 10 is a schematic flowchart of Embodiment 8 of a method for collating an intelligent conference according to an embodiment of the present disclosure;

图11为本发明实施例提供的会议终端实施例一的结构示意图；FIG. 11 is a schematic structural diagram of Embodiment 1 of a conference terminal according to an embodiment of the present disclosure;

图12为本发明实施例提供的会议终端实施例二的结构示意图；FIG. 12 is a schematic structural diagram of Embodiment 2 of a conference terminal according to an embodiment of the present disclosure;

图13为本发明实施例提供的会议终端实施例三的结构示意图；FIG. 13 is a schematic structural diagram of Embodiment 3 of a conference terminal according to an embodiment of the present disclosure;

图14为本发明实施例提供的会议终端实施例四的结构示意图；FIG. 14 is a schematic structural diagram of Embodiment 4 of a conference terminal according to an embodiment of the present disclosure;

图15为本发明实施例提供的会议终端实施例五的结构示意图；FIG. 15 is a schematic structural diagram of Embodiment 5 of a conference terminal according to an embodiment of the present disclosure;

图16为本发明实施例提供的会议终端实施例六的结构示意图；FIG. 16 is a schematic structural diagram of Embodiment 6 of a conference terminal according to an embodiment of the present disclosure;

图17为本发明实施例提供的会议终端实施例七的结构示意图；FIG. 17 is a schematic structural diagram of Embodiment 7 of a conference terminal according to an embodiment of the present disclosure;

图18为本发明实施例提供的会议终端实施例八的结构示意图；FIG. 18 is a schematic structural diagram of Embodiment 8 of a conference terminal according to an embodiment of the present disclosure;

图19为本发明实施例提供的会议终端实施例九的结构示意图。FIG. 19 is a schematic structural diagram of Embodiment 9 of a conference terminal according to an embodiment of the present disclosure.

detailed description

为使本发明实施例的目的、技术方案和优点更加清楚，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the drawings in the embodiments of the present invention. Is the invention Some embodiments, but not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

本发明实施例涉及的方法，适用于智能视频会议协作系统，该系统可以集成在会议终端中，该会议终端可以包括但不限于手机、个人数字助理(Personal Digital Assistant，PDA)、平板电脑、便携设备(例如，便携式计算机)或者普通的计算机(PC)等通讯设备，还可以是基于智能视频会议协作系统的智能会议协作机器人设备，该智能会议协作机器人设备的实现框架图可以参见图1所示。图1中，最底层为机器人设备的硬件层；硬件层之上为相关的设备驱动和协议栈，该协议栈可以包括网络协议栈，如无线保真(WiFi)协议栈或第三代移动通信网络(3G)协议栈、传感器协议栈等，设备驱动可以GPS驱动等；该设备驱动和协议栈之上为操作系统层；操作系统的上层为应用层，包括智能会议协作机器人的一些功能模块和智能视频会议协作系统。本发明实施例对终端的形式并不限定。The method of the embodiment of the present invention is applicable to an intelligent video conference collaboration system, and the system can be integrated into a conference terminal. The conference terminal can include, but is not limited to, a mobile phone, a personal digital assistant (PDA), a tablet computer, and a portable device. A communication device such as a device (for example, a portable computer) or a general computer (PC) may also be a smart conference collaboration robot device based on an intelligent video conference collaboration system, and an implementation framework diagram of the smart conference collaboration robot device can be seen in FIG. . In Figure 1, the bottom layer is the hardware layer of the robot device; above the hardware layer is the related device driver and protocol stack, the protocol stack may include a network protocol stack, such as a wireless fidelity (WiFi) protocol stack or a third generation mobile communication Network (3G) protocol stack, sensor protocol stack, etc., device driver can be GPS driver; the device driver and protocol stack is the operating system layer; the upper layer of the operating system is the application layer, including some functional modules of the intelligent conference collaboration robot and Intelligent video conferencing collaboration system. The embodiment of the present invention does not limit the form of the terminal.

由于传统的参会过程中，需要用户全程参加会议，即使会议中只有小部分内容与用户相关，用户也必须参加整个会议，否则就有可能错过与该用户相关的那小部分内容，这使得用户的参会效率非常低下。因此，本发明可以将上述智能视频会议协作系统集成在终端中，当该集成了智能视频会议协作系统的终端在接收到远端设备发送的连接请求时，其可以代替用户(即本端的参会者)加入会议，也就是说用户可以安排该集成了智能视频会议协作系统的终端参加一个或同时参加多个会议，作为该用户在会议中的一个替身。图2为本发明实施例提供的智能视频会议协作系统的组网图。Because the traditional participation process requires the user to participate in the whole meeting, even if only a small part of the content is related to the user, the user must participate in the entire meeting, otherwise it is possible to miss the small part of the content related to the user, which makes the user The participation efficiency is very low. Therefore, the present invention can integrate the above-mentioned intelligent video conference collaboration system in the terminal. When the terminal integrated with the intelligent video conference collaboration system receives the connection request sent by the remote device, it can replace the user (ie, the local conference). The user joins the conference, which means that the user can arrange the terminal integrated with the intelligent video conference collaboration system to participate in one or multiple conferences at the same time as a substitute for the user in the conference. FIG. 2 is a networking diagram of an intelligent video conference collaboration system according to an embodiment of the present invention.

传统的会议模式中，由参会者直接进行面对面会议或者使用音视频会议系统参加会议。在本发明实施例中，上述集成了智能视频会议协作系统的终端作为一个智能代理代替参会者参加会议。该智能视频会议协作系统可以看作是一个逻辑实体，其物理实现可以是软件，也可以是硬件设备，还可以作为一个功能模块部署在终端的会议客户端软件中。上述集成了智能视频会议协作系统的终端可以与其所代替的参会者通过图形界面、语音、视频等方式进行交互。In the traditional conference mode, participants can directly conduct face-to-face meetings or use audio and video conference systems to participate in the conference. In the embodiment of the present invention, the terminal integrated with the intelligent video conference collaboration system is used as a smart agent to participate in the conference instead of the participant. The intelligent video conference collaboration system can be regarded as a logical entity, and the physical implementation can be software or hardware equipment, and can also be deployed as a function module in the conference client software of the terminal. The terminal integrated with the intelligent video conference collaboration system can interact with the participants replaced by a graphical interface, voice, video, and the like.

上述智能视频会议协作系统在部署时，可以支持同时和多个语音/视频会议进行交互，这些语音/视频会议可以由一个或多个基于硬终端或软终端的视频会议系统创建。基于智能视频会议协作系统的终端可以模拟一个参会方，能够同时从多个语音/视频会议接收视频、音频和数据信息。例如，智能视频会议协作系统本身可以实现为会议客户端软件，安装了该智能视频会议协作系统的终端可以通过网络作为一个参会方接入一个或多个视频会议；又例如，智能视频会议协作系统可以作为独立的软件和会议客户端软件部署在同一台计算机上，通过进程间通信机制和会议客户端软件进行通信，获取会议上的视频、音频和数据等信息。The above-mentioned intelligent video conference collaboration system can support simultaneous interaction with multiple voice/video conferences. These voice/video conferences can be created by one or more video conference systems based on hard terminals or soft terminals. A terminal based on an intelligent video conferencing collaboration system can simulate a participant and can receive video, audio and data information from multiple voice/video conferences simultaneously. For example, the intelligent video conference collaboration system itself can be implemented as a conference client software, and a terminal installed with the intelligent video conference collaboration system can access one or more video conferences through a network as a participant; for example, intelligent video conference collaboration The system can be deployed on the same computer as independent software and conference client software, and communicates with the conference client software through the inter-process communication mechanism to obtain video, audio and data information of the conference.

本发明实施例设计的方法旨在解决现有技术中无法控制参会者的入会时间，导致参会者错过其所关注的会议内容，参会者参会效率低下的技术问题。The method designed in the embodiment of the present invention is intended to solve the problem that the participants cannot control the joining time of the participants in the prior art, and the participants miss the technical content of the meeting and the inconvenience of the attendees.

下面以具体地实施例对本发明的技术方案进行详细说明。下面这几个具体的实施例可以相互结合，对于相同或相似的概念或过程可能在某些实施例不再赘述。The technical solutions of the present invention will be described in detail below with specific embodiments. The following specific embodiments can be Combinations of the same or similar concepts or processes may not be described in some embodiments.

图3为本发明实施例提供的智能会议的协作方法实施例一的流程示意图。本发明实施例涉及的方法的执行主体可以为集成了上述智能视频会议协作系统的终端(下述均简称为会议终端或者本端设备)。如图3所示，该方法包括：FIG. 3 is a schematic flowchart diagram of Embodiment 1 of a method for collating intelligent conferences according to an embodiment of the present invention. The execution body of the method according to the embodiment of the present invention may be a terminal (hereinafter referred to as a conference terminal or a local device) integrated with the above-mentioned intelligent video conference cooperation system. As shown in FIG. 3, the method includes:

S101：会议终端接收远端设备发送的会议的多媒体信息；所述会议的多媒体信息包括语音信息、图像信息和文本信息中的至少一种；其中，所述会议终端为已加入所述会议的终端。S101: The conference terminal receives the multimedia information of the conference sent by the remote device, and the multimedia information of the conference includes at least one of voice information, image information, and text information, where the conference terminal is a terminal that has joined the conference. .

具体的，当会议终端接收到远端设备发送的会议连接指示后，会议终端接入该会议，从而使得该会议终端可以作为一个虚拟参会者参与该会议。即，会议终端可以接收该会议的远端设备通过会议服务器发送的会议的多媒体信息，该会议的多媒体信息可以包括语音信息、图像信息和文本信息中的至少一种。可选的，上述图像信息可以是远端设备通过视频的形式传送给会议终端的，上述图像信息可以为会议中参会者的人脸信息，还可以为会议中涉及的其他图片的信息(例如，会议中所共享的胶片中涉及的图片)，上述文本信息可以为会议中所共享的胶片中的文本内容。Specifically, after the conference terminal receives the conference connection indication sent by the remote device, the conference terminal accesses the conference, so that the conference terminal can participate in the conference as a virtual participant. That is, the conference terminal may receive multimedia information of the conference sent by the remote device of the conference through the conference server, and the multimedia information of the conference may include at least one of voice information, image information, and text information. Optionally, the foregoing image information may be transmitted by the remote device to the conference terminal by using a video. The image information may be the face information of the participant in the conference, and may also be information about other images involved in the conference (for example, , the picture involved in the film shared in the meeting), the above text information may be the text content in the film shared in the meeting.

S102：会议终端根据所述会议的多媒体信息和预设的多媒体内容，确定所述会议是否需要所述用户参加；其中，所述预设的多媒体内容包括：所述用户关注的语音内容、所述用户关注的图像内容、所述用户关注的图像内容的关联信息和所述用户关注的文本内容中的至少一种。S102: The conference terminal determines, according to the multimedia information of the conference and the preset multimedia content, whether the conference needs the user to participate; wherein the preset multimedia content includes: the voice content that the user pays attention to, the At least one of image content of interest of the user, associated information of the image content of interest of the user, and text content of interest of the user.

具体的，当上述会议终端接收到会议的多媒体信息后，会议终端根据该会议的多媒体信息和预设的多媒体内容确定当前的会议是否需要用户参加。该预设的多媒体内容包括：所述用户关注的语音内容、所述用户关注的图像内容、所述用户关注的图像内容的关联信息和所述用户关注的文本内容中的至少一种。可选的，会议终端根据会议的多媒体信息和预设的多媒体内容确定当前的会议是否需要用户参加，可以是会议终端确定会议的多媒体信息与预设的多媒体内容的相关度，根据该相关度的大小来确定当前会议是否需要用户参加；还可以是会议终端确定上述预设的多媒体内容在会议的多媒体信息中所占的比例，根据该比例的大小来确定当前会议是否需要用户参加；还可以是会议终端确定会议的多媒体信息是否存在上述预设的多媒体内容中的任一种，来确定当前会议是否需要用户参加。总之，本发明实施例只要能够根据会议的多媒体信息和预设的多媒体内容确定所述会议是否需要用户参加即可，至于采用哪一种确定方式，本发明实施例对此并不做限定。Specifically, after the conference terminal receives the multimedia information of the conference, the conference terminal determines, according to the multimedia information of the conference and the preset multimedia content, whether the current conference requires the user to participate. The preset multimedia content includes at least one of the voice content of the user's attention, the image content of the user's attention, the associated information of the image content of the user's attention, and the text content of the user's attention. Optionally, the conference terminal determines, according to the multimedia information of the conference and the preset multimedia content, whether the current conference needs the user to participate, or the conference terminal determines the correlation between the multimedia information of the conference and the preset multimedia content, according to the correlation degree. The size is used to determine whether the current conference requires the user to participate; or the conference terminal determines the proportion of the preset multimedia content in the multimedia information of the conference, and determines whether the current conference requires the user to participate according to the size of the ratio; The conference terminal determines whether the multimedia information of the conference has any one of the preset multimedia contents, to determine whether the current conference requires the user to participate. In an embodiment of the present invention, the embodiment of the present invention is not limited as long as it can determine whether the conference requires the user to participate according to the multimedia information of the conference and the preset multimedia content.

可选的，上述用户关注的图像内容的关联信息可以是与用户所关注的图像内容相关的信息，例如，用户关注的图像内容是人脸图像，则预设的多媒体内容中包括的用户关注的图像内容的关联信息就可以是与该人脸信息对应的身份信息。Optionally, the association information of the image content that the user pays attention to may be information related to the image content that the user pays attention to. For example, the image content that the user pays attention to is a face image, and the user included in the preset multimedia content is concerned. The association information of the image content may be identity information corresponding to the face information.

可选的，上述预设的多媒体内容可以预设在会议终端内部的存储器中，也可以预设在与会议终端连接的外部设备中，当会议终端获得会议的多媒体信息后，可以从外部设备中调用预设的多媒体内容进行判断(这样可以减缓会议终端内部存储器的缓存压力)。 Optionally, the preset multimedia content may be preset in a memory inside the conference terminal, or may be preset in an external device connected to the conference terminal. When the conference terminal obtains the multimedia information of the conference, the conference terminal may be from the external device. The preset multimedia content is called for judgment (this can slow down the buffering pressure of the internal memory of the conference terminal).

S103：终端确定所述用户的参会状态；所述用户的参会状态包括缺席状态或用户参会冲突状态。S103: The terminal determines the participation status of the user; the participation status of the user includes an absent state or a user participation conflict state.

可选的，会议终端可以根据是否检测到用户说话的声音来确定用户是否参会，还可以通过摄像机是否能够拍摄到用户的图像来判断用户是否参会或者判断用户是否处于会议环境中。可选的，用户的参会状态可以包括用户当前为缺席状态(即用户当前在会议环境中但没有参会会议讨论或者用户不在会议环境中)或用户参会冲突状态，用户的参会状态还可以包括用户当前正在参与会议讨论。需要说明的是，上述“用户处于会议环境中”可以为会议终端代替用户接入会议后，用户本人并没有关注会议的内容，而是在做其他重要的事情，只要摄像机可以拍到用户的图像，则就认为用户处于会议环境中；“用户为缺席状态”可以为会议终端代替用户接入会议，用户本人当前在会议环境中但没有参会会议讨论，也可以为用户当前不在会议环境中；对于“用户是否参会讨论”可以为通过麦克风或者声音检测软件来检测是否接收到用户的声音来确定用户是否参与讨论；对于上述“用户参会冲突状态”可以为当前会议需要用户参加，但用户此时正在参加别的会议，则会议终端就会确定用户的参会状态为用户参会冲突状态，例如，对于音频会议，会议终端确定当前会议需要用户参加，但会议终端此时检测到麦克风已经打开，用户已经在参与别的会议讨论了，则会议终端就认为用户的参会状态为用户参会冲突状态。Optionally, the conference terminal may determine whether the user participates in the conference according to whether the voice of the user is detected, and whether the user can participate in the conference or determine whether the user is in the conference environment by whether the camera can capture the image of the user. Optionally, the user's participation status may include that the user is currently in an absent state (that is, the user is currently in the conference environment but does not participate in the conference discussion or the user is not in the conference environment) or the user participates in the conflict state, and the user participates in the conference state. This can include the user currently participating in the conference discussion. It should be noted that, after the “user is in the conference environment” can access the conference instead of the user, the user does not pay attention to the content of the conference, but does other important things, as long as the camera can capture the image of the user. The user is in the conference environment; the user is absent state can access the conference instead of the user for the conference terminal. The user is currently in the conference environment but does not participate in the conference discussion, or the user is not currently in the conference environment; For "whether the user participates in the discussion", it may be determined whether the user participates in the discussion by detecting whether the user's voice is received through the microphone or the sound detection software; for the above-mentioned "user participation conflict status", the user may be required to participate in the current conference, but the user If the conference is participating in another conference, the conference terminal determines that the user's participation status is the user's participation conflict status. For example, for the audio conference, the conference terminal determines that the current conference requires the user to participate, but the conference terminal detects that the microphone has already been detected. Open, the user is already participating in another meeting On the, the meeting participants considered that the user terminal on the state of the user participating state of conflict.

需要说明的是，上述S102和S103之间并没有先后顺序，二者也可以同时进行。It should be noted that there is no order between the above S102 and S103, and the two can also be performed simultaneously.

S104：若所述会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则所述会议终端向所述用户发送参会提醒信息；所述参会提醒信息用于提示所述用户参加所述会议。S104: If the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the conference terminal sends the conference reminder information to the user; the conference reminder information Used to prompt the user to participate in the conference.

可选的，会议终端向用户发送的参会提醒信息的形式可以是任意的形式，只要能够通知用户及时参加所述会议即可，本发明实施例中对参会提醒信息的形式并不做限制。Optionally, the form of the participation reminding information sent by the conference terminal to the user may be in any form, as long as the user can be notified to participate in the conference in time, and the form of the reminder information is not limited in the embodiment of the present invention. .

本发明实施例提供的智能会议的协作方法，会议终端根据所接收到的所述会议的多媒体信息和预设的多媒体内容，确定所述会议是否需要所述用户参加，同时，会议终端确定用户的参会状态；当会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则会议终端向所述用户发送用于提示所述用户参加所述会议的参会提醒信息。本发明实施例提供的方法，通过会议终端确定当前会议是否需要用户参加，并结合用户当前的参会状态来确定向用户发送参会提醒信息，从而可以准确控制用户入会的时间，使得用户不会错过会议的内容，进而提高了用户的参会效率。The collaboration method of the smart conference provided by the embodiment of the present invention, the conference terminal determines, according to the received multimedia information of the conference and the preset multimedia content, whether the conference needs the user to participate, and the conference terminal determines the user's a conference state; when the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the conference terminal sends a conference to the user to prompt the user to participate in the conference. Reminder information. The method provided by the embodiment of the present invention determines whether the current conference requires the user to participate by using the conference terminal, and determines the sending reminder information to the user according to the current participation status of the user, so that the time for the user to join the conference can be accurately controlled, so that the user does not Missing the content of the meeting, thereby improving the efficiency of the user's participation.

图4为本发明实施例提供的智能会议的协作方法实施例二的流程示意图。本实施例涉及的是会议终端根据会议的多媒体信息和预设的多媒体内容，确定所述会议是否需要所述用户参加的具体过程。在上述实施例的基础上，上述S102具体包括：FIG. 4 is a schematic flowchart diagram of Embodiment 2 of a method for collating intelligent conferences according to an embodiment of the present invention. This embodiment relates to a specific process of determining, by the conference terminal, whether the conference requires the user to participate according to the multimedia information of the conference and the preset multimedia content. Based on the foregoing embodiment, the foregoing S102 specifically includes:

S201：会议终端根据所述会议的多媒体信息和所述预设的多媒体内容，确定所述会议的多媒体信息中是否存在预设的会议信息；所述预设的会议信息为用于表征需要所述用户参加所述会议的信息。若是，则执行S202，若否，则执行S203。S201: The conference terminal determines, according to the multimedia information of the conference and the preset multimedia content, the conference Whether there is preset conference information in the multimedia information; the preset conference information is information used to represent that the user needs to participate in the conference. If yes, execute S202, and if no, execute S203.

可选的，上述预设的多媒体内容可以包括预设的会议信息，该预设的会议信息可以是音频形式的信息，还可以是视频形式的信息，还可以是图像信息的形式，还可以是文本信息的形式，还可以是以上形式中的两种或两种以上结合的形式。该预设的会议信息为需要用户参加当前会议的信息，也就是说只要会议终端确定出会议的多媒体信息中存在上述预设的多媒体内容中的预设的会议信息，就可以确定出当前的会议需要用户参加。Optionally, the preset multimedia content may include preset conference information, where the preset conference information may be information in an audio format, or may be information in a video format, or may be in the form of image information, or may be The form of the text information may also be a combination of two or more of the above forms. The preset conference information is information that requires the user to participate in the current conference, that is, as long as the conference terminal determines that the preset conference information in the preset multimedia content exists in the multimedia information of the conference, the current conference can be determined. Users are required to participate.

可选的，会议终端可以通过确定会议的多媒体信息是否与预设的多媒体内容中的预设的会议信息完全相同或者大部分相同，来确定会议的多媒体信息中是否存在上述预设的会议信息，例如，会议终端可以通过判断会议的多媒体信息与预设的会议信息的相似度，当相似度大于某一设定的阈值时，就可以确定会议的多媒体信息中存在上述预设的会议信息。又例如，该预设的会议信息可以是用户的名字XXX，当会议终端检测到会议的多媒体信息中包括“由XXX讲述会议相关内容”的语音时，会议终端就确定该会议需要用户参加；再例如，该预设的会议信息可以是用户预设的自己在该会议里需要讲述的议题A，那么当会议终端检测到会议的多媒体信息中包括A时，会议终端就确定该会议需要用户参加。总而言之，本发明实施例对上述预设的会议信息的表现形式并不做限定，只要其能够表征当前会议需要用户参加即可。Optionally, the conference terminal may determine whether the preset conference information exists in the multimedia information of the conference by determining whether the multimedia information of the conference is completely the same as or substantially the same as the preset conference information in the preset multimedia content. For example, the conference terminal may determine the similarity between the multimedia information of the conference and the preset conference information. When the similarity is greater than a certain threshold, the preset conference information may be determined in the multimedia information of the conference. For another example, the preset conference information may be the user's name XXX. When the conference terminal detects that the multimedia information of the conference includes the voice “reporting the conference-related content by XXX”, the conference terminal determines that the conference requires the user to participate; For example, the preset conference information may be the topic A that the user presets to be described in the conference. When the conference terminal detects that the multimedia information of the conference includes A, the conference terminal determines that the conference requires the user to participate. In general, the embodiment of the present invention does not limit the representation form of the preset conference information, as long as it can represent the current conference and requires the user to participate.

S202：会议终端确定所述会议需要所述用户参加。S202: The conference terminal determines that the conference requires the user to participate.

S203：会议终端确定所述会议的多媒体信息中是否存在所述用户关注的第一多媒体信息。若是，则执行S204，若否，则执行S207。S203: The conference terminal determines whether the first multimedia information that is of interest to the user exists in the multimedia information of the conference. If yes, execute S204, if no, execute S207.

具体的，当会议终端确定会议的多媒体信息中不存在上述预设的会议信息时，则会议终端进一步确定该会议的多媒体信息中是否存在用户的关注的第一多媒体信息。具体的，会议终端确定会议的多媒体信息中是否存在用户的关注的第一多媒体信息，可以参见图5所示的实施例三，具体包括如下步骤：Specifically, when the conference terminal determines that the preset conference information does not exist in the multimedia information of the conference, the conference terminal further determines whether the first multimedia information of the user is present in the multimedia information of the conference. Specifically, the conference terminal determines whether the first multimedia information of the user's attention exists in the multimedia information of the conference, and may refer to the third embodiment shown in FIG. 5, which specifically includes the following steps:

S301：会议终端对所述会议的多媒体信息进行检测识别，获得识别结果，并将所述识别结果和所述预设的多媒体内容进行特征匹配，获得特征匹配结果。S301: The conference terminal detects and identifies the multimedia information of the conference, obtains a recognition result, and performs feature matching on the recognition result and the preset multimedia content to obtain a feature matching result.

具体的，当会议终端接收到远端设备发送的会议的多媒体信息后，对该会议的多媒体信息进行检测识别，确定出会议终端可识别的文本信息或者字符串或者图片信息等，作为识别结果。需要说明的是，不同形式的会议的多媒体信息，会议终端所采取的检测识别方式可以不同。例如，当上述会议的多媒体信息仅包括图像信息时，则会议终端会提取该图像的特征参数，以使会议终端可以根据图像的特征参数确定图像的内容，从而得到识别结果。可选的，该图像信息的特征参数可以为图像的亮度、饱和度、颜色、几何特征、分辨率、像素特征等；当上述会议的多媒体信息包括音频信息时，会议终端会根据语音识别技术确定该音频信息中语音的实际含义，得到识别结果。 Specifically, after receiving the multimedia information of the conference sent by the remote device, the conference terminal detects and identifies the multimedia information of the conference, and determines text information or a character string or a picture information that can be recognized by the conference terminal, as a recognition result. It should be noted that the multimedia information of different forms of conferences may be different in the detection and identification manner adopted by the conference terminal. For example, when the multimedia information of the conference includes only the image information, the conference terminal extracts the feature parameters of the image, so that the conference terminal can determine the content of the image according to the feature parameters of the image, thereby obtaining the recognition result. Optionally, the feature parameter of the image information may be brightness, saturation, color, geometric feature, resolution, pixel feature, and the like of the image; when the multimedia information of the conference includes audio information, the conference terminal determines according to the voice recognition technology. The actual meaning of the voice in the audio information is obtained.

待会议终端获得会议的多媒体信息的识别结果后，会议终端将该识别结果与预设的多媒体内容进行特征匹配，获得特征匹配结果。该特征匹配结果可以为上述识别结果与预设的多媒体内容中的图像内容、文本内容的相同或者相关或者相近的程度(即匹配度或相关度)。After the conference terminal obtains the identification result of the multimedia information of the conference, the conference terminal performs feature matching on the recognition result and the preset multimedia content to obtain a feature matching result. The feature matching result may be the same or related or close to the image content and the text content in the preset multimedia content (ie, the matching degree or the correlation degree).

S302：会议终端判断所述特征匹配结果是否大于预设的匹配阈值；若是在，则执行S303，若否，则执行S304。S302: The conference terminal determines whether the feature matching result is greater than a preset matching threshold; if yes, execute S303, if no, execute S304.

具体的，会议终端确定上述特征匹配结果可以为上述识别结果与预设的多媒体内容相同或者相关或者相近的程度(即匹配度或相关度)，并判断该相关度或者匹配度是否大于预设的匹配阈值(匹配度或相关度超过匹配阈值越多，说明该识别结果与预设的多媒体内容越相关)，并根据判断结果确定上述会议的多媒体信息中是否存在用户关注的第一多媒体信息。Specifically, the conference terminal determines that the feature matching result may be the same or related or close to the preset multimedia content (ie, the matching degree or the correlation degree), and determines whether the correlation degree or the matching degree is greater than a preset. a matching threshold (the more the matching degree or the correlation exceeds the matching threshold, the more the correlation result is related to the preset multimedia content), and determining whether the first multimedia information of the user is in the multimedia information of the conference according to the determination result. .

S303：会议终端确定所述会议的多媒体信息中存在所述第一多媒体信息。S303: The conference terminal determines that the first multimedia information exists in the multimedia information of the conference.

具体的，若上述特征匹配结果大于预设的匹配阈值，则会议终端确定会议的多媒体信息中存在上述第一多媒体信息，该第一多媒体信息可以为用户关注的图像数据或者用户关注的文本信息或者用户关注的音频信息等。Specifically, if the feature matching result is greater than a preset matching threshold, the conference terminal determines that the first multimedia information exists in the multimedia information of the conference, and the first multimedia information may be image data or user attention of the user. Text information or audio information of interest to the user.

可选的，当会议终端确定出上述会议的多媒体信息中存在用户关注的第一多媒体信息后，会议终端可以根据一定的策略对用户关注的第一多媒体信息(例如，用户关注的图像信息、语音信息、文本信息)进行格式转换或压缩等处理，并保存处理后的第一多媒体信息，方便用户后期查看。例如：当会议终端确定出第一多媒体信息为用户关注的图像信息或用户关注的语音信息后，可以将第一多媒体信息录制为文件进行保存，用户可以对这些文件进行检索和点播；又例如，当会议终端确定出第一多媒体信息为用户关注的文本信息后，会议终端可以将文本信息直接保存为文件；再例如，当会议终端确定出第一多媒体信息为用户关注的音频信息后，会议终端可以将该音频数据转化为文本形式的会议纪要进行保存。Optionally, after the conference terminal determines that the first multimedia information that is of interest to the user exists in the multimedia information of the conference, the conference terminal may use the first multimedia information that the user pays attention to according to a certain policy (for example, the user pays attention to The image information, the voice information, and the text information are processed by format conversion or compression, and the processed first multimedia information is saved, which is convenient for the user to view later. For example, after the conference terminal determines that the first multimedia information is the image information of the user's attention or the voice information that the user pays attention to, the first multimedia information may be recorded as a file for storage, and the user may retrieve and order the files. For example, after the conference terminal determines that the first multimedia information is the text information that the user pays attention to, the conference terminal can directly save the text information as a file; for example, when the conference terminal determines that the first multimedia information is the user After the audio information is concerned, the conference terminal can convert the audio data into a meeting minutes in text form for saving.

S304：会议终端确定所述会议的多媒体信息中不存在所述第一多媒体信息。S304: The conference terminal determines that the first multimedia information does not exist in the multimedia information of the conference.

综上，上述S301至S304具体描述了会议终端如何确定会议的多媒体信息中是否存在用户关注的第一多媒体信息的过程。In summary, the above S301 to S304 specifically describe how the conference terminal determines whether there is a process of the first multimedia information that the user pays attention to in the multimedia information of the conference.

S204：会议终端根据所述第一多媒体信息和第一映射关系确定所述第一多媒体信息的用户关注度；所述第一映射关系为所述预设的多媒体内容中的不同内容与用户关注度的对应关系。S204: The conference terminal determines the user attention degree of the first multimedia information according to the first multimedia information and the first mapping relationship; the first mapping relationship is different content in the preset multimedia content. Correspondence with user attention.

具体的，继上述S203之后，当会议终端确定会议的多媒体信息中存在用户关注的第一多媒体信息后，会议终端根据该第一多媒体信息和第一映射关系确定该第一多媒体信息的用户关注度，上述第一映射关系可以为上述预设的多媒体内容中的不同内容与用户关注度的对应关系，可选的，上述第一映射关系中，上述预设的多媒体内容中的不同内容可以对应不同的用户关注度，还可以为预设的多媒体内容中的部分不同的内容对应相同的用户关注度。Specifically, after the foregoing S203, after the conference terminal determines that the first multimedia information that is of interest to the user exists in the multimedia information of the conference, the conference terminal determines the first multimedia according to the first multimedia information and the first mapping relationship. The user relationship of the volume information, the first mapping relationship may be a correspondence between the different content in the preset multimedia content and the user attention degree. Optionally, in the first mapping relationship, the preset multimedia content is The different content may correspond to different user attention levels, and may also correspond to the same user attention for some different content in the preset multimedia content.

可选的，上述第一映射关系可以预设在会议终端内部的存储器中，也可以预设在与会议终端连接的外部设备中，当会议终端确定会议的多媒体信息中存在第一多媒体信息时，可以从外部设备中调用该第一映射关系来确定第一多媒体信息的用户关注度(这样可以减缓会议终端内部存储器的缓存压力)。Optionally, the first mapping relationship may be preset in a memory inside the conference terminal, or may be preset in an external device connected to the conference terminal, where the conference terminal determines that the first multimedia information exists in the multimedia information of the conference. When you can The first mapping relationship is invoked from the external device to determine the user attention of the first multimedia information (this can slow down the cache pressure of the internal memory of the conference terminal).

S205：会议终端判断所述第一多媒体信息的用户关注度是否大于预设的用户关注度阈值；若是，则执行S206，若否，则执行S207。S205: The conference terminal determines whether the user attention level of the first multimedia information is greater than a preset user attention threshold; if yes, execute S206; if not, execute S207.

S206：会议终端确定所述会议需要所述用户参加。S206: The conference terminal determines that the conference requires the user to participate.

S207：会议终端确定所述会议不需要所述用户参加。S207: The conference terminal determines that the conference does not require the user to participate.

具体的，当会议终端确定了上述第一多媒体信息的用户关注度之后，判断该第一多媒体信息的用户关注度是否大于预设的用户关注度阈值，当第一多媒体信息的用户关注度大于上述预设的用户关注度阈值时，则会议终端确定当前会议需要用户参加，当第一多媒体信息的用户关注度小于或等于上述预设的用户关注度阈值时，则会议终端确定当前会议不需要用户参加。Specifically, after the conference terminal determines the user attention degree of the first multimedia information, determining whether the user attention degree of the first multimedia information is greater than a preset user attention threshold, when the first multimedia information When the user attention degree is greater than the preset user attention threshold, the conference terminal determines that the current conference requires the user to participate. When the user attention degree of the first multimedia information is less than or equal to the preset user attention threshold, then The conference terminal determines that the current conference does not require the user to participate.

本发明实施例提供的智能会议的协作方法，通过会议终端接收远端设备发送的会议的多媒体信息，并根据所述会议的多媒体信息和预设的多媒体内容，确定会议的多媒体信息中是否存在预设的会议信息，并在判断不存在预设的会议信息时，进一步确定会议的多媒体信息中是否存在用户关注的第一多媒体信息，并在存在第一多媒体信息的情况下，通过确定该第一多媒体信息的用户关注度和预设的用户关注度阈值的大小，来确定当前会议是否需要用户参加。本发明实施例提供的方法，通过会议终端确定当前会议是否需要用户参加，并结合用户当前的参会状态来确定向用户发送参会提醒信息，从而可以准确控制用户入会的时间，使得用户不会错过会议的内容，进而提高了用户的参会效率；另外，上述会议终端不仅可以识别会议中的用户所关注的音频内容，还可以识别会议中用户所关注的图像内容、文本内容等第一多媒体信息，从而使得用户在需要查看自己所缺席的会议的会议内容时，可以获得比较全面的会议信息，进一步提高了提高用户的参会效率。The method for the collaboration of the smart conference provided by the embodiment of the present invention receives the multimedia information of the conference sent by the remote device through the conference terminal, and determines whether the multimedia information of the conference exists in the multimedia information of the conference according to the multimedia information of the conference and the preset multimedia content. Setting the conference information, and determining that there is no preset conference information, further determining whether the first multimedia information of the user is present in the multimedia information of the conference, and if the first multimedia information exists, Determining the user attention level of the first multimedia information and the preset user attention threshold value to determine whether the current conference requires the user to participate. The method provided by the embodiment of the present invention determines whether the current conference requires the user to participate by using the conference terminal, and determines the sending reminder information to the user according to the current participation status of the user, so that the time for the user to join the conference can be accurately controlled, so that the user does not Missing the content of the conference, thereby improving the efficiency of the user's participation; in addition, the conference terminal can not only identify the audio content of the user in the conference, but also identify the image content and text content of the user in the conference. The media information, so that users can obtain more comprehensive meeting information when they need to view the meeting content of the meeting that they are absent, and further improve the efficiency of the user's participation.

在上述实施例的基础上，可选的，在上述S104之前，该方法还可以包括：若所述会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则所述会议终端确定所述参会提醒信息的形式；其中，所述参会提醒信息的形式包括界面形式、图像形式、视频形式、音频形式、即时消息形式中的至少一种；则上述S104具体为：会议终端根据所述参会提醒信息的形式向所述用户发送所述参会提醒信息。On the basis of the foregoing embodiment, optionally, before the foregoing step S104, the method may further include: if the conference terminal determines that the conference requires the user to participate, and the participation status of the user is an absent state, The conference terminal determines the form of the participation reminding information; wherein the form of the participation reminding information includes at least one of an interface form, an image form, a video form, an audio form, and an instant message form; Specifically, the conference terminal sends the participation reminding information to the user according to the form of the participant reminding information.

可选的，会议终端确定向用户发送的参会提醒信息的形式，可以包括界面形式、图像形式、视频形式、音频形式、即时消息形式中的至少一种。例如，当用户处于会议环境中但并没有参与讨论(比如用户处于计算机的显示屏正前方，虽然处于会议环境中，但用户却在做其他的事情，这种情况用户的参会状态也是缺席状态)，因此会议终端可以为通过特殊的界面、特殊的颜色或界面闪烁等方式提醒用户；或者，会议终端可以通过播放一段视频或音频提醒用户，可选的，所播放的视频或音频可以是会议终端识别的用户感兴趣的音频数据或视频数据，也可以是会议终端内部生成或会议终端从外部设备调用的音频数据或视频数据；当用户不在会议环境中时(比如用户并没有在摄像机可以拍摄得到的范围内，或者用户并未处在计算机的显示屏前方)，会议终端可以通过短信、邮件和其他即时消息的形式提醒用户参与会议讨论。Optionally, the conference terminal determines the form of the participation reminding information sent to the user, and may include at least one of an interface form, an image form, a video form, an audio form, and an instant message form. For example, when the user is in the conference environment but is not involved in the discussion (such as the user is in front of the computer's display screen, although in the conference environment, but the user is doing other things, the user's participation status is also absent. Therefore, the conference terminal can remind the user through a special interface, a special color or an interface flashing; or the conference terminal can remind the user by playing a video or audio, optionally, the played video or audio can be a conference The audio data or video data that the terminal recognizes is of interest to the user, and may also be audio data or video data generated internally by the conference terminal or called by the conference terminal from the external device; When not in the conference environment (such as the user is not within the range that the camera can shoot, or the user is not in front of the computer's display), the conference terminal can remind the user to participate in the discussion by SMS, email and other instant messages. .

图6为本发明实施例提供的智能会议的协作方法实施例四的流程示意图。本实施例涉及的是当终端代替用户参会时，检测到会议需要用户参会，向用户的对端设备发送反馈数据的具体过程。可选的，图6中的方法可以与上述实施例中的S104之间并没有时序关系的限制，二者可以同时进行。在上述实施例的基础上，进一步地，在上述S103之后，该方法还包括：FIG. 6 is a schematic flowchart diagram of Embodiment 4 of a method for collating intelligent conferences according to an embodiment of the present invention. This embodiment relates to a specific process of detecting that a conference requires a user to participate in a conference and sends feedback data to the peer device of the user when the terminal replaces the user. Optionally, the method in FIG. 6 may have no timing relationship with S104 in the foregoing embodiment, and the two may be performed simultaneously. On the basis of the foregoing embodiment, further, after the foregoing S103, the method further includes:

S401：会议终端根据所述会议是否需要所述用户参会以及所述用户的参会状态确定反馈策略；所述反馈策略用于指示所述会议终端向所述远端设备发送反馈数据的类型。S401: The conference terminal determines a feedback policy according to whether the conference requires the user participation and the participation status of the user; the feedback policy is used to indicate that the conference terminal sends the type of feedback data to the remote device.

S402：会议终端根据所述反馈策略向所述远端设备发送所述反馈数据；所述反馈数据包括：与所述远端设备进行交互的视频内容、与所述远端设备进行交互的音频内容、与所述远端设备进行交互的文本内容中的至少一种。S402: The conference terminal sends the feedback data to the remote device according to the feedback policy. The feedback data includes: video content that interacts with the remote device, and audio content that interacts with the remote device. At least one of textual content that interacts with the remote device.

具体的，该反馈策略所指示的会议终端向远端设备发送反馈数据的类型，可以包括下述的第一反馈数据、第二反馈数据、第三反馈数据和第四反馈数据中的任一种类型。可选的，会议终端可以根据上述会议是否需要用户参加和用户的参会状态构建决策树，进行决策树学习得到反馈策略。可选的，上述反馈数据可以为用户预设的反馈内容，还可以为根据上述反馈策略和用户预设的反馈内容生成的数据，该反馈数据可以包括：与所述远端设备进行交互的视频内容、与所述远端设备进行交互的音频内容、与所述远端设备进行交互的文本内容中的至少一种。可选的，上述反馈策略还可以用于指示反馈数据的形式。例如，若用户预设的反馈数据为图像数据，但是反馈策略中指示反馈数据的形式应该为文本格式，则会议终端就会将用户预设的图像格式的反馈数据转换为文本格式，进而发送给远端设备。Specifically, the type of the feedback data sent by the conference terminal to the remote device, which is indicated by the feedback policy, may include any one of the following first feedback data, second feedback data, third feedback data, and fourth feedback data. Types of. Optionally, the conference terminal may construct a decision tree according to whether the conference needs the user participation and the participation status of the user, and perform a decision tree to obtain a feedback strategy. Optionally, the feedback data may be a preset content that is preset by the user, and may be data generated according to the feedback policy and the feedback content preset by the user, where the feedback data may include: a video that interacts with the remote device. At least one of content, audio content that interacts with the remote device, and text content that interacts with the remote device. Optionally, the above feedback policy may also be used to indicate the form of the feedback data. For example, if the feedback data preset by the user is image data, but the form of the feedback data indicated in the feedback policy should be a text format, the conference terminal converts the feedback data of the image format preset by the user into a text format, and then sends the data to the text format. Remote device.

上述会议终端根据所述会议是否需要所述用户参会以及所述用户的参会状态确定反馈策略，可以包括下述四种方式：The conference terminal may determine the feedback policy according to whether the conference requires the user to participate in the conference and the participation status of the user, and may include the following four methods:

第一种：若所述会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则所述会议终端确定的所述反馈策略为向所述远端设备发送第一反馈数据，所述第一反馈数据用于向所述远端设备指示所述会议终端正在通知所述用户加入所述会议。The first type: if the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the feedback policy determined by the conference terminal is sent to the remote device. First feedback data, the first feedback data is used to indicate to the remote device that the conference terminal is notifying the user to join the conference.

具体的，当会议终端确定用户的参会状态为缺席状态，且会议终端确定当前会议需要用户参加，则会议终端确定出反馈策略为向其他会场的设备(即远端设备)发送第一反馈数据，该第一反馈数据用于向其他会场的设备指示所述会议终端正在通知所述用户加入所述会议。可选的，会议终端向其他会场的设备发送的第一反馈数据，可以是向其他会场发送告知用户当前为缺席状态且正在通知用户及时参加会议的提示信息，该提示信息可以是文本、语音或视频片段。例如，在其它会场所显示的图像上叠加提示文本信息，或者，在发送给其它会场的音频流中混入提示语音，或者，在发送给其它会场的图像上叠加提示图像等。 Specifically, when the conference terminal determines that the participant's participation status is absent, and the conference terminal determines that the current conference requires the user to participate, the conference terminal determines that the feedback policy is to send the first feedback data to the device of the other site (ie, the remote device). The first feedback data is used to indicate to the device of the other site that the conference terminal is notifying the user to join the conference. Optionally, the first feedback data sent by the conference terminal to the device of the other site may be sent to other sites to notify the user that the user is currently absent and is notifying the user to participate in the conference in time. The prompt information may be text, voice, or Video clip. For example, the prompt text information is superimposed on the image displayed in another meeting place, or the prompt voice is mixed in the audio stream sent to other sites, or the prompt image is superimposed on the image sent to other sites.

第二种：若所述会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则所述会议终端确定的所述反馈策略为向所述远端设备发送第二反馈数据；所述第二反馈数据用于向所述远端设备示出所述用户预设的与所述会议相关的会议内容。The second type: if the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the feedback policy determined by the conference terminal is sent to the remote device. The second feedback data is used to display, to the remote device, the conference content preset by the user and related to the conference.

具体的，会议终端内部可以有用户预设的参会讨论的第二反馈数据，该第二反馈数据用于向所述远端设备示出所述用户预设的与所述会议相关的会议内容。在一些场景下，用户预先知道自己在本次会议中会涉及到哪些议题的讨论或者发表哪些自己观点，故用户可以预先录制音频或视频文件，或者预先存储相关的文本信息(例如共享的胶片)等文件作为第二反馈数据，则当会议终端确定用户的参会状态为缺席状态，且确定会议需要用户参会讨论时，会议终端确定出反馈策略为向其他会场的设备反馈用户预设的第二反馈数据。Specifically, the conference terminal may have second feedback data that is preset by the user, and the second feedback data is used to display, to the remote device, the conference content that is preset by the user and related to the conference. . In some scenarios, the user knows in advance which topics are discussed in the conference or which opinions are published, so the user can pre-record audio or video files or pre-store relevant text information (such as shared film). If the file is used as the second feedback data, the conference terminal determines that the user's participation status is absent, and determines that the conference requires the user to participate in the discussion. The conference terminal determines that the feedback policy is to feedback the user preset to the device of the other site. Two feedback data.

第三种：若所述会议终端确定所述会议不需要所述用户参加，且所述用户的参会状态为缺席状态，则所述会议终端确定的所述反馈策略为向所述远端设备发送第三反馈数据；所述第三反馈数据用于向所述远端设备指示所述用户正在参会。The third type: if the conference terminal determines that the conference does not require the user to participate, and the participant's participation status is absent, the feedback policy determined by the conference terminal is to the remote device. Sending third feedback data; the third feedback data is used to indicate to the remote device that the user is participating in the conference.

具体的，当会议终端确定用户的参会状态为缺席状态(即用户当前在会议环境中但没有参会讨论或者用户不在会议环境中)，且会议终端确定当前的会议需要用户参加，则会议终端可以确定反馈策略为向其他会场的设备发送第三反馈数据。该第三反馈数据用于向所述远端设备指示所述用户正在参会。可选的，该第三反馈数据可以为一个静态的用户图像，或者是一段用户在会议室中的视频，进而可以使得其它会场的参会者认为用户处于会议环境中且正在参加当前会议。Specifically, when the conference terminal determines that the participant's participation status is absent (that is, the user is currently in the conference environment but does not participate in the discussion or the user is not in the conference environment), and the conference terminal determines that the current conference requires the user to participate, the conference terminal The feedback policy can be determined to send third feedback data to devices of other sites. The third feedback data is used to indicate to the remote device that the user is participating. Optionally, the third feedback data may be a static user image, or a video of the user in the conference room, so that the participants of the other site can think that the user is in the conference environment and is participating in the current conference.

第四种：若所述会议终端确定所述会议需要所述用户参加，且所述用户的参会状态为用户参会冲突状态，则所述会议终端确定的所述反馈策略为向所述远端设备发送第四反馈数据，并记录所述会议当前的会议内容；所述第四反馈数据用于向所述远端设备指示所述用户的参会状态为用户参会冲突状态。The fourth type: if the conference terminal determines that the conference requires the user to participate, and the participation status of the user is a user participation conflict state, the feedback policy determined by the conference terminal is to the far The terminal device sends the fourth feedback data, and records the current conference content of the conference. The fourth feedback data is used to indicate to the remote device that the participant's participation status is a user participation conflict state.

具体的，会议终端可以检测用户参与多个会议是否存在冲突，在冲突发生时，会议终端确定用户的参会状态为用户参会冲突状态，并且，当会议终端同时也确定此时用户正在参加其他会议，无法及时参加当前需要用户参加的会议，则会议终端确定出反馈策略为向当前会议的会场发送第四反馈数据，以告知当前会场用户当前的状态，或者提示当前会场的参会者等待一段时间或者先进行其他的会议环节，与此同时，会议终端可以记录这些会场当前的会议内容(例如其他参会者的发言内容)，待用户本人后续参与时能够先了解其他参会人的发言内容，从而能够快速融入会议。可选的，该第四反馈数据的形式可以是音频、视频或文本的形式，本发明实施例对此并不做限制。Specifically, the conference terminal can detect whether the user participates in multiple conferences. If the conflict occurs, the conference terminal determines that the user participation status is the user participation conflict status, and when the conference terminal also determines that the user is participating in the other time. If the meeting fails to participate in the current meeting, the meeting terminal determines that the feedback policy is to send the fourth feedback data to the site of the current meeting to inform the current status of the current site user, or to prompt the participants of the current site to wait for a period of time. At the same time, the conference terminal can record the current conference content of these venues (for example, the content of other participants' speeches), and the content of the speeches of other participants can be understood first when the user participates in subsequent participation. So that you can quickly integrate into the meeting. Optionally, the fourth feedback data may be in the form of audio, video, or text, which is not limited by the embodiment of the present invention.

需要说明的是，当用户本人加入会议后，用户会代替会议终端得到会议的控制权，此时会议终端不再模拟用户参会，而是发送用户真实的实时音视频等数据流给其它参会方。当用户离开会议讨论时，会议终端再次可以代替用户参加会议。例如，会议终端代替用户参加3个会议，后来其中1个会议需要用户参加，则会议终端向用户发送参会提醒信息通知用户参加该会议，则用户在接收到参会提醒信息后，接管该会议(即用户本人参加该会议)，可选的，对于视频会议而言，会议终端发送参会提醒信息后，检测到摄像头拍摄到了用户的真实视频或检测到了用户入会的某些实际操作(例如用户点击了某个控件)，则会议终端就会将上述发送给远端设备的第三反馈数据(例如录制的一段用户在会议室中的视频)变成摄像头拍摄的实时视频。It should be noted that after the user joins the conference, the user will take control of the conference instead of the conference terminal. At this time, the conference terminal no longer simulates the user participation, but sends the user real-time audio and video data to other participants. square. When the user leaves the conference discussion, the conference terminal can again participate in the conference instead of the user. For example, the conference terminal replaces the user to participate in three conferences, and then one of the conferences requires the user to participate, and the conference terminal sends the conference reminder information to the user to notify the user. After the conference is received, the user takes over the conference after receiving the reminder information, that is, the user himself participates in the conference. Optionally, for the video conference, after the conference terminal sends the conference reminder information, the camera is detected. When the user's real video is detected or some actual operation of the user's membership is detected (for example, the user clicks on a certain control), the conference terminal sends the third feedback data sent to the remote device (for example, the recorded user is in the conference). The video in the room) becomes a live video taken by the camera.

本发明实施例提供的智能会议的协作方法，会议终端通过检测用户是否参会，获得用户的参会状态，并根据用户的参会状态和当前会议是否需要用户参加，确定向远端设备(或者其他会场)发送相应类型的反馈数据以及确定向用户发送参会提醒信息，从而使得用户可以及时入会，与其他会场的参会者进行交互，因此，本发明实施例提供的方法，可以避免用户错过与自己相关的会议内容，提高了用户的参会效率；另外，会议终端不仅可以识别会议中的用户所关注的音频内容，还可以识别会议中用户所关注的图像内容和文本内容等第一多媒体信息，从而使得用户在需要查看自己所缺席的会议的会议内容时，可以获得比较全面的会议信息，故进一步提高了用户的参会效率。The collaboration method of the intelligent conference provided by the embodiment of the present invention, the conference terminal obtains the participation status of the user by detecting whether the user participates in the conference, and determines to the remote device according to the participation status of the user and whether the current conference requires the user to participate. The other site) sends the corresponding type of feedback data and determines to send the participant reminder information to the user, so that the user can join the meeting in time to interact with the participants of other sites. Therefore, the method provided by the embodiment of the present invention can avoid the user missing. The conference content related to itself improves the efficiency of the user's participation; in addition, the conference terminal can not only identify the audio content of the user in the conference, but also identify the image content and text content of the user's attention in the conference. The media information, so that users can obtain more comprehensive meeting information when they need to view the meeting content of the meeting that they are absent, so the user's participation efficiency is further improved.

图7为本发明实施例提供的智能会议的协作方法实施例五的流程示意图。本实施例涉及的是当用户关注的第一多媒体信息包括所述会议中第一参会者的人脸信息，预设的多媒体内容中所述用户关注的图像内容的关联信息为所述第一参会者的身份信息，会议终端获得特征匹配结果的具体过程。在上述实施例的基础上，上述S301具体包括：FIG. 7 is a schematic flowchart of Embodiment 5 of a method for collating intelligent conferences according to an embodiment of the present invention. The embodiment relates to that the first multimedia information that is of interest to the user includes the face information of the first participant in the conference, and the associated information of the image content that the user pays attention to in the preset multimedia content is the The identity information of the first participant, and the specific process of the conference terminal obtaining the feature matching result. Based on the foregoing embodiment, the foregoing S301 specifically includes:

S501：会议终端对所述会议的多媒体信息进行检测，确定参会的参会者在所述多媒体信息中的人脸位置和人脸大小。S501: The conference terminal detects the multimedia information of the conference, and determines a face position and a face size of the participant in the multimedia information.

具体的，会议终端可以对接收到的会议的多媒体信息进行检测，一般的，该会议的多媒体信息可以为一段视频图像，会议终端可以在视频图像中识别出所有参会者在视频图像中的人脸位置和人脸大小。会议终端对会议的多媒体信息进行的人脸检测，基于的是视频图像中包含的人脸的模式特征，例如直方图特征、颜色特征、模板特征、结构特征及哈尔(Haar)特征等，即，会议终端可以通过图像处理技术可以将这些模式特征检测出来，并利用这些特征确定人脸位置和人脸大小。Specifically, the conference terminal can detect the multimedia information of the received conference. Generally, the multimedia information of the conference can be a video image, and the conference terminal can identify all the participants in the video image in the video image. Face position and face size. The face detection performed by the conference terminal on the multimedia information of the conference is based on the pattern features of the face included in the video image, such as histogram features, color features, template features, structural features, and Haar features, ie, The conference terminal can detect these pattern features by image processing technology, and use these features to determine the face position and the face size.

S502：会议终端对所述参会者在所述多媒体信息中的人脸位置和人脸大小进行特征提取，获得所述参会者的人脸特征。S502: The conference terminal performs feature extraction on the face position and the face size of the participant in the multimedia information to obtain a face feature of the participant.

具体的，当会议终端从会议的多媒体信息中确定出所有参会者的人脸位置和人脸大小后，会议终端根据所确定的这些参会者的人脸位置和人脸大小进行特征提取，特征提取的过程实际上是对人脸进行特征建模的过程，例如，可以利用几何特征的方法。由于人脸是由眼睛、鼻子、嘴、下巴等局部构成，因此对这些局部和它们之间结构关系的几何描述，可作为识别人脸的重要特征，这些特征被称为几何特征。根据人脸器官的形状描述以及两者之间的距离特性可以获得有助于人脸分类的特征数据，其特征分量通常包括特征点间的欧氏距离、曲率和角度等。会议终端根据几何特征的方法，可以确定出所有参会者的人脸特征。Specifically, after the conference terminal determines the face position and the face size of all the participants from the multimedia information of the conference, the conference terminal performs feature extraction according to the determined face positions and face sizes of the participants. The process of feature extraction is actually a process of character modeling a face, for example, a method that can utilize geometric features. Since the face is composed of eyes, nose, mouth, chin, etc., the geometric description of these parts and the structural relationship between them can be used as an important feature for recognizing human faces. These features are called geometric features. According to the shape description of the face organ and the distance characteristics between the two, the feature data that contributes to face classification can be obtained. The feature components usually include the Euclidean distance and curvature between the feature points. And angles, etc. According to the geometric feature method, the conference terminal can determine the facial features of all participants.

S503：会议终端将每个参会者的人脸特征与预设的人脸信息库进行匹配，确定第一匹配度。S503: The conference terminal matches each participant's face feature with a preset face information database to determine a first matching degree.

具体的，在会议终端获得所有参会者的人脸特征后，会议终端可以将这些人脸特征与预设的人脸信息库进行匹配，确定每个参会者的人脸特征与预设的人脸信息库的人脸信息的第一匹配度。可选的，该预设的人脸信息库可以为用户通过一些加载软件将涉及的参会概率高的用户的人脸以及其他信息加载在会议终端的处理器中。该预设的人脸信息库可以包括多个人脸信息，每个人脸信息对应一个参会者的身份信息。例如，假设会议终端从会议的多媒体信息中确定了4个参会者的人脸特征，然后会议终端将这4个参会者(A、B、C、D)的人脸特征与人脸信息库中的每个人脸信息进行匹配，确定各自的第一匹配度。Specifically, after the conference terminal obtains the facial features of all the participants, the conference terminal can match the facial features with the preset facial information database, and determine the facial features and presets of each participant. The first matching degree of the face information of the face information database. Optionally, the preset face information database may be used by the user to load the face and other information of the user with high participation probability involved in the processor of the conference terminal by using some loading software. The preset face information database may include a plurality of face information, and each face information corresponds to identity information of one participant. For example, suppose the conference terminal determines the facial features of the four participants from the multimedia information of the conference, and then the conference terminal sets the facial features and face information of the four participants (A, B, C, and D). Each face information in the library is matched to determine the respective first matching degree.

S504：会议终端确定所述识别结果为所述人脸信息库中的所述第一匹配度大于预设的第一阈值的参会者的身份信息。S504: The conference terminal determines that the identification result is the identity information of the participant whose first matching degree is greater than a preset first threshold in the face information database.

具体的，继续按照上述举例，假设会议终端确定参会者A的人脸特征与人脸信息库中的A’的人脸信息的第一匹配度大于第一阈值，参会者B、参会者C和参会者D与人脸信息库中的所有人脸信息匹配后，其第一匹配度均小于第一阈值，则会议终端确定识别结果为人脸信息库中的A’的人脸信息对应的参会者的身份信息。Specifically, the conference terminal determines that the first matching degree of the face feature of the participant A and the face information of the A' in the face information database is greater than the first threshold, and the participant B participates. After the C and the participant D match all the face information in the face information database, the first matching degree is less than the first threshold, and the conference terminal determines that the recognition result is the face information of the A' in the face information database. The identity information of the corresponding participant.

可选的，上述S503和S504的执行主体也可以是与会议终端可以进行通信的网络设备，例如网络服务器等，即终端将每个参会者的人脸特征发送给网络设备，由网络设备将每个参会者的人脸特征与网络设备上预设的人脸信息库进行匹配，获得第一匹配度，并确定最终的识别结果为人脸信息库中的第一匹配度大于预设的第一阈值的参会者的身份信息，从而将该人脸信息库中的第一匹配度大于预设的第一阈值的参会者的身份信息发送给会议终端。Optionally, the execution body of the foregoing S503 and S504 may also be a network device that can communicate with the conference terminal, such as a network server, that is, the terminal sends the facial features of each participant to the network device, and the network device The face feature of each participant is matched with the preset face information database on the network device to obtain the first matching degree, and the final recognition result is determined to be that the first matching degree in the face information database is greater than the preset number. The identity information of the participant of the threshold is sent to the conference terminal by the identity information of the participant whose first matching degree is greater than the preset first threshold in the face information database.

S505：会议终端将所述人脸信息库中的所述第一匹配度大于预设的第一阈值的参会者的身份信息与所述预设的多媒体内容中的第一参会者的身份信息匹配，获得所述特征匹配结果。S505: The conference terminal compares the identity information of the participant whose first matching degree is greater than a preset first threshold in the face information database with the identity of the first participant in the preset multimedia content. The information is matched to obtain the feature matching result.

具体的，继续按照上述举例，会议终端判断上述A’的人脸信息对应的参会者的身份信息是否与预设的多媒体内容中的第一参会者的身份信息匹配，即判断上述A’的人脸信息对应的参会者的身份信息是否与第一参会者的身份信息相同或者相近，从而获得特征匹配结果，并根据特征匹配结果确定当前会议是否需要用户参加。需要说明的是，会议终端根据特征匹配结果确定当前会议是否需要用户参加可以参见上述实施例二和实施例三中的描述，在此不再赘述。Specifically, the conference terminal continues to determine whether the identity information of the participant corresponding to the face information of the A′ matches the identity information of the first participant in the preset multimedia content, that is, the foregoing A′ is determined. Whether the identity information of the participant corresponding to the face information is the same as or similar to the identity information of the first participant, thereby obtaining the feature matching result, and determining whether the current meeting requires the user to participate according to the feature matching result. It should be noted that, the conference terminal determines whether the current conference needs the user to participate according to the feature matching result. For details, refer to the description in the foregoing Embodiment 2 and Embodiment 3, and details are not described herein again.

图8为本发明实施例提供的智能会议的协作方法实施例六的流程示意图。在上述实施例的基础上，本实施例涉及的是当用户关注的第一多媒体信息包括所述会议中的第一文本信息，用户关注的文本内容为所述第一文本信息时，会议终端获得特征匹配结果的具体过程。在上述实施例的基础上，上述S301具体包括： FIG. 8 is a schematic flowchart of Embodiment 6 of a method for collating intelligent conferences according to an embodiment of the present invention. On the basis of the foregoing embodiment, the embodiment relates to that when the first multimedia information that the user pays attention includes the first text information in the conference, and the text content that the user pays attention to is the first text information, the conference The specific process in which the terminal obtains the feature matching result. Based on the foregoing embodiment, the foregoing S301 specifically includes:

S601：会议终端对所述会议的多媒体信息进行检测，确定文本块区域在所述会议的多媒体信息中的位置和大小。S601: The conference terminal detects the multimedia information of the conference, and determines a location and a size of the text block area in the multimedia information of the conference.

S602：会议终端根据所述文本块区域在所述会议的多媒体信息中的位置和大小，获得所述会议的多媒体信息中的文本块。S602: The conference terminal obtains a text block in the multimedia information of the conference according to the location and size of the text block area in the multimedia information of the conference.

具体的，会议终端可以对接收到的会议的多媒体信息进行检测，一般的，该会议的多媒体信息可以为一段视频图像，会议终端可以在视频图像中检测出视频图像中存在的所有文本块区域的位置和大小，进而确定出会议的多媒体信息中的所有文本块。可选的，会议终端可以利用基于纹理的方法、或者基于边缘检测的方法或者基于连通域的方法，来进行文本块区域的位置和大小的检测。例如，基于纹理的方法主要是通过基于背景与字符纹理特性的差异来获取文本块在视频图像中的位置和大小，进而获得会议的多媒体信息中的所有文本块。Specifically, the conference terminal may detect the multimedia information of the received conference. Generally, the multimedia information of the conference may be a video image, and the conference terminal may detect, in the video image, all the text block regions existing in the video image. The location and size, which in turn determines all the text blocks in the multimedia information of the conference. Optionally, the conference terminal may perform the detection of the location and size of the text block region by using a texture-based method, or an edge detection-based method or a connected domain-based method. For example, the texture-based method mainly obtains the position and size of the text block in the video image based on the difference between the background and the character texture characteristics, thereby obtaining all the text blocks in the multimedia information of the conference.

S603：会议终端将所述会议的多媒体信息中的文本块与预设的文本信息库进行匹配，确定第二匹配度。S603: The conference terminal matches the text block in the multimedia information of the conference with a preset text information library, and determines a second matching degree.

S604：会议终端确定所述识别结果为所述文本信息库中的所述第二匹配度大于预设的第二阈值的文本信息。S604: The conference terminal determines that the recognition result is text information that the second matching degree in the text information library is greater than a preset second threshold.

具体的，会议终端可以通过模板匹配法，将上述确定的会议的多媒体信息中的所有文本块中的文字与预设的文本信息库进行相关匹配，计算文本块中的文字与预设的文本信息库中各个文本信息之间的第二匹配度，并将文本信息库中与文本块中的文字的第二匹配度大于预设的第二阈值的文本信息作为识别结果。需要说明的是，上述文本信息库中的文本信息是会议终端可识别的文本信息(例如会议终端可识别的字符串)，但上述文本块中的文字却是会议终端识别不了的，因此，会议终端需要通过上述模板匹配法，从文本信息库中找到与文本块的文字具有相同或者相近含义的文本信息(即第二匹配度大于第二阈值的文本信息)，将其作为识别结果。Specifically, the conference terminal may perform matching between the texts in all the text blocks in the multimedia information of the determined conference and the preset text information library by using a template matching method, and calculate text and preset text information in the text block. A second matching degree between each piece of text information in the library, and the text information in the text information library with the second matching degree of the text in the text block being greater than the preset second threshold is used as the recognition result. It should be noted that the text information in the text information library is text information recognizable by the conference terminal (for example, a character string recognizable by the conference terminal), but the text in the text block is not recognized by the conference terminal, and therefore, the conference The terminal needs to find the text information having the same or similar meaning to the text of the text block (that is, the text information whose second matching degree is greater than the second threshold) from the text information library by using the template matching method, and use the same as the recognition result.

S605：会议终端将所述文本信息库中的所述第二匹配度大于预设的第二阈值的文本信息与所述预设的多媒体内容中的所述第一文本信息匹配，获得所述特征匹配结果。S605: The conference terminal matches the text information in the text information library that is greater than the preset second threshold with the first text information in the preset multimedia content, to obtain the feature. Match the result.

具体的，会议终端将上述文本信息库中的第二匹配度大于预设的第二阈值的文本信息与预设的多媒体内容中的所述第一文本信息匹配，即判断上述文本信息库中的第二匹配度大于预设的第二阈值的文本信息与预设的多媒体内容中的所述第一文本信息是否相同或相近，从而确定出特征匹配结果，并根据特征匹配结果确定当前会议是否需要用户参加。需要说明的是，会议终端根据特征匹配结果确定当前会议是否需要用户参加可以参见上述实施例二和实施例三的具体描述，在此不再赘述。Specifically, the conference terminal matches the text information in the text information database with the second matching degree greater than the preset second threshold value and the first text information in the preset multimedia content, that is, determining the text information library. Whether the second matching degree is greater than the preset second threshold value and the first text information in the preset multimedia content is the same or similar, thereby determining the feature matching result, and determining whether the current meeting needs according to the feature matching result. User participation. It should be noted that, the conference terminal determines whether the current conference needs the user to participate according to the feature matching result. For details, refer to the detailed description of the foregoing embodiment 2 and the third embodiment, and details are not described herein again.

图9为本发明实施例提供的智能会议的协作方法实施例七的流程示意图。在上述实施例的基础上，本实施例涉及的是当用户关注的第一多媒体信息包括所述会议中的第一文本信息，用户关注的文本内容为所述第一文本信息时，会议终端获得特征匹配结果的另一具体过程。在上述实施例的基础上，上述S301具体包括：FIG. 9 is a schematic flowchart of Embodiment 7 of a method for collating intelligent conferences according to an embodiment of the present invention. On the basis of the foregoing embodiment, the embodiment relates to that when the first multimedia information that the user pays attention includes the first text information in the conference, and the text content that the user pays attention to is the first text information, the conference The terminal obtains another specific process of feature matching results. Based on the foregoing embodiment, the foregoing S301 specifically includes:

S701：会议终端对所述会议的多媒体信息进行检测，确定文本块区域在所述会议的多媒体信息中的位置和大小，获得所述会议的多媒体信息中的文本块。S701: The conference terminal detects the multimedia information of the conference, determines the location and size of the text block area in the multimedia information of the conference, and obtains a text block in the multimedia information of the conference.

具体的，S701可以参见上述S601的描述过程，在此不再赘述。For details, refer to the description process of S601 above, and details are not described herein again.

S702：会议终端根据所述文本块的几何特征确定所述识别结果。S702: The conference terminal determines the recognition result according to the geometric feature of the text block.

具体的，会议终端可以通过几何特征抽取法，抽取文本块中的文字的一些几何特征，如文字的端点、分叉点、凹凸部分以及水平、垂直、倾斜等各方向的线段、闭合环路等，根据这些特征的位置和相互关系进行逻辑组合判断，确定出会议终端可识别的文本信息，作为识别结果。Specifically, the conference terminal may extract some geometric features of the text in the text block by using a geometric feature extraction method, such as end points of the text, bifurcation points, concave and convex portions, and line segments in various directions such as horizontal, vertical, and tilt, closed loop, and the like. According to the position and mutual relationship of these features, logical combination judgment is made to determine the text information recognizable by the conference terminal as the recognition result.

S703：会议终端将所述识别结果与所述预设的多媒体内容中的所述第一文本信息匹配，获得所述特征匹配结果。S703: The conference terminal matches the identification result with the first text information in the preset multimedia content, to obtain the feature matching result.

具体的，会议终端通过上述通过几何特征抽取法，将文本块的文字转换为会议终端可识别的文本信息后，将该文本信息与预设的多媒体内容中的所述第一文本信息匹配，即判断该文本信息与预设的多媒体内容中的所述第一文本信息是否相同或相近，从而确定出特征匹配结果，并根据特征匹配结果确定当前会议是否需要用户参加。需要说明的是，会议终端根据特征匹配结果确定当前会议是否需要用户参加可以参见上述实施例二和实施例三中的描述，在此不再赘述。Specifically, after the conference terminal converts the text of the text block into the text information recognizable by the conference terminal by using the geometric feature extraction method, the conference terminal matches the text information with the first text information in the preset multimedia content, that is, Determining whether the text information is identical or similar to the first text information in the preset multimedia content, thereby determining a feature matching result, and determining, according to the feature matching result, whether the current meeting requires the user to participate. It should be noted that, the conference terminal determines whether the current conference needs the user to participate according to the feature matching result. For details, refer to the description in the foregoing Embodiment 2 and Embodiment 3, and details are not described herein again.

图10为本发明实施例提供的智能会议的协作方法实施例八的流程示意图。在上述实施例的基础上，本实施例涉及的是当用户关注的第一多媒体信息包括所述会议的多媒体信息中除文本类型、参会者人脸类型之外的与所述第一文本信息相关的第一数据信息时，会议终端获得特征匹配结果的另一具体过程。在上述实施例的基础上，上述S301具体包括：FIG. 10 is a schematic flowchart diagram of Embodiment 8 of a method for collating intelligent conferences according to an embodiment of the present invention. On the basis of the foregoing embodiment, the embodiment relates to: when the first multimedia information that is of interest to the user includes the multimedia information of the conference, except for the text type, the participant face type, and the first When the text information is related to the first data information, the conference terminal obtains another specific process of the feature matching result. Based on the foregoing embodiment, the foregoing S301 specifically includes:

S801：会议终端对所述会议的多媒体信息进行检测，确定所述会议的多媒体信息中除所述文本类型、所述参会者人脸类型之外其他数据信息。S801: The conference terminal detects the multimedia information of the conference, and determines other data information in the multimedia information of the conference, except the text type and the participant face type.

具体的，本实施例涉及的方法是用户关注的第一多媒体信息不再是文本信息或参会者的人脸信息，而是会议的多媒体信息中除这两者之外的与上述实施例七中的第一文本信息相关的第一数据信息。可选的，当第一文本信息为对某一个图片的描述信息时，该第一数据信息实际上就为第一文本信息所描述的图片的相关内容，例如，当第一文本信息描述的是一副图片的颜色时，则第一数据信息就可以是该图片的颜色深度或者亮度等特征；当第一文本信息为一段GUI界面的介绍时，第一数据信息就可以为该GUI界面的颜色、亮度或者大小等属性信息。Specifically, the method in this embodiment is that the first multimedia information that the user pays attention to is no longer the text information or the face information of the participant, but the multimedia information of the conference is other than the above implementation. The first data information related to the first text information in the seventh example. Optionally, when the first text information is description information of a certain image, the first data information is actually related content of the image described by the first text information, for example, when the first text information describes When the color of a picture is used, the first data information may be a color depth or brightness of the picture; when the first text information is a description of a GUI interface, the first data information may be the color of the GUI interface. Attribute information such as brightness or size.

S802：会议终端根据所述第一文本信息和所述其他数据信息，确定所述其他数据信息与所述第一文本信息的相关度。S802: The conference terminal determines, according to the first text information and the other data information, a degree of relevance of the other data information to the first text information.

S803：会议终端确定所述识别结果为所述其他数据信息中所述相关度大于预设的第三阈值的数据信息。S803: The conference terminal determines that the recognition result is that the correlation degree in the other data information is greater than a preset third threshold. Value data information.

具体的，当会议终端从上述会议的多媒体信息中确定除所述文本类型、所述参会者人脸类型之外其他数据信息后，判断所述其他数据信息与所述第一文本信息是否相关，确定其相关度，并将所述其他数据信息中相关度大于预设的第三阈值的数据信息作为识别结果。例如，当第一数据信息为用户关注的第一文本信息(假设为一段艺术字)的亮度，则若会议终端从会议的多媒体信息中确定的除文本类型、参会者人脸类型之外的有一个数据信息为艺术字的亮度或颜色的，则会议终端确定的二者的相关度大于第三阈值，则会议终端就将该数据信息作为识别结果。Specifically, when the conference terminal determines, from the multimedia information of the conference, other data information except the text type and the participant face type, determining whether the other data information is related to the first text information. And determining the relevance thereof, and using the data information of the other data information that the correlation is greater than the preset third threshold as the recognition result. For example, when the first data information is the brightness of the first text information (assumed to be a piece of art word) that the user pays attention to, if the conference terminal determines from the multimedia information of the conference, other than the text type and the participant face type. If the data information is the brightness or color of the word of art, if the degree of correlation between the two determined by the conference terminal is greater than the third threshold, the conference terminal uses the data information as the recognition result.

S804：会议终端将所述其他数据信息中所述相关度大于预设的第三阈值的数据信息与所述第一数据信息匹配，获得所述特征匹配结果。S804: The conference terminal matches the data information of the other data information that is greater than a preset third threshold with the first data information, to obtain the feature matching result.

本领域普通技术人员可以理解：实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时，执行包括上述各方法实施例的步骤；而前述的存储介质包括：ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。One of ordinary skill in the art will appreciate that all or part of the steps to implement the various method embodiments described above may be accomplished by hardware associated with the program instructions. The aforementioned program can be stored in a computer readable storage medium. The program, when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

图11为本发明实施例提供的会议终端实施例一的结构示意图，如图11所示，该会议终端可以包括接收模块10、第一确定模块11、第二确定模块12和发送模块13。FIG. 11 is a schematic structural diagram of Embodiment 1 of a conference terminal according to an embodiment of the present invention. As shown in FIG. 11 , the conference terminal may include a receiving module 10, a first determining module 11, a second determining module 12, and a sending module 13.

其中，接收模块10，用于接收远端设备发送的会议的多媒体信息；所述会议的多媒体信息包括语音信息、图像信息和文本信息中的至少一种；其中，所述会议终端为已加入所述会议的终端；The receiving module 10 is configured to receive multimedia information of the conference sent by the remote device, where the multimedia information of the conference includes at least one of voice information, image information, and text information. The terminal of the conference;

第一确定模块11，用于根据所述会议的多媒体信息和预设的多媒体内容，确定所述会议是否需要所述用户参加；其中，所述预设的多媒体内容包括：所述用户关注的语音内容、所述用户关注的图像内容、所述用户关注的图像内容的关联信息和所述用户关注的文本内容中的至少一种；The first determining module 11 is configured to determine, according to the multimedia information of the conference and the preset multimedia content, whether the conference needs the user to participate; wherein the preset multimedia content includes: the voice that the user pays attention to At least one of content, image content of the user's attention, association information of the image content of interest of the user, and text content of the user's attention;

第二确定模块12，用于确定所述用户的参会状态；所述用户的参会状态包括缺席状态或用户参会冲突状态；a second determining module 12, configured to determine a participation status of the user; the participation status of the user includes an absent state or a user participation conflict state;

发送模块13，用于在所述第一确定模块11确定所述会议需要所述用户参加，且所述第二确定模块12确定所述用户的参会状态为缺席状态时，向所述用户发送参会提醒信息；所述参会提醒信息用于提示所述用户参加所述会议。The sending module 13 is configured to determine, at the first determining module 11, that the conference requires the user to participate, and the The determining module 12 determines that the participation status of the user is an absent state, and sends the meeting reminding information to the user; the participating reminding information is used to prompt the user to participate in the meeting.

本发明实施例提供的会议终端，可以执行上述方法实施例，其实现原理和技术效果类似，在此不再赘述。The conference terminal provided by the embodiment of the present invention may perform the foregoing method embodiments, and the implementation principles and technical effects thereof are similar, and details are not described herein again.

图12为本发明实施例提供的会议终端实施例二的结构示意图。在上述图11所示实施例的基础上，进一步地，如图12所示，所述第一确定模块11，包括：第一确定单元111，用于根据所述会议的多媒体信息和所述预设的多媒体内容，确定所述会议的多媒体信息中是否存在预设的会议信息，若所述会议的多媒体信息中存在所述预设的会议信息时，确定所述会议需要所述用户参加；所述预设的会议信息为用于表征需要所述用户参加所述会议的信息。FIG. 12 is a schematic structural diagram of Embodiment 2 of a conference terminal according to an embodiment of the present disclosure. On the basis of the foregoing embodiment shown in FIG. 11, further, as shown in FIG. 12, the first determining module 11 includes: a first determining unit 111, configured to use the multimedia information of the conference and the pre- Determining whether there is preset conference information in the multimedia information of the conference, and if the preset conference information exists in the multimedia information of the conference, determining that the conference requires the user to participate; The preset conference information is information for characterizing that the user is required to participate in the conference.

更进一步地，所述第一确定模块11，还包括：第二确定单元112和判断单元113。其中，所述第二确定单元112，用于在所述第一确定单元111确定所述会议的多媒体信息中不存在所述预设的会议信息时，确定所述会议的多媒体信息中是否存在所述用户关注的第一多媒体信息，并在确定所述会议的多媒体信息中存在所述用户关注的第一多媒体信息时，根据所述第一多媒体信息和第一映射关系确定所述第一多媒体信息的用户关注度；所述第一映射关系为所述预设的多媒体内容中的不同内容与用户关注度的对应关系；所述判断单元113，用于判断所述第一多媒体信息的用户关注度是否大于预设的用户关注度阈值；若是，则确定所述会议需要所述用户参加；若否，则确定所述会议不需要所述用户参加。Further, the first determining module 11 further includes: a second determining unit 112 and a determining unit 113. The second determining unit 112 is configured to determine whether the multimedia information of the conference exists in the multimedia information of the conference when the first determining unit 111 determines that the preset conference information does not exist in the multimedia information of the conference. Determining the first multimedia information that is of interest to the user, and determining that the first multimedia information that is of interest to the user exists in the multimedia information of the conference, determining, according to the first multimedia information and the first mapping relationship a user attention degree of the first multimedia information; the first mapping relationship is a correspondence between different content in the preset multimedia content and a user attention degree; the determining unit 113 is configured to determine the Whether the user attention of the first multimedia information is greater than a preset user attention threshold; if yes, determining that the conference requires the user to participate; if not, determining that the conference does not require the user to participate.

更进一步地，所述第二确定单元112，具体用于对所述会议的多媒体信息进行检测识别，获得识别结果，并将所述识别结果和所述预设的多媒体内容进行特征匹配，获得特征匹配结果，并判断所述特征匹配结果是否大于预设的匹配阈值；若是，则确定所述会议的多媒体信息中存在所述第一多媒体信息；若否，则确定所述会议的多媒体信息中不存在所述第一多媒体信息。Further, the second determining unit 112 is specifically configured to detect and identify the multimedia information of the conference, obtain a recognition result, and perform feature matching on the recognition result and the preset multimedia content to obtain a feature. Matching the result, and determining whether the feature matching result is greater than a preset matching threshold; if yes, determining that the first multimedia information exists in the multimedia information of the conference; if not, determining the multimedia information of the conference The first multimedia information does not exist in the middle.

图13为本发明实施例提供的会议终端实施例三的结构示意图。在上述图11或图12所示实施例的基础上，进一步地，如图13所示，所述会议终端还包括：第三确定模块14，用于在所述第一确定模块11确定所述会议需要所述用户参加，且所述第二确定模块12确定所述用户的参会状态为缺席状态时，确定所述参会提醒信息的形式；其中，所述参会提醒信息的形式包括界面形式、图像形式、视频形式、音频形式、即时消息形式中的至少一种；FIG. 13 is a schematic structural diagram of Embodiment 3 of a conference terminal according to an embodiment of the present disclosure. On the basis of the foregoing embodiment shown in FIG. 11 or FIG. 12, further, as shown in FIG. 13, the conference terminal further includes: a third determining module 14 configured to determine, at the first determining module 11, The conference needs the user to participate, and the second determining module 12 determines the form of the participation reminding information when the participation status of the user is an absent state; wherein the form of the participation reminding information includes an interface. At least one of a form, an image form, a video form, an audio form, and an instant message form;

则所述发送模块13，具体用于根据所述参会提醒信息的形式向所述用户发送所述参会提醒信息。The sending module 13 is specifically configured to send the participation reminding information to the user according to the form of the participation reminding information.

需要说明的是，图13示出的指示基于图12所示的实施例的结构，当然，图13也可以基于图11所示的实施例，本发明只是示出了其中的一种。It should be noted that the indication shown in FIG. 13 is based on the structure of the embodiment shown in FIG. 12. Of course, FIG. 13 can also be based on the embodiment shown in FIG. 11, and the present invention only shows one of them.

图14为本发明实施例提供的会议终端实施例四的结构示意图。在上述图11或图12或图 13所示实施例的基础上，进一步地，如图14所示，所述会议终端还包括：第四确定模块15，用于根据所述会议是否需要所述用户参会以及所述用户的参会状态确定反馈策略；所述反馈策略用于指示所述会议终端向所述远端设备发送反馈数据的类型；FIG. 14 is a schematic structural diagram of Embodiment 4 of a conference terminal according to an embodiment of the present disclosure. In the above Figure 11 or Figure 12 or Figure On the basis of the embodiment shown in FIG. 13, further, as shown in FIG. 14, the conference terminal further includes: a fourth determining module 15 configured to: according to whether the conference requires the user to participate in the conference and the user's participation a state determination feedback policy; the feedback policy is used to indicate that the conference terminal sends the type of feedback data to the remote device;

则所述发送模块13，还用于根据所述反馈策略向所述远端设备发送所述反馈数据；所述反馈数据包括：与所述远端设备进行交互的视频内容、与所述远端设备进行交互的音频内容、与所述远端设备进行交互的文本内容中的至少一种。所述反馈数据为所述用户预设的反馈内容，或者，根据所述反馈策略和所述用户预设的反馈内容生成的数据。The sending module 13 is further configured to send the feedback data to the remote device according to the feedback policy; the feedback data includes: video content that interacts with the remote device, and the remote end At least one of audio content that the device interacts with, and text content that interacts with the remote device. The feedback data is feedback content preset by the user, or data generated according to the feedback policy and the feedback content preset by the user.

可选的，所述第四确定模块15，具体用于若所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则确定所述反馈策略为向所述远端设备发送第一反馈数据，所述第一反馈数据用于向所述远端设备指示所述会议终端正在通知所述用户加入所述会议。Optionally, the fourth determining module 15 is configured to determine that the feedback policy is to the remote device, if the user is required to participate in the conference, and the participant is in an absent state. Sending first feedback data, the first feedback data is used to indicate to the remote device that the conference terminal is notifying the user to join the conference.

可选的，所述第四确定模块15，具体用于若所述会议需要所述用户参加，且所述用户的参会状态为缺席状态，则确定所述反馈策略为向所述远端设备发送第二反馈数据；所述第二反馈数据用于向所述远端设备示出所述用户预设的与所述会议相关的会议内容。Optionally, the fourth determining module 15 is configured to determine that the feedback policy is to the remote device, if the user is required to participate in the conference, and the participant is in an absent state. Sending second feedback data; the second feedback data is used to show, to the remote device, the conference content that is preset by the user and related to the conference.

可选的，所述第四确定模块15，具体用于若所述会议不需要所述用户参加，且所述用户的参会状态为缺席状态，则确定所述反馈策略为向所述远端设备发送第三反馈数据；所述第三反馈数据用于向所述远端设备指示所述用户正在参会。Optionally, the fourth determining module 15 is configured to: if the conference does not require the user to participate, and the participant's participation status is absent, determine that the feedback policy is to the remote end. The device sends third feedback data, where the third feedback data is used to indicate to the remote device that the user is participating.

可选的，所述第四确定模块15，具体用于若所述会议需要所述用户参加，且所述用户的参会状态为用户参会冲突状态，则确定所述反馈策略为向所述远端设备发送第四反馈数据，并记录所述会议当前的会议内容；所述第四反馈数据用于向所述远端设备指示所述用户的参会状态为用户参会冲突状态。Optionally, the fourth determining module 15 is specifically configured to: if the user needs to participate in the conference, and the participation status of the user is a user participation conflict state, determine that the feedback policy is The remote device sends the fourth feedback data, and records the current conference content of the conference. The fourth feedback data is used to indicate to the remote device that the participant's participation status is a user participation conflict state.

需要说明的是，图14示出的指示基于图13所示的实施例的结构，当然，图14也可以基于图11或图12所示的实施例，本发明只是示出了其中的一种。It should be noted that the indication shown in FIG. 14 is based on the structure of the embodiment shown in FIG. 13. Of course, FIG. 14 can also be based on the embodiment shown in FIG. 11 or FIG. 12, and the present invention only shows one of them. .

图15为本发明实施例提供的会议终端实施例五的结构示意图。在上述图12或图13或图14所示实施例的基础上，进一步地，如图15所示，若所述用户关注的第一多媒体信息包括所述会议中第一参会者的人脸信息，所述预设的多媒体内容中所述用户关注的图像内容的关联信息为所述第一参会者的身份信息；则上述第二确定单元112，具体包括：第一检测子单元1121、特征提取子单元1122、第一匹配子单元1123、第一确定子单元1124和第二匹配子单元1125。FIG. 15 is a schematic structural diagram of Embodiment 5 of a conference terminal according to an embodiment of the present disclosure. On the basis of the foregoing embodiment shown in FIG. 12 or FIG. 13 or FIG. 14 , further, as shown in FIG. 15 , if the first multimedia information that the user pays attention to includes the first participant in the conference The second information determining unit 112 includes: the first detecting subunit, the second information determining unit 112, wherein the second determining unit 112 includes: 1121, a feature extraction sub-unit 1122, a first matching sub-unit 1123, a first determining sub-unit 1124, and a second matching sub-unit 1125.

其中，第一检测子单元1121，用于对所述会议的多媒体信息进行检测，确定参会的参会者在所述多媒体信息中的人脸位置和人脸大小；The first detecting sub-unit 1121 is configured to detect multimedia information of the conference, and determine a face position and a face size of the participant in the multimedia information.

特征提取子单元1122，用于对所述参会者在所述多媒体信息中的人脸位置和人脸大小进行特征提取，获得所述参会者的人脸特征； a feature extraction sub-unit 1122, configured to perform feature extraction on a face position and a face size of the participant in the multimedia information, to obtain a face feature of the participant;

第一匹配子单元1123，用于将每个参会者的人脸特征与预设的人脸信息库进行匹配，确定第一匹配度；a first matching sub-unit 1123, configured to match a face feature of each participant with a preset face information database to determine a first matching degree;

第一确定子单元1124，用于确定所述识别结果为所述人脸信息库中的所述第一匹配度大于预设的第一阈值的参会者的身份信息；a first determining subunit 1124, configured to determine that the identification result is identity information of the participant whose first matching degree is greater than a preset first threshold in the face information database;

第二匹配子单元1125，用于将所述人脸信息库中的所述第一匹配度大于预设的第一阈值的参会者的身份信息与所述预设的多媒体内容中的第一参会者的身份信息匹配，获得所述特征匹配结果。a second matching sub-unit 1125, configured to: use the identity information of the participant whose first matching degree is greater than a preset first threshold in the face information database, and the first one of the preset multimedia content The identity information of the participants is matched, and the feature matching result is obtained.

需要说明的是，图15示出的指示基于图14所示的实施例的结构，当然，图15也可以基于图12或图13所示的实施例，本发明只是示出了其中的一种。It should be noted that the indication shown in FIG. 15 is based on the structure of the embodiment shown in FIG. 14. Of course, FIG. 15 can also be based on the embodiment shown in FIG. 12 or FIG. 13, and the present invention only shows one of them. .

图16为本发明实施例提供的会议终端实施例六的结构示意图。在上述图12或图13或图14所示实施例的基础上，进一步地，如图16所示，若所述用户关注的第一多媒体信息包括第一文本信息，所述用户关注的文本内容为所述第一文本信息，则所述第二确定单元112，具体包括：第二检测子单元1126、获取子单元1127、第三匹配子单元1128、第二确定子单元1129和第四匹配子单元1130。FIG. 16 is a schematic structural diagram of Embodiment 6 of a conference terminal according to an embodiment of the present disclosure. On the basis of the embodiment shown in FIG. 12 or FIG. 13 or FIG. 14 , further, as shown in FIG. 16 , if the first multimedia information that the user pays attention to includes the first text information, the user pays attention to The text content is the first text information, and the second determining unit 112 specifically includes: a second detecting subunit 1126, an obtaining subunit 1127, a third matching subunit 1128, a second determining subunit 1129, and a fourth Matching subunit 1130.

其中，第二检测子单元1126，用于对所述会议的多媒体信息进行检测，确定文本块区域在所述会议的多媒体信息中的位置和大小；The second detecting sub-unit 1126 is configured to detect multimedia information of the conference, and determine a location and a size of a text block area in the multimedia information of the conference;

获取子单元1127，用于根据所述文本块区域在所述会议的多媒体信息中的位置和大小，获得所述会议的多媒体信息中的文本块；The obtaining subunit 1127 is configured to obtain a text block in the multimedia information of the conference according to the location and size of the text block area in the multimedia information of the conference;

第三匹配子单元1128，用于将所述会议的多媒体信息中的文本块与预设的文本信息库进行匹配，确定第二匹配度；a third matching sub-unit 1128, configured to match a text block in the multimedia information of the conference with a preset text information library, and determine a second matching degree;

第二确定子单元1129，用于确定所述识别结果为所述文本信息库中的所述第二匹配度大于预设的第二阈值的文本信息；a second determining subunit 1129, configured to determine that the recognition result is text information that the second matching degree in the text information library is greater than a preset second threshold;

第四匹配子单元1130，用于将所述文本信息库中的所述第二匹配度大于预设的第二阈值的文本信息与所述预设的多媒体内容中的所述第一文本信息匹配，获得所述特征匹配结果。The fourth matching sub-unit 1130 is configured to match the text information in the text information library with the second matching degree greater than a preset second threshold value to the first text information in the preset multimedia content. , obtaining the feature matching result.

需要说明的是，图16示出的指示基于图14所示的实施例的结构，当然，图16也可以基于图12或图13所示的实施例，本发明只是示出了其中的一种。It should be noted that the indication shown in FIG. 16 is based on the structure of the embodiment shown in FIG. 14. Of course, FIG. 16 can also be based on the embodiment shown in FIG. 12 or FIG. 13, and the present invention only shows one of them. .

图17为本发明实施例提供的会议终端实施例七的结构示意图。在上述图12或图13或图14所示实施例的基础上，进一步地，如图17所示，若所述用户关注的第一多媒体信息包括第一文本信息，所述用户关注的文本内容为所述第一文本信息，则所述第二确定单元112，具体包括：第三检测子单元1131、第三确定子单元1132和第五匹配子单元1133。 FIG. 17 is a schematic structural diagram of Embodiment 7 of a conference terminal according to an embodiment of the present disclosure. On the basis of the embodiment shown in FIG. 12 or FIG. 13 or FIG. 14 , further, as shown in FIG. 17 , if the first multimedia information that the user pays attention to includes the first text information, the user pays attention to The text content is the first text information, and the second determining unit 112 specifically includes: a third detecting subunit 1131, a third determining subunit 1132, and a fifth matching subunit 1133.

其中，第三检测子单元1131，用于对所述会议的多媒体信息进行检测，确定文本块区域在所述会议的多媒体信息中的位置和大小，获得所述会议的多媒体信息中的文本块；The third detecting sub-unit 1131 is configured to detect multimedia information of the conference, determine a location and a size of a text block area in the multimedia information of the conference, and obtain a text block in the multimedia information of the conference;

第三确定子单元1132，用于根据所述文本块的几何特征确定所述识别结果；a third determining subunit 1132, configured to determine the recognition result according to a geometric feature of the text block;

第五匹配子单元1133，用于将所述识别结果与所述预设的多媒体内容中的所述第一文本信息匹配，获得所述特征匹配结果。The fifth matching sub-unit 1133 is configured to match the identification result with the first text information in the preset multimedia content to obtain the feature matching result.

需要说明的是，图17示出的指示基于图14所示的实施例的结构，当然，图17也可以基于图12或图13所示的实施例，本发明只是示出了其中的一种。It should be noted that the indication shown in FIG. 17 is based on the structure of the embodiment shown in FIG. 14. Of course, FIG. 17 can also be based on the embodiment shown in FIG. 12 or FIG. 13, but the present invention only shows one of them. .

图18为本发明实施例提供的会议终端实施例八的结构示意图。在上述图12至图17任一所示实施例的基础上，进一步地，如图18所示，若所述第一多媒体信息包括所述会议的多媒体信息中除文本类型、参会者人脸类型之外的与所述第一文本信息相关的第一数据信息，则所述第二确定子单元1129，包括：第四检测子单元1134、第四确定子单元1135、第五确定子单元1136和第六匹配子单元1137。FIG. 18 is a schematic structural diagram of Embodiment 8 of a conference terminal according to an embodiment of the present disclosure. On the basis of the foregoing embodiments shown in FIG. 12 to FIG. 17, further, as shown in FIG. 18, if the first multimedia information includes multimedia information of the conference, except for the text type and the participant. The second determining subunit 1129 includes a fourth detecting subunit 1134, a fourth determining subunit 1135, and a fifth determining subroutine, in addition to the first data information related to the first text information other than the face type. Unit 1136 and sixth matching subunit 1137.

其中，第四检测子单元1134，用于对所述会议的多媒体信息进行检测，确定所述会议的多媒体信息中除所述文本类型、所述参会者人脸类型之外其他数据信息；The fourth detecting sub-unit 1134 is configured to detect the multimedia information of the conference, and determine other data information in the multimedia information of the conference, except the text type and the participant face type;

第四确定子单元1135，用于根据所述第一文本信息和所述其他数据信息，确定所述其他数据信息与所述第一文本信息的相关度；a fourth determining subunit 1135, configured to determine, according to the first text information and the other data information, a degree of correlation between the other data information and the first text information;

第五确定子单元1136，用于确定所述识别结果为所述其他数据信息中所述相关度大于预设的第三阈值的数据信息；a fifth determining subunit 1136, configured to determine that the identification result is data information that the correlation is greater than a preset third threshold in the other data information;

第六匹配子单元1137，用于所述其他数据信息中所述相关度大于预设的第三阈值的数据信息与所述第一数据信息匹配，获得所述特征匹配结果。The sixth matching sub-unit 1137 is configured to match, in the other data information, the data information whose correlation is greater than a preset third threshold with the first data information, to obtain the feature matching result.

需要说明的是，图18示出的指示基于图14所示的实施例的结构，当然，图18也可以基于图12或图13或图16或图17任一所示的实施例，本发明只是示出了其中的一种。It should be noted that the indication shown in FIG. 18 is based on the structure of the embodiment shown in FIG. 14. Of course, FIG. 18 can also be based on the embodiment shown in any of FIG. 12 or FIG. 13 or FIG. 16 or FIG. Only one of them is shown.

图19为本发明实施例提供的会议终端实施例九的结构示意图。如图19所示，该会议终端可以包括处理器20，例如CPU；存储器21，至少一个通信总线22。通信总线22用于实现元件之间的通信连接。存储器21可能包含高速RAM存储器，也可能还包括非易失性存储器NVM，例如至少一个磁盘存储器，存储器21中可以存储各种程序，用于完成各种处理功能以及实现本实施例的方法步骤。FIG. 19 is a schematic structural diagram of Embodiment 9 of a conference terminal according to an embodiment of the present disclosure. As shown in FIG. 19, the conference terminal may include a processor 20, such as a CPU; a memory 21, at least one communication bus 22. Communication bus 22 is used to implement a communication connection between components. The memory 21 may include a high speed RAM memory, and may also include a non-volatile memory NVM, such as at least one disk memory, in which various programs may be stored for performing various processing functions and implementing the method steps of the present embodiment.

在本发明实施例中，上述会议终端还包括：接收器23和发送器24。In the embodiment of the present invention, the conference terminal further includes: a receiver 23 and a transmitter 24.

其中，所述接收器23，用于接收远端设备发送的会议的多媒体信息；所述会议的多媒体信息包括语音信息、图像信息和文本信息中的至少一种；其中，所述会议终端为已加入所述会议的终端；The receiver 23 is configured to receive multimedia information of a conference sent by the remote device, where the multimedia information of the conference includes at least one of voice information, image information, and text information, where the conference terminal is Join the said Terminal of the conference;

所述处理器20，用于根据所述会议的多媒体信息和预设的多媒体内容，确定所述会议是否需要所述用户参加，并确定所述用户的参会状态；其中，所述预设的多媒体内容包括：所述用户关注的语音内容、所述用户关注的图像内容、所述用户关注的图像内容的关联信息和所述用户关注的文本内容中的至少一种，所述用户的参会状态包括缺席状态或用户参会冲突状态；The processor 20 is configured to determine, according to the multimedia information of the conference and the preset multimedia content, whether the conference needs the user to participate, and determine a participation status of the user; where the preset The multimedia content includes: at least one of the voice content of the user's attention, the image content of the user's attention, the associated information of the image content of the user's attention, and the text content of the user's attention, the user's participation The status includes an absent state or a user participation conflict state;

所述发送器24，用于在所述处理器20确定所述会议需要所述用户参加，且所述用户的参会状态为缺席状态时，向所述用户发送参会提醒信息；所述参会提醒信息用于提示所述用户参加所述会议。The transmitter 24 is configured to send the conference reminder information to the user when the processor 20 determines that the conference requires the user to participate, and the participant's participation status is an absent state; The reminder information is used to prompt the user to participate in the conference.

进一步地，所述处理器20，具体用于根据所述会议的多媒体信息和所述预设的多媒体内容，确定所述会议的多媒体信息中是否存在预设的会议信息；若所述会议的多媒体信息中存在所述预设的会议信息，则确定所述会议需要所述用户参加；所述预设的会议信息为用于表征需要所述用户参加所述会议的信息。Further, the processor 20 is specifically configured to determine, according to the multimedia information of the conference and the preset multimedia content, whether preset conference information exists in the multimedia information of the conference; if the conference multimedia If the preset conference information exists in the information, it is determined that the conference needs the user to participate; and the preset conference information is information used to represent that the user needs to participate in the conference.

进一步地，所述处理器20，还用于在所述会议的多媒体信息中不存在所述预设的会议信息时，确定所述会议的多媒体信息中是否存在所述用户关注的第一多媒体信息；当所述会议的多媒体信息中存在所述第一多媒体信息时，根据所述第一多媒体信息和第一映射关系确定所述第一多媒体信息的用户关注度，并判断所述第一多媒体信息的用户关注度是否大于预设的用户关注度阈值；若是，则确定所述会议需要所述用户参加；若否，则确定所述会议不需要所述用户参加；其中，所述第一映射关系为所述预设的多媒体内容中的不同内容与用户关注度的对应关系。Further, the processor 20 is further configured to: when the preset conference information does not exist in the multimedia information of the conference, determine whether the first multimedia that is of interest to the user exists in the multimedia information of the conference. And determining the user attention degree of the first multimedia information according to the first multimedia information and the first mapping relationship, when the first multimedia information exists in the multimedia information of the conference, And determining whether the user attention of the first multimedia information is greater than a preset user attention threshold; if yes, determining that the conference requires the user to participate; if not, determining that the conference does not require the user Participating; wherein the first mapping relationship is a correspondence between different content in the preset multimedia content and user attention.

更进一步地，所述处理器20，具体用于对所述会议的多媒体信息进行检测识别，获得识别结果，并将所述识别结果和所述预设的多媒体内容进行特征匹配，获得特征匹配结果，并判断所述特征匹配结果是否大于预设的匹配阈值；若是，则确定所述会议的多媒体信息中存在所述第一多媒体信息；若否，则确定所述会议的多媒体信息中不存在所述第一多媒体信息。Further, the processor 20 is specifically configured to detect and identify the multimedia information of the conference, obtain a recognition result, and perform feature matching on the recognition result and the preset multimedia content to obtain a feature matching result. And determining whether the feature matching result is greater than a preset matching threshold; if yes, determining that the first multimedia information exists in the multimedia information of the conference; if not, determining that the multimedia information of the conference is not There is the first multimedia information.

可选的，所述处理器20，还用于在所述会议需要所述用户参加，且所述用户的参会状态为缺席状态时，确定所述参会提醒信息的形式；其中，所述参会提醒信息的形式包括界面形式、图像形式、视频形式、音频形式、即时消息形式中的至少一种；则所述发送器24，具体用于根据所述参会提醒信息的形式向所述用户发送所述参会提醒信息。Optionally, the processor 20 is further configured to determine a form of the participation reminding information when the conference requires the user to participate, and the participant's participation status is an absent state; The form of the participation reminder information includes at least one of an interface form, an image form, a video form, an audio form, and an instant message form; and the transmitter 24 is specifically configured to: according to the form of the participation reminding information The user sends the participation reminding information.

可选的，所述处理器20，还可以用于根据所述会议是否需要所述用户参会以及所述用户的参会状态确定反馈策略；所述反馈策略用于指示所述会议终端向所述远端设备发送反馈数据的类型；则所述发送器24，还用于根据所述反馈策略向所述远端设备发送所述反馈数据；所述反馈数据包括：与所述远端设备进行交互的视频内容、与所述远端设备进行交互的音频内容、与所述远端设备进行交互的文本内容中的至少一种。所述反馈数据为所述用户预设的反馈内容，或者，根据所述反馈策略和所述用户预设的反馈内容生成的数据。Optionally, the processor 20 is further configured to: determine, according to whether the conference requires the user to participate in the conference and the participant's participation status, the feedback policy is used to indicate that the conference terminal is located The number of feedback sent by the remote device The sender 24 is further configured to send the feedback data to the remote device according to the feedback policy; the feedback data includes: video content that interacts with the remote device, and The remote device performs at least one of audio content that interacts with text content that interacts with the remote device. The feedback data is feedback content preset by the user, or data generated according to the feedback policy and the feedback content preset by the user.

可选的，所述处理器20，具体用于在所述会议需要所述用户参加，且所述用户的参会状态为缺席状态时，确定所述反馈策略为向所述远端设备发送第一反馈数据，所述第一反馈数据用于向所述远端设备指示所述会议终端正在通知所述用户加入所述会议。Optionally, the processor 20 is configured to: when the conference requires the user to participate, and the participant's participation status is absent, determine that the feedback policy is to send the first to the remote device. a feedback data, the first feedback data is used to indicate to the remote device that the conference terminal is notifying the user to join the conference.

可选的，所述处理器20，具体用于在所述会议需要所述用户参加，且所述用户的参会状态为缺席状态时，确定所述反馈策略为向所述远端设备发送第二反馈数据；所述第二反馈数据用于向所述远端设备示出所述用户预设的与所述会议相关的会议内容。Optionally, the processor 20 is configured to: when the conference requires the user to participate, and the participant's participation status is absent, determine that the feedback policy is to send the first to the remote device. The second feedback data is used to display, to the remote device, the conference content preset by the user and related to the conference.

可选的，所述处理器20，具体用于在所述会议不需要所述用户参加，且所述用户的参会状态为缺席状态时，确定所述反馈策略为向所述远端设备发送第三反馈数据；所述第三反馈数据用于向所述远端设备指示所述用户正在参会。Optionally, the processor 20 is configured to: when the conference does not require the user to participate, and the participant's participation status is absent, determine that the feedback policy is sent to the remote device. The third feedback data is used to indicate to the remote device that the user is participating in the conference.

可选的，所述处理器20，具体用于在所述会议需要所述用户参加，且所述用户的参会状态为用户参会冲突状态时，确定所述反馈策略为向所述远端设备发送第四反馈数据，并记录所述会议当前的会议内容；所述第四反馈数据用于向所述远端设备指示所述用户的参会状态为用户参会冲突状态。Optionally, the processor 20 is configured to determine that the feedback policy is to the remote end when the user needs to participate in the conference, and the participant's participation status is a user participation conflict state. The device sends the fourth feedback data, and records the current conference content of the conference. The fourth feedback data is used to indicate to the remote device that the participant's participation status is a user participation conflict state.

作为本发明实施例的一种可能的实施方式，可选的，若所述用户关注的第一多媒体信息包括所述会议中第一参会者的人脸信息，所述预设的多媒体内容中所述用户关注的图像内容的关联信息为所述第一参会者的身份信息；则所述处理器20，具体用于对所述会议的多媒体信息进行检测，确定参会的参会者在所述多媒体信息中的人脸位置和人脸大小，并对所述参会者在所述多媒体信息中的人脸位置和人脸大小进行特征提取，获得所述参会者的人脸特征，并将每个参会者的人脸特征与预设的人脸信息库进行匹配，确定第一匹配度，进而确定所述识别结果为所述人脸信息库中的所述第一匹配度大于预设的第一阈值的参会者的身份信息，并将所述人脸信息库中的所述第一匹配度大于预设的第一阈值的参会者的身份信息与所述预设的多媒体内容中的第一参会者的身份信息匹配，获得所述特征匹配结果。As a possible implementation manner of the embodiment of the present invention, optionally, if the first multimedia information that is of interest to the user includes the face information of the first participant in the conference, the preset multimedia The associated information of the image content that is of interest to the user in the content is the identity information of the first participant; the processor 20 is specifically configured to detect the multimedia information of the conference, and determine the participation of the participant. The face position and the face size in the multimedia information, and the feature extraction of the face position and the face size of the participant in the multimedia information to obtain the face of the participant Feature, matching each participant's face feature with a preset face information database, determining a first matching degree, and determining the recognition result as the first match in the face information database The identity information of the participant whose degree is greater than the preset first threshold, and the identity information of the participant whose first matching degree is greater than the preset first threshold in the face information database and the pre-predetermined The first participant in the multimedia content Matching identity information, wherein said matching result is obtained.

作为本发明实施例的另一种可能的实施方式，可选的，若所述用户关注的第一多媒体信息包括第一文本信息，所述用户关注的文本内容为所述第一文本信息，所述处理器20，具体用于对所述会议的多媒体信息进行检测，确定文本块区域在所述会议的多媒体信息中的位置和大小，并根据所述文本块区域在所述会议的多媒体信息中的位置和大小，获得所述会议的多媒体信息中的文本块，并将所述会议的多媒体信息中的文本块与预设的文本信息库进行匹配，确定第二匹配度，进而确定所述识别结果为所述文本信息库中的所述第二匹配度大于预设的第二阈值的文本信息，并将所述文本信息库中的所述第二匹配度大于预设的第二阈值的文本信息与所述预设的多媒体内容中的所述第一文本信息匹配，获得所述特征匹配结果。As another possible implementation manner of the embodiment of the present invention, optionally, if the first multimedia information that is of interest to the user includes the first text information, the text content that the user pays attention to is the first text information. The processor 20 is configured to detect multimedia information of the conference, determine a location and a size of a text block area in the multimedia information of the conference, and perform multimedia in the conference according to the text block area. Position and size in the information, obtain a text block in the multimedia information of the conference, and match a text block in the multimedia information of the conference with a preset text information library to determine a second matching degree, thereby determining The recognition result is that the second matching degree in the text information library is greater than a pre- a text information of the second threshold, and the second matching degree in the text information library is greater than a preset second threshold text information and the first text information in the preset multimedia content Matching, obtaining the feature matching result.

作为本发明实施例的第三种可能的实施方式，可选的，若所述用户关注的第一多媒体信息包括第一文本信息，所述用户关注的文本内容为所述第一文本信息，则所述处理器20，具体用于对所述会议的多媒体信息进行检测，确定文本块区域在所述会议的多媒体信息中的位置和大小，获得所述会议的多媒体信息中的文本块，并根据所述文本块的几何特征确定所述识别结果，并将所述识别结果与所述预设的多媒体内容中的所述第一文本信息匹配，获得所述特征匹配结果。As a third possible implementation manner of the embodiment of the present invention, optionally, if the first multimedia information that is of interest to the user includes the first text information, the text content that the user pays attention to is the first text information. The processor 20 is configured to detect multimedia information of the conference, determine a location and a size of a text block area in the multimedia information of the conference, and obtain a text block in the multimedia information of the conference. And determining the recognition result according to the geometric feature of the text block, and matching the recognition result with the first text information in the preset multimedia content to obtain the feature matching result.

作为本发明实施例的第四种可能的实施方式，进一步地，在上述第三种可能的实施方式或第四种可能的实施方式的基础上，若所述第一多媒体信息包括所述会议的多媒体信息中除文本类型、参会者人脸类型之外的与所述第一文本信息相关的第一数据信息，则所述处理器20，具体用于对所述会议的多媒体信息进行检测，确定所述会议的多媒体信息中除所述文本类型、所述参会者人脸类型之外其他数据信息，并根据所述第一文本信息和所述其他数据信息，确定所述其他数据信息与所述第一文本信息的相关度，并确定所述识别结果为所述其他数据信息中所述相关度大于预设的第三阈值的数据信息，进而将所述其他数据信息中所述相关度大于预设的第三阈值的数据信息与所述第一数据信息匹配，获得所述特征匹配结果。As a fourth possible implementation manner of the embodiment of the present invention, further, based on the foregoing third possible implementation manner or the fourth possible implementation manner, if the first multimedia information includes the The processor 20 is configured to perform multimedia information on the conference, in the multimedia information of the conference, in addition to the text type and the participant face type, the first data information related to the first text information. Detecting, determining data information other than the text type and the participant face type in the multimedia information of the conference, and determining the other data according to the first text information and the other data information Correlation of the information with the first text information, and determining that the recognition result is data information of the other data information that the correlation is greater than a preset third threshold, and then the other data information is The data information whose correlation is greater than the preset third threshold matches the first data information, and the feature matching result is obtained.

最后应说明的是：以上各实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述各实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分或者全部技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。 Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, and are not intended to be limiting; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present invention. range.

Claims

A collaborative method for intelligent meeting, characterized in that it comprises:

The conference terminal receives the multimedia information of the conference sent by the remote device; the multimedia information of the conference includes at least one of voice information, image information, and text information; wherein the conference terminal is a terminal that has joined the conference;

Determining, by the conference terminal, whether the conference requires the user to participate according to the multimedia information of the conference and the preset multimedia content, where the preset multimedia content includes: the voice content that the user pays attention to, the At least one of image content of interest of the user, associated information of the image content of interest of the user, and text content of interest of the user;

Determining, by the conference terminal, a participation status of the user; the participation status of the user includes an absent state or a user participation conflict state;

If the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the conference terminal sends the conference reminder information to the user; the conference reminder information is used for The user is prompted to participate in the meeting.

The method according to claim 1, wherein the conference terminal determines whether the conference requires the user to participate according to the multimedia information of the conference and the preset multimedia content, including:

Determining, by the conference terminal, whether preset conference information exists in the multimedia information of the conference according to the multimedia information of the conference and the preset multimedia content; the preset conference information is used for characterization Information about the user participating in the meeting;

If the preset conference information exists in the multimedia information of the conference, the conference terminal determines that the conference requires the user to participate.

The method according to claim 2, wherein the conference terminal determines, according to the multimedia information of the conference and the preset multimedia content, whether preset conference information exists in the multimedia information of the conference, and include:

If the conference terminal determines that the preset conference information does not exist in the multimedia information of the conference, the conference terminal determines whether the first multimedia information that is of interest to the user exists in the multimedia information of the conference;

When the first multimedia information exists in the multimedia information of the conference, the conference terminal determines the user attention of the first multimedia information according to the first multimedia information and the first mapping relationship. The first mapping relationship is a correspondence between different content in the preset multimedia content and user attention degree;

The conference terminal determines whether the user attention of the first multimedia information is greater than a preset user attention threshold; if yes, the conference terminal determines that the conference requires the user to participate; if not, the conference The conference terminal determines that the conference does not require the user to participate.

The method according to claim 3, wherein the conference terminal determines, according to the multimedia information of the conference and the preset multimedia content, whether the first information of the user is present in the multimedia information of the conference. Multimedia information, including:

The conference terminal detects and identifies the multimedia information of the conference, obtains a recognition result, and performs feature matching on the recognition result and the preset multimedia content to obtain a feature matching result;

Determining, by the conference terminal, whether the feature matching result is greater than a preset matching threshold;

If yes, the conference terminal determines that the first multimedia information exists in the multimedia information of the conference;

If not, the conference terminal determines that the first multimedia information does not exist in the multimedia information of the conference.

The method according to any one of claims 1 to 4, wherein the method further comprises:

If the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the conference terminal determines the form of the conference reminder information; wherein the conference reminder information The form includes at least one of an interface form, an image form, a video form, an audio form, and an instant message form;

The conference terminal sends the conference reminder information to the user, including:

The conference terminal sends the participation reminding information to the user according to the form of the participation reminding information.

The method according to any one of claims 1 to 5, wherein the method further comprises:

Determining, by the conference terminal, a feedback policy according to whether the conference requires the user participation and the participation status of the user; the feedback policy is used to indicate that the conference terminal sends the type of feedback data to the remote device;

The conference terminal sends the feedback data to the remote device according to the feedback policy; the feedback data includes: video content that interacts with the remote device, and audio content that interacts with the remote device At least one of textual content that interacts with the remote device.

The method according to claim 6, wherein the conference terminal determines a feedback policy according to whether the conference requires the user to participate in the conference and the participation status of the user, including:

If the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the feedback policy determined by the conference terminal is to send the first feedback data to the remote device. The first feedback data is used to indicate to the remote device that the conference terminal is notifying the user to join the conference.

If the conference terminal determines that the conference requires the user to participate, and the participant's participation status is absent, the feedback policy determined by the conference terminal is to send second feedback data to the remote device. The second feedback data is used to show the remote device the conference content related to the conference preset by the user.

If the conference terminal determines that the conference does not require the user to participate, and the participant's participation status is absent, the feedback policy determined by the conference terminal is to send a third feedback to the remote device. Data; the third feedback data is used to indicate to the remote device that the user is participating.

If the conference terminal determines that the conference requires the user to participate, and the participant's participation status is a user participation conflict state, the feedback policy determined by the conference terminal is to send the first to the remote device. The fourth feedback data is used to record the current conference content of the conference; the fourth feedback data is used to indicate to the remote device that the participant's participation status is a user participation conflict state.

The method according to any one of claims 3 to 10, wherein if the first multimedia information that is of interest to the user includes face information of the first participant in the conference, the preset The association information of the image content that the user pays attention to in the multimedia content is the identity information of the first participant; the conference terminal detects and identifies the multimedia information of the conference, obtains a recognition result, and identifies the identifier The result is matched with the preset multimedia content to obtain a feature matching result, which specifically includes:

The conference terminal detects multimedia information of the conference, and determines a face position and a face size of the participant in the multimedia information;

The conference terminal performs feature extraction on the face position and the face size of the participant in the multimedia information to obtain a face feature of the participant;

The conference terminal matches each participant's face feature with a preset face information database to determine a first matching degree;

Determining, by the conference terminal, that the identification result is identity information of the participant whose first matching degree is greater than a preset first threshold in the face information database;

The conference terminal sets the identity information of the participant whose first matching degree is greater than the preset first threshold in the face information database and the identity of the first participant in the preset multimedia content. The information is matched to obtain the feature matching result.

The method according to any one of claims 3 to 10, wherein if the first multimedia information that the user is interested in comprises first text information, the text content that the user pays attention to is the first text information. And the conference terminal detects and identifies the multimedia information of the conference, obtains a recognition result, and performs feature matching on the identifier and the preset multimedia content to obtain a feature matching result, which specifically includes:

The conference terminal detects multimedia information of the conference, and determines a location and a size of a text block area in the multimedia information of the conference;

And obtaining, by the conference terminal, a text block in the multimedia information of the conference according to a location and a size of the text block area in the multimedia information of the conference;

The conference terminal matches a text block in the multimedia information of the conference with a preset text information library, and determines a second matching degree;

The conference terminal determines that the recognition result is text information that the second matching degree in the text information library is greater than a preset second threshold;

The conference terminal matches the text information in the text information library with the second matching degree greater than the preset second threshold value and the first text information in the preset multimedia content, to obtain the feature. Match the result.

The method according to any one of claims 3 to 10, wherein if the first multimedia information that the user is interested in comprises first text information, the text content that the user pays attention to is the first text information. , the conference terminal pair The multimedia information of the conference is detected and identified, and the recognition result is obtained, and the matching result is matched with the preset multimedia content to obtain a feature matching result, which specifically includes:

The conference terminal detects the multimedia information of the conference, determines the location and size of the text block area in the multimedia information of the conference, and obtains a text block in the multimedia information of the conference;

Determining, by the conference terminal, the recognition result according to a geometric feature of the text block;

The conference terminal matches the identification result with the first text information in the preset multimedia content to obtain the feature matching result.

The method according to claim 12 or 13, wherein the first multimedia information comprises, in addition to the text type, the participant face type, and the first The first data information related to the text information, the conference terminal detects and identifies the multimedia information of the conference, obtains a recognition result, and performs feature matching on the recognition result and the preset multimedia content to obtain feature matching. The results also include:

The conference terminal detects the multimedia information of the conference, and determines other data information in the multimedia information of the conference, except the text type and the participant face type;

Determining, by the conference terminal, the degree of relevance of the other data information and the first text information according to the first text information and the other data information;

Determining, by the conference terminal, that the identification result is data information that the correlation is greater than a preset third threshold in the other data information;

And the conference terminal matches the data information of the other data information that is greater than a preset third threshold with the first data information to obtain the feature matching result.

The method according to any one of claims 6 to 14, wherein the feedback data is feedback content preset by the user, or generated according to the feedback policy and the feedback content preset by the user. data.

A conference terminal, comprising:

a receiving module, configured to receive multimedia information of a conference sent by the remote device, where the multimedia information of the conference includes at least one of voice information, image information, and text information, where the conference terminal is added to the conference terminal;

a first determining module, configured to determine, according to the multimedia information of the conference and the preset multimedia content, whether the conference needs the user to participate; wherein the preset multimedia content includes: the voice content that the user pays attention to At least one of image content of the user's attention, association information of the image content of interest of the user, and text content of the user's attention;

a second determining module, configured to determine a participating state of the user; the participating state of the user includes an absent state or a user participating conflict state;

a sending module, configured to send a reminder to the user when the first determining module determines that the conference requires the user to participate, and the second determining module determines that the participating state of the user is an absent state Information; the participation reminding information is used to prompt the user to participate in the meeting.

The conference terminal according to claim 16, wherein the first determining module comprises:

a first determining unit, configured to determine, according to the multimedia information of the conference and the preset multimedia content, whether preset conference information exists in the multimedia information of the conference, if the multimedia information of the conference exists When the preset conference information is determined, it is determined that the conference needs the user to participate; and the preset conference information is information used to represent that the user needs to participate in the conference.

The conference terminal according to claim 17, wherein the first determining module further comprises:

a second determining unit, configured to determine, when the first determining unit determines that the preset meeting information does not exist in the multimedia information of the conference, whether the first information of the user is present in the multimedia information of the conference Determining, by the first multimedia information, the first multimedia information according to the first multimedia information and the first mapping relationship, when determining that the first multimedia information that is of interest to the user exists in the multimedia information of the conference User attention of the body information; the first mapping relationship is a correspondence between different content in the preset multimedia content and user attention;

a determining unit, configured to determine whether the user attention level of the first multimedia information is greater than a preset user attention degree threshold; if yes, determining that the meeting requires the user to participate; if not, determining that the meeting is not The user is required to participate.

The conference terminal according to claim 18, wherein the second determining unit is configured to detect and identify the multimedia information of the conference, obtain a recognition result, and use the recognition result and the preset Performing feature matching on the multimedia content, obtaining a feature matching result, and determining whether the feature matching result is greater than a preset matching threshold; if yes, determining that the first multimedia information exists in the multimedia information of the meeting; And determining that the first multimedia information does not exist in the multimedia information of the conference.

The conference terminal according to any one of claims 16 to 19, wherein the conference terminal further comprises:

a third determining module, configured to: when the first determining module determines that the conference needs the user to participate, and the second determining module determines that the participating state of the user is an absent state, determining the participation reminder a form of the information; wherein the form of the participation reminding information includes at least one of an interface form, an image form, a video form, an audio form, and an instant message form;

The sending module is specifically configured to send the participation reminding information to the user according to the form of the participation reminding information.

The conference terminal according to any one of claims 16 to 20, wherein the conference terminal further comprises: a fourth determining module;

The fourth determining module is configured to determine, according to whether the conference requires the user to participate in the conference and the participation status of the user, the feedback policy is used to indicate that the conference terminal sends the conference terminal to the remote device. The type of feedback data;

The sending module is further configured to send the feedback data to the remote device according to the feedback policy; the feedback data includes: video content that interacts with the remote device, and the remote device At least one of interactive audio content, textual content that interacts with the remote device.

The conference terminal according to claim 21, wherein the fourth determining module is specifically configured to: if the conference requires the user to participate, and the participation status of the user is an absent state, determine the The feedback policy is to send first feedback data to the remote device, where the first feedback data is used to indicate to the remote device that the conference terminal is positive Notifying the user to join the meeting.

The conference terminal according to claim 21, wherein the fourth determining module is specifically configured to: if the conference requires the user to participate, and the participation status of the user is an absent state, determine the The feedback policy is to send second feedback data to the remote device; the second feedback data is used to show the remote device the conference content related to the conference preset by the user.

The conference terminal according to claim 21, wherein the fourth determining module is specifically configured to: if the conference does not require the user to participate, and the participation status of the user is an absent state, determine the location The feedback policy is to send third feedback data to the remote device; the third feedback data is used to indicate to the remote device that the user is participating.

The conference terminal according to claim 21, wherein the fourth determining module is specifically configured to: if the conference requires the user to participate, and the participation status of the user is a user participation conflict state, Determining the feedback policy is to send fourth feedback data to the remote device, and record the current conference content of the conference; the fourth feedback data is used to indicate the participation status of the user to the remote device Participate in conflict status for users.

The conference terminal according to any one of claims 18 to 25, wherein if the first multimedia information that is of interest to the user includes face information of the first participant in the conference, the preset The associated information of the image content that is of interest to the user in the multimedia content is the identity information of the first participant; the second determining unit specifically includes:

a first detecting subunit, configured to detect multimedia information of the conference, and determine a face position and a face size of the participant in the multimedia information;

a feature extraction sub-unit, configured to perform feature extraction on a face position and a face size of the participant in the multimedia information, to obtain a face feature of the participant;

a first matching sub-unit, configured to match a face feature of each participant with a preset face information database to determine a first matching degree;

a first determining subunit, configured to determine that the identification result is identity information of the participant whose first matching degree is greater than a preset first threshold in the face information database;

a second matching subunit, configured to use the identity information of the participant whose first matching degree is greater than a preset first threshold in the face information database and the first parameter in the preset multimedia content The identity information of the participants is matched, and the feature matching result is obtained.

The conference terminal according to any one of claims 18 to 25, wherein if the first multimedia information that is of interest to the user includes the first text information, the text content that the user pays attention to is the first text. The information, the second determining unit specifically includes:

a second detecting subunit, configured to detect multimedia information of the conference, and determine a location and a size of a text block area in the multimedia information of the conference;

Obtaining a subunit, configured to obtain a text block in the multimedia information of the conference according to a location and a size of the text block area in the multimedia information of the conference;

a third matching subunit, configured to match a text block in the multimedia information of the conference with a preset text information library, and determine a second matching degree;

a second determining subunit, configured to determine that the recognition result is text information that the second matching degree in the text information library is greater than a preset second threshold;

a fourth matching sub-unit, configured to match the text information in the text information library that is greater than a preset second threshold with the first text information in the preset multimedia content, The feature matching result is obtained.

a third detecting subunit, configured to detect multimedia information of the conference, determine a location and a size of a text block area in the multimedia information of the conference, and obtain a text block in the multimedia information of the conference;

a third determining subunit, configured to determine the recognition result according to a geometric feature of the text block;

And a fifth matching subunit, configured to match the identification result with the first text information in the preset multimedia content to obtain the feature matching result.

The conference terminal according to claim 27 or 28, wherein if the first multimedia information includes multimedia information of the conference, except for a text type, a participant face type, and the The first data sub-unit related to the text information, the second determining sub-unit further includes:

a fourth detecting subunit, configured to detect multimedia information of the conference, and determine other data information in the multimedia information of the conference, except the text type and the participant face type;

a fourth determining subunit, configured to determine, according to the first text information and the other data information, a degree of correlation between the other data information and the first text information;

a fifth determining subunit, configured to determine that the identification result is data information that the correlation is greater than a preset third threshold in the other data information;

a sixth matching subunit, configured to match the data information whose correlation degree is greater than a preset third threshold value to the first data information in the other data information, to obtain the feature matching result.

The conference terminal according to any one of claims 21 to 29, wherein the feedback data is feedback content preset by the user, or generated according to the feedback policy and the feedback content preset by the user. The data.