WO2021128846A1

WO2021128846A1 - Electronic file control method and apparatus, and computer device and storage medium

Info

Publication number: WO2021128846A1
Application number: PCT/CN2020/105759
Authority: WO
Inventors: 卢宁; 徐国强
Original assignee: OneConnect Smart Technology Co Ltd
Current assignee: OneConnect Smart Technology Co Ltd
Priority date: 2019-12-23
Filing date: 2020-07-30
Publication date: 2021-07-01
Anticipated expiration: 2022-06-23
Also published as: CN111191207A

Abstract

Provided are an electronic file control method and apparatus, and a computer device and a storage medium. The method comprises: capturing a real-time video stream (S201); then acquiring, from the real-time video stream, an image set with a facial image of the current user included therein, and acquiring an action frame set from the real-time video stream in a preset manner (S202); then, comparing the facial image in the image set with a preset facial image, and obtaining a permission verification result (S203); before each instance of control over an electronic file, quickly verifying the security of controlling the electronic file, and when the permission verification result indicates that verification is passed, performing action recognition on the action frame set by means of a trained AU detection network, and obtaining a target action (S204); acquiring, from a preset instruction set, an instruction corresponding to the target action, and taking the instruction as a target instruction (S205); and executing the target instruction on the electronic file (S206). By means of simultaneously performing permission verification and action confirmation on an acquired frame image, an electronic file is rapidly controlled, and the efficiency of controlling the electronic file is improved.

Description

Electronic file control method, device, computer equipment and storage medium

本申请要求于2019年12月23日提交中国专利局、申请号为201911339761.9，发明名称为“电子文件的控制方法、装置、计算机设备及存储介质”的中国专利申请的优先权，其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on December 23, 2019 with the application number 201911339761.9 and the invention title "Electronic file control method, device, computer equipment and storage medium", and the entire content of it is approved The reference is incorporated in this application.

Technical field

本申请涉及信息安全领域，尤其涉及一种电子文件的控制方法、装置、计算机设备及存储介质。This application relates to the field of information security, and in particular to a method, device, computer equipment, and storage medium for controlling electronic files.

Background technique

随着互联网技术的发展和智能终端的普及，人们能够通过智能设备完成的工作和生活事项越来越多。一些原本需要进行线下处理的文件，慢慢也通过智能设备进行完成。With the development of Internet technology and the popularization of smart terminals, people can accomplish more and more work and life matters through smart devices. Some documents that originally needed to be processed offline are slowly being completed through smart devices.

在当前，人们主要通过触屏或者通过智能设备上的实体按钮，对电子文件进行操作。At present, people mainly operate electronic files through touch screens or through physical buttons on smart devices.

technical problem

在实现本申请的过程中，发明人意识到现有方案至少存在如下问题：采用这种方式较为繁琐，使得电子文件的控制效率较低，与此同时，在一些领域，涉及到一些对安全性较高的电子文件，比如电子合约、财务报表、用户隐私数据等时，为确保安全，对这些电子文件进行处理往往需要进行权限验证，通过触屏（触屏输入密码或者图案解锁）或者智能设备上的实体按钮进行操作，有可能泄露解锁密码，使得安全性得不到保障，寻求一个在智能设备上，能够进行高效的电子文件控制的方法，成了一个亟待解决的难题。In the process of realizing this application, the inventor realized that the existing solution has at least the following problems: using this method is more cumbersome, making the control efficiency of electronic files lower. At the same time, in some fields, it involves some safety issues. For higher electronic documents, such as electronic contracts, financial statements, user privacy data, etc., in order to ensure security, the processing of these electronic documents often requires authorization verification, through touch screen (touch screen input password or pattern unlock) or smart device The operation of the physical button on the computer may reveal the unlock password, which makes the security not guaranteed. Finding a method for efficient electronic file control on smart devices has become a problem that needs to be solved urgently.

Technical solutions

本申请实施例提供一种电子文件的控制方法、装置、计算机设备和存储介质，以提高当前电子文件的控制效率。The embodiments of the present application provide a method, a device, a computer device, and a storage medium for controlling electronic files, so as to improve the current control efficiency of electronic files.

为了解决上述技术问题，本申请实施例提供一种电子文件的控制方法，包括：In order to solve the above technical problems, an embodiment of the present application provides an electronic file control method, including:

采集实时视频流；Collect real-time video stream;

从所述实时视频流中，获取包含当前用户的面部图像的图像集合，并按照预设方式，从所述实时视频流中获取动作帧集合；From the real-time video stream, obtain an image set containing the facial image of the current user, and obtain an action frame set from the real-time video stream according to a preset method;

将所述图像集合中的面部图像与预设的人脸图像进行对比，得到权限校验结果；Comparing the facial images in the image collection with the preset facial images to obtain a permission verification result;

若所述权限校验结果为校验通过，则通过训练好的AU检测网络，对所述动作帧集合进行动作识别，得到目标动作；If the permission verification result is that the verification is passed, the action recognition is performed on the action frame set through the trained AU detection network to obtain the target action;

从预设的指令集中，获取所述目标动作对应的指令，作为目标指令；Obtain the instruction corresponding to the target action from the preset instruction set as the target instruction;

对电子文件执行所述目标指令。The target instruction is executed on the electronic file.

为了解决上述技术问题，本申请实施例还提供一种电子文件的控制装置，包括：In order to solve the above technical problems, an embodiment of the present application also provides an electronic file control device, including:

数据采集模块，用于采集实时视频流；Data collection module, used to collect real-time video stream;

图像获取模块，用于从所述实时视频流中，获取包含当前用户的面部图像的图像集合，并按照预设方式，从所述实时视频流中获取动作帧集合；An image acquisition module, configured to acquire an image collection containing facial images of the current user from the real-time video stream, and acquire an action frame collection from the real-time video stream according to a preset method;

权限校验模块，用于将所述图像集合中的面部图像与预设的人脸图像进行对比，得到权限校验结果；The authorization verification module is used to compare the facial image in the image collection with the preset facial image to obtain the authorization verification result;

动作检测模块，用于若所述权限校验结果为校验通过，则通过训练好的AU检测网络，对所述动作帧集合进行动作识别，得到目标动作；An action detection module, configured to perform action recognition on the set of action frames through the trained AU detection network if the permission verification result is a pass, to obtain a target action;

指令确定模块，用于从预设的指令集中，获取所述目标动作对应的指令，作为目标指令；The instruction determining module is used to obtain the instruction corresponding to the target action from a preset instruction set as the target instruction;

文件控制模块，用于对电子文件执行所述目标指令。The file control module is used to execute the target instruction on the electronic file.

为了解决上述技术问题，本申请实施例还提供一种计算机设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令，所述处理器执行所述计算机可读指令时实现如下电子文件的控制方法的步骤：In order to solve the above technical problems, an embodiment of the present application also provides a computer device, including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, and the processor executes all When the computer-readable instructions are described, the steps of the following electronic file control method are realized:

采集实时视频流；Collect real-time video stream;

为了解决上述技术问题，本申请实施例还提供一种计算机可读存储介质，所述计算机可读存储介质存储有计算机可读指令，所述计算机可读指令被处理器执行时实现如下电子文件的控制方法的步骤：In order to solve the above technical problems, embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, the computer-readable instructions are executed by a processor to achieve the following electronic file Steps of the control method:

采集实时视频流；Collect real-time video stream;

Beneficial effect

本申请实施例提供的电子文件的控制方法、装置、计算机设备及存储介质，一方面，通过采集实时视频流，进而从实时视频流中，获取包含当前用户的面部图像的图像集合，并按照预设方式，从实时视频流中获取动作帧集合，再将图像集合中的面部图像与预设的人脸图像进行对比，得到权限校验结果，在每次进行电子文件的控制之前，快速校验电子文件控制的安全性，另一方面，在权限校验结果为校验通过时，通过训练好的AU检测网络，对动作帧集合进行动作识别，得到目标动作，并确定目标动作对应的指令，作为目标指令，进而对电子文件执行目标指令，这种将获取到的帧图像同时进行权限校验和动作确认的方式，实现快速进行电子文件的控制，提高了电子文件控制的效率。The electronic file control method, device, computer equipment, and storage medium provided by the embodiments of the present application, on the one hand, collect real-time video streams, and then from the real-time video streams, obtain an image collection containing the current user’s facial images, and follow the pre-defined Set up a method to obtain the action frame set from the real-time video stream, and then compare the facial image in the image set with the preset face image to obtain the permission verification result, and quickly verify before each electronic file control The security of electronic file control. On the other hand, when the permission verification result is passed, the action recognition is performed on the action frame set through the trained AU detection network, and the target action is obtained, and the instruction corresponding to the target action is determined. As the target instruction, the target instruction is executed on the electronic file. This method of simultaneously performing authorization check and action confirmation on the acquired frame image realizes the rapid control of the electronic file and improves the efficiency of the electronic file control.

Description of the drawings

为了更清楚地说明本申请实施例的技术方案，下面将对本申请实施例的描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.

图1是本申请可以应用于其中的示例性系统架构图；Figure 1 is an exemplary system architecture diagram to which the present application can be applied;

图2是本申请的电子文件的控制方法的一个实施例的流程图；Figure 2 is a flowchart of an embodiment of the electronic file control method of the present application;

图3是根据本申请的电子文件的控制装置的一个实施例的结构示意图；Fig. 3 is a schematic structural diagram of an embodiment of an electronic file control device according to the present application;

图4是根据本申请的计算机设备的一个实施例的结构示意图。Fig. 4 is a schematic structural diagram of an embodiment of a computer device according to the present application.

The best mode of the present invention

请参阅图2，图2示出本申请实施例提供的一种电子文件的控制方法，以该方法应用在图1中的服务端为例进行说明，详述如下：Please refer to FIG. 2. FIG. 2 shows an electronic file control method provided by an embodiment of the present application. The method is applied to the server in FIG. 1 as an example for description. The details are as follows:

S201：采集实时视频流。S201: Collect real-time video streams.

具体地，在检测到对安全性要求比较高的电子文件进行操作时，自动开启摄像设备进行数据采集，得到当前用户的实时视频流。Specifically, when an operation on an electronic file with relatively high security requirements is detected, the camera device is automatically turned on for data collection, and the real-time video stream of the current user is obtained.

例如，在一具体实施方式中，检测到一电子合同文件的查看请求，此时，启动电子设备的摄像头，在电子设备上提示用户开启电子合同文件的动作，并开始拍摄预设时间的实时视频流。For example, in a specific embodiment, a request to view an electronic contract document is detected. At this time, the camera of the electronic device is activated, the user is prompted on the electronic device to open the electronic contract document, and the real-time video of the preset time is taken. flow.

S202：从实时视频流中，获取包含当前用户的面部图像的图像集合，并按照预设方式，从实时视频流中获取动作帧集合。S202: Obtain an image set containing the facial image of the current user from the real-time video stream, and obtain an action frame set from the real-time video stream according to a preset manner.

具体地，从实时视频流中，获取包含当前用户的面部图像的图像集合，以便后续通过图像集合中的面部图像，判断用户是否具有操作该电子合同的权限，同时，按照预设的方式，从该实时视频流中获取动作视频帧集合。Specifically, from the real-time video stream, an image collection containing the facial images of the current user is obtained, so that the facial images in the image collection can be used to determine whether the user has the authority to operate the electronic contract. A collection of action video frames is obtained from the real-time video stream.

其中，图像集合包含当前用户的一个或多个面部图像，从实时视频流中获取包含当前用户的面部图像的基础图像集合具体可以是通过从实时视频流中获取多个视频帧图像，进而对每个视频帧图像进行人脸检测，将包含完整人脸的视频帧图像作为一个当前用户的面部图像，得到包含当前用户的面部图像的图像集合。Among them, the image set contains one or more facial images of the current user, and the basic image set containing the facial images of the current user from the real-time video stream can be specifically obtained by acquiring multiple video frame images from the real-time video stream, and then for each Face detection is performed on two video frame images, and the video frame image containing the complete face is taken as a facial image of the current user, and an image set containing the facial image of the current user is obtained.

其中，动作帧集合是指包含多个具有时间序列的视频帧图像的集合。Among them, the action frame set refers to a set containing multiple video frame images with a time sequence.

S203：将图像集合中的面部图像与预设的人脸图像进行对比，得到权限校验结果。S203: Compare the facial image in the image collection with the preset facial image to obtain a permission verification result.

具体地，本实施例中，将具有电子文件操作权限的人脸图像，预先存储在终端，在步骤S202中获取到图像集合后，先将图像集合中的面部图像与终端预设的人脸图像进行对比，判断图像集合中的面部图像是否为终端预设的人脸图像之一，在图像集合中的面部图像与终端预设的任一人脸图像匹配成功时，确定当前用户具有电子文件的操作权限，此时，权限校验结果为校验通过。Specifically, in this embodiment, the facial image with the electronic file operation authority is pre-stored in the terminal. After the image collection is obtained in step S202, the facial image in the image collection is first combined with the facial image preset by the terminal. Make a comparison to determine whether the facial image in the image collection is one of the facial images preset by the terminal. When the facial image in the image collection matches any facial image preset by the terminal successfully, it is determined that the current user has the operation of an electronic file Authority, at this time, the authority verification result is the verification passed.

可以理解地，若图像集合中的面部图像与终端预设的人脸图像均匹配失败时，说明当前用户不具备该电子文件的操作权限，此时，拒绝此次操作记录，并将当前图像集合进行保存，并生成相关日志。Understandably, if the facial image in the image collection fails to match with the facial image preset by the terminal, it means that the current user does not have the operation authority of the electronic file. At this time, the operation record is rejected and the current image is collected. Save it and generate related logs.

其中，人脸匹配的方法包括但不限于：基于Gabor引擎的人脸匹配算法、局部特征分析方法（Local Face Analysis）、基于几何特征的方法和特定人脸子空间(FSS)算法等。Among them, the methods of face matching include but are not limited to: face matching algorithm based on Gabor engine, local feature analysis method (Local Face Analysis), geometric feature-based methods and specific face subspace (FSS) algorithms, etc.

S204：若权限校验结果为校验通过，则通过训练好的AU检测网络，对动作帧集合进行动作识别，得到目标动作。S204: If the permission verification result is that the verification is passed, the action recognition is performed on the action frame set through the trained AU detection network to obtain the target action.

具体地，在权限校验结果为校验通过后，确认当前用户具有电子文件的操作权限，通过训练好的AU检测网络，对步骤S20中得到的动作帧集合进行动作识别，得到动作帧集合对应的目标动作。Specifically, after the permission verification result is that the verification is passed, confirm that the current user has the operation permission of the electronic file, and perform action recognition on the action frame set obtained in step S20 through the trained AU detection network, and obtain the action frame set corresponding The target action.

其中，AU检测网络是一种用于进行AU检测的卷积神经网络模型，在本实施例中，采用预先训练好的AU检测网络对动作帧集合进行动作识别，有利于提高动作识别的速度，确保电子文件操作的流畅性。Among them, the AU detection network is a convolutional neural network model for AU detection. In this embodiment, a pre-trained AU detection network is used to perform action recognition on the action frame set, which is beneficial to improve the speed of action recognition. Ensure the smoothness of electronic file operations.

其中，AU是动作单元（Facial Action Unit）的简称，AU是指行为人的面部用以表达行为人的表情的表情单元。Among them, AU is the action unit (Facial Action Unit) abbreviation, AU refers to the facial expression unit used to express the actor's facial expression.

其中，卷积神经网（Convolutional Neural Network，CNN）是一种前馈神经网络，它的人工神经元可以响应一部分覆盖范围内的周围单元，可以快速高效进行图像处理，在本实施例中，采用预先训练好的卷积神经网络，可以快速地识别出基础图像中包含的AU。Among them, convolutional neural network (Convolutional Neural Network, CNN) is a feed-forward neural network. Its artificial neurons can respond to a part of the surrounding units in the coverage area, and can quickly and efficiently perform image processing. In this embodiment, The pre-trained convolutional neural network can quickly identify the AU contained in the basic image.

需要说明的是，在权限校验结果为校验不通过时，返回步骤S201，重新进行视频采集，并重新校验，在连续校验不通过的次数达到预设次数时，将文件进行锁定，其中，预设次数可以根据实际需要进行设定。It should be noted that when the permission verification result is that the verification is not passed, return to step S201, perform video capture again, and re-verify, and when the number of consecutive failed verifications reaches the preset number of times, the file is locked. Among them, the preset times can be set according to actual needs.

S205：从预设的指令集中，获取目标动作对应的指令，作为目标指令。S205: Obtain the instruction corresponding to the target action from the preset instruction set as the target instruction.

具体地，终端预设有预设指令集和预设个数的AU动作，每个预设的AU动作对应预设指令集中的一个预设指令，在得到目标动作之后，获取目标动作对应的预设指令，将该预设指令作为目标指令。Specifically, the terminal is preset with a preset instruction set and a preset number of AU actions. Each preset AU action corresponds to a preset instruction in the preset instruction set. After the target action is obtained, the preset instruction corresponding to the target action is obtained. Set the instruction, and use the preset instruction as the target instruction.

其中，预设AU动作在本实施例中，具体是指用于进行电子文件操作的动作，例如：左摇头、右摇头、眨眼等，每个预设AU动作对应一个预设的电子文件的操作指令，例如：左摇头对应合同左翻页指令，右摇头对应右翻页指令，眨眼对应暂停指令等，具体可以根据实际需要进行设定，此处不做限制。Among them, the preset AU action in this embodiment specifically refers to the actions used to perform electronic file operations, such as: shaking the head left, shaking the head right, blinking, etc., each preset AU action corresponds to the operation of a preset electronic file Instructions, such as: shaking your head left corresponds to the left page instruction of the contract, shaking your head right corresponds to the right page turning instruction, and blinking corresponds to the pause instruction, etc., which can be set according to actual needs, and there are no restrictions here.

S206：对电子文件执行目标指令。S206: Execute the target instruction on the electronic file.

具体地，对电子文件执行目标指令，以使所述电子文件根据所述目标指令进行对应的操作。Specifically, the target instruction is executed on the electronic file, so that the electronic file performs a corresponding operation according to the target instruction.

在本实施例中，通过采集实时视频流，进而从实时视频流中，获取包含当前用户的面部图像的图像集合，并按照预设方式，从实时视频流中获取动作帧集合，再将图像集合中的面部图像与预设的人脸图像进行对比，得到权限校验结果，在每次进行电子文件的控制之前，快速校验电子文件控制的安全性，同时，在权限校验结果为校验通过时，通过训练好的AU检测网络，对动作帧集合进行动作识别，得到目标动作，并确定目标动作对应的指令，作为目标指令，进而对电子文件执行目标指令，这种将获取到的帧图像同时进行权限校验和动作确认的方式，实现快速进行电子文件的控制，提高了电子文件控制的效率。In this embodiment, by collecting a real-time video stream, an image collection containing facial images of the current user is obtained from the real-time video stream, and an action frame collection is obtained from the real-time video stream according to a preset method, and then the image collection Compare the facial image in the file with the preset facial image to obtain the permission verification result. Before each electronic file control, the security of the electronic file control is quickly verified. At the same time, the permission verification result is the verification result. When passing, through the trained AU detection network, action recognition is performed on the action frame set, the target action is obtained, and the instruction corresponding to the target action is determined as the target instruction, and then the target instruction is executed on the electronic file. This will obtain the frame The way that the image performs authorization verification and action confirmation at the same time realizes the rapid control of electronic files and improves the efficiency of electronic file control.

图3示出与上述实施例电子文件的控制方法一一对应的电子文件的控制装置的原理框图。如图3所示，该电子文件的控制装置包括数据采集模块31、图像获取模块32、权限校验模块33、动作检测模块34、指令确定模块35和文件控制模块36。各功能模块详细说明如下：Fig. 3 shows a principle block diagram of an electronic file control device corresponding to the electronic file control method of the above-mentioned embodiment one-to-one. As shown in FIG. 3, the electronic file control device includes a data acquisition module 31, an image acquisition module 32, an authorization verification module 33, an action detection module 34, an instruction determination module 35 and a file control module 36. The detailed description of each functional module is as follows:

数据采集模块10，用于采集实时视频流；The data collection module 10 is used to collect real-time video streams;

图像获取模块20，用于从实时视频流中，获取包含当前用户的面部图像的图像集合，并按照预设方式，从实时视频流中获取动作帧集合；The image acquisition module 20 is configured to acquire an image collection containing the facial image of the current user from the real-time video stream, and acquire the action frame collection from the real-time video stream according to a preset method;

权限校验模块30，用于将图像集合中的面部图像与预设的人脸图像进行对比，得到权限校验结果；The authorization verification module 30 is used to compare the facial image in the image collection with the preset facial image to obtain the authorization verification result;

动作检测模块40，用于若权限校验结果为校验通过，则通过训练好的AU检测网络，对动作帧集合进行动作识别，得到目标动作；The action detection module 40 is configured to perform action recognition on the action frame set through the trained AU detection network if the permission verification result is that the verification is passed, to obtain the target action;

指令确定模块50，用于从预设的指令集中，获取目标动作对应的指令，作为目标指令；The instruction determining module 50 is used to obtain the instruction corresponding to the target action from the preset instruction set as the target instruction;

文件控制模块60，用于对电子文件执行目标指令。The file control module 60 is used to execute target instructions on the electronic file.

为解决上述技术问题，本申请实施例还提供计算机设备。具体请参阅图4，图4为本实施例计算机设备基本结构框图。In order to solve the above technical problems, the embodiments of the present application also provide computer equipment. Please refer to FIG. 4 for details. FIG. 4 is a block diagram of the basic structure of the computer device in this embodiment.

所述计算机设备4包括通过系统总线相互通信连接存储器41、处理器42、网络接口43。需要指出的是，图中仅示出了具有组件连接存储器41、处理器42、网络接口43的计算机设备4，但是应理解的是，并不要求实施所有示出的组件，可以替代的实施更多或者更少的组件。其中，本技术领域技术人员可以理解，这里的计算机设备是一种能够按照事先设定或存储的指令，自动进行数值计算和/或信息处理的设备，其硬件包括但不限于微处理器、专用集成电路(Application Specific Integrated Circuit，ASIC)、可编程门阵列(Field－Programmable Gate Array，FPGA)、数字处理器 (Digital Signal Processor，DSP)、嵌入式设备等。The computer device 4 includes a memory 41, a processor 42, and a network interface 43 that are connected to each other in communication via a system bus. It should be pointed out that the figure only shows the computer device 4 with the components connected to the memory 41, the processor 42, and the network interface 43, but it should be understood that it is not required to implement all the shown components, and alternative implementations can be made. More or fewer components. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA), Digital Signal Processor (DSP), embedded devices, etc.

所述计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。所述计算机设备可以与用户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互。The computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.

本申请还提供了另一种实施方式，即提供一种计算机可读存储介质，该计算机可读存储介质可以是非易失性，也可以是易失性，该计算机可读存储介质存储有用于界面显示的计算机可读指令，所述用于界面显示的计算机可读指令可被至少一个处理器执行，以使所述至少一个处理器执行如上述的电子文件的控制方法的步骤。This application also provides another implementation manner, that is, a computer-readable storage medium is provided. The computer-readable storage medium may be non-volatile or volatile, and the computer-readable storage medium stores information for interface Displayed computer-readable instructions, the computer-readable instructions for interface display can be executed by at least one processor, so that the at least one processor executes the steps of the above-mentioned electronic file control method.

Embodiments of the present invention

除非另有定义，本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同；本文中在申请的说明书中所使用的术语只是为了描述具体的实施例的目的，不是旨在于限制本申请；本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形，意图在于覆盖不排他的包含。本申请的说明书和权利要求书或上述附图中的术语“第一”、“第二”等是用于区别不同对象，而不是用于描述特定顺序。Unless otherwise defined, all technical and scientific terms used herein have the same meanings as commonly understood by those skilled in the technical field of the application; the terms used in the specification of the application herein are only for describing specific embodiments. The purpose is not to limit the application; the terms "including" and "having" in the specification and claims of the application and the above-mentioned description of the drawings and any variations thereof are intended to cover non-exclusive inclusions. The terms "first", "second", etc. in the specification and claims of the present application or the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific sequence.

在本文中提及“实施例”意味着，结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例，也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是，本文所描述的实施例可以与其它实施例相结合。The reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present application. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

请参阅图1，如图1所示，系统架构100可以包括终端设备101、102、103，网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型，例如有线、无线通信链路或者光纤电缆等等。Please refer to FIG. 1. As shown in FIG. 1, the system architecture 100 may include terminal devices 101, 102, and 103, a network 104 and a server 105. The network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.

用户可以使用终端设备101、102、103通过网络104与服务器105交互，以接收或发送消息等。The user can use the terminal devices 101, 102, and 103 to interact with the server 105 through the network 104 to receive or send messages and so on.

终端设备101、102、103可以是具有显示屏并且支持网页浏览的各种电子设备，包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器( Moving Picture E界面显示perts Group Audio Layer III，动态影像专家压缩标准音频层面3 )、MP4( Moving Picture E界面显示perts Group Audio Layer IV，动态影像专家压缩标准音频层面4 )播放器、膝上型便携计算机和台式计算机等等。The terminal devices 101, 102, 103 can be various electronic devices with display screens and support for web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture E interface displays perts Group Audio Layer III, dynamic image expert compresses standard audio layer 3), MP4 (Moving Picture E interface displays perts Group Audio Layer IV, the dynamic image expert compresses the standard audio layer 4) Players, laptops and desktop computers, etc.

服务器105可以是提供各种服务的服务器，例如对终端设备101、102、103上显示的页面提供支持的后台服务器。The server 105 may be a server that provides various services, for example, a background server that provides support for pages displayed on the terminal devices 101, 102, and 103.

需要说明的是，本申请实施例所提供的电子文件的控制方法由服务器执行，相应地，电子文件的控制装置设置于服务器中。It should be noted that the electronic file control method provided in the embodiments of the present application is executed by the server, and correspondingly, the electronic file control device is set in the server.

应该理解，图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要，可以具有任意数目的终端设备、网络和服务器，本申请实施例中的终端设备101、102、103具体可以对应的是实际生产中的应用系统。It should be understood that the numbers of terminal devices, networks, and servers in FIG. 1 are merely illustrative. According to implementation needs, there may be any number of terminal devices, networks, and servers. The terminal devices 101, 102, and 103 in the embodiments of the present application may specifically correspond to application systems in actual production.

S201：采集实时视频流。S201: Collect real-time video streams.

其中，AU是动作单元（Facial Action Unit）的简称，AU是指行为人的面部用以表达行为人的表情的表情单元。Among them, AU is the action unit (Facial Action Unit) abbreviation, AU refers to the expression unit used to express the actor's facial expression on the actor's face.

具体地，终端预设有预设指令集和预设个数的AU动作，每个预设的AU动作对应预设指令集中的一个预设指令，在得到目标动作之后，获取目标动作对应的预设指令，将该预设指令作为目标指令。Specifically, the terminal is preset with a preset instruction set and a preset number of AU actions. Each preset AU action corresponds to a preset instruction in the preset instruction set. After the target action is obtained, the preset instruction corresponding to the target action is obtained. Set the instruction and use the preset instruction as the target instruction.

在本实施例的一些可选的实现方式中，步骤S202中，从实时视频流中，获取包含当前用户的面部图像的图像集合包括：In some optional implementation manners of this embodiment, in step S202, obtaining an image set containing the facial image of the current user from the real-time video stream includes:

按照预设的时间间隔，从实时视频流中获取帧图像，得到包含预设个数的帧图像的帧图像集合；Obtain frame images from the real-time video stream according to a preset time interval, and obtain a frame image set containing a preset number of frame images;

采用人脸检测算法，对帧图像集合中的帧图像进行人脸检测，得到检测结果；Use face detection algorithm to perform face detection on the frame images in the frame image collection to obtain the detection result;

将检测结果中，包含完整人脸特征的每个帧图像均作为一个面部图像，得到包含至少一个面部图像的图像集合。In the detection result, each frame image containing complete facial features is taken as a facial image, and an image set containing at least one facial image is obtained.

具体地，终端按照预设的时间间隔，从接收到的实时视频流中抽取视频帧，得到包含多个帧图像的帧图像集合，进而通过人脸检测技术，对获取到的帧图像集合中的每个帧图像进行人脸检测，在本实施例中，获取实时视频流中的帧图像是为了进行电子文件的操作权限验证，因而，对人脸进行检测，主要是检测帧图像中是否包括清晰完整的人脸图像，因而，检测结果包括两种情况：包含完整的人脸特征和不包含完整人脸特征，将检测结果为包含完整的人脸特征的帧图像作为当前用户的面部图像，当前用户的面部图像可以是一个，也可以是多个。Specifically, the terminal extracts video frames from the received real-time video stream at a preset time interval to obtain a frame image set containing multiple frame images, and then uses face detection technology to determine Face detection is performed on each frame image. In this embodiment, the frame image in the real-time video stream is obtained for the purpose of verifying the operation authority of the electronic file. Therefore, the face detection is mainly to detect whether the frame image includes clearness A complete face image. Therefore, the detection result includes two situations: including complete facial features and not including complete facial features. The frame image containing complete facial features is used as the current user’s facial image. The user's facial image can be one or multiple.

例如，在一具体实施方式中，预设的时间间隔为8个连续视频帧对应的时间，在接收到客户端发送的实时视频流后，终端将获取到的包含128个连续视频帧的实时视频流中，将8个视频帧作为一个视频帧集合，并获取每个视频帧集合中的最后一个视频帧图像，一共得到128/8=16个视频帧图像，将这16个帧图像作为帧图像集合。For example, in a specific embodiment, the preset time interval is the time corresponding to 8 continuous video frames. After receiving the real-time video stream sent by the client, the terminal will obtain the real-time video containing 128 continuous video frames. In the stream, 8 video frames are regarded as a video frame set, and the last video frame image in each video frame set is obtained, a total of 128/8=16 video frame images are obtained, and these 16 frame images are regarded as frame images set.

其中，人脸检测技术主要是用于检测帧图像中面部五官、轮廓及重要面部弧线是否清晰完整，其具体实现方式包括但不限于：Adaboost算法、人脸特征点的识别算法（Feature-based recognition algorithms）、神经网络进行识别的算法（Recognition algorithms using neural network）和基于主成分分析（Principal Component Analysis，PCA）的特征脸算法等。Among them, the face detection technology is mainly used to detect whether the facial features, contours, and important facial arcs in the frame image are clear and complete. The specific implementation methods include but not limited to: Adaboost algorithm, facial feature point recognition algorithm (Feature-based recognition algorithms), recognition algorithms using neural network (Recognition algorithms using neural network), and eigenface algorithms based on Principal Component Analysis (PCA).

值得说明的是，获取到的当前用户的面部图像至少为一个，若未获取到当前用户的面部图像，则认为当前用户的状态获取异常，此时，向电子设备的显示界面发送相应提示信息，并进行实时视频流的重新获取。It is worth noting that there is at least one facial image of the current user acquired. If the facial image of the current user is not acquired, it is considered that the current user's state acquisition is abnormal. At this time, the corresponding prompt information is sent to the display interface of the electronic device. And reacquire the real-time video stream.

在本实施例中，按照预设的时间间隔，从实时视频流中获取帧图像，得到包含预设个数的帧图像的帧图像集合，再采用人脸检测算法，对帧图像集合中的帧图像进行人脸检测，得到检测结果，将检测结果中完整人脸特征的每个帧图像均作为一个面部图像，得到包含至少一个面部图像的图像集合，实现通过采集到的视频流来进行快速权限认证，同时避免使用传统的密码验证可能存在的泄露破解风险，提高权限验证的安全性和效率。In this embodiment, frame images are obtained from the real-time video stream at a preset time interval to obtain a frame image set containing a preset number of frame images, and then a face detection algorithm is used to determine the frames in the frame image set. The image is subjected to face detection, and the detection result is obtained. Each frame image of the complete face feature in the detection result is regarded as a facial image, and an image set containing at least one facial image is obtained, so as to realize fast authorization through the collected video stream At the same time, it avoids the risk of leakage and cracking that may exist in traditional password verification, and improves the security and efficiency of authorization verification.

在本实施例的一些可选的实现方式中，步骤S202中，按照预设方式，从实时视频流中获取动作帧集合包括：In some optional implementation manners of this embodiment, in step S202, acquiring an action frame set from a real-time video stream according to a preset manner includes:

按照帧图像集合中每个帧图像在实时视频流中的出现的先后顺序，对帧图像进行排序，得到排序后的帧图像序列；Sort the frame images according to the sequence of appearance of each frame image in the frame image collection in the real-time video stream to obtain the sorted frame image sequence;

对帧图像序列中的每个帧图像进行标注，得到动作帧集合。Annotate each frame image in the frame image sequence to obtain the action frame set.

具体地，终端按照预设的时间间隔，从接收到的实时视频流中抽取视频帧，得到包含多个帧图像的帧图像集合，并根据帧图像对应的时序关系，对帧图像进行排序并标注，得到具有标注信息的动作帧集合。Specifically, the terminal extracts video frames from the received real-time video stream at a preset time interval to obtain a frame image set containing multiple frame images, and sorts and labels the frame images according to the timing relationship corresponding to the frame images , Get a set of action frames with annotation information.

其中，标注是指为每个帧图像赋予顺序标识，在后续根据标注可以确定图像之间的关联，标注具体可以是数字、字母，或数字字母的组合等，可根据实际需要进行设置，此处不做限制。Among them, labeling refers to assigning a sequential mark to each frame image. The association between the images can be determined according to the labeling. The labeling can be numbers, letters, or a combination of numbers and letters, etc., which can be set according to actual needs. No restrictions.

在本实施例中，按照预设的时间间隔，从实时视频流中获取帧图像，得到包含预设个数的帧图像的帧图像集合，进而按照帧图像集合中每个帧图像在实时视频流中的出现的先后顺序，对帧图像进行排序，得到排序后的帧图像序列，并对帧图像序列中的每个帧图像进行标注，得到动作帧集合，使得后续可以通过这些有标注的动作帧集合确定帧图像的变化内容，采用变化内容进行动作的识别，不仅可以提高动作识别的准确度，还可以提高动作识别的效率。In this embodiment, frame images are obtained from the real-time video stream at a preset time interval to obtain a frame image set containing a preset number of frame images, and then according to each frame image in the frame image set in the real-time video stream Sort the frame images in the order of appearance in the sequence to obtain the sorted frame image sequence, and mark each frame image in the frame image sequence to obtain the action frame set, so that these labeled action frames can be subsequently passed Collecting and determining the change content of the frame image, and using the change content for action recognition can not only improve the accuracy of action recognition, but also improve the efficiency of action recognition.

在本实施例的一些可选的实现方式中，步骤S204中，若权限校验结果为校验通过，则通过训练好的AU检测网络，对动作帧集合进行动作识别，得到目标动作包括：In some optional implementations of this embodiment, in step S204, if the permission verification result is that the verification is passed, the action recognition is performed on the action frame set through the trained AU detection network, and the target action is obtained including:

若权限校验结果为校验通过，则将获取到的动作帧集合中的每一个帧图像输入到训练好的AU检测网络中；If the permission verification result is that the verification is passed, input each frame image in the acquired action frame set into the trained AU detection network;

按照动作帧标注的顺序，依次计算相邻帧之间的像素差异，得到相邻帧之间的差异内容；Calculate the pixel difference between adjacent frames in sequence according to the order of action frame labeling, and obtain the difference content between adjacent frames;

通过训练好的AU检测网络的卷积层，依次对差异内容进行特征提取，得到对应的特征变化内容；Through the convolutional layer of the trained AU detection network, feature extraction is performed on the difference content in turn to obtain the corresponding feature change content;

将每个特征变化内容输入到AU动作识别层，并根据AU动作识别层对特征变化内容进行分类识别，得到目标动作。Each feature change content is input to the AU action recognition layer, and the feature change content is classified and recognized according to the AU action recognition layer to obtain the target action.

具体地，在权限校验通过时，将获取到的动作帧集合中的每一个帧图像均作为输入数据，输入到训练好的AU检测网络中，并根据帧图像的标注，确定上一个帧图像和下一个帧图像之间的差异内容，再通过AU检测网络提取差异内容的特征信息，并将特征信息传递到AU动作识别层进行动作识别。Specifically, when the permission verification is passed, each frame image in the acquired action frame set is used as input data, and input into the trained AU detection network, and the previous frame image is determined according to the label of the frame image The difference content with the next frame image is then extracted through the AU detection network to extract the characteristic information of the difference content, and the characteristic information is passed to the AU action recognition layer for action recognition.

需要说明的是，在本实施例中，通过特征变化内容来进行动作识别，相对于传统方式直接对帧图像进行识别，减少了大量特征，能有效提高识别效率，同时，变化内容更能反映动作的一些过程，有利于提高动作识别的准确率。It should be noted that, in this embodiment, the action recognition is performed by changing the content of the feature, compared with the traditional method to directly recognize the frame image, reducing a large number of features, which can effectively improve the recognition efficiency, and at the same time, the changed content can more reflect the action. Some of the processes help to improve the accuracy of action recognition.

其中，确定上一个帧图像和下一个帧图像之间的差异内容，具体可采用的方式包括但不限于：光流法检测，帧差法、边缘检测法和运动矢量检测法等。Among them, to determine the difference content between the previous frame image and the next frame image, specific methods that can be used include, but are not limited to: optical flow detection, frame difference, edge detection, and motion vector detection.

优选地，本实施例采用帧差法进行差异内容的确定。Preferably, this embodiment adopts the frame difference method to determine the difference content.

通过AU动作识别层对特征变化内容进行分类识别，得到目标动作的具体过程可参考后续实施例的描述，为避免重复，此处不再赘述。The feature change content is classified and recognized through the AU action recognition layer, and the specific process of obtaining the target action can be referred to the description of the subsequent embodiments. To avoid repetition, details are not described here.

在本实施例中，在权限校验结果为校验通过时，将获取到的动作帧集合中的每一个帧图像输入到训练好的AU检测网络中并获取相邻帧之间的差异内容，再通过训练好的AU检测网络的卷积层，依次对差异内容进行特征提取，得到对应的特征变化内容，进而将每个特征变化内容输入到AU动作识别层，并根据AU动作识别层对特征变化内容进行分类识别，得到目标动作。有利于提高动作识别的效率和准确率。In this embodiment, when the permission check result is that the check is passed, each frame image in the acquired action frame set is input into the trained AU detection network and the difference content between adjacent frames is obtained. Then through the convolutional layer of the trained AU detection network, feature extraction is performed on the difference content in turn to obtain the corresponding feature change content, and then each feature change content is input to the AU action recognition layer, and the features are compared according to the AU action recognition layer The change content is classified and identified, and the target action is obtained. Conducive to improving the efficiency and accuracy of action recognition.

在本实施例的一些可选的实现方式中，将每个特征变化内容输入到AU动作识别层，并根据AU动作识别层对特征变化内容进行分类识别，得到目标动作包括：In some optional implementations of this embodiment, each feature change content is input to the AU action recognition layer, and the feature change content is classified and recognized according to the AU action recognition layer, and the target action obtained includes:

使用AU动作识别层的n个AU分类器对特征变化内容进行相似度计算，得到特征变化内容属于该AU分类器对应的动作类别的概率，共得到n个概率，其中，每个AU分类器对应一种动作类别；Use the n AU classifiers of the AU action recognition layer to calculate the similarity of the feature change content, and obtain the probability that the feature change content belongs to the action category corresponding to the AU classifier. A total of n probabilities are obtained, where each AU classifier corresponds to An action category;

从n个概率中，选取概率最大的动作类别作为特征变化内容对应的目标动作。From n probabilities, the action category with the highest probability is selected as the target action corresponding to the feature change content.

具体地，AU检测网络模型包括但不限于：输入层、卷积层和AU动作识别层等，在AU动作识别层有n个训练好的AU分类器，将每个AU分类器均与特征变化内容进行相似度计算，得到特征变化内容属于该AU分类器对应的动作类别的概率，共得到n个概率，从n个概率中，选取概率最大的动作类别作为特征变化内容对应的目标动作。Specifically, the AU detection network model includes but is not limited to: input layer, convolutional layer and AU action recognition layer, etc. There are n trained AU classifiers in the AU action recognition layer, and each AU classifier is changed with the feature The content similarity calculation is performed to obtain the probability that the feature change content belongs to the action category corresponding to the AU classifier, and a total of n probabilities are obtained. From the n probabilities, the action category with the highest probability is selected as the target action corresponding to the feature change content.

其中，AU分类器对应的行为类别可根据实际需要进行训练，例如上翻页、下翻页、自动翻页、放大和暂停等。AU分类器的数量n也可根据需要进行设置，此处不作具体限制，例如，n设置为14。Among them, the behavior category corresponding to the AU classifier can be trained according to actual needs, such as page up, page down, automatic page turning, zoom in, and pause. The number n of AU classifiers can also be set as required, and there is no specific limitation here. For example, n is set to 14.

其中，AU分类器实现方法包括但不限于：逻辑回归（Logistic Regression，LR）、支持向量机（(Support Vector Machine，SVM）、交叉熵（Corss Entropy）和softmax回归等。Among them, the realization method of AU classifier includes but not limited to: Logistic regression (Logistic Regression, LR), support vector machine (Support Vector Machine, SVM), cross entropy (Corss Entropy) and softmax regression, etc.

优选地，本申请实施例采用softmax回归来实现多个AU分类器的分类识别。Preferably, the embodiment of the present application adopts softmax regression to realize the classification and recognition of multiple AU classifiers.

在本实施例中，通过使用AU动作识别层的n个AU分类器对特征变化内容进行相似度计算，得到特征变化内容属于该AU分类器对应的动作类别的概率，选取概率最大的动作类别作为特征变化内容对应的目标动作，提高了目标动作的识别准确率。In this embodiment, by using n AU classifiers of the AU action recognition layer to calculate the similarity of the feature change content, the probability that the feature change content belongs to the action category corresponding to the AU classifier is obtained, and the action category with the highest probability is selected as The target action corresponding to the feature change content improves the recognition accuracy of the target action.

应理解，上述实施例中各步骤的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在逻辑确定，而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.

进一步地，图像获取模块包括：Further, the image acquisition module includes:

帧图像选取单元，用于按照预设的时间间隔，从实时视频流中获取帧图像，得到包含预设个数的帧图像的帧图像集合；The frame image selection unit is used to obtain frame images from the real-time video stream according to a preset time interval, and obtain a frame image set containing a preset number of frame images;

人脸检测单元，用于采用人脸检测算法，对帧图像集合中的帧图像进行人脸检测，得到检测结果；The face detection unit is used to use a face detection algorithm to perform face detection on the frame images in the frame image set to obtain the detection result;

面部图像确定单元，用于将检测结果中，包含完整人脸特征的每个帧图像均作为一个面部图像，得到包含至少一个面部图像的图像集合。The facial image determining unit is used to treat each frame image containing complete facial features in the detection result as a facial image to obtain an image set containing at least one facial image.

进一步地，图像获取模块还包括：Further, the image acquisition module also includes:

帧图像选取单元，按照预设的时间间隔，从实时视频流中获取帧图像，得到包含预设个数的帧图像的帧图像集合；The frame image selection unit obtains frame images from the real-time video stream according to a preset time interval, and obtains a frame image set containing a preset number of frame images;

图像排序单元，用于按照帧图像集合中每个帧图像在实时视频流中的出现的先后顺序，对帧图像进行排序，得到排序后的帧图像序列；The image sorting unit is used to sort the frame images according to the sequence of appearance of each frame image in the frame image collection in the real-time video stream to obtain a sequence of frame images after sorting;

图像标注单元，用于对帧图像序列中的每个帧图像进行标注，得到动作帧集合。The image tagging unit is used to tag each frame image in the frame image sequence to obtain an action frame set.

进一步地，动作检测模块包括：Further, the motion detection module includes:

数据输入单元，用于若权限校验结果为校验通过，则将获取到的动作帧集合中的每一个帧图像输入到训练好的AU检测网络中；The data input unit is used for inputting each frame image in the acquired action frame set into the trained AU detection network if the permission verification result is that the verification is passed;

差异内容获取单元，用于按照动作帧标注的顺序，依次计算相邻帧之间的像素差异，得到相邻帧之间的差异内容；The difference content obtaining unit is used to sequentially calculate the pixel difference between adjacent frames according to the order in which the action frames are marked to obtain the difference content between adjacent frames;

差异特征提取单元，用于通过训练好的AU检测网络的卷积层，依次对差异内容进行特征提取，得到对应的特征变化内容；The difference feature extraction unit is used to detect the convolutional layer of the network through the trained AU, and then perform feature extraction on the difference content in turn to obtain the corresponding feature change content;

动作识别单元，用于将每个特征变化内容输入到AU动作识别层，并根据AU动作识别层对特征变化内容进行分类识别，得到目标动作。The action recognition unit is used to input each feature change content to the AU action recognition layer, and classify and recognize the feature change content according to the AU action recognition layer to obtain the target action.

进一步地，差异内容获取单元包括：Further, the difference content obtaining unit includes:

帧差法计算子单元，用于采用帧差法计算相邻帧之间的灰度差分，得到差异内容。The frame difference method calculation subunit is used to calculate the gray level difference between adjacent frames by using the frame difference method to obtain the difference content.

进一步地，动作识别单元包括：Further, the action recognition unit includes:

概率计算子单元，用于使用AU动作识别层的n个AU分类器对特征变化内容进行相似度计算，得到特征变化内容属于该AU分类器对应的动作类别的概率，共得到n个概率，其中，每个AU分类器对应一种动作类别；The probability calculation subunit is used to use n AU classifiers of the AU action recognition layer to calculate the similarity of the feature change content to obtain the probability that the feature change content belongs to the action category corresponding to the AU classifier, and obtain a total of n probabilities, where , Each AU classifier corresponds to an action category;

目标动作确定子单元，用于从n个概率中，选取概率最大的动作类别作为特征变化内容对应的目标动作。The target action determination subunit is used to select the action category with the highest probability from n probabilities as the target action corresponding to the feature change content.

关于电子文件的控制装置的具体限定可以参见上文中对于电子文件的控制方法的限定，在此不再赘述。上述电子文件的控制装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中，也可以以软件形式存储于计算机设备中的存储器中，以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the control device of the electronic file, please refer to the above limitation of the control method of the electronic file, which will not be repeated here. Each module in the above electronic file control device can be implemented in whole or in part by software, hardware and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.

所述存储器41至少包括一种类型的可读存储介质，所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器（例如，SD或D界面显示存储器等）、随机访问存储器（RAM）、静态随机访问存储器（SRAM）、只读存储器（ROM）、电可擦除可编程只读存储器（EEPROM）、可编程只读存储器（PROM）、磁性存储器、磁盘、光盘等。在一些实施例中，所述存储器41可以是所述计算机设备4的内部存储单元，例如该计算机设备4的硬盘或内存。在另一些实施例中，所述存储器41也可以是所述计算机设备4的外部存储设备，例如该计算机设备4上配备的插接式硬盘，智能存储卡（Smart Media Card, SMC），安全数字（Secure Digital, SD）卡，闪存卡（Flash Card）等。当然，所述存储器41还可以既包括所述计算机设备4的内部存储单元也包括其外部存储设备。本实施例中，所述存储器41通常用于存储安装于所述计算机设备4的操作系统和各类应用软件，例如电子文件的控制的计算机可读指令等。此外，所述存储器41还可以用于暂时地存储已经输出或者将要输出的各类数据。The memory 41 includes at least one type of readable storage medium, the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or D interface display memory, etc.), random access memory (RAM) , Static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, for example, a plug-in hard disk, a smart memory card (Smart Media Card, SMC), and a secure digital device equipped on the computer device 4. (Secure Digital, SD) card, flash memory card (Flash Card) and so on. Of course, the memory 41 may also include both the internal storage unit of the computer device 4 and its external storage device. In this embodiment, the memory 41 is generally used to store an operating system and various application software installed in the computer device 4, such as computer-readable instructions for controlling electronic files. In addition, the memory 41 can also be used to temporarily store various types of data that have been output or will be output.

所述处理器42在一些实施例中可以是中央处理器（Central Processing Unit，CPU）、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器42通常用于控制所述计算机设备4的总体操作。本实施例中，所述处理器42用于运行所述存储器41中存储的计算机可读指令或者处理数据，例如运行电子文件的控制的计算机可读指令。The processor 42 may be a central processing unit (Central Processing Unit) in some embodiments. Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip. The processor 42 is generally used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to run computer-readable instructions or process data stored in the memory 41, for example, run computer-readable instructions for controlling electronic files.

所述网络接口43可包括无线网络接口或有线网络接口，该网络接口43通常用于在所述计算机设备4与其他电子设备之间建立通信连接。The network interface 43 may include a wireless network interface or a wired network interface, and the network interface 43 is generally used to establish a communication connection between the computer device 4 and other electronic devices.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质（如ROM/RAM、磁碟、光盘）中，包括若干指令用以使得一台终端设备（可以是手机，计算机，服务器，空调器，或者网络设备等）执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to enable a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the method described in each embodiment of the present application.

显然，以上所描述的实施例仅仅是本申请一部分实施例，而不是全部的实施例，附图中给出了本申请的较佳实施例，但并不限制本申请的专利范围。本申请可以以许多不同的形式来实现，相反地，提供这些实施例的目的是使对本申请的公开内容的理解更加透彻全面。尽管参照前述实施例对本申请进行了详细的说明，对于本领域的技术人员来而言，其依然可以对前述各具体实施方式所记载的技术方案进行修改，或者对其中部分技术特征进行等效替换。凡是利用本申请说明书及附图内容所做的等效结构，直接或间接运用在其他相关的技术领域，均同理在本申请专利保护范围之内。Obviously, the above-described embodiments are only a part of the embodiments of the present application, rather than all of the embodiments. The drawings show preferred embodiments of the present application, but do not limit the patent scope of the present application. The present application can be implemented in many different forms. On the contrary, the purpose of providing these examples is to make the understanding of the disclosure of the present application more thorough and comprehensive. Although this application has been described in detail with reference to the foregoing embodiments, for those skilled in the art, it is still possible for those skilled in the art to modify the technical solutions described in each of the foregoing specific embodiments, or equivalently replace some of the technical features. . All equivalent structures made by using the contents of the description and drawings of this application, directly or indirectly used in other related technical fields, are similarly within the scope of patent protection of this application.

Claims

An electronic file control method, wherein the electronic file control method includes:

Collect real-time video stream;

From the real-time video stream, obtain an image set containing the facial image of the current user, and obtain an action frame set from the real-time video stream according to a preset method;

Comparing the facial images in the image collection with the preset facial images to obtain a permission verification result;

If the permission verification result is that the verification is passed, the action recognition is performed on the action frame set through the trained AU detection network to obtain the target action;

Obtain the instruction corresponding to the target action from the preset instruction set as the target instruction;

The target instruction is executed on the electronic file.

5. The method for controlling electronic files according to claim 1, wherein said acquiring from said real-time video stream an image set containing the facial image of the current user comprises:

Obtaining frame images from the real-time video stream according to a preset time interval to obtain a frame image set containing a preset number of frame images;

Using a face detection algorithm to perform face detection on the frame images in the frame image set to obtain a detection result;

In the detection result, each frame image containing complete facial features is used as one facial image, and an image set containing at least one facial image is obtained.

5. The method for controlling electronic files according to claim 1, wherein said acquiring a set of action frames from said real-time video stream according to a preset method comprises:

Sorting the frame images according to the sequence of appearance of each frame image in the frame image set in the real-time video stream to obtain a sequence of frame images after sorting;

Annotate each frame image in the frame image sequence to obtain the action frame set.

The electronic document control method according to claim 3, wherein if the permission verification result is that the verification is passed, the action recognition is performed on the action frame set through the trained AU detection network to obtain the target Actions include:

If the permission verification result is that the verification is passed, input each frame image in the acquired action frame set into the trained AU detection network;

Calculate the pixel difference between adjacent frames in sequence according to the order of action frame labeling, and obtain the difference content between adjacent frames;

Through the convolutional layer of the trained AU detection network, feature extraction is performed on the difference content in sequence to obtain corresponding feature change content;

Input each feature change content to the AU action recognition layer, and classify and recognize the feature change content according to the AU action recognition layer to obtain a target action.

5. The electronic document control method according to claim 4, wherein the step of sequentially calculating the pixel difference between adjacent frames according to the order in which the action frames are marked to obtain the difference content between the adjacent frames comprises:

The frame difference method is used to calculate the gray scale difference between adjacent frames to obtain the difference content.

The electronic document control method according to claim 4, wherein the AU action recognition layer includes n preset AU classifiers, wherein n is a positive integer greater than 1, and the AU action recognition layer The classification and recognition of the feature change content to obtain the target action includes:

Use the n AU classifiers of the AU action recognition layer to calculate the similarity of the feature change content to obtain the probability that the feature change content belongs to the action category corresponding to the AU classifier, and obtain a total of n of the probabilities, Among them, each AU classifier corresponds to an action category;

From the n probabilities, the action category with the highest probability is selected as the target action corresponding to the feature change content.

An electronic file control device, wherein the electronic file control device includes:

Data collection module, used to collect real-time video stream;

An image acquisition module, configured to acquire an image collection containing facial images of the current user from the real-time video stream, and acquire an action frame collection from the real-time video stream according to a preset method;

The authorization verification module is used to compare the facial image in the image collection with the preset facial image to obtain the authorization verification result;

An action detection module, configured to perform action recognition on the set of action frames through the trained AU detection network if the permission verification result is a pass, to obtain a target action;

The instruction determining module is used to obtain the instruction corresponding to the target action from a preset instruction set as the target instruction;

The file control module is used to execute the target instruction on the electronic file.

8. The electronic document control device according to claim 7, wherein the image acquisition module comprises:

The frame image selection unit is configured to obtain frame images from the real-time video stream according to a preset time interval to obtain a frame image set containing a preset number of frame images;

The face detection unit is configured to use a face detection algorithm to perform face detection on the frame images in the frame image set to obtain a detection result;

The facial image determining unit is configured to use, in the detection result, each of the frame images containing complete facial features as one of the facial images, to obtain an image set containing at least one of the facial images.

A computer device includes a memory, a processor, and computer readable instructions stored in the memory and capable of running on the processor, wherein the processor implements the following electronic file when the computer readable instruction is executed Control method:

Collect real-time video stream;

The target instruction is executed on the electronic file.

9. The computer device according to claim 9, wherein said acquiring, from said real-time video stream, an image collection containing a facial image of the current user comprises:

9. The computer device according to claim 9, wherein said acquiring an action frame set from said real-time video stream in a preset manner comprises:

The computer device according to claim 11, wherein, if the permission verification result is a pass, performing action recognition on the action frame set through the trained AU detection network to obtain the target action comprises:

Calculate the pixel difference between adjacent frames in sequence according to the order of the action frame labeling, and obtain the difference content between adjacent frames;

Through the convolutional layer of the trained AU detection network, feature extraction is performed on the difference content in sequence to obtain the corresponding feature change content;

Input each feature change content to an AU action recognition layer, and classify and recognize the feature change content according to the AU action recognition layer to obtain a target action.

11. The computer device according to claim 12, wherein said sequentially calculating the pixel difference between adjacent frames in the order of labeling the action frames, and obtaining the content of the difference between adjacent frames comprises:

The computer device of claim 12, wherein the AU action recognition layer includes n preset AU classifiers, wherein n is a positive integer greater than 1, and the AU action recognition layer performs The feature change content is classified and identified, and the target actions obtained include:

A computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, wherein when the computer-readable instructions are executed by a processor, the following electronic file control method is implemented:

Collect real-time video stream;

The target instruction is executed on the electronic file.

15. The computer-readable storage medium according to claim 15, wherein said acquiring, from said real-time video stream, an image collection containing a facial image of a current user comprises:

15. The computer-readable storage medium according to claim 15, wherein said acquiring a set of action frames from said real-time video stream in a preset manner comprises:

The computer-readable storage medium according to claim 17, wherein, if the permission verification result is that the verification is passed, the action recognition is performed on the action frame set through the trained AU detection network to obtain the target Actions include:

18. The computer-readable storage medium according to claim 18, wherein said sequentially calculating the pixel difference between adjacent frames according to the order of action frame labeling, and obtaining the difference content between adjacent frames comprises:

The computer-readable storage medium of claim 18, wherein the AU action recognition layer includes n preset AU classifiers, wherein n is a positive integer greater than 1, and the AU action recognition layer The classification and recognition of the feature change content to obtain the target action includes: