CN106358006A

CN106358006A - Video correction method and video correction device

Info

Publication number: CN106358006A
Application number: CN201610030971.XA
Authority: CN
Inventors: 杨铀; 万凯; 刘琼
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2016-01-15
Filing date: 2016-01-15
Publication date: 2017-01-25
Anticipated expiration: 2036-01-15
Also published as: CN106358006B

Abstract

The invention discloses a video correction method and device. Wherein, the method includes: collecting a first image and a second image of the user, wherein the first image is an image collected when the user watches the camera area, and the second image is an image collected when the user watches the screen area; extracting the second image The coordinate system in the first image and the coordinate system in the second image, and calculate the affine matrix from the coordinate system of the first image to the coordinate system of the second image; use the affine matrix to perform affine transformation processing on the collected initial video , to generate the corrected target video. The invention solves the technical problem that in the prior art, the video correction method for realizing eye contact between chatting users is complex and consumes a lot of computing resources.

Description

Video correction method and device

技术领域technical field

本发明涉及计算机领域，具体而言，涉及一种视频的校正方法及装置。The present invention relates to the field of computers, in particular to a video correction method and device.

背景技术Background technique

随着互联网技术的发展，只需要通过摄像头以及电脑即可以实现多个不同地域的人如同面对面一样的交流，然而当我们使用聊天软件(例如微信、米聊、MSN等)进行视频聊天时，由于摄像头与屏幕中心之间存在距离差，导致聊天用户之间缺少目光对视的交流，例如，笔记本电脑的摄像头往往设置在屏幕的上方，当用户USER1与用户USER2在进行聊天时，用户USER1往往注视着屏幕的中心区域，而不是摄像头，那么对于用户USER2而言，用户USER2的显示屏幕会显示用户USER1的眼睛向下看，而眼神对视是人与人交流中不可缺少的一部分，缺少它会使得视频聊天不能像平时面对面聊天一样自然。With the development of Internet technology, people in multiple different regions can communicate face to face only through cameras and computers. However, when we use chat software (such as WeChat, Michat, MSN, etc.) There is a distance difference between the camera and the center of the screen, resulting in a lack of eye-to-eye communication between chat users. For example, the camera of a laptop computer is often set at the top of the screen. When user USER1 and user USER2 are chatting, user USER1 often looks at The center area of the screen, instead of the camera, then for user USER2, the display screen of user USER2 will show that the eyes of user USER1 are looking down, and eye contact is an indispensable part of human-to-human communication. It makes video chatting not as natural as usual face-to-face chatting.

在现有技术中，往往采用镜面反射或虚拟视点绘制的方法对视频图像进行校正，从而实现聊天用户之间的眼神对视，但是，实现上述方法复杂，而且设备昂贵，不适于人们日常视频聊天的场景中，所以难以普及。In the prior art, the method of mirror reflection or virtual viewpoint drawing is often used to correct the video image, so as to realize the eye contact between chatting users, but the above method is complicated to realize, and the equipment is expensive, which is not suitable for people's daily video chat In many scenarios, it is difficult to popularize.

针对上述在现有技术中，为实现聊天用户之间目光对视的视频校正方法复杂，且耗费资源大的问题，目前尚未提出有效的解决方案。For the above-mentioned problems in the prior art, the video correction method for realizing eye contact between chatting users is complex and consumes a lot of resources, and no effective solution has been proposed so far.

发明内容Contents of the invention

本发明实施例提供了一种视频的校正方法及装置，以至少解决在现有技术中，为实现聊天用户之间目光对视的视频校正方法复杂，且耗费计算资源大的技术问题。Embodiments of the present invention provide a video correction method and device to at least solve the technical problems in the prior art that the video correction method for realizing eye contact between chatting users is complex and consumes a large amount of computing resources.

根据本发明实施例的一个方面，提供了一种视频的校正方法，该方法包括：采集用户的第一图像以及第二图像，其中，第一图像为用户观看摄像设备区域时所采集的图像，第二图像为用户观看屏幕区域时所采集的图像；提取第一图像中的坐标系以及第二图像中的坐标系，并计算第一图像的坐标系至第二图像的坐标系的仿射矩阵；采用仿射矩阵对采集到的初始视频进行仿射变换处理，生成校正后的目标视频。According to an aspect of an embodiment of the present invention, a video correction method is provided, the method includes: collecting a first image and a second image of the user, wherein the first image is an image collected when the user watches the area of the camera device, The second image is the image collected when the user watches the screen area; extract the coordinate system in the first image and the coordinate system in the second image, and calculate the affine matrix from the coordinate system of the first image to the coordinate system of the second image ; Use the affine matrix to perform affine transformation processing on the collected initial video to generate the corrected target video.

根据本发明实施例的另一方面，还提供了一种视频的校正装置，包括：采集单元，用于采集用户的第一图像以及第二图像，其中，第一图像为用户观看摄像设备区域时所采集的图像，第二图像为用户观看屏幕区域时所采集的图像；提取单元，用于提取第一图像中的坐标系以及第二图像中的坐标系，并计算第一图像的坐标系至第二图像的坐标系的仿射矩阵；校正单元，用于采用仿射矩阵对采集到的初始视频进行仿射变换处理，生成校正后的目标视频。According to another aspect of the embodiments of the present invention, there is also provided a video correction device, including: a collection unit, configured to collect a first image and a second image of the user, wherein the first image is when the user watches the area of the camera equipment The collected image, the second image is the image collected when the user watches the screen area; the extraction unit is used to extract the coordinate system in the first image and the coordinate system in the second image, and calculate the coordinate system of the first image to An affine matrix of the coordinate system of the second image; a correction unit, configured to use the affine matrix to perform affine transformation processing on the collected initial video to generate a corrected target video.

在本发明实施例中，通过采用采集用户的第一图像以及第二图像，其中，第一图像为用户观看摄像设备区域时所采集的图像，第二图像为用户观看屏幕区域时所采集的图像；提取第一图像中的坐标系以及第二图像中的坐标系，并计算第一图像的坐标系至第二图像的坐标系的仿射矩阵；采用仿射矩阵对采集到的初始视频进行仿射变换处理，生成校正后的目标视频，解决了现有技术中为实现聊天用户之间目光对视的视频校正方法复杂，且耗费计算资源大的技术问题。In the embodiment of the present invention, the first image and the second image of the user are collected by using the method, wherein the first image is the image collected when the user watches the camera area, and the second image is the image collected when the user watches the screen area ; Extract the coordinate system in the first image and the coordinate system in the second image, and calculate the affine matrix from the coordinate system of the first image to the coordinate system of the second image; use the affine matrix to affine the collected initial video The corrected target video is generated by projective transformation processing, which solves the technical problem that the video correction method for realizing eye contact between chatting users in the prior art is complex and consumes a large amount of computing resources.

附图说明Description of drawings

此处所说明的附图用来提供对本发明的进一步理解，构成本申请的一部分，本发明的示意性实施例及其说明用于解释本发明，并不构成对本发明的不当限定。在附图中：The accompanying drawings described here are used to provide a further understanding of the present invention and constitute a part of the application. The schematic embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute improper limitations to the present invention. In the attached picture:

图1是根据本发明实施例的视频的校正方法的流程图；以及Fig. 1 is the flow chart of the correction method of video according to the embodiment of the present invention; And

图2是根据本发明实施例视频的校正装置的结构示意图。Fig. 2 is a schematic structural diagram of a video correction device according to an embodiment of the present invention.

具体实施方式detailed description

为了使本技术领域的人员更好地理解本发明方案，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分的实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都应当属于本发明保护的范围。In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is an embodiment of a part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

需要说明的是，本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first" and "second" in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.

实施例一Embodiment one

根据本发明实施例，提供了一种视频的校正方法的实施例，需要说明的是，在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行，并且，虽然在流程图中示出了逻辑顺序，但是在某些情况下，可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present invention, an embodiment of a video correction method is provided. It should be noted that the steps shown in the flow chart of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions, and, Although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

图1是根据本发明实施例的视频的校正方法的流程图，如图1所示，该方法包括如下步骤：Fig. 1 is a flow chart of a video correction method according to an embodiment of the present invention. As shown in Fig. 1, the method includes the following steps:

步骤S12，采集用户的第一图像以及第二图像，其中，第一图像为用户观看摄像设备区域时所采集的图像，第二图像为用户观看屏幕区域时所采集的图像。Step S12, collecting a first image and a second image of the user, wherein the first image is an image collected when the user watches the area of the camera device, and the second image is an image collected when the user watches the area of the screen.

具体地，在上述步骤S12中，可以采用摄像设备来采集用户在目视不同区域时的两张图像，摄像设备可以是配置在电脑屏幕上方的摄像头，在本实施例中，可以让用户静坐于电脑屏幕前，在用户低头目视屏幕区域(优选为屏幕中心)时，通过摄像头采集用户的第一图像，在用户仰头目视摄像设备区域(优选为摄像头的镜头)时，通过摄像头采集用户的第二图像。需要说明的是，在第一图像中，用户的眼睛关注于屏幕的中心，在第二图像中，用户的眼睛关注于屏幕上方的摄像头。Specifically, in the above step S12, an imaging device may be used to collect two images of the user viewing different areas. The imaging device may be a camera arranged above the computer screen. In this embodiment, the user may be allowed to sit quietly in the In front of the computer screen, when the user looks down at the screen area (preferably the center of the screen), the first image of the user is collected by the camera; second image. It should be noted that, in the first image, the user's eyes focus on the center of the screen, and in the second image, the user's eyes focus on the camera above the screen.

步骤S14，提取第一图像中的坐标系以及第二图像中的坐标系，并计算第一图像的坐标系至第二图像的坐标系的仿射矩阵。Step S14, extracting the coordinate system in the first image and the coordinate system in the second image, and calculating an affine matrix from the coordinate system of the first image to the coordinate system of the second image.

具体地，在上述步骤S14中，可以采用计算机终端来获取上述两张图像，然后提取上述第一图像中的坐标系以及第二图像中的坐标系，需要说明的是，上述第一图像为以相机为原点的坐标系，上述第二图像为以屏幕为中心为原点的坐标系，计算机终端可以根据现有的计算算法来计算以相机为原点的坐标系到以屏幕为中心为原点的坐标系之间的仿射矩阵。Specifically, in the above step S14, a computer terminal can be used to obtain the above two images, and then extract the coordinate system in the above first image and the coordinate system in the second image. It should be noted that the above first image is in the form of The coordinate system with the camera as the origin, the above second image is the coordinate system with the screen as the center as the origin, and the computer terminal can calculate the coordinate system with the camera as the origin to the coordinate system with the screen as the origin according to the existing calculation algorithm Affine matrix between.

步骤S16，采用仿射矩阵对采集到的初始视频进行仿射变换处理，生成校正后的目标视频。Step S16, using an affine matrix to perform affine transformation processing on the collected initial video to generate a corrected target video.

具体地，在上述步骤S16中，在用户在进行视频通话时，本方案可以获取用户的初始视频，然后提取每一帧初始视频的面部特征点与轮廓，然后针对每一帧初始视频的面部特征点与轮廓进行仿射变换处理，生成校正后的目标视频，优选地，本方案可以将校正后的目标视频贴回初始视频的面部区域。需要说明的，由于本方案对于用户的视频进行了校正，因此，在多个用户在进行视频聊天时，可以实现目光对视的效果。Specifically, in the above step S16, when the user is making a video call, this solution can obtain the user's initial video, and then extract the facial feature points and contours of each frame of the initial video, and then target the facial features of each frame of the initial video Affine transformation processing is performed on the points and contours to generate a corrected target video. Preferably, this solution can paste the corrected target video back to the face area of the original video. It should be noted that since this solution corrects the user's video, when multiple users are in video chat, the effect of eye contact can be achieved.

本实施例通过采集用户的第一图像以及第二图像，其中，第一图像为用户观看摄像设备区域时所采集的图像，第二图像为用户观看屏幕区域时所采集的图像；提取第一图像中的坐标系以及第二图像中的坐标系，并计算第一图像的坐标系至第二图像的坐标系的仿射矩阵；采用仿射矩阵对采集到的初始视频进行仿射变换处理，生成校正后的目标视频，解决了在现有技术中为实现聊天用户之间目光对视的视频校正方法复杂，且耗费资源大的技术问题。In this embodiment, the first image and the second image of the user are collected, wherein the first image is the image collected when the user watches the camera area, and the second image is the image collected when the user watches the screen area; the first image is extracted The coordinate system in the first image and the coordinate system in the second image, and calculate the affine matrix from the coordinate system of the first image to the coordinate system of the second image; use the affine matrix to perform affine transformation processing on the collected initial video, and generate The corrected target video solves the technical problems in the prior art that the video correction method for realizing eye contact between chatting users is complex and consumes a lot of resources.

可选地，步骤S14，提取第一图像中的坐标系以及第二图像中的坐标系的步骤可以包括：Optionally, step S14, the step of extracting the coordinate system in the first image and the coordinate system in the second image may include:

步骤S141，对第一图像以及第二图像进行Hartley变换处理。Step S141, performing Hartley transform processing on the first image and the second image.

步骤S142，对经过Hartley变换处理的两张图像进行立体匹配处理，得到第一图像与第二图像之间的视差图。Step S142, performing stereo matching processing on the two images processed by the Hartley transform to obtain a disparity map between the first image and the second image.

这里需要说明的是，本申请中的Hartley变换算法(也称哈特莱变换算法)是一种完全对称的实数域正交三角变换算法，相对于傅里叶变换具有更好的对称性以及计算效率。What needs to be explained here is that the Hartley transform algorithm (also known as the Hartley transform algorithm) in this application is a completely symmetric real-number domain orthogonal triangular transform algorithm, which has better symmetry and computing power than the Fourier transform. efficiency.

步骤S143，在视差图中提取第一图像中的坐标系以及第二图像中的坐标系。Step S143, extracting the coordinate system in the first image and the coordinate system in the second image from the disparity map.

具体地，在上述步骤S141至步骤S143中，可以对上述第一图像以及第二图像进行Hartley校正方法予以校正，使得两张图像的极线水平(或垂直)，然后利用立体匹配算法得到视差图，对此视差图进行平滑滤波，得到平滑后的视差图，最后，提取两幅图像中眼部特征点的位置，在视差图中提取第一图像中的坐标系以及第二图像中的坐标系。Specifically, in the above step S141 to step S143, the first image and the second image can be corrected by the Hartley correction method so that the epipolar lines of the two images are horizontal (or vertical), and then the disparity map can be obtained by using the stereo matching algorithm , smoothing and filtering the disparity map to obtain the smoothed disparity map. Finally, extract the positions of the eye feature points in the two images, and extract the coordinate system in the first image and the coordinate system in the second image from the disparity map .

需要说明的是，本方案还可以获取通过摄像机标定方法(例如三角标定方法)获取摄像机的内参数，内参数矩阵包含摄像头的焦距和光心信息，然后根据摄像头的焦距和光心信息来将视差图转为深度值，然后执行上述步骤S143的方案。It should be noted that this solution can also obtain the internal parameters of the camera through the camera calibration method (such as the triangulation calibration method). The internal parameter matrix contains the focal length and optical center information of the camera, and then converts the disparity map according to the focal length and optical center information of the camera. is the depth value, and then execute the solution of the above step S143.

可选地，在步骤S142，对经过Hartley变换处理的两张图像进行立体匹配处理，得到第一图像与第二图像之间的视差图之前，本实施例提供的方法还可以包括：Optionally, in step S142, before performing stereo matching processing on the two images processed by Hartley transform to obtain the disparity map between the first image and the second image, the method provided in this embodiment may further include:

步骤S140，分别将第一图像以及第二图像按照顺时针旋转预设角度。Step S140, respectively rotating the first image and the second image clockwise by a preset angle.

具体地，在本方案中，由于立体匹配算法计算的是水平方向的视差，因此本方案可以将两张图像顺时针旋转预设角度(例如90°)，然后在将左视图(对着屏幕中心)与右视图(对着摄像头)采用立体匹配算法获得垂直视差图。Specifically, in this solution, since the stereo matching algorithm calculates the disparity in the horizontal direction, this solution can rotate the two images clockwise by a preset angle (for example, 90°), and then the left view (facing the center of the screen ) and the right view (facing the camera) using a stereo matching algorithm to obtain a vertical disparity map.

可选地，在步骤S16生成校正后的目标视频之后，本实施例提供的方法还可以包括：Optionally, after the corrected target video is generated in step S16, the method provided in this embodiment may further include:

步骤S17，对目标视频中的面部轮廓进行去重叠处理，生成校正后的目标面部轮廓。Step S17, performing de-overlapping processing on the facial contour in the target video to generate a corrected target facial contour.

具体地，在上述步骤S17中，本方案可以对目标视频中的面部轮廓进行优化处理，以消除目标视频中的面部轮廓的重叠和误差。Specifically, in the above-mentioned step S17, this solution can optimize the facial contour in the target video, so as to eliminate the overlap and error of the facial contour in the target video.

可选地，步骤S17，对目标视频中的面部轮廓进行去重叠处理，生成校正后的目标面部轮廓的步骤可以包括：Optionally, in step S17, the facial contour in the target video is de-overlapped, and the step of generating the corrected target facial contour may include:

步骤S171，在目标视频中的面部轮廓中提取多个特征点。Step S171, extracting multiple feature points from the facial contour in the target video.

步骤S172，计算每个特征点与第一图像的像素密度差值。Step S172, calculating the pixel density difference between each feature point and the first image.

步骤S173，将像素密度差值最小的特征点作为校正后的目标面部轮廓的特征点。In step S173, the feature point with the smallest pixel density difference is used as the feature point of the corrected target facial contour.

具体地，在上述步骤S171至步骤S173中，可以对每一帧的面部轮廓提取20到30个特征点，以每一个特征点为中心取一个N×N的小块(N为奇数，一般可取大于3小于15的奇数，不同的数值对应于不同的计算量和优化效果，数值越大计算量越大，优化效果越好；反之则计算量越小，优化效果越差，特征点数量的选取可根据具体情况予以选择，通常可选5或者7)，在上述N×N个点中计算校正后与原图像素密度差值最小的点，然后将该点作为校正后面部轮廓的点，如此迭代计算M次后(M为大于1的正整数，不同的数值对应于不同的计算量和优化效果，数值越大计算量越大，优化效果越好；反之则计算量越小，优化效果越差，M的数量可根据具体情况予以选择，通常可选3或者4)，便可得到优化后的面部轮廓。Specifically, in the above step S171 to step S173, 20 to 30 feature points can be extracted from the facial contour of each frame, and an N×N small block can be taken with each feature point as the center (N is an odd number, generally Odd numbers greater than 3 and less than 15, different values correspond to different calculation amounts and optimization effects, the larger the value, the greater the calculation amount, and the better the optimization effect; otherwise, the smaller the calculation amount, the worse the optimization effect, the selection of the number of feature points It can be selected according to the specific situation, usually 5 or 7), among the above N×N points, calculate the point with the smallest difference between the pixel density of the original image after correction, and then use this point as the point of the face contour after correction, so After M times of iterative calculation (M is a positive integer greater than 1, different values correspond to different calculation amounts and optimization effects, the larger the value, the greater the calculation amount, and the better the optimization effect; otherwise, the smaller the calculation amount, the better the optimization effect Poor, the number of M can be selected according to the specific situation, usually 3 or 4), and the optimized facial contour can be obtained.

综上，本方案提出了用单个普通网络摄像头获取场景深度的技术方案，并且提出在深度图有空洞的情况下，利用图像信息进行填补，最后，本方案能够实现在不影响原来视频聊天质量情况下，实现聊天者的目光对视的技术效果。To sum up, this solution proposes a technical solution to obtain the depth of the scene with a single ordinary webcam, and proposes to use image information to fill holes in the depth map. Finally, this solution can realize the situation without affecting the quality of the original video chat. Next, the technical effect of the chatter's eye-to-eye contact is realized.

实施例二Embodiment two

本申请还提供了一种视频的校正装置，该装置可以用于执行上述视频的校正方法，如图2所示，该装置可以包括：采集单元20，用于采集用户的第一图像以及第二图像，其中，第一图像为用户观看摄像设备区域时所采集的图像，第二图像为用户观看屏幕区域时所采集的图像；提取单元22，用于提取第一图像中的坐标系以及第二图像中的坐标系，并计算第一图像的坐标系至第二图像的坐标系的仿射矩阵；校正单元24，用于采用仿射矩阵对采集到的初始视频进行仿射变换处理，生成校正后的目标视频。The present application also provides a video correction device, which can be used to implement the above video correction method, as shown in Figure 2, the device can include: a collection unit 20, used to collect the user's first image and images, wherein the first image is an image collected when the user watches the camera area, and the second image is an image collected when the user watches the screen area; the extraction unit 22 is used to extract the coordinate system in the first image and the second image coordinate system in the image, and calculate the affine matrix from the coordinate system of the first image to the coordinate system of the second image; the correction unit 24 is used to use the affine matrix to carry out affine transformation processing on the initial video collected, and generate correction after the target video.

本实施例通过采集用户的第一图像以及第二图像，其中，第一图像为用户观看摄像设备区域时所采集的图像，第二图像为用户观看屏幕区域时所采集的图像；提取第一图像中的坐标系以及第二图像中的坐标系，并计算第一图像的坐标系至第二图像的坐标系的仿射矩阵；采用仿射矩阵对采集到的初始视频进行仿射变换处理，生成校正后的目标视频，解决了现有技术中为实现聊天用户之间目光对视的视频校正方法复杂，且耗费资源大的技术问题。In this embodiment, the first image and the second image of the user are collected, wherein the first image is the image collected when the user watches the camera area, and the second image is the image collected when the user watches the screen area; the first image is extracted The coordinate system in the first image and the coordinate system in the second image, and calculate the affine matrix from the coordinate system of the first image to the coordinate system of the second image; use the affine matrix to perform affine transformation processing on the collected initial video, and generate The corrected target video solves the technical problems in the prior art that the video correction method for realizing eye contact between chatting users is complex and consumes a lot of resources.

可选地，提取单元可以包括：第一处理模块，用于对第一图像以及第二图像进行Hartley变换处理；第二处理模块，用于对经过Hartley变换处理的两张图像进行立体匹配处理，得到第一图像与第二图像之间的视差图；第一提取模块，用于在视差图中提取第一图像中的坐标系以及第二图像中的坐标系。Optionally, the extraction unit may include: a first processing module, configured to perform Hartley transform processing on the first image and the second image; a second processing module, configured to perform stereo matching processing on the two images processed through the Hartley transform, A disparity map between the first image and the second image is obtained; a first extraction module is configured to extract a coordinate system in the first image and a coordinate system in the second image from the disparity map.

可选地，本实施例提供的装置还包括：旋转单元，用于分别将第一图像以及第二图像按照顺时针旋转预设角度。Optionally, the device provided in this embodiment further includes: a rotating unit, configured to respectively rotate the first image and the second image clockwise by a preset angle.

可选地，本实施例提供的装置还包括：处理单元，用于对目标视频中的面部轮廓进行去重叠处理，生成校正后的目标面部轮廓。Optionally, the device provided in this embodiment further includes: a processing unit configured to perform de-overlapping processing on facial contours in the target video to generate a corrected target facial contour.

可选地，处理单元可以包括：第二提取模块，用于在目标视频中的面部轮廓中提取多个特征点；计算模块，用于计算每个特征点与第一图像的像素密度差值；第三处理模块，用于将像素密度差值最小的特征点确定为校正后的目标面部轮廓的特征点。Optionally, the processing unit may include: a second extraction module, used to extract a plurality of feature points in the facial contour in the target video; a calculation module, used to calculate the pixel density difference between each feature point and the first image; The third processing module is used to determine the feature point with the smallest pixel density difference as the feature point of the corrected target facial contour.

上述本发明实施例序号仅仅为了描述，不代表实施例的优劣。The serial numbers of the above embodiments of the present invention are for description only, and do not represent the advantages and disadvantages of the embodiments.

在本发明的上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述的部分，可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present invention, the descriptions of each embodiment have their own emphases, and for parts not described in detail in a certain embodiment, reference may be made to relevant descriptions of other embodiments.

在本申请所提供的几个实施例中，应该理解到，所揭露的技术内容，可通过其它的方式实现。其中，以上所描述的装置实施例仅仅是示意性的，例如所述单元的划分，可以为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，单元或模块的间接耦合或通信连接，可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed technical content can be realized in other ways. Wherein, the device embodiments described above are only illustrative. For example, the division of the units may be a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or may be Integrate into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of units or modules may be in electrical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本发明各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、只读存储器(ROM，Read-OnlyMemory)、随机存取存储器(RAM，Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage medium includes: various media capable of storing program codes such as U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk.

以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。The above is only a preferred embodiment of the present invention, it should be pointed out that, for those of ordinary skill in the art, without departing from the principle of the present invention, some improvements and modifications can also be made, and these improvements and modifications can also be made. It should be regarded as the protection scope of the present invention.

Claims

1. A correction method for video, characterized in that, comprising:

Collecting a first image and a second image of the user, wherein the first image is an image collected when the user watches the camera area, and the second image is an image collected when the user watches the screen area;

extracting a coordinate system in the first image and a coordinate system in the second image, and calculating an affine matrix from the coordinate system of the first image to the coordinate system of the second image;

The affine transformation process is performed on the collected initial video by using the affine matrix to generate a corrected target video.

2. The method according to claim 1, wherein the step of extracting the coordinate system in the first image and the coordinate system in the second image comprises:

Carrying out Hartley transform processing to the first image and the second image;

performing stereo matching processing on the two images processed by the Hartley transform to obtain a disparity map between the first image and the second image;

A coordinate system in the first image and a coordinate system in the second image are extracted in the disparity map.

3. The method according to claim 2, wherein the stereo matching process is carried out to the two images processed through the Hartley transform to obtain a disparity map between the first image and the second image The steps include:

directly performing the stereo matching process on the two images processed by the Hartley transform to obtain a vertical disparity map between the first image and the second image; or

After the two images processed by the Hartley transform are rotated clockwise by a preset angle, the stereo matching process is performed on the two images to obtain the horizontal level of the first image and the second image Disparity map.

4. The method according to claim 1, wherein, after generating the corrected target video, the method further comprises:

De-overlapping processing is performed on the facial contour in the target video to generate a corrected target facial contour.

5. method according to claim 4, is characterized in that, de-overlapping is carried out to the facial contour in the target video, the step of generating the corrected target facial contour comprises:

extracting a plurality of feature points in the facial contour in the target video;

calculating the pixel density difference between each feature point and the first image;

The feature point with the smallest pixel density difference is determined as the feature point of the corrected target facial contour.

6. A correction device for video, comprising:

A collection unit, configured to collect a first image and a second image of the user, wherein the first image is an image collected when the user watches the area of the camera device, and the second image is an image collected when the user watches the area of the screen the captured image;

an extraction unit, configured to extract a coordinate system in the first image and a coordinate system in the second image, and calculate an affine matrix from the coordinate system of the first image to the coordinate system of the second image;

A correction unit, configured to use the affine matrix to perform affine transformation processing on the collected initial video to generate a corrected target video.

7. The device according to claim 6, wherein the extracting unit comprises:

A first processing module, configured to perform Hartley transform processing on the first image and the second image;

A second processing module, configured to perform stereo matching processing on the two images processed by the Hartley transform, to obtain a disparity map between the first image and the second image;

A first extraction module, configured to extract the coordinate system in the first image and the coordinate system in the second image from the disparity map.

8. The device according to claim 7, further comprising:

The rotating unit is used to rotate the first image and the second image clockwise by a preset angle respectively.

9. The device according to claim 6, further comprising:

The processing unit is configured to perform de-overlapping processing on the facial contours in the target video to generate corrected target facial contours.

10. The device according to claim 9, wherein the processing unit comprises:

The second extraction module is used to extract a plurality of feature points in the facial contour in the target video;

A calculation module, configured to calculate the pixel density difference between each feature point and the first image;

A third processing module, configured to determine the feature point with the smallest pixel density difference as the feature point of the corrected target facial contour.