[go: up one dir, main page]

CN116958203B - Image processing method, device, electronic device and storage medium - Google Patents

Image processing method, device, electronic device and storage medium Download PDF

Info

Publication number
CN116958203B
CN116958203B CN202310960167.1A CN202310960167A CN116958203B CN 116958203 B CN116958203 B CN 116958203B CN 202310960167 A CN202310960167 A CN 202310960167A CN 116958203 B CN116958203 B CN 116958203B
Authority
CN
China
Prior art keywords
frame
predicted
image
mask data
mutation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310960167.1A
Other languages
Chinese (zh)
Other versions
CN116958203A (en
Inventor
严洪泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Zhicun Computing Technology Co ltd
Original Assignee
Hangzhou Zhicun Computing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Zhicun Computing Technology Co ltd filed Critical Hangzhou Zhicun Computing Technology Co ltd
Priority to CN202310960167.1A priority Critical patent/CN116958203B/en
Publication of CN116958203A publication Critical patent/CN116958203A/en
Application granted granted Critical
Publication of CN116958203B publication Critical patent/CN116958203B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the invention discloses an image processing method, an image processing device, electronic equipment and a storage medium, wherein the method comprises the following steps: calculating an initialized optical flow according to the target reference image frame; determining a predicted light flow graph and weighted fusion mask data according to the initialized light flow; calculating mutation mask data according to the predicted light flow graph; and calculating predicted image frame data according to the predicted light flow graph, the weighted fusion mask data and the abrupt change mask data. The technical scheme of the embodiment of the invention shortens the time cost of image processing and improves the rendering processing effect of the local abrupt change region of the image.

Description

Image processing method and device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of image generation, in particular to an image processing method, an image processing device, electronic equipment and a storage medium.
Background
Along with the update iteration of the mobile device and the virtual reality device, the image frame rate requirement of the user on the output content of the device is continuously improved, so that the rendering capability of an application processor chip for processing the image is also required to be higher.
The rendering capability of the application processor chip at present cannot meet the high-frame-rate display requirement of a user on the rendering content, so that a frame inserting chip needs to be introduced to solve the problem of insufficient rendering performance of the application processor in the electronic equipment.
Fig. 1 is a timing diagram of an image processing method in the prior art, as shown in fig. 1, in which when an image is processed in the prior art, a first line is a processing procedure in which a graphic processing unit of an application processor finishes rendering and starts outputting a rendering frame through a mobile industry processor interface. Because the high frame rate video output to the display screen is 90fps, the 45fps low frame rate video output by the application processor is required to complete the output of the rendered frame image within 11 ms; the second line is a synchronization process between the output module of the application processor and the receiving module of the framing chip, wherein the synchronization signal delay tdelay_sync_img_aux is about 2 ms; the third line is the operation of the prior art framing chip. In the prior art, after receiving all data of a next frame image, the frame inserting chip can output first line data of a predicted frame image. In the prior art, the time delay tdelay_if from the first line data of the next frame image to the first line data of the output predicted frame image is about 13ms; the fourth line is a processing procedure of displaying the high-frame-rate video by the display screen, and may include a synchronization delay between the frame inserting chip and the display screen, a display screen data buffer, a display screen pixel flip delay, and the like for about 3.5ms. It is thus easy to understand that in the image processing, there is a 13ms delay processing case in which only the interpolation frame calculation operation, that is, the delay time of outputting the first line data of the predicted frame image is 13ms.
The inventor finds that the efficiency of the prior art in the image processing process is low in the process of realizing the invention, and the problem of image processing delay exists. Secondly, in the case that the virtual device output content cuts the scene or the operation interface content suddenly changes, there is a problem that the output content is abnormal.
Disclosure of Invention
The embodiment of the invention provides an image processing method, an image processing device, electronic equipment and a storage medium, which shorten the time cost of image processing and improve the rendering processing effect of local abrupt change areas of an image.
According to an aspect of the present invention, there is provided an image processing method including:
Calculating an initialized optical flow according to the target reference image frame;
determining a predicted light flow graph and weighted fusion mask data according to the initialized light flow;
calculating mutation mask data according to the predicted light flow graph;
and calculating predicted image frame data according to the predicted light flow graph, the weighted fusion mask data and the abrupt change mask data.
According to another aspect of the present invention, there is provided an image processing apparatus including:
An initialization optical flow calculation module for calculating an initialization optical flow according to the target reference image frame;
the prediction light flow graph and weighted fusion mask data determining module is used for determining prediction light flow graph and weighted fusion mask data according to the initialization light flow;
The mutation mask data calculation module is used for calculating mutation mask data according to the predicted light flow graph;
and the predicted image frame data calculation module is used for calculating predicted image frame data according to the predicted light flow graph, the weighted fusion mask data and the abrupt mask data.
According to another aspect of the present invention, there is provided an electronic apparatus including:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the image processing method according to any one of the embodiments of the present invention.
According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute the image processing method according to any one of the embodiments of the present invention.
According to the technical scheme provided by the embodiment of the invention, the initialization optical flow is calculated according to the target reference image frame, the predicted optical flow diagram and the weighted fusion mask data are determined according to the initialization optical flow, the mutation mask data are calculated according to the predicted optical flow diagram, and finally the predicted image frame data are calculated according to the predicted optical flow diagram, the weighted fusion mask data and the mutation mask data, so that the problems of poor image local mutation region processing effect and large image processing delay are solved, the time cost of image processing is shortened, and the image local mutation region rendering processing effect is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a timing diagram of an image processing method in the prior art;
FIG. 2 is a flowchart of an image processing method according to a first embodiment of the present invention;
FIG. 3 is a schematic diagram of a hardware module link of an image processing chip according to a first embodiment of the present invention;
Fig. 4 is a timing chart of image processing performed by an image processing chip according to a first embodiment of the present invention;
fig. 5 is a flowchart of an image processing method according to a second embodiment of the present invention;
FIG. 6 is a flowchart of initializing optical flow calculation according to a second embodiment of the present invention;
Fig. 7 is a schematic diagram of a structure of an auxiliary network model for frame insertion according to a second embodiment of the present invention;
Fig. 8 is a schematic diagram of a structure of an interposed frame backbone network model according to a second embodiment of the present invention;
FIG. 9 is a flowchart for obtaining mutation mask data according to a second embodiment of the present invention;
FIG. 10 is a schematic diagram of an image processing data path according to a second embodiment of the present invention;
fig. 11 is a flowchart of an image processing method according to a second embodiment of the present invention;
fig. 12 is a timing chart of an image processing method according to a second embodiment of the present invention;
FIG. 13 is a schematic diagram of another image processing data path according to a second embodiment of the present invention;
FIG. 14 is a flowchart of another image processing method according to the second embodiment of the present invention;
fig. 15 is a timing chart of another image processing method according to the second embodiment of the present invention;
fig. 16 is a schematic diagram of a local display interface of a virtual reality device according to a second embodiment of the present invention;
fig. 17 is a schematic diagram of a local display interface of another virtual reality device according to a second embodiment of the present invention;
FIG. 18 is a flowchart of an image processing method including mutation detection according to a second embodiment of the present invention;
fig. 19 is an effect diagram of a method for processing a mutation area detection image according to the second embodiment of the present invention;
fig. 20 is a schematic view of an image processing apparatus according to a third embodiment of the present invention;
fig. 21 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 2 is a flowchart of an image processing method according to an embodiment of the present invention, where the embodiment is applicable to a case of performing image processing on a low-frequency video including a mutation area to improve image rendering performance, the method may be performed by an image processing apparatus, and the apparatus may be implemented by software and/or hardware, and may be generally integrated in an electronic device, where the electronic device may be a terminal device or a server device, and the embodiment of the present invention is not limited to a specific device type of the electronic device. Accordingly, as shown in fig. 2, the method includes the following operations:
s110, calculating an initialization optical flow according to the target reference image frame.
The target reference image frame may be an image frame near a predicted frame in the low-frequency video and having a certain association relationship with the predicted frame. Predicted frames are target frame data in low frequency video, which need to be subjected to frame interpolation processing. The initialization of the optical flow may be that the optical flow data is obtained by the optical flow method of the target reference image frame.
In the embodiment of the present invention, the target reference image frame may be the first two frame images of the predicted frame in the low-frequency video, may be the second two frame images of the predicted frame, may also be the previous frame image and the next frame image of the predicted frame, etc., and the embodiment of the present invention does not limit the specific frame position of the target reference image frame. After the target reference frame images are determined, an initialization optical flow may be calculated for the target reference frame images to initialize the optical flow to represent the object motion magnitudes between the target reference frame images.
S120, determining a predicted light flow graph and weighted fusion mask data according to the initialized light flow.
Wherein the predicted light flow map may be an image representing the movement of the target pixel in two consecutive frame images. The weighted fusion mask data may be weight coefficients representing the weighted mask.
Correspondingly, an optical flow method can be adopted to perform motion estimation on the initialized optical flow to obtain a predicted optical flow diagram and weighted fusion mask data. Among them, the optical flow method may also be called motion estimation, i.e., calculating the motion amplitude of a target pixel between two frames of image data. The light flow map may include motion information for a target pixel in the image, such as x and y displacement of the target pixel for two frames. Illustratively, the optical flow graph may include two channels, one of which may be a horizontal movement optical flow channel for reflecting a horizontal movement amount of the pixel, i.e., an x-direction displacement amount; the other channel may be a vertical shift optical flow channel for reflecting the amount of pixel vertical shift, i.e., the amount of y-direction displacement.
Alternatively, the optical flow method may be a unidirectional optical flow estimation method, first calculating an optical flow estimation that a previous frame points to a subsequent frame, then taking half of the optical flow vector, and performing optical flow mapping with the previous frame as a reference frame to obtain a predicted frame, which is called forward estimation; the latter frame may also be used as a reference frame to perform optical flow mapping to obtain a predicted frame, and half of the optical flow vector is taken and negative, which is called backward estimation. The optical flow method may also be a bidirectional optical flow estimation method, in which a first optical flow vector from a previous frame to a predicted frame and a second optical flow vector from a next frame to a predicted frame are predicted, where the previous frame and the next frame are respectively used as reference frames, the previous frame is combined with the first optical flow vector to be used for forward estimation mapping to obtain a forward predicted frame, the next frame is combined with the second optical flow vector to be used for backward estimation mapping to obtain a backward predicted frame, and then the forward predicted frame and the backward predicted frame are weighted and fused to obtain a video frame to be inserted, so that the predicted frame can be obtained.
S130, calculating mutation mask data according to the predicted light flow diagram.
The abrupt mask data may be weight coefficients of an abrupt mask.
Correspondingly, the predicted optical flow diagram obtained in the above steps can be used for calculation to obtain mutation mask data, so that mutation detection is carried out on the frame image containing the mutation region by using the mutation mask data, and further the local mutation region processing effect of the image can be improved.
And S140, calculating predicted image frame data according to the predicted light flow diagram, the weighted fusion mask data and the abrupt change mask data.
The predicted image frame data may be target image data for performing a frame interpolation operation at a specific frame interpolation position.
In the embodiment of the invention, the predicted image frame data can be obtained by calculation according to the predicted light flow diagram, the weighted fusion mask data and the abrupt change mask data, so that the predicted image frame data is utilized to process at the frame inserting position of the original low-frequency video to obtain the high-frequency video.
Fig. 3 is a schematic diagram of a hardware module link of an image processing chip according to a first embodiment of the present invention, and in a specific example, as shown in fig. 3, the hardware module of the image processing chip may include a video acquisition unit, a motion estimation unit, an insertion frame prediction unit, and an insertion frame chip control unit. The video acquisition unit can be used for acquiring a low-frame-rate video frame sequence to be subjected to frame interpolation processing; the motion estimation unit can be used for performing motion estimation on a previous frame and a next frame of video images in the low-frame-rate video frame sequence to obtain an optical flow vector Ft < - >0 (t=0.5) of a predicted frame pointing to the previous frame and an optical flow vector Ft < - >1 (t=0.5) of the predicted frame pointing to the next frame; the inserted frame prediction unit can utilize the optical flow vectors of the predicted frames pointing to the previous frame and the next frame to map the previous frame and the next frame to obtain a forward predicted frame and a backward predicted frame respectively, then obtain a predicted frame after weighted fusion, and then insert the obtained predicted frame image between the previous frame and the next frame of the original low-frame-rate video to obtain a high-frame-rate video frame sequence; the frame inserting chip control unit may be used to obtain control signals set by the application processor, including but not limited to whether to start the frame inserting function, the scene jump judging threshold, etc.
Fig. 4 is a timing chart of image processing performed by an image processing chip according to a first embodiment of the present invention, in a specific example, as shown in fig. 4, if a frame inserting chip is not turned on, an application processor may send image data and a control signal of a low-frequency video to the frame inserting chip, and the frame inserting chip adopts a transparent transmission mode to transmit the low-frequency video to a display screen; if the frame inserting chip is in the frame inserting working mode, the frame inserting chip can receive the low-frequency video sent by the application processor, obtain a high-frame-rate video through frame inserting processing and send the high-frame-rate video to the display screen.
According to the technical scheme provided by the embodiment of the invention, the initialization optical flow is calculated according to the target reference image frame, the predicted optical flow diagram and the weighted fusion mask data are determined according to the initialization optical flow, the mutation mask data are calculated according to the predicted optical flow diagram, and finally the predicted image frame data are calculated according to the predicted optical flow diagram, the weighted fusion mask data and the mutation mask data, so that the problems of poor image local mutation region processing effect and large image processing delay are solved, the time cost of image processing is shortened, and the image local mutation region rendering processing effect is improved.
Example two
Fig. 5 is a flowchart of an image processing method according to a second embodiment of the present invention. The present embodiment is embodied based on the above embodiment, and in the present embodiment, specific alternative implementation manners of calculating the initialization optical flow, the abrupt mask data, and the predicted image frame data are given. Accordingly, as shown in fig. 5, the method of this embodiment may include:
s210, calculating an initialization optical flow according to the target reference image frame.
S220, determining a predicted light flow graph and weighted fusion mask data according to the initialized light flow.
In an alternative embodiment of the present invention, the target reference image frame may include a previous frame image and a subsequent frame image of a predicted frame; the calculating the initialized optical flow according to the target reference image frame may include: inputting a previous frame image and a next frame image of the predicted frame to an application processor chip to calculate the initialization optical flow by the application processor chip; the determining the predicted optical flow graph and the weighted fusion mask data according to the initialized optical flow may include: and inputting a previous frame image of the predicted frame, a next frame image of the predicted frame and the initialized optical flow into an interpolation frame backbone network model in an interpolation frame chip so as to calculate predicted optical flow diagrams and weighted fusion mask data of the previous frame image and the next frame image of the predicted frame through the interpolation frame backbone network model.
Wherein the previous frame image of the predicted frame may be image data of a previous frame position of the predicted frame. The frame image subsequent to the predicted frame may be image data of a frame position subsequent to the predicted frame. The application processor chip may be a calculation processing chip for calculating the initialization light value. The frame inserting chip may be an image processing chip for implementing real-time frame inserting processing. The interpolated backbone network model may be a network model for implementing predicted frame calculations.
In the embodiment of the application, the previous frame image and the next frame image of the predicted frame can be used as input data of the application processor chip to be sent into the application processor chip for calculation processing to obtain the initialized optical flow. Alternatively, a conventional optical flow calculation method may be deployed on an application processor chip, and a deep learning optical flow calculation method may be deployed on the application processor chip. For example, in order to reduce the memory space of the application processor and the calculation amount of the application processor, the previous frame of the predicted frame may be downsampled and buffered in the memory space of the application processor, and the application processor may render the next frame of the predicted frame. When the application processor has rendered the next frame of the predicted frame, it may calculate an initialized optical flow using the DIS (DENSE INVERSE SEARCH-based method). After the initialized optical flow is obtained, the initialized optical flow can be used as input data of an inter-frame backbone network model to be sent into the inter-frame backbone network model of the inter-frame chip for calculation to obtain a predicted optical flow graph and weighted fusion mask data of a previous frame image and a next frame image of a predicted frame. The initialized optical flow is input into a frame inserting backbone network model in a frame inserting chip, and a backward searching optical flow diagram Ft-0 from a predicted frame to a previous frame, a forward searching optical flow diagram Ft-1 from the predicted frame to the next frame and weighted fusion mask data can be calculated through the frame inserting backbone network model. The frame inserting backbone network model of the frame inserting chip can be realized by a traditional method or can be realized by a deep learning network, and the embodiment of the application is not limited to a specific realization mode.
Fig. 6 is a flowchart of an initialized optical flow calculation according to a second embodiment of the present invention, as shown in fig. 6, a previous frame image and a next frame image of a predicted frame may be sent to an application processor as input data of the application processor chip, and the image may be subjected to multi-layer convolution and upsampling calculation by using an NPU (Neural network Processing Unit, a neural network processor) and a GPU (Graphics Processing Unit, a graphics processor) in the application processor, so as to obtain an initialized optical flow map Ft- >0 and an initialized optical flow map Ft- >1.
In an alternative embodiment of the present invention, the target reference image frame may include the first two frame images of the predicted frame; the calculating the initialized optical flow according to the target reference image frame may include: inputting the first two frame images of the predicted frame into a frame inserting auxiliary network model in a frame inserting chip to calculate the initialized optical flow through the frame inserting auxiliary network model; the determining the predicted optical flow graph and the weighted fusion mask data according to the initialized optical flow may include: and inputting a previous frame image of the predicted frame, a next frame image of the predicted frame and the initialized optical flow into an interpolation frame backbone network model in an interpolation frame chip so as to calculate predicted optical flow diagrams and weighted fusion mask data of the previous frame image and the next frame image of the predicted frame through the interpolation frame backbone network model.
Wherein the first two frame images of the predicted frame may be low frame rate video image data of the first two frame positions of the predicted frame. The plug-in auxiliary network model may be a network model for implementing an initialized optical flow calculation.
In the embodiment of the invention, the frame inserting auxiliary network and the frame inserting main network can be used for time-sharing multiplexing the frame inserting chip computing power by introducing the frame inserting auxiliary network into the frame inserting chip. The frame inserting auxiliary network can fully utilize the calculation force of each period of the frame inserting chip, save the calculation force of an application processor and the data transmission bandwidth, and reduce PCIE (PERIPHERAL COMPONENT INTERCONNECT EXPRESS, high-speed serial interface) output interfaces of a hardware design unit such as the application processor and PCIE receiving interfaces of the frame inserting chip.
When the frame inserting auxiliary network starts to calculate, the image data of the next frame of the predicted frame cannot be obtained, so that the first two frames of images of the predicted frame can be used as input data of the frame inserting auxiliary network to be sent to the frame inserting auxiliary network for calculation to obtain an initialized optical flow. After the initialized optical flow is obtained, the previous frame image of the predicted frame, the next frame image of the predicted frame and the initialized optical flow can be used as input data of the frame inserting main network model to be sent into the frame inserting main network model of the frame inserting chip for calculation to obtain predicted optical flow diagrams and weighted fusion mask data of the previous frame image and the next frame image of the predicted frame. The initialized optical flow is input into a frame inserting backbone network model in a frame inserting chip, and a backward searching optical flow diagram Ft-0 from a predicted frame to a previous frame, a forward searching optical flow diagram Ft-1 from the predicted frame to the next frame and weighted fusion mask data can be calculated through the frame inserting backbone network model. The frame inserting backbone network model of the frame inserting chip can be realized by a traditional method or can be realized by a deep learning network, and the embodiment of the application is not limited to a specific realization mode.
Fig. 7 is a schematic diagram of a structure of an auxiliary network model for frame insertion, as shown in fig. 7, in which the first two frames of images of a predicted frame can be sent into the auxiliary network for frame insertion as input data of the auxiliary network for frame insertion, and multi-layer convolution and up-sampling computation are performed in the auxiliary network model for frame insertion, so as to obtain an initialized optical flow diagram Ft- >0 and an initialized optical flow diagram Ft- >1.
Fig. 8 is a schematic diagram of a structure of an interpolated backbone network model according to a second embodiment of the present invention, as shown in fig. 8, a previous frame image, a next frame image, and an initialized optical flow of a predicted frame may be sent as input data of the interpolated backbone network model, and multi-layer convolution and up-sampling computation may be performed in the interpolated backbone network model, so as to obtain backward search optical flow graph Ft- >0, forward search optical flow graph Ft- >1, and weighted fusion mask data.
S230, aligning a previous frame image and a next frame image of the predicted frame according to the predicted light flow graph to respectively obtain a forward predicted frame and a backward predicted frame.
The forward predicted frame may be an image obtained by performing alignment processing on a previous frame image of the predicted frame by using a predicted light-flow graph. The backward predicted frame may be an image obtained by aligning a frame image subsequent to the predicted frame by the predicted light flow graph.
Correspondingly, the backward mapping alignment processing can be respectively carried out on the previous frame image and the next frame image of the predicted frame according to the predicted light flow graph output by the frame inserting backbone network model in the steps. For example, a backward search light flow graph Ft- >0 of a predicted frame pointing to a previous frame acts on a backward mapping of the previous frame to obtain a forward predicted frame, and a forward search light flow graph Ft- >1 of the predicted frame pointing to a subsequent frame acts on a backward mapping of the subsequent frame to obtain a backward predicted frame.
S240, calculating the absolute value of the pixel difference value between the forward predicted frame and the backward predicted frame aligned to the predicted frame time.
S250, calculating the mutation mask data according to the absolute value of the pixel difference value.
The predicted frame time may be a time when the predicted frame insertion process is required. The absolute value of the pixel difference may be used to represent the degree of difference between the forward predicted frame and the backward predicted frame aligned to the predicted frame instant.
In the embodiment of the invention, the absolute value of the pixel difference between the two images can be obtained by calculating the forward predicted frame and the backward predicted frame aligned to the predicted frame time, and then the multi-layer convolution and binarization calculation can be carried out on the absolute value of the pixel difference to obtain the abrupt mask data.
Fig. 9 is a flowchart for obtaining abrupt change mask data according to a second embodiment of the present invention, as shown in fig. 9, a predicted light flow chart, a previous frame image of a predicted frame, and a next frame image of a predicted frame may be sent to an alignment module to perform alignment processing, further, an absolute value of a pixel difference between two images may be obtained by performing calculation using a forward predicted frame and a backward predicted frame aligned to a predicted frame time, and the absolute value of the pixel difference may be subjected to multi-layer convolution and binarization calculation to obtain abrupt change mask data.
S260, calculating predicted image frame data according to the predicted light flow diagram, the weighted fusion mask data and the mutation mask data.
In an alternative embodiment of the present invention, the calculating predicted image frame data according to the predicted light flow map, the weighted fusion mask data, and the abrupt mask data may include: determining a mutation region and a non-mutation region according to the mutation mask data; setting a first region value for the mutation region and a second region value for the non-mutation region; and calculating the predicted image frame data according to the mutation region of the mutation mask data, the predicted light flow graph and the weighted fusion mask data.
The abrupt change region may be a pixel position where the content in the image is abrupt when the virtual reality device performs the video scene switching process. The non-abrupt region may be a pixel position where the virtual reality device does not have abrupt content in the image when performing the video scene switching process. The first region value may be a value for representing a mutation region. The second region value may be a value for representing a non-mutated region.
In the embodiment of the present invention, a mutation region may be determined according to the mutation mask data obtained in the above steps, and the mutation region may be set to a first region value; the non-mutated region may be determined from the mutated mask data and the mutated region may be set to a second region value, wherein the first region value may be 0; the second region value may be 1. Further, the abrupt mask data may be used to process the weighted fusion mask data, the backward initialization optical flow map Ft- >0, and the forward initialization optical flow map Ft- >1 to obtain predicted image frame data.
In an alternative embodiment of the present invention, the calculating the predicted image frame data according to the mutation region of the mutation mask data, the predicted light flow map, and the weighted fusion mask data may include: determining a mutation region of the predicted light flow graph according to the mutation region of the mutation mask data; inhibiting backward mapping processing on optical flow values of a mutation area of the predicted optical flow graph; and setting the weighted fusion mask data of the abrupt region as the first region value so as to preserve the image content of the image of the previous frame of the predicted frame.
Correspondingly, the mutation region of the predicted optical flow graph can be determined according to the mutation region of the mutation mask data, namely, the backward search optical flow graph Ft-0 and the forward search optical flow graph Ft-1 are set to be 0 in the mutation region according to the mutation region of the mutation mask data. The light flow value of the abrupt change region of the predicted light flow graph is not subjected to frame insertion backward mapping processing, and the original image content of the previous frame image and the next frame image of the predicted frame can be reserved. Further, the weighted fusion mask data of the abrupt region is set to 0, and the image content of the previous frame image of the predicted frame is reserved.
In an optional embodiment of the invention, after calculating predicted image frame data according to the predicted light flow map, the weighted fusion mask data and the abrupt mask data, the method may further comprise: determining an inserting frame position of the predicted image frame data in an original low-frequency video; and inserting the predicted image frame data into the original low-frequency video according to the frame inserting position.
The original low-frequency video may be a low-frequency video with delay and needing to be subjected to frame insertion processing. The frame insertion position may be a frame position in the original low frequency video where a predicted frame needs to be inserted.
Correspondingly, after the predicted image frame data is obtained through calculation according to the predicted light flow graph, the weighted fusion mask data and the abrupt change mask data, the frame inserting position of the predicted image frame data in the original low-frequency video is determined, the predicted image frame data is inserted into the original low-frequency video, and the high-frame-rate video can be obtained after frame inserting is completed.
Fig. 10 is a schematic diagram of an image processing data path according to the second embodiment of the present invention, and in a specific example, as shown in fig. 10, the image processing data path may include an application processor, a frame inserting chip, and an electronic device such as a display screen. The application processor can send image data of low-frame-rate video to the frame inserting chip through an MIPI (Mobile Industry Processor Interface ) bus, send control signals to the frame inserting chip through an SPI (SERIAL PERIPHERAL INTERFACE, serial asynchronous communication interface) interface, and send initialized optical flow data to the frame inserting chip through a PCIE interface; the frame inserting chip can perform frame inserting processing on the received low-frame-rate video; the frame inserting chip can send the high-frame-rate video image data after frame inserting to the display screen through the MIPI bus.
Fig. 11 is a flowchart of an image processing method according to a second embodiment of the present invention, in a specific example, as shown in fig. 11, a previous frame image and a next frame image of a predicted frame may be sent to an application processor chip as input data of the application processor chip, and when a video acquisition unit of the frame inserting chip receives the next frame image, a motion estimation unit of the frame inserting chip may be started. When the video acquisition unit of the frame inserting chip receives 0.2 frame of the next frame of image, the motion estimation unit of the frame inserting chip can output the first line of optical flow data and the weighted fusion mask data.
The motion estimation unit of the frame inserting chip can read the previous frame image, the next frame image data and the cached initialization optical flow from the cache space of the chip, the data stream mode is adopted to continuously send the data stream into the frame inserting backbone network model, when the next frame image reaches 0.2 frame, the first line data of each of the backward search optical flow diagram Ft- >0, the forward search optical flow diagram Ft- >1 and the weighted fusion mask data can be obtained through calculation, and the data stream mode is adopted to continuously send the data stream into the frame inserting prediction unit of the frame inserting chip. In the buffer space of the frame inserting chip, the previous frame and the next frame of images adopt a ping-pong mode to exchange the storage space.
The inserted frame prediction unit of the inserted frame chip can map the previous frame and the next frame respectively by utilizing optical flow vectors of the predicted frame pointing to the previous frame and the next frame to obtain a forward predicted frame and a backward predicted frame, obtain the predicted frame through weighted fusion processing, then insert the predicted frame between the previous frame and the next frame of the original low-frame-rate video, and finally output a high-frame-rate video frame sequence. The inserted frame prediction unit can be composed of 2 backward mapping operators and a weighted fusion operator; the backward mapping operator may be a spatial shift of pixels in the image according to the dataflow graph. Alternatively, the weighted fusion operator may be denoted :img1 = (1 - mask_flow1)* img0_warp1 + mask_flow1 * img2_warp1; where the predicted frame may be img 1; the weighted fusion coefficient matrix may be mask_flow 1; the previous frame backward mapped image may be img0_warp 1; the post-frame backward mapped image may be img2_warp 1.
The frame inserting chip control unit of the frame inserting chip can be used for obtaining control signals set by the application processor and can comprise, but is not limited to, whether to start a frame inserting function, a scene jump judging threshold value and the like. When the previous frame and the next frame do not meet the frame inserting condition, the frame inserting chip control unit can close the frame inserting function and output the previous frame as a predicted frame. The time sequence control can be used for outputting a predicted frame calculation result when predicting the frame time, outputting a current frame when predicting the current frame time, and realizing high-frame-rate video output on the video time sequence.
Fig. 12 is a timing chart of an image processing method according to the second embodiment of the present invention, in a specific example, as shown in fig. 12, the time spent by the application processor outputting the whole frame of image of the subsequent frame is about 11ms, the time spent by initializing the optical flow chart with only 1/36 of the full resolution is about 0.2ms, and the time spent by the application processor transmitting the optical flow chart is basically negligible. The initialization optical flow calculation module calculates, at 320 x 320 resolution, on the application processor, a total delay of synchronization of the initialization optical flow data with the data transmission of approximately 3ms, namely tdelay_ calcFlow _sync, as shown in fig. 12. The first line is the time sequence of the GPU of the application processor for rendering the low-frame-rate video, the GPU finishes rendering a frame of image within 11ms, and the signal synchronization between the GPU of the application processor and the MIPI of the application processor needs about 1ms, namely Tdelay_gpu_ MIPI _sync; the second line is the time sequence of the application processor GPU outputting the rendering frame through MIPI, and the GPU completes the output of the rendering frame image within 11 ms; the third line is an initialized optical flow calculation module, which calculates an initialized optical flow by using the downsampled front and rear two frames of images, wherein the total delay of the calculation and the data transmission synchronization is about 3ms on an application processor, namely Tdelay_ calcFlow _sync; the fourth line is the MIPI output timing of the application processor GPU, which is the timing after synchronization between the MIPI output module and the MIPI receiving module of the framing chip, wherein the MIPI synchronization signal delay, i.e., tdelay_sync_img_aux, is about 2ms; the fifth line is the time sequence of outputting the first line of predicted frame data and the last line of predicted frame data after the low-delay real-time frame inserting chip receives the initialized optical flow and the low-frame rate video; wherein the initialization optical flow transmission and synchronization is delayed by about 1ms, namely Tdelay_flow_init_ pcie. Because of the adoption of the low-delay frame inserting backbone network model with a small receptive field, the frame inserting chip MIPI receives the first line image data of the next frame and then outputs the delay Tdelay_IF of the first line image data of the predicted frame for about 2ms; the sixth line is the timing of the display screen displaying the high frame rate video, and the delay includes a synchronization delay between the frame inserting chip and the display screen, a display screen data buffer, a display screen pixel flip delay, and the like, collectively referred to as tdelay_disp, for about 3.5ms.
Since the delay introduced by tdelay_sync_img_aux, tdelay_disp and the insertion of the predicted frame are fixed, it is known through calculation that the delay of the technical solution implemented by the present invention can be reduced from 13ms to 3ms, i.e. tdelay_flow_init_ pcie and tdelay_if delay add to 3ms, which is 77% less than the prior art.
Fig. 13 is a schematic diagram of another image processing data path according to the second embodiment of the present invention, and in a specific example, as shown in fig. 13, the image processing data path may include an electronic device such as an application processor, a frame inserting chip, and a display screen. The application processor can send image data of the low-frame-rate video to the frame inserting chip through the MIPI bus, and send control signals to the frame inserting chip through the SPI interface; the frame inserting chip can perform frame inserting processing on the received low-frame-rate video; the frame inserting chip can send the high-frame-rate video image data after frame inserting to the display screen through the MIPI bus.
Fig. 14 is a flowchart of another image processing method provided in the second embodiment of the present invention, in a specific example, as shown in fig. 14, since the frame inserting auxiliary network model of the frame inserting chip and the frame inserting main network model are time-division multiplexed, the frame inserting chip calculating unit, the motion estimating unit of which is in a normally open state, calculates different calculating modules in different time periods, when the frame inserting chip finishes the calculation of the current predicted frame, the frame inserting auxiliary network model can be started, and the frame inserting auxiliary network model can calculate the initialization optical flow required by the frame inserting main network by using the first two frame images of the predicted frame; the initialized optical flow completed by the frame insertion auxiliary network can be one-way optical flow estimation or two-way optical flow estimation.
And when the video acquisition unit of the frame inserting chip receives the image of the next frame, starting the frame inserting backbone network model. When the video acquisition unit of the frame inserting chip receives 0.2 frame of the next frame of image, the frame inserting backbone network model can output first row optical flow data and weighted fusion mask data. The motion estimation unit of the frame inserting chip can read the previous frame image, the next frame image data and the cached initialization optical flow from the cache space of the chip, the data stream mode is adopted to continuously send the data stream into the frame inserting backbone network model, when the next frame image reaches 0.2 frame, the first line data of each of the backward search optical flow diagram Ft- >0, the forward search optical flow diagram Ft- >1 and the weighted fusion mask data can be obtained through calculation, and the data stream mode is adopted to continuously send the data stream into the frame inserting prediction unit of the frame inserting chip. In the buffer space of the frame inserting chip, the image data of the Img N-2 frame, the Img N-1 frame and the Img N frame of the low-frame-rate video are exchanged in a ping-pong mode. When the storage space of the next frame is full, the new low-frame-rate video image reaches the frame inserting chip, and the image data of the new frame is written into the storage space of the N-2 th frame of the original Img. At this time, the new frame is used as the next frame calculated by the next frame, the next frame of the current predicted frame is used as the previous frame calculated by the next frame, and the previous frame of the current predicted frame is used as the previous frame calculated by the next frame.
The inserted frame prediction unit of the inserted frame chip can map the previous frame and the next frame respectively by utilizing optical flow vectors of the predicted frame pointing to the previous frame and the next frame to obtain a forward predicted frame and a backward predicted frame, obtain the predicted frame through weighted fusion processing, then insert the predicted frame between the previous frame and the next frame of the original low-frame-rate video, and finally output a high-frame-rate video frame sequence. The inserted frame prediction unit can be composed of 2 backward mapping operators and a weighted fusion operator; the backward mapping operator may be a spatial shift of pixels in the image according to the dataflow graph. Alternatively, the weighted fusion operator may be denoted :img2= (1 - mask_flow2)* img0_warp2 + mask_flow2 * img2_warp2; where the predicted frame may be img 2; the weighted fusion coefficient matrix may be mask_flow 2; the previous frame backward mapped image may be img0_warp 2; the post-frame backward mapped image may be img2_warp 2.
The frame inserting chip control unit of the frame inserting chip can be used for obtaining control signals set by the application processor and can comprise, but is not limited to, whether to start a frame inserting function, a scene jump judging threshold value and the like. When the previous frame and the next frame do not meet the frame inserting condition, the frame inserting chip control unit can close the frame inserting function and output the previous frame as a predicted frame. The time sequence control can be used for outputting a predicted frame calculation result when predicting the frame time, outputting a current frame when predicting the current frame time, and realizing high-frame-rate video output on the video time sequence.
Fig. 15 is a timing diagram of another image processing method according to the second embodiment of the present invention, in a specific example, as shown in fig. 15, the first line is a timing of rendering a low frame rate video by using a GPU of a processor, where the GPU can complete rendering of a frame of image within 11 ms; the second line is the time sequence of the application processor GPU for outputting the rendered frame through MIPI, and the GPU can finish the output of the rendered frame image within 11 ms; the MIPI synchronous signal delay Tdelay_sync_img_aux between the MIPI output module of the application processor and the MIPI receiving module of the frame inserting chip is about 2ms; the third row is the time sequence of initializing the optical flow calculated by the frame inserting auxiliary network model of the frame inserting chip by using the first two frames of images before the predicted frame moment; the frame insertion auxiliary network model can be calculated and completed within 11ms, and the calculation and output time sequence of the predicted frame cannot be influenced; the fourth line is the computation time sequence of the frame inserting backbone network model of the frame inserting chip, and the frame inserting computation can be completed by initializing the previous frame and the next frame of images of the optical flow and the predicted frame. The method comprises the steps that a low-delay frame inserting backbone network model with a small receptive field is adopted, and a frame inserting chip MIPI receives first line image data of a later frame and then delays Tdelay_IF of the first line image data of an output predicted frame for about 2ms; the fifth line is the timing of the display screen displaying the high frame rate video, and the delay may include a synchronization delay between the frame inserting chip and the display screen, a display screen data buffer, a display screen pixel flip delay, and the like, collectively referred to as tdelay_disp, for about 3.5ms.
Since the delay introduced by tdelay_sync_img_aux, tdelay_disp and the insertion of the predicted frame is fixed, the delay of the technical scheme implemented by the invention can be reduced from 13ms to 2ms by calculation, and compared with the prior art, the delay is reduced by 85%.
Fig. 16 is a schematic diagram of a local display interface of a virtual reality device provided in the second embodiment of the present invention, and fig. 17 is a schematic diagram of a local display interface of another virtual reality device provided in the second embodiment of the present invention, where, as shown in fig. 16 and fig. 17, an application program display interface in the virtual reality device is very different from a traditional plug-in chip application scenario such as a mobile phone and a television. In mobile phones and televisions, video applications playing movies may occupy the entire display; in a virtual reality device, the video application will be in cinema mode, i.e. the movie pictures only occupy a part of the display screen. Therefore, in a specific scene, the virtual reality device has a scene local picture content abrupt change, which may include but is not limited to a scene such as a cinema mode video content cut scene, an abrupt change of front and rear frame caption area content, a flash and content prompt of a progress bar in a cinema mode picture, a pop and close of a desktop operation application program interface of the virtual reality device, an abrupt change or a large movement of front and rear frame content caused by an application program frame loss, and a fast moving handle in the virtual reality device.
Fig. 18 is a flowchart of an image processing method including abrupt change detection, as shown in fig. 18, where a Virtual Reality device is taken as a VR (Virtual Reality) device as an example, the Virtual Reality device continuously outputs an original low-frequency video including an abrupt change region, the first two frames of images of a predicted frame of the original low-frequency video are sent into an interpolation auxiliary network as input data of the interpolation auxiliary network, and multi-layer convolution and up-sampling computation is performed in the interpolation auxiliary network model to obtain an initialized optical flow map Ft- >0 and an initialized optical flow map Ft- >1. And then, the previous frame image, the next frame image and the initialized optical flow of the predicted frame are used as input data of an interpolation frame backbone network model, multi-layer convolution and up-sampling calculation are carried out in the interpolation frame backbone network model, and backward search optical flow diagram Ft- >0, forward search optical flow diagram Ft- >1 and weighted fusion mask data can be obtained. Further, a backward initialization optical flow map Ft- >0 of the predicted frame pointing to the previous frame acts on the previous frame to obtain a forward predicted frame, and a forward initialization optical flow map Ft- >1 of the predicted frame pointing to the next frame acts on the next frame to obtain a backward predicted frame. And then calculating by using the forward predicted frame and the backward predicted frame aligned to the predicted frame time to obtain the absolute value of the pixel difference between the two images, and further carrying out multi-layer convolution and binarization calculation on the absolute value of the pixel difference to obtain the mutation mask data. The abrupt region fusion module can set an abrupt region in abrupt mask data to 0, set a non-abrupt region to 1, and then respectively carry out multiplication operation on the abrupt mask data and the interpolated bidirectional optical flow and the interpolated weighted fusion mask data, namely, a backward initialization optical flow graph Ft- >0 and a forward initialization optical flow graph Ft- >1 are set to 0 in the abrupt region, and do not carry out interpolation backward mapping, so that the original image content of a previous frame image and a next frame image of a predicted frame is reserved; the frame-inserting weighted fusion mask data of the abrupt region are all set to 0, and the original image content of the previous frame is reserved. And finally, the frame inserting bidirectional optical flow and the weighted fusion mask data after the abrupt region correction can be sent into an frame inserting prediction unit to be calculated to obtain a predicted frame, and the frame inserting chip embeds the predicted frame into the original low-frame-rate video and can obtain the high-frame-rate video through continuous output.
Fig. 19 is an effect diagram of a method for processing an image for detecting a mutation area according to a second embodiment of the present invention, where, as shown in fig. 19, when a scene cut occurs in a movie content, an abnormality occurs in a predicted frame output by the image processing method that does not include mutation detection; the image processing method comprising mutation detection can well complete frame insertion processing, and the content of the previous frame is output in the mutation area, so that the continuity of the picture content is ensured.
According to the technical scheme, the initialization optical flow can be calculated according to the target reference image frame, the predicted optical flow graph and the weighted fusion mask data are determined according to the initialization optical flow, the previous frame image and the next frame image of the predicted frame are aligned according to the predicted optical flow graph, the forward predicted frame and the backward predicted frame are respectively obtained, the absolute value of the pixel difference between the forward predicted frame and the backward predicted frame aligned to the moment of the predicted frame is calculated, the abrupt mask data are calculated according to the absolute value of the pixel difference, and the predicted image frame data are calculated according to the predicted optical flow graph, the weighted fusion mask data and the abrupt mask data, so that the problems of poor image local abrupt region processing effect and high image processing delay are solved, the time cost of image processing is shortened, and the rendering processing effect of the image local abrupt region is improved.
Example III
Fig. 20 is a schematic diagram of an image processing apparatus according to a third embodiment of the present invention, as shown in fig. 20, the apparatus includes: an initialization optical flow calculation module 310, a predicted optical flow graph and weighted fusion mask data determination module 320, a abrupt mask data calculation module 330, and a predicted image frame data calculation module 340, wherein:
an optical flow calculation module 310 is initialized for: calculating an initialized optical flow according to the target reference image frame;
The prediction light flow graph and weighted fusion mask data determining module 320 is configured to: determining a predicted light flow graph and weighted fusion mask data according to the initialized light flow;
a mutation mask data calculation module 330, configured to: calculating mutation mask data according to the predicted light flow graph;
A predicted image frame data calculation module 340 for: and calculating predicted image frame data according to the predicted light flow graph, the weighted fusion mask data and the abrupt change mask data.
According to the technical scheme provided by the embodiment of the invention, the initialization optical flow is calculated according to the target reference image frame, the predicted optical flow diagram and the weighted fusion mask data are determined according to the initialization optical flow, the mutation mask data are calculated according to the predicted optical flow diagram, and finally the predicted image frame data are calculated according to the predicted optical flow diagram, the weighted fusion mask data and the mutation mask data, so that the problems of poor image local mutation region processing effect and large image processing delay are solved, the time cost of image processing is shortened, and the image local mutation region rendering processing effect is improved.
Optionally, the target reference image frame includes a previous frame image and a subsequent frame image of the predicted frame; the optical flow calculation module 310 is initialized, specifically for: inputting a previous frame image and a next frame image of the predicted frame to an application processor chip to calculate the initialization optical flow by the application processor chip; the prediction light flow graph and weighted fusion mask data determining module 320 is specifically configured to: and inputting a previous frame image of the predicted frame, a next frame image of the predicted frame and the initialized optical flow into an interpolation frame backbone network model in an interpolation frame chip so as to calculate predicted optical flow diagrams and weighted fusion mask data of the previous frame image and the next frame image of the predicted frame through the interpolation frame backbone network model.
Optionally, the target reference image frame includes first two frame images of a predicted frame; the optical flow calculation module 310 is initialized, specifically for: inputting the first two frame images of the predicted frame into a frame inserting auxiliary network model in a frame inserting chip to calculate the initialized optical flow through the frame inserting auxiliary network model; the prediction light flow graph and weighted fusion mask data determining module 320 is specifically configured to: and inputting a previous frame image of the predicted frame, a next frame image of the predicted frame and the initialized optical flow into an interpolation frame backbone network model in an interpolation frame chip so as to calculate predicted optical flow diagrams and weighted fusion mask data of the previous frame image and the next frame image of the predicted frame through the interpolation frame backbone network model.
Optionally, the mutation mask data calculating module 330 is specifically configured to: aligning a previous frame image and a next frame image of the predicted frame according to the predicted light flow graph to respectively obtain a forward predicted frame and a backward predicted frame; calculating the absolute value of the pixel difference value between the forward predicted frame and the backward predicted frame aligned to the predicted frame time; and calculating the mutation mask data according to the absolute value of the pixel difference value.
Optionally, the predicted image frame data calculating module 340 is specifically configured to: determining a mutation region and a non-mutation region according to the mutation mask data; setting a first region value for the mutation region and a second region value for the non-mutation region; and calculating the predicted image frame data according to the mutation region of the mutation mask data, the predicted light flow graph and the weighted fusion mask data.
Optionally, the predicted image frame data calculating module 340 is specifically configured to: determining a mutation region of the predicted light flow graph according to the mutation region of the mutation mask data; inhibiting backward mapping processing on optical flow values of a mutation area of the predicted optical flow graph; and setting the weighted fusion mask data of the abrupt region as the first region value so as to preserve the image content of the image of the previous frame of the predicted frame.
Optionally, the image processing apparatus further includes a frame inserting module, specifically configured to: determining an inserting frame position of the predicted image frame data in an original low-frequency video; and inserting the predicted image frame data into the original low-frequency video according to the frame inserting position.
The image processing device can execute the image processing method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. Technical details not described in detail in this embodiment may be referred to the image processing method provided in any embodiment of the present invention.
Example IV
Fig. 21 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 21, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the respective methods and processes described above, for example, an image processing method.
In some embodiments, the image processing method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as the storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the image processing method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the image processing method in any other suitable way (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.

Claims (7)

1. An image processing method, comprising:
Calculating an initialized optical flow according to the target reference image frame;
determining a predicted light flow graph and weighted fusion mask data according to the initialized light flow;
calculating mutation mask data according to the predicted light flow graph;
Calculating predicted image frame data according to the predicted light flow graph, the weighted fusion mask data and the abrupt mask data;
Wherein the calculating mutation mask data according to the predicted light flow graph comprises: aligning a previous frame image and a next frame image of the predicted frame according to the predicted light flow graph to respectively obtain a forward predicted frame and a backward predicted frame; calculating an absolute value of a pixel difference between a forward predicted frame and a backward predicted frame aligned to a predicted frame time; calculating the abrupt mask data according to the absolute value of the pixel difference value;
Wherein said calculating predicted image frame data from said predicted dataflow graph, said weighted fusion mask data, and said abrupt mask data includes: determining a mutation region and a non-mutation region according to the mutation mask data; setting a first region value for the mutation region and a second region value for the non-mutation region; calculating the predicted image frame data according to the mutation region of the mutation mask data, the predicted light flow graph and the weighted fusion mask data;
Wherein the calculating the predicted image frame data according to the mutation region of the mutation mask data, the predicted light flow map and the weighted fusion mask data includes: determining a mutation region of the predicted light flow graph according to the mutation region of the mutation mask data; inhibiting backward mapping processing on optical flow values of a mutation area of the predicted optical flow graph; and setting the weighted fusion mask data of the abrupt region as the first region value so as to preserve the image content of the image of the previous frame of the predicted frame.
2. The method of claim 1, wherein the target reference image frame comprises a previous frame image and a subsequent frame image of a predicted frame;
The calculating the initialized optical flow according to the target reference image frame comprises the following steps:
inputting a previous frame image and a next frame image of the predicted frame to an application processor chip to calculate the initialization optical flow by the application processor chip;
the determining the predicted light flow graph and the weighted fusion mask data according to the initialized light flow comprises the following steps:
And inputting a previous frame image of the predicted frame, a next frame image of the predicted frame and the initialized optical flow into an interpolation frame backbone network model in an interpolation frame chip so as to calculate predicted optical flow diagrams and weighted fusion mask data of the previous frame image and the next frame image of the predicted frame through the interpolation frame backbone network model.
3. The method of claim 1, wherein the target reference image frame comprises a first two frame image of a predicted frame;
The calculating the initialized optical flow according to the target reference image frame comprises the following steps:
inputting the first two frame images of the predicted frame into a frame inserting auxiliary network model in a frame inserting chip to calculate the initialized optical flow through the frame inserting auxiliary network model;
the determining the predicted light flow graph and the weighted fusion mask data according to the initialized light flow comprises the following steps:
And inputting a previous frame image of the predicted frame, a next frame image of the predicted frame and the initialized optical flow into an interpolation frame backbone network model in an interpolation frame chip so as to calculate predicted optical flow diagrams and weighted fusion mask data of the previous frame image and the next frame image of the predicted frame through the interpolation frame backbone network model.
4. The method of claim 1, further comprising, after said computing predicted image frame data from said predicted light flow map, said weighted fusion mask data, and said abrupt mask data:
determining an inserting frame position of the predicted image frame data in an original low-frequency video;
and inserting the predicted image frame data into the original low-frequency video according to the frame inserting position.
5. An image processing apparatus, comprising:
An initialization optical flow calculation module for calculating an initialization optical flow according to the target reference image frame;
the prediction light flow graph and weighted fusion mask data determining module is used for determining prediction light flow graph and weighted fusion mask data according to the initialization light flow;
The mutation mask data calculation module is used for calculating mutation mask data according to the predicted light flow graph;
a predicted image frame data calculation module, configured to calculate predicted image frame data according to the predicted light flow graph, the weighted fusion mask data, and the abrupt mask data;
the mutation mask data calculation module is specifically configured to: aligning a previous frame image and a next frame image of the predicted frame according to the predicted light flow graph to respectively obtain a forward predicted frame and a backward predicted frame; calculating an absolute value of a pixel difference between a forward predicted frame and a backward predicted frame aligned to a predicted frame time; calculating the abrupt mask data according to the absolute value of the pixel difference value;
The prediction image frame data calculation module is specifically configured to: determining a mutation region and a non-mutation region according to the mutation mask data; setting a first region value for the mutation region and a second region value for the non-mutation region; calculating the predicted image frame data according to the mutation region of the mutation mask data, the predicted light flow graph and the weighted fusion mask data;
The prediction image frame data calculation module is specifically configured to: determining a mutation region of the predicted light flow graph according to the mutation region of the mutation mask data; inhibiting backward mapping processing on optical flow values of a mutation area of the predicted optical flow graph; and setting the weighted fusion mask data of the abrupt region as the first region value so as to preserve the image content of the image of the previous frame of the predicted frame.
6. An electronic device, the electronic device comprising:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the image processing method of any one of claims 1-4.
7. A computer storage medium, characterized in that the computer readable storage medium stores computer instructions for causing a processor to implement the image processing method of any one of claims 1-4 when executed.
CN202310960167.1A 2023-08-01 2023-08-01 Image processing method, device, electronic device and storage medium Active CN116958203B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310960167.1A CN116958203B (en) 2023-08-01 2023-08-01 Image processing method, device, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310960167.1A CN116958203B (en) 2023-08-01 2023-08-01 Image processing method, device, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN116958203A CN116958203A (en) 2023-10-27
CN116958203B true CN116958203B (en) 2024-09-13

Family

ID=88456424

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310960167.1A Active CN116958203B (en) 2023-08-01 2023-08-01 Image processing method, device, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN116958203B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119011744B (en) * 2024-10-21 2024-12-17 湖南马栏山天择微链科技有限公司 A video synchronization method based on image multi-dimensional feature value comparison

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112233055A (en) * 2020-10-15 2021-01-15 北京达佳互联信息技术有限公司 Video mark removing method and video mark removing device
CN116309755A (en) * 2023-03-29 2023-06-23 标贝(北京)科技有限公司 Image registration method and surface normal vector reconstruction method, system and electronic device

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114071223B (en) * 2020-07-30 2024-10-29 武汉Tcl集团工业研究院有限公司 Optical flow-based video plug-in frame generation method, storage medium and terminal equipment
CN112261390B (en) * 2020-08-20 2022-02-11 深圳市豪恩汽车电子装备股份有限公司 Vehicle-mounted camera equipment and image optimization device and method thereof
CN112200757B (en) * 2020-09-29 2024-08-02 北京灵汐科技有限公司 Image processing method, device, computer equipment and storage medium
WO2022141333A1 (en) * 2020-12-31 2022-07-07 华为技术有限公司 Image processing method and apparatus
CN113077385A (en) * 2021-03-30 2021-07-06 上海大学 Video super-resolution method and system based on countermeasure generation network and edge enhancement
CN114119990B (en) * 2021-09-29 2023-10-27 北京百度网讯科技有限公司 Method, device and computer program product for image feature point matching
CN113822910A (en) * 2021-09-30 2021-12-21 上海商汤临港智能科技有限公司 Multi-target tracking method and device, electronic equipment and storage medium
CN117635486A (en) * 2022-08-31 2024-03-01 荣耀终端有限公司 Image processing methods, devices, equipment and storage media

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112233055A (en) * 2020-10-15 2021-01-15 北京达佳互联信息技术有限公司 Video mark removing method and video mark removing device
CN116309755A (en) * 2023-03-29 2023-06-23 标贝(北京)科技有限公司 Image registration method and surface normal vector reconstruction method, system and electronic device

Also Published As

Publication number Publication date
CN116958203A (en) 2023-10-27

Similar Documents

Publication Publication Date Title
US12279070B2 (en) Method for frame interpolation and related products
US10984583B2 (en) Reconstructing views of real world 3D scenes
CN111405316A (en) Frame insertion method, electronic device and readable storage medium
JP4564564B2 (en) Moving picture reproducing apparatus, moving picture reproducing method, and moving picture reproducing program
CN108810281A (en) Lost frame compensation method, lost frame compensation device, storage medium and terminal
EP3958207A2 (en) Method and apparatus for video frame interpolation, and electronic device
CN103929648B (en) Motion estimation method and device in frame rate up conversion
CN116958203B (en) Image processing method, device, electronic device and storage medium
CN115334335A (en) Video frame insertion method and device
CN115706810B (en) Video frame adjustment method, device, electronic equipment and storage medium
CN114745545B (en) Video frame inserting method, device, equipment and medium
CN117667000A (en) Image display method, device, electronic equipment and medium
US12395646B2 (en) Method, apparatus, electronic device, storage media and program product for video coding
CN113706673A (en) Cloud rendering framework platform applied to virtual augmented reality technology
EP4589530A1 (en) Binocular image generation method and apparatus, electronic device and storage medium
CN116051380B (en) A video super-resolution processing method and electronic equipment
CN116668843A (en) A shooting state switching method, device, electronic equipment and storage medium
CN117425048A (en) Video playing method, device, equipment and storage medium
WO2024082933A1 (en) Video processing method and apparatus, and electronic device and storage medium
CN118283208A (en) A method and system for real-time rendering interpolation on a mobile terminal based on a splash algorithm
CN116437028A (en) A video display method and system
CN115953432A (en) Image-based motion prediction method and apparatus, electronic device, and storage medium
CN117812382B (en) Video data processing method, device, equipment and storage medium
CN119854473B (en) A Binocular Image Display Method and Virtual Reality Device Based on Motion Compensation
CN117478814A (en) Frame inserting method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Country or region after: China

Address after: Room 213-175, 2nd Floor, Building 1, No. 180 Kecheng Street, Qiaosi Street, Linping District, Hangzhou City, Zhejiang Province, 311101

Applicant after: Hangzhou Zhicun Computing Technology Co.,Ltd.

Address before: 1707, 17th floor, shining building, No. 35, Xueyuan Road, Haidian District, Beijing 100083

Applicant before: BEIJING WITINMEM TECHNOLOGY Co.,Ltd.

Country or region before: China

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant