[go: up one dir, main page]

CN106686472B - A method and system for generating high frame rate video based on deep learning - Google Patents

A method and system for generating high frame rate video based on deep learning Download PDF

Info

Publication number
CN106686472B
CN106686472B CN201611241691.XA CN201611241691A CN106686472B CN 106686472 B CN106686472 B CN 106686472B CN 201611241691 A CN201611241691 A CN 201611241691A CN 106686472 B CN106686472 B CN 106686472B
Authority
CN
China
Prior art keywords
frame
video
neural networks
convolutional neural
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611241691.XA
Other languages
Chinese (zh)
Other versions
CN106686472A (en
Inventor
王兴刚
罗浩
姜玉静
刘文予
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201611241691.XA priority Critical patent/CN106686472B/en
Publication of CN106686472A publication Critical patent/CN106686472A/en
Application granted granted Critical
Publication of CN106686472B publication Critical patent/CN106686472B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/83Generation or processing of protective or descriptive data associated with content; Content structuring
    • H04N21/845Structuring of content, e.g. decomposing content into time segments
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/587Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal sub-sampling or interpolation, e.g. decimation or subsequent interpolation of pictures in a video sequence
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0127Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level by changing the field or frame frequency of the incoming video signal, e.g. frame rate converter

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The high frame-rate video generation method based on deep learning that the invention discloses a kind of, comprising: generate training sample set using one or more original high frame-rate video segments;The multiple video frame set training binary channels convolutional neural networks model concentrated using the training sample, with binary channels convolutional neural networks after being optimized, the binary channels convolutional neural networks model is convolutional neural networks made of being merged as two convolutional channels;Using binary channels convolutional neural networks after the optimization, the insertion frame of this two video frame is generated according to two video frame of arbitrary neighborhood in low frame-rate video, to generate the video that frame per second is higher than the low frame-rate video.The method of the present invention whole process is end to end, it does not need to carry out subsequent processing to video frame, the problems such as video frame rate conversion effect is good, and the video fluency of synthesis is high, switches for shake existing during video capture, video scene has preferable robustness.

Description

A kind of high frame-rate video generation method and system based on deep learning
Technical field
The invention belongs to technical field of computer vision, regard more particularly, to a kind of high frame per second based on deep learning Frequency generation method and system.
Background technique
With the development of science and technology, the mode that people obtain video is more and more convenient, however due to hardware, it is most of Video is all that non-professional equipment is collected, and frame per second generally only has 24fps-30fps.The video of high frame per second has high smoothness Degree, can bring better visual experience.If people directly upload the video of high frame per second on the net, due to flow Consumption increases, and the cost of people also increases as.If the directly upper video for transmitting low frame per second, due to network line, Inevitably there is frame losing in video, the video the big more is easy to appear this phenomenon, so that the view of distal end during transmission Frequency quality cannot be effectively guaranteed, this greatly affected the experience of people.It is reasonable it is therefore desirable to be used in distal end Processing mode carries out subsequent processing to the video that people upload, so that the demand that the quality of video is able to satisfy people is even further Promote the experience of people.
Summary of the invention
Aiming at the above defects or improvement requirements of the prior art, the high frame per second based on deep learning that the present invention provides a kind of Thus video generation method solves to regard due to low frame per second its object is to be the video of high frame per second by the Video Quality Metric of low frame per second Frame losing of frequency during network transmission and cause video quality to decline the technical issues of affecting to the experience of people.
To achieve the above object, according to one aspect of the present invention, a kind of high frame per second view based on deep learning is provided Frequency generation method, comprising the following steps:
(1) training sample set is generated using one or more original high frame-rate video segments, the training sample concentrates packet It includes multiple video frame set, includes two training frames and a control frame in each video frame set, described two Training frames are two video frames that a frame or multiframe are spaced in high frame-rate video segment, and the control frame is two training frames Any one frame of midfeather;The frame per second of the high frame-rate video segment is higher than setting frame per second threshold value;
(2) the multiple video frame set training binary channels convolutional neural networks model concentrated using the training sample, With binary channels convolutional neural networks after being optimized;Wherein, the binary channels convolutional neural networks model is to be led to by two convolution Convolutional neural networks made of road fusion, two convolutional channels are respectively used to two video frames in input video frame subclass simultaneously Convolution is carried out to the video frame of input respectively, binary channels convolutional neural networks model carries out the convolution results of two convolutional channels It merges and exports to predict frame, the frame flyback training bilateral is compareed with the video frame set according to the prediction frame Road convolutional neural networks model;
(3) using binary channels convolutional neural networks after the optimization, according to two video of arbitrary neighborhood in low frame-rate video Frame generates the insertion frame of this two video frame, to generate the video that frame per second is higher than the low frame-rate video.
In one embodiment of the present of invention, each convolutional channel in the binary channels convolutional neural networks model includes k A convolutional layer, wherein k > 0, the mathematical description of each convolutional layer are as follows:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i indicates that the number of plies of convolutional layer, input video frame are the 0th layer, and * represents convolution operation, Fi-1Indicate (i-1)-th layer Output, Zi(Y) output after i-th layer of convolution operation, W are indicatediFor i-th layer of convolution nuclear parameter, BiJoin for i-th layer of biasing Number.
In one embodiment of the present of invention, in the convolutional channel, it is connected to one respectively after preceding k-1 convolutional layer The active coating of ReLU is to keep the sparsity of network, mathematical description are as follows:
Fi(Y)=max (0, Zi)。
In one embodiment of the present of invention, in two video frames by the feature that obtains after the last one convolutional layer Response diagram is merged by the way of corresponding position value addition.
In one embodiment of the present of invention, swash in the Sigmoid that is followed by that the mixing operation obtains characteristic response figure Layer living is the pixel value of picture to be mapped between 0-1, mathematical description are as follows:
In one embodiment of the present of invention, use mean value for 0, the Gaussian Profile that standard deviation is 1 initializes convolution nuclear parameter, Biasing is initialized as 0, and benchmark learning rate is initialized as 1e-6, benchmark learning rate reduces 10 times after the m period of iteration, wherein m For preset value.
In one embodiment of the present of invention, frame flyback instruction is compareed with the video frame set according to the prediction frame Practice the binary channels convolutional neural networks model, specifically:
Using prediction frame with compare the error between frame, the binary channels convolution is trained using error backpropagation algorithm Neural network;Wherein use least squares error for our majorized function, mathematical description are as follows:
Wherein i indicates i-th samples pictures, and n indicates the quantity of sample training collection, YiIndicate the video frame of neural network forecast, Indicate the true value of corresponding video frame.
In one embodiment of the present of invention, the k value is 3;First convolutional layer has the convolution kernel of 64 9*9, step-length For 1 pixel, Filling power 4, Filling power refers to the circle number in the zero padding of characteristic pattern periphery;Second convolutional layer has 32 1*1's Convolution kernel, step-length are 1 pixel, Filling power 0;Third convolutional layer has the convolution kernel of 3 5*5, step-length 1, and Filling power is 2。
It is another aspect of this invention to provide that additionally providing a kind of high frame-rate video generation system based on deep learning, packet Include training sample set generation module, binary channels convolutional neural networks optimization module and high frame-rate video generation module, in which:
The training sample set generation module, for generating training sample using one or more high frame-rate video segments Collection, it includes two training frames in each video frame set that it includes multiple video frame set that the training sample, which is concentrated, With a control frame, two training frames are two video frames that a frame or multiframe are spaced in high frame-rate video segment, described Compare any one frame for the midfeather that frame is two training frames;The frame per second of the high frame-rate video segment is higher than setting frame Rate threshold value;
The binary channels convolutional neural networks optimization module, multiple video frames for being concentrated using the training sample Gather training binary channels convolutional neural networks model, binary channels convolutional neural networks after being optimized;Wherein, the binary channels volume Product neural network model is the convolutional neural networks of two channels fusion, and two channels are respectively used to input the video frame Two video frames in conjunction simultaneously carry out convolution to the video frame of input respectively, binary channels convolutional neural networks model it is logical to two The result of road convolution is merged and is exported to predict frame, compares frame with the video frame set according to the prediction frame Binary channels convolutional neural networks model described in regression training;
The high frame-rate video generation module is used for using binary channels convolutional neural networks after the optimization, according to low frame Two video frame of arbitrary neighborhood in rate video generates the insertion frame of this two video frame, so that generating frame per second is higher than the low frame per second view The video of frequency.
In one embodiment of the present of invention, each convolutional channel in the binary channels convolutional neural networks model includes k A convolutional layer, wherein k > 0, the mathematical description of each convolutional layer are as follows:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i indicates that the number of plies of convolutional layer, input video frame are the 0th layer, and * represents convolution operation, Fi-1Indicate (i-1)-th layer Output, Zi(Y) output after i-th layer of convolution operation, W are indicatediFor i-th layer of convolution nuclear parameter, BiJoin for i-th layer of biasing Number.
In general, contemplated above technical scheme through the invention, compared with prior art, the present invention has following Technical effect:
(1) feature extraction of the invention and the prediction of frame are obtained by the supervised learning of training sample, without artificial Intervene, spatial diversity information can be preferably fitted under the scene of large-scale data;
(2) whole process of the invention is end to end, using the ability of self-teaching of convolutional neural networks, to pass through self The mode of study learns model parameter, it is succinct efficiently, overcome traditional technology taken time and effort when handling video frame rate conversion and The unconspicuous feature of effect.
Detailed description of the invention
Fig. 1 is the flow chart of the method for converting video frame rate of the invention based on deep learning, wherein FiIndicate i-th layer Output, Yt-1、Yt、Yt+1Indicate continuous three frames video frame, YtNet is indicated for calculating error, Prediction as true value The video frame of network prediction.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.As long as in addition, technical characteristic involved in the various embodiments of the present invention described below Not constituting a conflict with each other can be combined with each other.
Just technical term of the invention is explained and illustrated first below:
Convolutional neural networks (Convolutional Neural Network, CNN): one kind can be used for image classification, returns The neural network of tasks such as return, its particularity is embodied in two aspects, be on the one hand its interneuronal connection is non-complete Connection, the weight of the connection in another aspect same layer between certain neurons is shared.Network is usually by convolutional layer, pond Change layer and full articulamentum is constituted.Convolutional layer and pond layer are responsible for extracting the hierarchy characteristic of image, and full articulamentum is responsible for extracting Feature classified or returned.The parameter of network includes the parameter and biasing of convolution kernel and full articulamentum, and parameter can be with By reverse conduction algorithm from the acquistion of data middle school to.
Reverse conduction algorithm (Backpropagation Algorithm, BP): being a kind of and optimal method (such as gradient Descent method) be used in combination, for training the common methods of artificial neural network.This method damages weight calculations all in network The gradient of function is lost, this gradient can feed back to optimal method, for updating weight to minimize loss function.Algorithm master To include two stages: the forward direction of excitation, backpropagation and weight update.
With the arrival of big data era, the scale of video database is also increasing, and the solution of this problem is also more next It is more urgent.Deep neural network can by it is a kind of it is preferable in a manner of simulate the working method of human brain data analyzed, In recent years, deep learning all achieves successful application in the every field of computer vision, but video frame rate is turned The problem of changing there is no apparent research, and complicated in view of traditional method for converting video frame rate process, time human cost is higher, this hair The bright one kind that proposes is based on deep learning method for converting video frame rate.This method whole process be end to end, it is easy and efficiently, For video shake, scene switching the problems such as all there is stronger robustness.
As shown in Figure 1, may comprise steps of the present invention is based on the method for converting video frame rate of deep learning:
(1) training sample set is generated using one or more original high frame-rate video segments, the training sample concentrates packet It includes multiple video frame set, includes two training frames and a control frame in each video frame set, described two Training frames are two video frames that a frame or multiframe are spaced in high frame-rate video segment, and the control frame is two training frames Any one frame of midfeather;The frame per second of the high frame-rate video segment is higher than setting frame per second threshold value;
Specifically, high frame-rate video segment can be extracted and obtain sets of video frames, obtain training sample according to a certain percentage Collection;
Training sample set is combined by multiple video frames, and two instructions are included in each video frame set Practice frame and a control frame.Control frame is chosen for the most intermediate of two training frames or close to that most intermediate frame.Generally In the case of refer to and take continuous 3 frame, an intermediate frame is control frame, and another two frame is training frames;If frame per second is sufficiently high, can also take Be separated by two frames of multiframe (depending on frame per second, cannot too many) as training frames, and interphase every multiframe in can choose middle ware Every any one frame be control frame;Such as trained high video frame rate be 60, which has N frame, then according to interval one The mode of this training of frame sample, from the 2nd to N-1 frame in take a frame as true value (control frame) at random, and it is the frame is adjacent Two frames be input to inside network as training sample (two training frames).Similarly, can also come in the way of being spaced multiframe Training sample, can be used for the video of lower frame per second in this way, i.e., the Video Quality Metric of lower frame per second is the video of high frame per second.
(2) the multiple video frame set training binary channels convolutional neural networks model concentrated using the training sample, With binary channels convolutional neural networks after being optimized;Wherein, the binary channels convolutional neural networks model is to be led to by two convolution Convolutional neural networks made of road fusion, two convolutional channels are respectively used to two video frames in input video frame subclass simultaneously Convolution is carried out to the video frame of input respectively, binary channels convolutional neural networks model carries out the convolution results of two convolutional channels It merges and exports to predict frame, the frame flyback training bilateral is compareed with the video frame set according to the prediction frame Road convolutional neural networks model;
It first has to design and Implement a binary channels convolutional neural networks, specifically:
The binary channels convolutional neural networks model established is the convolutional neural networks of two convolutional channels fusion, includes altogether K convolutional layer, k > 0, preferably 3 individually carry out convolution to two video frame pictures (training frames) respectively.First convolutional layer has The convolution kernel of 64 9*9, step-length are 1 pixel, and Filling power 4, Filling power refers to the circle number in the zero padding of characteristic pattern periphery.Second A convolutional layer has the convolution kernel of 32 1*1, and step-length is 1 pixel, Filling power 0.Third volume layer has the convolution kernel of 3 5*5, Step-length is 1, Filling power 2.The mathematical description of convolutional layer are as follows:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i indicates the number of plies of network, and input picture is the 0th layer, and * represents convolution operation, Fi-1Indicate (i-1)-th layer defeated Out, Zi(Y) output after i-th layer of convolution operation, W are indicatediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter;
In 3 convolutional layers, it is connected to the active coating of a ReLU respectively after the 1st and the 2nd convolutional layer to keep The sparsity of network, mathematical description are as follows:
Fi(Y)=max (0, Zi)。
Two video frame pictures are added by the characteristic response figure obtained after third convolutional layer using corresponding position value Mode merged;
After the mixing operation, obtained characteristic response figure is followed by a Sigmoid active coating with by the picture of picture Plain value is mapped between 0-1, mathematical description are as follows:
Before the training binary channels convolutional neural networks, need to each pixel value in video frame divided by 255 into Row normalized, the pixel value after normalization is between 0 to 1;
Also, before the training binary channels convolutional neural networks, need to initialize the use of convolutional neural networks parameter Mean value is 0, and the Gaussian Profile that standard deviation is 1 initializes convolution nuclear parameter, and biasing is initialized as 0, the initialization of benchmark learning rate For 1e-6, benchmark learning rate reduces 10 times after the m period of iteration, and wherein m is preset value;For example, m preferably 2, then in preceding 1-m In a iteration cycle, learning rate=1e-6, after the m period of iteration, learning rate=1e-7, and be always maintained at constant.
Specifically, can use the predicted value of network with compare between error, instructed using error backpropagation algorithm Practice binary channels convolutional neural networks.Use least squares error for our majorized function, mathematical description are as follows:
Wherein i indicates i-th samples pictures, and n indicates the quantity of sample training collection, YiIndicate the video frame of neural network forecast, Indicate the true value of corresponding video frame;
(3) using binary channels convolutional neural networks after the optimization, according to two video of arbitrary neighborhood in low frame-rate video Frame generates the insertion frame of this two video frame, to generate the video that frame per second is higher than the low frame-rate video.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, not to The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should all include Within protection scope of the present invention.

Claims (3)

1. a kind of high frame-rate video generation method based on deep learning, which is characterized in that the described method comprises the following steps:
(1) training sample set is generated using one or more original high frame-rate video segments, it includes more that the training sample, which is concentrated, A video frame set includes two training frames and a control frame, two training in each video frame set Frame is two video frames that a frame or multiframe are spaced in high frame-rate video segment, and the control frame is among two training frames Any one frame at interval;The frame per second of the high frame-rate video segment is higher than setting frame per second threshold value;
(2) the multiple video frame set training binary channels convolutional neural networks model concentrated using the training sample, to obtain Binary channels convolutional neural networks after must optimizing;Wherein, the binary channels convolutional neural networks model is to be melted by two convolutional channels Convolutional neural networks made of conjunction, two convolutional channels are respectively used to two video frames and difference in input video frame subclass Convolution is carried out to the video frame of input, binary channels convolutional neural networks model merges the convolution results of two convolutional channels And export to predict frame, it is rolled up according to the prediction frame with the frame flyback training binary channels that compares in the video frame set Product neural network model;Wherein,
Each convolutional channel in the binary channels convolutional neural networks model includes k convolutional layer, wherein k > 0, each convolution The mathematical description of layer are as follows:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i indicates that the number of plies of convolutional layer, input video frame are the 0th layer, and * represents convolution operation, Fi-1Indicate (i-1)-th layer defeated Out, Zi(Y) output after i-th layer of convolution operation, W are indicatediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter;
In the convolutional channel, it is connected to the active coating of a ReLU respectively after preceding k-1 convolutional layer to keep network Sparsity, mathematical description are as follows:
Fi(Y)=max (0, Zi);
Use mean value for 0, the Gaussian Profile that standard deviation is 1 initializes convolution nuclear parameter, and biasing is initialized as 0, benchmark study speed Rate is initialized as 1e-6, benchmark learning rate reduces 10 times after the m period of iteration, and wherein m is preset value;
The k value is 3;First convolutional layer has the convolution kernel of 64 9*9, and step-length is 1 pixel, Filling power 4, Filling power Refer to the circle number in the zero padding of characteristic pattern periphery;Second convolutional layer has the convolution kernel of 32 1*1, and step-length is 1 pixel, Filling power It is 0;Third convolutional layer has the convolution kernel of 3 5*5, step-length 1, Filling power 2;
A Sigmoid active coating is followed by what the mixing operation obtained characteristic response figure the pixel value of picture to be mapped to Between 0-1, mathematical description are as follows:
The frame flyback training binary channels convolutional neural networks are compareed with the video frame set according to the prediction frame Model, specifically:
Using prediction frame with compare the error between frame, the binary channels convolutional Neural is trained using error backpropagation algorithm Network;Wherein use least squares error for majorized function, mathematical description are as follows:
Wherein i indicates i-th samples pictures, and n indicates the quantity of sample training collection, YiIndicate the video frame of neural network forecast,It indicates The true value of corresponding video frame;
(3) raw according to two video frame of arbitrary neighborhood in low frame-rate video using binary channels convolutional neural networks after the optimization At the insertion frame of this two video frame, to generate the video that frame per second is higher than the low frame-rate video.
2. the high frame-rate video generation method based on deep learning as described in claim 1, which is characterized in that at described two Video frame is merged by the way of corresponding position value addition by the characteristic response figure obtained after the last one convolutional layer.
3. a kind of high frame-rate video based on deep learning generates system, which is characterized in that including training sample set generation module, Binary channels convolutional neural networks optimization module and high frame-rate video generation module, in which:
The training sample set generation module, for generating training sample set, institute using one or more high frame-rate video segments Stating training sample and concentrating includes multiple video frame set, includes two training frames and one in each video frame set Frame is compareed, two training frames are two video frames that a frame or multiframe are spaced in high frame-rate video segment, the control frame For any one frame of the midfeather of two training frames;The frame per second of the high frame-rate video segment is higher than setting frame per second threshold Value;
The binary channels convolutional neural networks optimization module, multiple video frame set for being concentrated using the training sample Training binary channels convolutional neural networks model, binary channels convolutional neural networks after being optimized;Wherein, the binary channels convolution mind It is the convolutional neural networks of two channels fusion through network model, two channels are respectively used to input in the video frame set Two video frames and convolution carried out respectively to the video frame of input, binary channels convolutional neural networks model rolls up two channels Long-pending result is merged and is exported to predict frame, compares frame flyback with the video frame set according to the prediction frame The training binary channels convolutional neural networks model;
The high frame-rate video generation module, for being regarded according to low frame per second using binary channels convolutional neural networks after the optimization Two video frame of arbitrary neighborhood in frequency generates the insertion frame of this two video frame, to generate frame per second higher than the low frame-rate video Video;
Each convolutional channel in the binary channels convolutional neural networks model includes k convolutional layer, wherein k > 0, each convolution The mathematical description of layer are as follows:
Zi(Y)=Wi*Fi-1(Y)+Bi
Wherein i indicates that the number of plies of convolutional layer, input video frame are the 0th layer, and * represents convolution operation, Fi-1Indicate (i-1)-th layer defeated Out, Zi(Y) output after i-th layer of convolution operation, W are indicatediFor i-th layer of convolution nuclear parameter, BiFor i-th layer of offset parameter.
CN201611241691.XA 2016-12-29 2016-12-29 A method and system for generating high frame rate video based on deep learning Active CN106686472B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611241691.XA CN106686472B (en) 2016-12-29 2016-12-29 A method and system for generating high frame rate video based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611241691.XA CN106686472B (en) 2016-12-29 2016-12-29 A method and system for generating high frame rate video based on deep learning

Publications (2)

Publication Number Publication Date
CN106686472A CN106686472A (en) 2017-05-17
CN106686472B true CN106686472B (en) 2019-04-26

Family

ID=58872327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611241691.XA Active CN106686472B (en) 2016-12-29 2016-12-29 A method and system for generating high frame rate video based on deep learning

Country Status (1)

Country Link
CN (1) CN106686472B (en)

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9756375B2 (en) 2015-01-22 2017-09-05 Microsoft Technology Licensing, Llc Predictive server-side rendering of scenes
CN107481209B (en) * 2017-08-21 2020-04-21 北京航空航天大学 An image or video quality enhancement method based on convolutional neural network
CN107613299A (en) * 2017-09-29 2018-01-19 杭州电子科技大学 A Method of Using Generative Networks to Improve Frame Rate Upconversion
CN107886081B (en) * 2017-11-23 2021-02-02 武汉理工大学 Two-way U-Net deep neural network intelligent classification identification method for dangerous behaviors in mines
CN108111860B (en) * 2018-01-11 2020-04-14 安徽优思天成智能科技有限公司 Lost frame prediction and recovery method of video sequence based on deep residual network
CN108322685B (en) * 2018-01-12 2020-09-25 广州华多网络科技有限公司 Video frame insertion method, storage medium and terminal
CN108600655A (en) * 2018-04-12 2018-09-28 视缘(上海)智能科技有限公司 A kind of video image synthetic method and device
CN108600762B (en) * 2018-04-23 2020-05-15 中国科学技术大学 Progressive video frame generation method combining motion compensation and neural network algorithm
CN108830812B (en) * 2018-06-12 2021-08-31 福建帝视信息科技有限公司 Video high frame rate reproduction method based on grid structure deep learning
CN108810551B (en) * 2018-06-20 2021-01-12 Oppo(重庆)智能科技有限公司 Video frame prediction method, terminal and computer storage medium
CN108961236B (en) * 2018-06-29 2021-02-26 国信优易数据股份有限公司 Circuit board defect detection method and device
CN110780664A (en) * 2018-07-25 2020-02-11 格力电器(武汉)有限公司 Robot control method and device and sweeping robot
CN109379550B (en) * 2018-09-12 2020-04-17 上海交通大学 Convolutional neural network-based video frame rate up-conversion method and system
CN109068174B (en) * 2018-09-12 2019-12-27 上海交通大学 Video frame rate up-conversion method and system based on cyclic convolution neural network
CN109120936A (en) * 2018-09-27 2019-01-01 贺禄元 A kind of coding/decoding method and device of video image
US10924525B2 (en) 2018-10-01 2021-02-16 Microsoft Technology Licensing, Llc Inducing higher input latency in multiplayer programs
CN109360436B (en) * 2018-11-02 2021-01-08 Oppo广东移动通信有限公司 Video generation method, terminal and storage medium
CN110163061B (en) * 2018-11-14 2023-04-07 腾讯科技(深圳)有限公司 Method, apparatus, device and computer readable medium for extracting video fingerprint
CN111371983A (en) * 2018-12-26 2020-07-03 清华大学 A kind of video online stabilization method and system
CN109922372B (en) * 2019-02-26 2021-10-12 深圳市商汤科技有限公司 Video data processing method and device, electronic equipment and storage medium
JP7201073B2 (en) * 2019-04-01 2023-01-10 株式会社デンソー Information processing equipment
CN110636221A (en) * 2019-09-23 2019-12-31 天津天地人和企业管理咨询有限公司 System and method for super frame rate of sensor based on FPGA
CN112584158B (en) * 2019-09-30 2021-10-15 复旦大学 Video quality enhancement method and system
CN114730372A (en) * 2019-11-27 2022-07-08 Oppo广东移动通信有限公司 Method and apparatus for stylizing video, and storage medium
CN113630621B (en) 2020-05-08 2022-07-19 腾讯科技(深圳)有限公司 Video processing method, related device and storage medium
RU2747965C1 (en) * 2020-10-05 2021-05-18 Самсунг Электроникс Ко., Лтд. Frc occlusion processing with deep learning
US11889227B2 (en) 2020-10-05 2024-01-30 Samsung Electronics Co., Ltd. Occlusion processing for frame rate conversion using deep learning
CN113516050A (en) * 2021-05-19 2021-10-19 江苏奥易克斯汽车电子科技股份有限公司 Method and device for scene change detection based on deep learning
CN113420771B (en) * 2021-06-30 2024-04-19 扬州明晟新能源科技有限公司 Colored glass detection method based on feature fusion

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202285412U (en) * 2011-09-02 2012-06-27 深圳市华美特科技有限公司 Low frame rate transmission or motion image twinkling elimination system
CN104102919A (en) * 2014-07-14 2014-10-15 同济大学 Image classification method capable of effectively preventing convolutional neural network from being overfit
CN105787510A (en) * 2016-02-26 2016-07-20 华东理工大学 System and method for realizing subway scene classification based on deep learning
CN106022237A (en) * 2016-05-13 2016-10-12 电子科技大学 Pedestrian detection method based on end-to-end convolutional neural network
CN106228124A (en) * 2016-07-17 2016-12-14 西安电子科技大学 SAR image object detection method based on convolutional neural networks

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202285412U (en) * 2011-09-02 2012-06-27 深圳市华美特科技有限公司 Low frame rate transmission or motion image twinkling elimination system
CN104102919A (en) * 2014-07-14 2014-10-15 同济大学 Image classification method capable of effectively preventing convolutional neural network from being overfit
CN105787510A (en) * 2016-02-26 2016-07-20 华东理工大学 System and method for realizing subway scene classification based on deep learning
CN106022237A (en) * 2016-05-13 2016-10-12 电子科技大学 Pedestrian detection method based on end-to-end convolutional neural network
CN106228124A (en) * 2016-07-17 2016-12-14 西安电子科技大学 SAR image object detection method based on convolutional neural networks

Also Published As

Publication number Publication date
CN106686472A (en) 2017-05-17

Similar Documents

Publication Publication Date Title
CN106686472B (en) A method and system for generating high frame rate video based on deep learning
CN104217214B (en) RGB‑D Human Behavior Recognition Method Based on Configurable Convolutional Neural Network
Zhao et al. Real-time and light-weighted unsupervised video object segmentation network
Feng et al. SGANVO: Unsupervised deep visual odometry and depth estimation with stacked generative adversarial networks
CN110110624B (en) Human body behavior recognition method based on DenseNet and frame difference method characteristic input
CN116012950B (en) Skeleton action recognition method based on multi-heart space-time attention pattern convolution network
CN110572696A (en) A Video Generation Method Combining Variational Autoencoders and Generative Adversarial Networks
CN108830252A (en) A kind of convolutional neural networks human motion recognition method of amalgamation of global space-time characteristic
CN108805083A (en) The video behavior detection method of single phase
CN107862376A (en) A kind of human body image action identification method based on double-current neutral net
CN111310707A (en) Skeleton-based method and system for recognizing attention network actions
CN110175951A (en) Video Style Transfer method based on time domain consistency constraint
Tan et al. Bidirectional long short-term memory with temporal dense sampling for human action recognition
CN109993820B (en) Automatic animation video generation method and device
CN110059598A (en) The Activity recognition method of the long time-histories speed network integration based on posture artis
CN112767519B (en) A controllable expression generation method combined with style transfer
CN113128424A (en) Attention mechanism-based graph convolution neural network action identification method
CN110110686A (en) Based on the human motion recognition methods for losing double-current convolutional neural networks more
CN113570036B (en) Hardware Accelerator Architecture Supporting Dynamic Neural Network Sparse Models
CN119963954A (en) A method for fusion of infrared and visible light images based on diffusion model
CN119206865A (en) A lightweight behavior recognition method based on skeleton data
CN111160170B (en) Self-learning human behavior recognition and anomaly detection method
Huo et al. 3D skeleton aware driver behavior recognition framework for autonomous driving system
CN116109509A (en) Real-time low-illumination image enhancement method and system based on pixel-by-pixel gamma correction
WO2020001046A1 (en) Video prediction method based on adaptive hierarchical kinematic modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant