CN120876919A - Image processing system, image processing method and training system - Google Patents
Image processing system, image processing method and training systemInfo
- Publication number
- CN120876919A CN120876919A CN202410538378.0A CN202410538378A CN120876919A CN 120876919 A CN120876919 A CN 120876919A CN 202410538378 A CN202410538378 A CN 202410538378A CN 120876919 A CN120876919 A CN 120876919A
- Authority
- CN
- China
- Prior art keywords
- module
- image
- tensor
- image processing
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
- G06V10/765—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
An image processing system, an image processing method and a training system, wherein the image processing method comprises the steps of receiving an image by a preprocessing module in an image processing module and downsampling the image to obtain a downsampled tensor, processing the downsampled tensor by a neural network module in the image processing module based on a plurality of first parameters and generating an output tensor, upsampling the output tensor by an upsampling module in the image processing module to generate an upsampled tensor with the same size as the image, and performing element-by-element addition on the upsampled tensor and the image by an adding module to obtain the output image.
Description
Technical Field
The present invention relates to image processing technology, and more particularly, to an image processing technology using a neural network.
Background
Because of the performance limitations of real-time image processing chips, many products do not open artificial intelligence model (e.g., CNN network) processing when 4K or 8K film inputs are received.
Disclosure of Invention
In view of the above, some embodiments of the invention provide an image processing system, an image processing method and a training system to improve the prior art.
Some embodiments of the present invention provide an image processing system including an image processing module including a preprocessing module configured to receive an image and down-sample the image to obtain a down-sample tensor, a neural network module configured to process the down-sample tensor and generate an output tensor based on a plurality of first parameters, and an addition module configured to up-sample the output tensor to generate an up-sample tensor of the same size as the image, and an addition module configured to perform element-wise addition on the up-sample tensor and the image to obtain the output image.
Some embodiments of the present invention provide an image processing method including receiving an image by a preprocessing module in an image processing module and downsampling the image to obtain a downsampled tensor, processing the downsampled tensor and generating an output tensor based on a plurality of first parameters by a neural network module in the image processing module, upsampling the output tensor by an upsampling module in the image processing module to generate an upsampled tensor having the same size as the image, and performing element-wise addition on the upsampled tensor and the image by an addition module to obtain an output image.
Some embodiments of the present invention provide a training system comprising a processing module and an image processing module to be trained, wherein the image processing module to be trained comprises a preprocessing module, a neural network module, an up-sampling module and an addition module, the preprocessing module is configured to receive a training input image and down-sample the training input image to obtain a down-sampling tensor, the neural network module is configured to process the down-sampling tensor and generate an output tensor based on a plurality of first training parameters, the up-sampling module is configured to up-sample the output tensor to generate an up-sampling tensor with the same size as the training input image, and the addition module is configured to perform element-wise addition on the up-sampling tensor and the training input image to obtain a training output image, the processing module is configured to train the image processing module to be trained with a plurality of training images in a training set and a plurality of target images corresponding to the training images to obtain trained parameter values of each of a plurality of image processing training parameters of the image processing module to be trained, wherein the plurality of image processing parameters comprise the plurality of first training parameters.
Some embodiments of the invention provide a training system, which comprises a processing module and a neural network module to be trained, wherein the neural network module to be trained comprises a clipping module and a classification neural network module, the clipping module is configured to receive a training input image and clip a plurality of clipping images on the training input image, the classification neural network module comprises a plurality of training parameters, the classification neural network module is configured to generate image quality classifications corresponding to the training input image based on the plurality of clipping images, and the processing module is configured to train the neural network module to be trained by using the plurality of training images in a training set and the image quality classification label of each training image to obtain trained parameter values of each training parameter.
Based on the foregoing, some embodiments of the present invention provide an image processing system, an image processing method, and a training system. In the image processing system, the image is firstly subjected to downsampling by the preprocessing module, and then the original image is added after being processed under the condition of low resolution, so that the input size of the neural network module can be reduced. Reducing the input size of the neural network module can reduce the amount of computation, buffering required by the computation and energy consumption of the neural network module during operation, so that the neural network module can be designed into a neural network with a deeper structure under the same computation resource to obtain a larger field of view. In addition, the parameter values obtained through training of the neural network training system can quickly obtain the image processing effect through the neural network.
Drawings
FIG. 1 is a block diagram of an image processing system according to some embodiments of the invention.
FIG. 2 is a timing diagram illustrating operation of an image processing system according to some embodiments of the invention.
Fig. 3A, 3B, and 3C are schematic diagrams illustrating a pixel de-shuffling operation according to some embodiments of the present invention.
Fig. 4 is a block diagram of an upsampling module according to some embodiments of the present invention.
Fig. 5 is a block diagram of an image processing system according to some embodiments of the invention.
Fig. 6 is a block diagram of an upsampling module according to some embodiments of the present invention.
FIG. 7 is a schematic diagram illustrating clipping according to some embodiments of the invention.
Fig. 8 is a block diagram of a neural network module, according to some embodiments of the invention.
Fig. 9 is a diagram illustrating a residual network layer according to some embodiments of the invention.
FIG. 10 is a block diagram of a training system according to some embodiments of the invention.
FIG. 11 is a block diagram of a training system according to some embodiments of the invention.
Fig. 12 is a block diagram of an electronic device system according to some embodiments of the invention.
Fig. 13 is a flowchart illustrating an image processing method according to some embodiments of the invention.
Fig. 14 is a flowchart illustrating an image processing method according to some embodiments of the invention.
Fig. 15 is a flowchart illustrating an image processing method according to some embodiments of the invention.
Fig. 16 is a flowchart illustrating an image processing method according to some embodiments of the invention.
Fig. 17 is a flowchart illustrating an image processing method according to some embodiments of the invention.
Symbol description
100,500 Image processing system
101 Image processing module
1011,1003 Pretreatment module
1012,1004 Neural network module
1013,1005 Up-sampling module
102,503 Memory module
103,1006 Summing module
104,701 Image
201 Current frame timing
202 Up-sampling output timing
203 Memory module timing
300,300': Tensor
300-1 To 300-C,301-1 to 316-1,301-2 to 316-2,301- (W+1) to 316- (W+1), 301-N to 316-N as the element
W, H, C, N, M, P, r is a positive integer
301-316 Channel elements
401 Amplification module
402 Convolution module
501 Picture quality detection module
502 Load module
601,1103 Cutting module
602,1104 Classified neural network module
603 Mapping module
701-1 To 701-M cropping images
702-1 To 702-M positions
801-80P, 900: residual network layer
901 Network layer
902 Path
903 Summing module
1000,1100 Training System
1001,1101 Processing module
1002 Image processing module to be trained
1007,1105 Training input images
1102 Neural network module to be trained
1200 Electronic equipment
1201 Processing unit
1202 Memory storage
1203 Non-volatile memory
S1301 to S1304, S1401, S1501 to S1502, S1601 to S1602, S1701 to S1703 steps
Detailed Description
The foregoing and other technical content, features and technical effects of the present invention will be apparent from the following detailed description of the embodiments with reference to the accompanying drawings. Any modifications and variations which do not affect the technical effects and objectives achieved by the present invention should still fall within the scope of the disclosure. The same reference numbers will be used throughout the drawings to refer to the same or like elements. The term "connected" as referred to in the following embodiments may refer to any direct or indirect, wired or wireless connection means. The terms "first" or "second" and the like, as used herein, are used to distinguish or refer to the same or similar element or structure and do not necessarily imply a sequential order of such elements on the system. It is to be understood that in some cases or configurations, ordinal terms are used interchangeably without affecting the practice of the present invention.
FIG. 1 is a block diagram of an image processing system according to some embodiments of the invention. Referring to fig. 1, an image processing system 100 includes an image processing module 101, a memory module 102, and an adding module 103. The image processing module 101 and the memory module 102 are configured to receive a copy of the image 104, respectively. The image processing module 101 is configured to process a duplicate image of the image 104. The memory module 102 temporarily stores the copied image of the image 104 when the image processing module 101 processes the copied image of the image 104. The memory module 102 is, for example, a Static random-access memory (SRAM) or a Dynamic random-access memory (DRAM).
The image processing module 101 includes a preprocessing module 1011, a neural network module 1012, and an upsampling module 1013. The preprocessing module 1011 is configured to receive a duplicate of the image 104 and downsample the duplicate of the image 104 to generate a downsampled tensor of the image 104. The neural network module 1012 includes a neural network. The neural network module 1012 includes a plurality of parameters, wherein the parameters of the neural network module 1012 include a plurality of weights of the neural network module 1012. For convenience of description, the plurality of parameters of the neural network module 1012 are referred to as first parameters. The neural network module 1012 is configured to process the received tensors and generate an output tensor based on the plurality of first parameters. In the following description, the architecture of the neural network module 1012 will be further described.
The upsampling module 1013 is configured to upsample the output tensor to generate an upsampled tensor of the same size as the image 104. The addition module 103 is configured to perform element-wise addition on the two received tensors.
The following describes in detail how the image processing method and the modules of the image processing system 100 cooperate with each other according to some embodiments of the present invention.
Fig. 13 is a flowchart illustrating an image processing method according to some embodiments of the invention. Referring to fig. 1 and 13, the image processing method includes steps S1301 to S1304. In step S1301, the preprocessing module 1011 in the image processing module 101 receives the copied image of the image 104 and downsamples the copied image of the image 104 to generate a downsampled tensor. In step S1302, the down-sampling tensor is processed by the neural network module 1012 in the image processing module 101 based on the plurality of first parameters of the neural network module 1012 and an output tensor is generated. The output tensor of the neural network module 1012 is up-sampled by the up-sampling module 1013 in the image processing module 101 to generate an up-sampled tensor of the same size as the image 104 in step S1303. In step S1304, the addition module 103 performs element-wise addition on the upsampled tensor and the copy image of the image 104 stored in the memory module 102 to obtain an output image.
In some embodiments of the present invention, the image 104 is a high resolution image. For example, the image 104 is a 4K or 8K image.
FIG. 2 is a timing diagram illustrating operation of an image processing system according to some embodiments of the invention. Referring to fig. 1, fig. 2, and fig. 13, in some embodiments of the invention, the image processing system 100 is configured to process frames in a film, that is, the image 104 is a frame in the film. As shown in fig. 2, frames in a movie are loaded into the image processing system 100 for processing based on the current frame timing 201. Since the image processing module 101 requires processing time to process multiple times to obtain important information, the image processing system 100 will output the upsampling tensor (as shown by the upsampling output sequence 202) from the upsampling module 1013 after the current frame is loaded as the image 104. At this time, as shown in the memory module timing 203 of the memory module 102 in fig. 2, the memory module 102 starts to operate to output the original copy image of the stored image 104 to the adder module 103.
In the foregoing embodiment, the image 104 is downsampled by the preprocessing module 1011 and is processed at a low resolution and then added back to the original image, so that the input size of the neural network module 1012 can be reduced. Reducing the input size of the neural network module 1012 may reduce the amount of computation, buffering (buffering) and power consumption required for the computation of the neural network module 1012 during operation, so that the neural network module 1012 may be designed as a deeper-structured neural network to obtain a larger field of view under the same computing resources. Even though the memory module 102 is needed to store the original image 104 in the above embodiment, the resources used as a whole are saved compared to directly processing the image 104.
Fig. 3A, 3B, and 3C are schematic diagrams illustrating a pixel de-shuffling operation according to some embodiments of the present invention. Referring to fig. 3A, 3B and 3C, the tensor 300 is a 3-axis (axes) tensor, whose shape (shape) is (h×r, w×r, C), where r=4, H, W and C are positive integers. That is, tensor 300 has H×r elements on axis 0, W×r elements on axis 1, and C elements on axis 2. The 2 nd axis of tensor 300 is also referred to as the channel axis of tensor 300. The operation of performing pixel de-shuffling on the tensor 300 is to combine, for each of the elements 300-1 to 300-C on the channel axis of the tensor 300, the elements spaced apart from each other by r on the 0-th and 1-th axes into a new channel element based on a reduction ratio r to convert the tensor 300 into a 3-axis tensor of the shape (H, W, c×r 2).
The reduction ratio r=4 will be described below. Referring to fig. 3B and 3C, the tensor 300' is an element of the tensor 300 on the channel axis of the tensor 300. In fig. 3B and 3C, on the 0 th and 1 st axes of tensor 300', elements 30k-1 to 30k-N are elements spaced apart from each other by r on the 0 th and 1 st axes, where k=1, 2..16, n=h×w. Thus, elements 30k-1 to 30k-N are grouped into new channel elements 30k (as shown in fig. 3C), where k=1, 2. It should be noted that, since the element content of the tensor 300 is not actually changed when the pixel shuffling is performed, when the tensor 300 is an image, the pixel information of the tensor 300 is retained when the pixel shuffling is performed on the tensor 300 based on the reduction magnification r, where the pixel information of the image is information included in each pixel of the image, such as a pixel RGB value, etc.
It is further worth noting that performing pixel shuffling on a 3-axis tensor based on the magnification r is the inverse of the above-described pixel de-shuffling.
Fig. 14 is a flowchart illustrating an image processing method according to some embodiments of the invention. Referring to fig. 1, 13 and 14, in some embodiments of the invention, step S1301 includes step S1401. In step S1401, the preprocessing module 1011 performs pixel de-shuffling on the copy image of the image 104 based on a reduction magnification (e.g. the reduction magnification r) to generate a downsampled tensor for the downsampled image 104, wherein the downsampled tensor holds pixel information of the image 104.
In this embodiment, using pixel de-shuffling the downsampled image 104 may reduce the input size of the neural network module 1012 without losing pixel information, and at the same time allow the neural network module 1012 to receive the complete pixel information of the image 104. Taking the above-mentioned reduction ratio r=4 as an example, if the image 104 is an 8K image (size 7680×4320), the size is 1960×1080×16 after the pixel de-shuffling based on the reduction ratio 4, and thus a neural network with a smaller input size can be used.
Of course, the preprocessing module 1011 may also perform downsampling on the duplicate image 104 to generate a downsampled tensor based on other downsampling methods, such as deleting the element using a delete method or using a pooling layer pooling layer and a convolution layer.
In some embodiments of the present invention, the output tensor of the neural network module 1012 is sized to be the same as the downsampled tensor generated by the preprocessing module 1011. The upsampling module 1013 is configured to perform pixel shuffling on the output tensor based on one magnification to upsample the output tensor, wherein the aforementioned magnification is the same as the reduction magnification of the preprocessing module 1011.
Fig. 4 is a block diagram of an upsampling module according to some embodiments of the present invention. Fig. 15 is a flowchart illustrating an image processing method according to some embodiments of the invention. Referring to fig. 4, 13 and 15, in this embodiment, the up-sampling module 1013 includes an amplifying module 401 and a convolution module 402. The amplification module 401 is configured to amplify the output tensor to generate an amplified output tensor. The convolution module 402 includes at least one convolution layer and has a plurality of parameters, and the plurality of parameters of the convolution module 402 include weights of the convolution layer. For ease of description, the plurality of parameters of convolution module 402 are referred to as second parameters. The convolution module 402 is configured to process the amplified output tensor based on a plurality of second parameters to generate an upsampled tensor. Wherein the amplification module 401 may amplify the output tensor based on any amplification method. The amplification method is, for example, interpolation or 0-padding, and the invention is not limited thereto.
In this embodiment, the step S1303 includes steps S1501 and S1502. In step S1501, the output tensor is amplified by the amplification module 401 to generate an amplified output tensor. In step S1502, the amplified output tensor is processed by the convolution module 402 based on the aforementioned plurality of second parameters to generate an upsampled tensor.
Fig. 5 is a block diagram of an image processing system according to some embodiments of the invention. Fig. 16 is a flowchart illustrating an image processing method according to some embodiments of the invention. Referring to fig. 5, the image processing system 500 further includes a quality detection module 501, a loading module 502, and a memory module 503, compared to the image processing system 100. In this embodiment, the image 104 is a frame of a film. The image quality detection module 501 is configured to receive a copy of the image 104 and generate an index corresponding to an image quality classification of a film to which the image 104 belongs based on the copy of the image 104. The memory module 503 may employ, for example, double data rate synchronous dynamic random access memory (Double Data Rate Synchronous Dynamic Random Access Memory, DDR SDRAM) to speed up access.
The image quality classification of the film may include the compression rate and the image quality of the film content. For example, the image quality classification of a film is described in the following table (one), including 8K high bit rate, 8K low bit rate, the. Each image quality class corresponds to an index, e.g., 8K high bit rate index 0,8K low bit rate index 1, etc.
Watch 1
The loading module 502 is configured to obtain a plurality of image processing parameter values corresponding to the image quality classifications from the memory module 503 based on the index of the image quality classifications of the movies to which the corresponding images 104 belong, and load the obtained image processing parameter values into the image processing module 101.
The image processing parameter values include parameters required by the image processing module 101 during operation. For example, when the upsampling module 1013 adopts the architecture shown in the embodiment of fig. 4, the image processing parameter values include the parameter values of the first parameters of the neural network module 1012 and the second parameters of the convolution module 402.
Please refer to fig. 5 and fig. 16 at the same time. In this embodiment, the image processing method includes steps S1601 to S1602. In step S1601, an index corresponding to the image quality classification of the film to which the image 104 belongs is generated by the image quality detection module 501 based on the copied image of the image 104. In step S1602, the loading module 502 obtains a plurality of image processing parameter values corresponding to the image quality classifications from the memory module 503 based on the index of the image quality classifications of the movies to which the corresponding images 104 belong, and loads the image processing parameter values into the image processing module 101.
It should be noted that, the index of the image quality classification of the film to which the corresponding image 104 belongs is set to enable the loading module 502 to quickly find a plurality of image processing parameter values corresponding to the image quality classification in the memory module 503, and may be arbitrarily set, and is not limited to the embodiment described in table (a). For example, a value of 0 may be used as an index for the 2K low bit rate.
In the foregoing embodiment, since the mechanism of switching the parameters of the image processing module 101 based on the image quality classification of the film to which the image 104 belongs is adopted, different parameters can be adopted for different kinds of films to generate a preferable processing effect. Different processing effects can also be generated for different kinds of movies based on demand.
Fig. 6 is a block diagram of an upsampling module according to some embodiments of the present invention. FIG. 7 is a schematic diagram illustrating clipping according to some embodiments of the invention. Referring to fig. 6, the image quality detection module 501 includes a cropping (crop) module 601, a classification neural network module 602, and a mapping module 603. The cropping module 601 is configured to receive the duplicate image of the image 104 and crop a plurality of cropped images on the duplicate image of the image 104. The clipping position may be a fixed plurality of positions or a random plurality of positions.
Referring to fig. 7, in some embodiments of the present invention, the copy image of the image 104 is an image 701, the cropping module 601 crops the cropped image 701-1 to 701-M on the image 701 based on the fixed positions 702-1 to 702-M, wherein the size of the image 701 is 3840×2160×3, the size of the cropped image 701-1 to 701-M is 240×240×3, and M is a positive integer. Of course, the cropping module 601 may crop the cropped image on the image 701 based on other fixed positions, or randomly select a fixed number of positions from the positions 702-1 to 702-M to crop the cropped image on the image 701.
The classification neural network module 602 is configured to receive the plurality of cropped images and generate an image quality classification of the film to which the image 104 belongs based on the received cropped images. In some embodiments of the present invention, the classifying neural network module 602 includes a convolution layer, a full-connection layer, and a normalized exponential function layer (softmax layer), the convolution layer of the classifying neural network module 602 is configured to obtain the features of the plurality of cropping images, the full-connection layer integrates the features of the plurality of cropping images to generate a plurality of outputs, and the normalized exponential function layer receives the output of the full-connection layer and outputs a probability corresponding to each image classification to which the film of the image 104 belongs. For example, the image quality classification is described in table (a), the normalized exponential function layer is set to include 6 outputs, the 1 st output is the probability that the film belongs to the 8K high bit rate, the 2 nd output is the probability that the film belongs to the 8K low bit rate, and so on.
The mapping module 603 is configured to generate an index based on the image quality classification. For example, the mapping module 603 selects the image quality class with the highest probability based on the output of the normalized exponential function layer, and outputs the index corresponding to the image quality class with the highest probability. For example, the mapping module 603 determines that the image quality with the highest probability is classified as 8K high bit rate based on the output of the normalized exponential function layer, and the mapping module 603 generates an index 0.
Fig. 17 is a flowchart illustrating an image processing method according to some embodiments of the invention. Please refer to fig. 6, fig. 7 and fig. 17. In this embodiment, the step S1601 includes steps S1701 to S1703. In step S1701, the cropping module 601 receives the duplicated image of the image 104 and crops a plurality of cropped images on the duplicated image of the image 104. In step S1702, the classification neural network module 602 generates an image quality classification of the film to which the image 104 belongs based on the plurality of cropping images. In step S1703, the mapping module 603 generates an index based on the image quality classification of the image 104.
Fig. 8 is a block diagram of a neural network module, according to some embodiments of the invention. Fig. 9 is a diagram illustrating a residual network layer according to some embodiments of the invention. Referring to fig. 8 and 9, the neural network module 1012 includes serial residual network layers 801-80P, where P is a positive integer. The residual network layers 801-80 p are configured to receive the down-sampled tensor generated by the preprocessing module 1011 and to generate an output tensor after processing the down-sampled tensor. The architecture of each of the residual network layers 801-80 p is shown as residual network layer 900. The residual network layer 900 comprises a network layer 901, an summing module 903 and a path 902 directly connected to the summing module 903 by an input of the network layer 901. Wherein the network layer 901 comprises a neural network. It should be noted that the neural networks of the network layer 901 of each of the residual network layers 801 to 80p may be the same or different, and the present invention is not limited thereto. In this embodiment, the step S1302 includes receiving the downsampled tensors from the residual network layers 801-80P and generating the output tensors.
FIG. 10 is a block diagram of a training system according to some embodiments of the invention. Referring to fig. 10, the training system 1000 includes a processing module 1001 and an image processing module 1002 to be trained. The image processing module 1002 to be trained includes a preprocessing module 1003, a neural network module 1004, an upsampling module 1005, and an adding module 1006. The preprocessing module 1003 is configured to receive a duplicate of the training input image 1007 and downsample the duplicate of the training input image 1007 to obtain a downsampled tensor. The neural network module 1004 includes a plurality of first training parameters. The neural network module 1004 is configured to process the downsampled tensor and generate an output tensor based on the plurality of first training parameters. The upsampling module 1005 is configured to upsample the output tensor to generate an upsampled tensor of the same size as the training input image, and the adding module 1006 is configured to perform element-wise addition on the upsampled tensor and the image to obtain the training output image. The embodiments of the preprocessing module 1003, the neural network module 1004, the upsampling module 1005 and the summing module 1006 are the same as those of the preprocessing module 1011, the neural network module 1012, the upsampling module 1013 and the summing module 103, and thus, for each embodiment of the preprocessing module 1003, the neural network module 1004, the upsampling module 1005 and the summing module 1006, reference is made to the examples related to the preprocessing module 1011, the neural network module 1012, the upsampling module 1013 and the summing module 103.
The processing module 1001 is configured to input each of the plurality of training images as a training input image 1007 to the to-be-trained image processing module 1002 to train the to-be-trained image processing module 1002 using the plurality of training images in the training set and the plurality of target images corresponding to the training images. After training is completed, the processing module 1001 may obtain trained parameter values for each of a plurality of image processing training parameters of the image processing module 1002 to be trained. Wherein the image processing training parameters include the first training parameters.
In some embodiments of the present invention, the user gathers multiple sets of training sets for different image quality classifications (e.g., the image quality classifications described in table (one)) and the effects (e.g., noise reduction, sharpness or increased detail) that the image processing system 100 will produce after processing the image 104. The training system 1000 trains the image processing module to be trained 1002 based on the training sets to obtain trained parameter values for different sets of image processing training parameters. The training system 1000 further stores the trained parameter values of the image processing training parameters of the different sets into the memory module 503 according to the image quality classification, so that the loading module 502 is configured to extract the index based on the image quality classification of the movie corresponding to the image 104.
In some embodiments of the invention, the upsampling module 1005 is configured to perform pixel shuffling on the output tensors based on one magnification to upsample the output tensors output by the neural network module 1004.
In some embodiments of the present invention, the upsampling module 1005 is identical to the upsampling module 1013 shown in fig. 4, and includes an amplifying module and a convolution module. The amplification module of the upsampling module 1005 is configured to amplify the output tensor of the neural network module 1004 to generate an amplified output tensor. The convolution module of the upsampling module 1005 includes at least one convolution layer and has a plurality of parameters, and the plurality of parameters of the convolution module of the upsampling module 1005 include weights of the convolution layer. For ease of illustration, the plurality of parameters of the convolution module of the upsampling module 1005 is referred to as the second training parameters. The convolution module of the upsampling module 1005 is configured to process the amplified output tensor based on the aforementioned plurality of second training parameters to generate an upsampled tensor. The implementation of the amplifying module and the convolution module of the upsampling module 1005 is the same as that of the amplifying module 401 and the convolution module 402, and thus, reference is made to the examples related to the amplifying module 401 and the convolution module 402 for each implementation of the amplifying module and the convolution module of the upsampling module 1005.
FIG. 11 is a block diagram of a training system according to some embodiments of the invention. Referring to fig. 11, the training system 1100 includes a processing module 1101 and a neural network module 1102 to be trained. Wherein the neural network module to be trained 1102 includes a clipping module 1103 and a classification neural network module 1104. The cropping module 1103 is configured to receive the training input image 1105 and crop a plurality of cropped images on the replica image of the training input image 1105. The classification neural network module 1104 includes a plurality of training parameters, and the classification neural network module 1104 is configured to generate an image quality classification corresponding to the training input image based on the cropped image. The embodiments of the clipping module 1103 and the classification neural network module 1104 are the same as the clipping module 601 and the classification neural network module 602, and thus, for each embodiment of the clipping module 1103 and the classification neural network module 1104, reference may be made to the examples related to the clipping module 601 and the classification neural network module 602.
The processing module 1101 is configured to train the neural network module 1102 to be trained with a plurality of training images in a training set and an image quality class label for each training image to obtain trained parameter values for each training parameter.
In some embodiments of the present invention, the trained parameter value of each training parameter obtained by the processing module 1101 is loaded into the classification neural network module 602, so that the classification neural network module 602 can generate the image quality classification of the film to which the image 104 belongs.
It is noted that the processing module 1001 and the processing module 1101 may be general purpose processors, including a central processing unit (Central Processing Unit, CPU), a digital signal Processor (DIGITAL SIGNAL Processor, DSP), an Application SPECIFIC INTEGRATED Circuit (ASIC), a Field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA), or other Programmable logic device.
Fig. 12 is a block diagram of an electronic device system according to some embodiments of the invention. As shown in fig. 12, at the hardware level, the electronic device 1200 includes a processing unit 1201, a memory 1202, and a nonvolatile storage 1203. The Memory 1202 is, for example, a Random-Access Memory (RAM). The nonvolatile memory 1203 is, for example, at least 1 disk memory or the like. Of course, the electronic device 1200 may also include hardware required for other functions.
Memory 1202 and nonvolatile memory 1203 are used to store programs, which may include program code, including computer operating instructions. The memory 1202 and the nonvolatile memory 1203 provide instructions and data to the processing unit 1201. The processing unit 1201 reads a corresponding computer program from the nonvolatile memory 1203 into the memory 1202 and then runs, forming the image processing system 100 or 500 on a logic level.
The processing unit 1201 may be an integrated circuit die having signal processing capabilities. In implementation, each method and step disclosed in the foregoing embodiments may be implemented by an integrated logic circuit of hardware in the processing unit 1201 or an instruction in software form. The processing unit 1201 may be a general purpose processor including a central processing unit, digital signal processor, application specific integrated circuit, field programmable gate array, or other programmable logic device, and may implement or perform the methods, steps disclosed in the foregoing embodiments.
The present disclosure also provides a computer-readable storage medium storing at least one instruction that, when executed by the processing unit 1201 of the electronic device 1200, can cause the processing unit 1201 of the electronic device 1200 to perform the methods and steps disclosed in the foregoing embodiments.
Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory media (transmission media), such as modulated data signals and carrier waves.
The foregoing embodiments provide an image processing system, an image processing method, and a training system. In the image processing system, the image is firstly subjected to downsampling by the preprocessing module, and then the original image is added after being processed under the condition of low resolution, so that the input size of the neural network module can be reduced. Reducing the input size of the neural network module can reduce the amount of computation, buffering required by the computation and energy consumption of the neural network module during operation, so that the neural network module can be designed into a neural network with a deeper structure under the same computation resource to obtain a larger field of view. In addition, the parameter values obtained through training of the neural network training system can quickly obtain the image processing effect through the neural network.
Although the present invention has been described with reference to the above embodiments, it should be understood that the invention is not limited thereto, but may be modified or altered somewhat by persons skilled in the art without departing from the spirit and scope of the invention.
Claims (10)
1. An image processing system, comprising:
an image processing module including a preprocessing module configured to receive an image and downsample the image to obtain a downsampled tensor, a neural network module configured to process the downsampled tensor based on a plurality of first parameters and generate an output tensor, and an upsampling module configured to upsample the output tensor to generate an upsampled tensor of the same size as the image, and
An addition module is configured to perform an element-wise addition on the upsampled tensor and the image to obtain an output image.
2. The image processing system of claim 1, wherein the preprocessing module performs pixel de-shuffling on the image based on a downscaling factor to downsample the image, wherein the downsampled tensor holds a pixel information of the image.
3. The image processing system of claim 1, wherein the upsampling module is configured to perform pixel shuffling on the output tensor based on a magnification to upsample the output tensor.
4. The image processing system of claim 1, wherein the upsampling module comprises:
An amplifying module configured to amplify the output tensor to generate an amplified output tensor, and
A convolution module comprising at least one convolution layer is configured to process the amplified output tensor based on a plurality of second parameters to generate the upsampled tensor.
5. The image processing system of claim 1, wherein the image is a high resolution image.
6. The image processing system of claim 1, wherein the image processing system comprises a quality detection module and a loading module, wherein the quality detection module is configured to generate an index corresponding to a quality class of a film to which the image belongs based on the image, and the loading module is configured to obtain a plurality of image processing parameter values corresponding to the quality class from a memory module based on the index, and load the plurality of image processing parameter values into the image processing module.
7. The image processing system of claim 6, wherein the image quality detection module comprises a cropping module, a classification neural network module, and a mapping module, wherein the cropping module is configured to receive the image and crop a plurality of cropped images on the image, the classification neural network module is configured to generate the image quality classification of the movie to which the image belongs based on the plurality of cropped images, and the mapping module is configured to generate the index based on the image quality classification.
8. The image processing system of claim 1, wherein the neural network module comprises a plurality of residual network layers in series, the plurality of residual network layers configured to receive the downsampled tensor and to generate the output tensor.
9. An image processing method, comprising:
(a) Receiving an image by a preprocessing module in an image processing module, and downsampling the image to obtain a downsampled tensor;
(b) Processing the downsampled tensor by a neural network module in the image processing module based on a plurality of first parameters and generating an output tensor;
(c) Upsampling the output tensor by an upsampling module in the image processing module to generate an upsampled tensor of the same size as the image, and
(D) An addition module performs an element-by-element addition on the upsampled tensor and the image to obtain an output image.
10. A training system comprises a processing module and an image processing module to be trained, wherein the image processing module to be trained comprises:
A preprocessing module configured to receive a training input image and downsample the training input image to obtain a downsampled tensor;
A neural network module configured to process the downsampled tensor and generate an output tensor based on a plurality of first training parameters;
An upsampling module configured to upsample the output tensor to generate an upsampled tensor of the same size as the training input image, and
An addition module configured to perform an element-wise addition on the upsampled tensor and the training input image to obtain a training output image;
The processing module is configured to train the image processing module to be trained using a plurality of training images in a training set and a plurality of target images corresponding to the plurality of training images to obtain a trained parameter value for each of a plurality of image processing training parameters of the image processing module to be trained, wherein the plurality of image processing training parameters include the first training parameter.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410538378.0A CN120876919A (en) | 2024-04-30 | 2024-04-30 | Image processing system, image processing method and training system |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410538378.0A CN120876919A (en) | 2024-04-30 | 2024-04-30 | Image processing system, image processing method and training system |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN120876919A true CN120876919A (en) | 2025-10-31 |
Family
ID=97459832
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410538378.0A Pending CN120876919A (en) | 2024-04-30 | 2024-04-30 | Image processing system, image processing method and training system |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN120876919A (en) |
-
2024
- 2024-04-30 CN CN202410538378.0A patent/CN120876919A/en active Pending
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11570477B2 (en) | Data preprocessing and data augmentation in frequency domain | |
| CN113628115B (en) | Image reconstruction processing method, device, electronic equipment and storage medium | |
| US20210295475A1 (en) | Method and apparatus for generating image, and electronic device | |
| CN119206236B (en) | Image semantic segmentation method and device integrating Fourier transform | |
| CN115760625B (en) | Terminal image quality enhancement method and apparatus, computer-readable storage medium | |
| CN116682076A (en) | Multi-scale target detection method, system and equipment for ship safety supervision | |
| CN113962861A (en) | Image reconstruction method, apparatus, electronic device and computer readable medium | |
| CN115526773A (en) | Image reconstruction method and device, equipment and storage medium | |
| CN120876919A (en) | Image processing system, image processing method and training system | |
| CN112580656A (en) | End-to-end text detection method, system, terminal and storage medium | |
| CN116030256A (en) | Small object segmentation method, small object segmentation system, device and medium | |
| US20250378529A1 (en) | Image super-resolution method and apparatus | |
| TWI874201B (en) | Image processing system, image processing method and training system | |
| CN120411864A (en) | Forged video detection method and system based on capture of selective multi-scale spatiotemporal fusion module | |
| CN115713689A (en) | SAR image target detection method based on feature fusion and cross-layer connection | |
| CN117765372B (en) | Industrial defect sample image generation method and device, electronic equipment and storage medium | |
| CN114663320A (en) | Image processing method and data set expansion method, storage medium and electronic device | |
| CN114612296A (en) | Method, apparatus and computer-readable storage medium for high resolution image reconstruction | |
| US20250061686A1 (en) | An image processing method and apparatus | |
| CN117372258A (en) | Image processing method and device, electronic equipment and storage medium | |
| CN116029905A (en) | Face super-resolution reconstruction method and system based on progressive difference complementation | |
| CN115953310A (en) | Image denoising method, device, chip and module equipment | |
| US20250272786A1 (en) | Image super-resolution method, super-resolution network parameter adjustment method, related apparatus, and medium | |
| CN117689669B (en) | Retinal blood vessel segmentation method based on structure-adaptive context-sensitive | |
| CN119379570B (en) | Image enhancement method and device for monitoring video |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |