Background
When a high-brightness area irradiated by a strong light source (sunlight, lamp, reflective light, etc.) and an area with relatively low brightness such as shadow, backlight, etc. exist in an image at the same time, the image output by the camera can appear that the bright area becomes white due to overexposure and the dark area becomes black due to underexposure. That is, the camera's behavior in the same scene for the brightest and darker areas is limited, which is the "dynamic range".
Because of the characteristic of large sensing range of the sensor, overexposure and darkness areas can exist in the remote sensing image, so that the image quality is seriously affected. To address the problem of limited dynamic range of the sensor, one effective approach is to use high dynamic range imaging techniques that aid in image restoration by deep learning. For this reason, it is necessary to provide a data set composed of paired low-quality-high-quality remote sensing images in a high dynamic range application scene for a depth learning-based image enhancement model. Unlike other remote sensing data, the high dynamic range remote sensing data refers to remote sensing image data which is photographed in an analog high dynamic range imaging environment provided for applying the high dynamic range technology and needs to be enhanced, wherein the remote sensing image data comprises dark light and overexposure areas with different degrees.
Image enhancement techniques based on deep learning require a large amount of data to train the model, but the acquisition of real pair images is difficult, requiring high time and labor costs. In addition, the real dataset has the following two problems:
1. The shooting scene of real data is limited. For extreme illuminance images, the diversity of shooting scenes is limited by ensuring that no displacement exists between the extreme illuminance images and the corresponding normal illumination images during imaging, so that the imaging equipment and the shooting scenes are required to be ensured to be relatively static; for an image containing a shadow area caused by a solar altitude angle, shooting scene diversity is limited not only by the above factors, but also by a scene with a large change in topography that needs to be selected during imaging. Due to the common influence of the solar altitude and the topography, it is difficult to obtain a remote sensing image containing an extreme illuminance region and a contrast image not containing the extreme illuminance region in the same scene. This greatly increases the cost of capturing image data and limits the selection of a captured scene.
2. The photographed image data is affected by the characteristics of the imaging device, and it is difficult to migrate well to other imaging devices. The characteristics of hardware of different imaging devices are different, and shot data also has different noise, so that when an algorithm model learned from image data shot by one imaging device is applied to an image shot by another imaging device, consistent performance and effect are difficult to obtain, namely, the algorithm model cannot be adaptively generalized aiming at the currently deployed imaging device, and optimal performance cannot be obtained. The neural network model based on deep learning requires a large amount of data to train to ensure the performance, so that it is required to make the simulation data as close as possible to the extreme illumination image shot by the real imaging device.
Current methods of synthesizing simulated data typically only handle a single task. For example, the original image is entirely converted into a dark-light image or an overexposed image. However, for remote sensing image data, the size of the remote sensing image data can reach hundreds of millions of pixels, the image of the remote sensing image data can simultaneously contain dark light and overexposed areas, and if only a pure dark light-overexposed image is used as a training set, the remote sensing image imaging range under the real illumination condition can not be closed, so that the processing capacity of the model for high dynamic range data is insufficient.
In order to generate synthetic data more closely approaching to the real remote sensing imaging situation, a method capable of generating synthetic data of a scene with normal, dim light and overexposure at the same time and enabling pixel values of a transition area to be smooth and natural is urgently needed.
Disclosure of Invention
Aiming at the defects and shortcomings existing in the prior art, the invention creatively provides a high dynamic range remote sensing image data synthesis method and system guided by using a smooth mask, which aims at solving the defect that the existing method only aims at the single extreme illumination condition to generate synthesized data and cannot be applied to the condition that shadows, overexposure, normal areas and the like exist in an actual large-scale remote sensing image at the same time.
The invention provides a high dynamic range training data synthesis scheme which is more suitable for the illumination condition of real data aiming at the image enhancement task of a large-scale remote sensing image, improves the quality of a data set, and enables an image enhancement model to be more suitable for processing the real high dynamic range remote sensing imaging data.
In order to achieve the above purpose, the invention is realized by adopting the following technical scheme.
A method for synthesizing high dynamic range remote sensing image data guided by a smooth mask comprises the following steps:
step 1: and generating basic data, namely generating a corresponding overexposure chart and a corresponding darkness chart through the input original image.
Wherein, for normal remote sensing image data, an overexposed image is generated by multiplying the brightness coefficient parameter.
And for normal remote sensing image data, simulating the reverse process of image signal processing through a full-element physical quantity noise model and parameters obtained by calibrating a sensor, reversely processing an RGB remote sensing image into a RAW image, reducing the brightness, and using the noise model to add noise to synthesize low-band noise data.
The parameter calibration of the sensor can be obtained by shooting a flat field frame and an offset frame.
In addition, the normal image data can be subjected to noise adding operation by using calibrated sensor parameters.
Step 2: and generating a random seed matrix, and upsampling the random seed matrix to a preset size to obtain a smooth mask.
The size of the preset up-sampling result is the size of input data of other subsequent computer vision tasks.
Step 3: and remapping the generated mask result to the value range space of [0,1], and carrying out weighted fusion on the images by using the mapped mask to generate a final high dynamic range remote sensing image synthesis result.
Meanwhile, the invention provides a smooth mask guided high dynamic range remote sensing image data generation system which comprises an overexposure data generation module, a dim light data generation module, a smooth mask generation module and a basic data fusion module. The output ends of the overexposure data generation module, the darkness data generation module and the smooth mask generation module are connected with the input end of the basic data fusion module.
Advantageous effects
The invention effectively solves the defect that the existing data synthesis method only aims at a single illumination scene to generate synthesized remote sensing data, provides a high dynamic range remote sensing image data synthesis scheme which is closer to the real situation for a large-scale remote sensing image enhancement task, and is beneficial to training an image enhancement model of the high dynamic range real remote sensing data.
Detailed Description
The invention will be described in further detail with reference to the drawings and examples.
Embodiment as shown in fig. 1, a method for synthesizing high dynamic range remote sensing image data guided by a smooth mask includes the following steps:
step S10: and generating basic data, namely generating a corresponding overexposure chart and a corresponding darkness chart through the input original image.
Step S10 further comprises two steps:
Step S11: and generating an overexposed image by multiplying the brightness coefficient parameters aiming at the normal remote sensing image data.
Specifically, overexposed images are synthesized using random parameters. In the present embodiment, use is made ofAnd (3) withGenerating an overexposure map by two parameters, wherein the generation of overexposure data is shown in the following formula:
(1)
Wherein, Is the generated overexposure map; is the original image after cutting; is a luminance coefficient representing the magnification of the image pixel value; Is an offset, and the default value is 0.
Step S12: for normal remote sensing image data, the reverse process of an image signal processing module is simulated through a full-element physical quantity noise model and parameters obtained by calibrating a sensor, an RGB remote sensing image is reversely processed into a RAW image, the brightness is reduced, and noise is added by using the noise model to synthesize low-band noise data.
Specifically, in this embodiment, a noise calibration and modeling scheme based on a physical model is adopted. For noise model establishment, in the process of shooting an image by using a camera, RAW data obtained by shooting is obtainedProportional to the number of photons incident on the sensorAnd the ratio is affected by the combination of the analog and digital gains of the camera.
Let the gain of the imaging system beConsidering a wide variety of noise, this process is expressed as:
(2)
Wherein, Representing the sum of all noise during the shooting.
The invention considers photon shot noise in the imaging processRead-out noiseIn the form of line noiseDominant sensor banding noise and quantization noise generated during digital-to-analog conversion. Shot noiseFor the noise at the photon level,,AndIs considered as noise at the digital signal level. Thus, the noise is expressed as:
(3)
Further, for noise parameter calibration, in order to accurately model different sensors of a plurality of different remote sensing photographing devices, the noise parameter calibration needs to be performed for the different sensors. For each sensor, two special calibration frames are required to be shot to complete parameter calibration, including a flat field frame and an offset frame.
Imaging system gainReflecting the relationship between the gray value of the captured image and the number of photons captured by the lens. To estimateIn the case of a fixed ISO, the method captures a series of flat field frames of different exposure times for the known sensor.
Wherein, flat field frames are pictures taken with illumination uniformly striking the sensor. In actual photographing, white paper is attached to a wall surface uniformly irradiated with natural light (or a direct current light source) as a photographing target by fixing a camera using a tripod, and a mobile phone is remotely controlled to adjust exposure time and perform photographing so as to avoid shaking of a lens. In addition, in order to reduce the influence of the unevenness of illumination, the focusing distance of the lens is adjusted to infinity.
Framing is a picture taken under full darkness conditions, which can be used to estimate other noise parameters of the sensor independent of lighting conditions, i.eAndCorresponding parameters. In the actual acquisition process, the lens needs to be covered in a dark room for shooting.
Further, according to the noise model assumption of the method, a flat field frame RAW imageVariance of (2)And theoretical true valueIs a linear relationship between the variances of (a) and the proportionality coefficient is. Thus, forVariance sum of (2)Performing linear fitting, wherein the slope is the system gain。
In the actual shooting process, the paper surface and the illumination cannot be ensured to completely meet the uniform condition, so as to obtain more accurate varianceThe effect of the non-uniformity is reduced by photographing the same two frames at each exposure time and differencing the two frames, and only the middle 200x200 area of each Ping Chang frame is truncated. Considering the nature of the variance, the residual map obtained by subtracting two frames needs to be divided byTo ensure that the correct variance is obtained. In order to accurately obtain a theoretical true value, the average value of two frames is calculated, and then the median of the pixels is taken as the true value.
To estimate readout noise (includingAnd) Using minimum exposure time to shoot offset frames with ISO of 50-6400. In order to estimate the relevant parameters of the line noise, the mean value of each line of the offset frame can be calculated, and the maximum likelihood estimation is used for fitting the Gaussian distribution of the zero mean value by using all the line mean values of the same frame, so that the standard deviation of the line noise can be obtained. Since the remaining noise is zero-mean except for the line noise for a certain line, the pixel mean of each line can be used to estimate the line noise in the RAW data. After the calibration of the line noise is completed, the component needs to be subtracted, and then other noise parameters are calibrated continuously.
Further, the remaining readout noise is performedAnd noise parameter calibration. To be able to estimate the readout noise more accurately, a Tukey Lambda distribution is used to fit the noise distribution and size. The Tukey Lambda distribution is a family of distributions that can be used to fit a range of common distributions. Wherein the parameters controlling the distribution category are shape parametersBy changingIs used to determine the specific shape of the distribution.
After the shape parameters are determined, standard deviation parameters of readout noise are determined by adopting a mode of maximum likelihood estimation, so that a probability distribution model can be fitted to the distribution of sample points. To this end, the corresponding system gain can be determined for the determined ISO valueAs well as other noise parameters.
Further, in order to determine various noise parameters at any ISO, it is necessary to further estimate the joint distribution of the system gain and the noise parameters. Analysis can find ISO and system gainIn an accurate and proportional relationship, therefore, with ISO=800 as a reference, ISO=can be obtainedTime-dependent system gain:(4)
For other noise parameters, the shape parameters of Tukey Lambda distribution are found through analysisIs substantially unaffected by ISO and therefore can take into account only the line noise standard deviationAnd the standard deviation of readout noiseSystem gainIs a relationship of (2); gain to system,AndTaking logarithms respectively, performing linear least square fitting under logarithmic scale, and using a Gaussian distribution with least square result as mean value to obtain parametersAndEstimation and sampling are performed.
Step S20: generating a random 4x4 seed matrix, up-sampling the seed matrix to a preset size, and normalizing the generated result to obtain a smooth mask;
Specifically, in the up-sampling process, a bicubic interpolation method is adopted to perform interpolation operation, and the method is used for performing cubic spline interpolation on pixels in the field of 4x4 pixels when an image is amplified.
Step S30: remapping the generated mask result to the value range space of [0,1], and carrying out weighted fusion on the image by using the mapped mask to generate a final high dynamic range remote sensing image synthesis result;
specifically, for the generated smooth mask, since adjacent pixel values cannot actually be continuous in a strict sense, if the mask is directly used for weighting and synthesizing the image, the difference between the pixels of the overexposure, the darkness and the original image is too large, so that more obvious image layering occurs and a more natural image cannot be generated, therefore, in order to synthesize a more natural high dynamic range image, two sections of quadratic functions are introduced, and the value of the smooth mask is mapped and then used as the final weight.
The specific synthesis method is as follows: when the mask intensity is 0.5, the corresponding pixels of the synthesized image adopt original images, when the mask intensity is more than 0.5, the overexposed image is mixed with the original images, and particularly, when the mask intensity is 1, the overexposed image is completely used; when the mask intensity is less than 0.5, the dark-light image is mixed with the original image, particularly when the mask intensity is 0, the dark-light image is completely used, and in order to reduce the pixel value difference between the extreme illumination condition and the original image, the proportion of the extreme illumination image is properly increased through quadratic function mapping weight when the images are mixed. This process can be expressed as: Wherein, Is the value of the mask that is to be used,For the weights to be mapped to,As an unprocessed raw remote sensing image,In order to generate an overexposed image,In order to generate a dark-light image,Is a final graph of the mixing results.
FIG. 2 is a system diagram of the apparatus of the present invention. The smooth mask guided high dynamic range remote sensing image data generation system comprises an overexposure data generation module M10, a dim light data generation module M20, a smooth mask generation module M30 and a basic data fusion module M40.
Wherein:
The overexposure data generating module M10 is configured to generate overexposed image data from the input normal remote sensing image data.
The dim light data generation module M20 is configured to generate dim light image data from the input normal remote sensing image data.
The smoothing mask generation module M30 is configured to generate a smoothing mask for guiding image fusion.
The basic data fusion module M40 is configured to remap the smooth mask and guide the fusion of the normal, dark and overexposed images into final high dynamic range remote sensing image data.
The connection relation between the modules is as follows:
the output ends of the overexposure data generation module M10, the darkness data generation module M20 and the smooth mask generation module M30 are connected with the input end of the basic data fusion module M40.
The method can solve the problem that the existing data synthesis method only aims at a single illumination scene to generate synthesized remote sensing data, provides a high dynamic range remote sensing image data synthesis scheme which is closer to the real situation for a large-scale remote sensing image enhancement task, and is beneficial to training an image enhancement model of the high dynamic range real remote sensing data.