CN119006305A

CN119006305A - Method and system for synthesizing high dynamic range remote sensing image data guided by smooth mask

Info

Publication number: CN119006305A
Application number: CN202411430013.2A
Authority: CN
Inventors: 付莹; 王嘉楠; 李和松; 聂婧; 刘乾坤
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2024-10-14
Filing date: 2024-10-14
Publication date: 2024-11-22

Abstract

The present invention proposes a smoothing mask-guided high dynamic range remote sensing image data synthesis method and system, which belongs to the field of image information processing technology. For normal remote sensing image data, the present invention generates an overexposed image by multiplying the brightness coefficient parameter. For normal remote sensing image data, the inverse process of image signal processing is simulated through the full-element physical quantity noise model and the parameters obtained by sensor calibration, and the RGB remote sensing image is inversely processed into a RAW image, and then the brightness is lowered and the noise model is used to add noise to synthesize low light band noise data. This method improves the quality of the data set, makes the image enhancement model more suitable for processing real high dynamic range remote sensing imaging data, and provides a high dynamic range remote sensing image data synthesis solution that is closer to the actual situation for large-scale remote sensing image enhancement tasks.

Description

Method and system for synthesizing high dynamic range remote sensing image data guided by smooth mask

Technical Field

The invention relates to a remote sensing image data synthesis method and system, in particular to a high dynamic range remote sensing image data synthesis method and system guided by a smooth mask, and belongs to the technical field of image information processing.

Background

When a high-brightness area irradiated by a strong light source (sunlight, lamp, reflective light, etc.) and an area with relatively low brightness such as shadow, backlight, etc. exist in an image at the same time, the image output by the camera can appear that the bright area becomes white due to overexposure and the dark area becomes black due to underexposure. That is, the camera's behavior in the same scene for the brightest and darker areas is limited, which is the "dynamic range".

Because of the characteristic of large sensing range of the sensor, overexposure and darkness areas can exist in the remote sensing image, so that the image quality is seriously affected. To address the problem of limited dynamic range of the sensor, one effective approach is to use high dynamic range imaging techniques that aid in image restoration by deep learning. For this reason, it is necessary to provide a data set composed of paired low-quality-high-quality remote sensing images in a high dynamic range application scene for a depth learning-based image enhancement model. Unlike other remote sensing data, the high dynamic range remote sensing data refers to remote sensing image data which is photographed in an analog high dynamic range imaging environment provided for applying the high dynamic range technology and needs to be enhanced, wherein the remote sensing image data comprises dark light and overexposure areas with different degrees.

Image enhancement techniques based on deep learning require a large amount of data to train the model, but the acquisition of real pair images is difficult, requiring high time and labor costs. In addition, the real dataset has the following two problems:

1. The shooting scene of real data is limited. For extreme illuminance images, the diversity of shooting scenes is limited by ensuring that no displacement exists between the extreme illuminance images and the corresponding normal illumination images during imaging, so that the imaging equipment and the shooting scenes are required to be ensured to be relatively static; for an image containing a shadow area caused by a solar altitude angle, shooting scene diversity is limited not only by the above factors, but also by a scene with a large change in topography that needs to be selected during imaging. Due to the common influence of the solar altitude and the topography, it is difficult to obtain a remote sensing image containing an extreme illuminance region and a contrast image not containing the extreme illuminance region in the same scene. This greatly increases the cost of capturing image data and limits the selection of a captured scene.

2. The photographed image data is affected by the characteristics of the imaging device, and it is difficult to migrate well to other imaging devices. The characteristics of hardware of different imaging devices are different, and shot data also has different noise, so that when an algorithm model learned from image data shot by one imaging device is applied to an image shot by another imaging device, consistent performance and effect are difficult to obtain, namely, the algorithm model cannot be adaptively generalized aiming at the currently deployed imaging device, and optimal performance cannot be obtained. The neural network model based on deep learning requires a large amount of data to train to ensure the performance, so that it is required to make the simulation data as close as possible to the extreme illumination image shot by the real imaging device.

Current methods of synthesizing simulated data typically only handle a single task. For example, the original image is entirely converted into a dark-light image or an overexposed image. However, for remote sensing image data, the size of the remote sensing image data can reach hundreds of millions of pixels, the image of the remote sensing image data can simultaneously contain dark light and overexposed areas, and if only a pure dark light-overexposed image is used as a training set, the remote sensing image imaging range under the real illumination condition can not be closed, so that the processing capacity of the model for high dynamic range data is insufficient.

In order to generate synthetic data more closely approaching to the real remote sensing imaging situation, a method capable of generating synthetic data of a scene with normal, dim light and overexposure at the same time and enabling pixel values of a transition area to be smooth and natural is urgently needed.

Disclosure of Invention

Aiming at the defects and shortcomings existing in the prior art, the invention creatively provides a high dynamic range remote sensing image data synthesis method and system guided by using a smooth mask, which aims at solving the defect that the existing method only aims at the single extreme illumination condition to generate synthesized data and cannot be applied to the condition that shadows, overexposure, normal areas and the like exist in an actual large-scale remote sensing image at the same time.

The invention provides a high dynamic range training data synthesis scheme which is more suitable for the illumination condition of real data aiming at the image enhancement task of a large-scale remote sensing image, improves the quality of a data set, and enables an image enhancement model to be more suitable for processing the real high dynamic range remote sensing imaging data.

In order to achieve the above purpose, the invention is realized by adopting the following technical scheme.

A method for synthesizing high dynamic range remote sensing image data guided by a smooth mask comprises the following steps:

step 1: and generating basic data, namely generating a corresponding overexposure chart and a corresponding darkness chart through the input original image.

Wherein, for normal remote sensing image data, an overexposed image is generated by multiplying the brightness coefficient parameter.

And for normal remote sensing image data, simulating the reverse process of image signal processing through a full-element physical quantity noise model and parameters obtained by calibrating a sensor, reversely processing an RGB remote sensing image into a RAW image, reducing the brightness, and using the noise model to add noise to synthesize low-band noise data.

The parameter calibration of the sensor can be obtained by shooting a flat field frame and an offset frame.

In addition, the normal image data can be subjected to noise adding operation by using calibrated sensor parameters.

Step 2: and generating a random seed matrix, and upsampling the random seed matrix to a preset size to obtain a smooth mask.

The size of the preset up-sampling result is the size of input data of other subsequent computer vision tasks.

Step 3: and remapping the generated mask result to the value range space of [0,1], and carrying out weighted fusion on the images by using the mapped mask to generate a final high dynamic range remote sensing image synthesis result.

Meanwhile, the invention provides a smooth mask guided high dynamic range remote sensing image data generation system which comprises an overexposure data generation module, a dim light data generation module, a smooth mask generation module and a basic data fusion module. The output ends of the overexposure data generation module, the darkness data generation module and the smooth mask generation module are connected with the input end of the basic data fusion module.

Advantageous effects

The invention effectively solves the defect that the existing data synthesis method only aims at a single illumination scene to generate synthesized remote sensing data, provides a high dynamic range remote sensing image data synthesis scheme which is closer to the real situation for a large-scale remote sensing image enhancement task, and is beneficial to training an image enhancement model of the high dynamic range real remote sensing data.

Drawings

FIG. 1 is a general flow chart of the method of the present invention;

fig. 2 is a schematic diagram of the system of the present invention.

Detailed Description

The invention will be described in further detail with reference to the drawings and examples.

Embodiment as shown in fig. 1, a method for synthesizing high dynamic range remote sensing image data guided by a smooth mask includes the following steps:

step S10: and generating basic data, namely generating a corresponding overexposure chart and a corresponding darkness chart through the input original image.

Step S10 further comprises two steps:

Step S11: and generating an overexposed image by multiplying the brightness coefficient parameters aiming at the normal remote sensing image data.

Specifically, overexposed images are synthesized using random parameters. In the present embodiment, use is made ofAnd (3) withGenerating an overexposure map by two parameters, wherein the generation of overexposure data is shown in the following formula:

（1）

Wherein, Is the generated overexposure map; is the original image after cutting; is a luminance coefficient representing the magnification of the image pixel value; Is an offset, and the default value is 0.

Step S12: for normal remote sensing image data, the reverse process of an image signal processing module is simulated through a full-element physical quantity noise model and parameters obtained by calibrating a sensor, an RGB remote sensing image is reversely processed into a RAW image, the brightness is reduced, and noise is added by using the noise model to synthesize low-band noise data.

Specifically, in this embodiment, a noise calibration and modeling scheme based on a physical model is adopted. For noise model establishment, in the process of shooting an image by using a camera, RAW data obtained by shooting is obtainedProportional to the number of photons incident on the sensorAnd the ratio is affected by the combination of the analog and digital gains of the camera.

Let the gain of the imaging system beConsidering a wide variety of noise, this process is expressed as:

（2）

Wherein, Representing the sum of all noise during the shooting.

The invention considers photon shot noise in the imaging processRead-out noiseIn the form of line noiseDominant sensor banding noise and quantization noise generated during digital-to-analog conversion. Shot noiseFor the noise at the photon level,，AndIs considered as noise at the digital signal level. Thus, the noise is expressed as:

（3）

Further, for noise parameter calibration, in order to accurately model different sensors of a plurality of different remote sensing photographing devices, the noise parameter calibration needs to be performed for the different sensors. For each sensor, two special calibration frames are required to be shot to complete parameter calibration, including a flat field frame and an offset frame.

Imaging system gainReflecting the relationship between the gray value of the captured image and the number of photons captured by the lens. To estimateIn the case of a fixed ISO, the method captures a series of flat field frames of different exposure times for the known sensor.

Wherein, flat field frames are pictures taken with illumination uniformly striking the sensor. In actual photographing, white paper is attached to a wall surface uniformly irradiated with natural light (or a direct current light source) as a photographing target by fixing a camera using a tripod, and a mobile phone is remotely controlled to adjust exposure time and perform photographing so as to avoid shaking of a lens. In addition, in order to reduce the influence of the unevenness of illumination, the focusing distance of the lens is adjusted to infinity.

Framing is a picture taken under full darkness conditions, which can be used to estimate other noise parameters of the sensor independent of lighting conditions, i.eAndCorresponding parameters. In the actual acquisition process, the lens needs to be covered in a dark room for shooting.

Further, according to the noise model assumption of the method, a flat field frame RAW imageVariance of (2)And theoretical true valueIs a linear relationship between the variances of (a) and the proportionality coefficient is. Thus, forVariance sum of (2)Performing linear fitting, wherein the slope is the system gain。

In the actual shooting process, the paper surface and the illumination cannot be ensured to completely meet the uniform condition, so as to obtain more accurate varianceThe effect of the non-uniformity is reduced by photographing the same two frames at each exposure time and differencing the two frames, and only the middle 200x200 area of each Ping Chang frame is truncated. Considering the nature of the variance, the residual map obtained by subtracting two frames needs to be divided byTo ensure that the correct variance is obtained. In order to accurately obtain a theoretical true value, the average value of two frames is calculated, and then the median of the pixels is taken as the true value.

To estimate readout noise (includingAnd) Using minimum exposure time to shoot offset frames with ISO of 50-6400. In order to estimate the relevant parameters of the line noise, the mean value of each line of the offset frame can be calculated, and the maximum likelihood estimation is used for fitting the Gaussian distribution of the zero mean value by using all the line mean values of the same frame, so that the standard deviation of the line noise can be obtained. Since the remaining noise is zero-mean except for the line noise for a certain line, the pixel mean of each line can be used to estimate the line noise in the RAW data. After the calibration of the line noise is completed, the component needs to be subtracted, and then other noise parameters are calibrated continuously.

Further, the remaining readout noise is performedAnd noise parameter calibration. To be able to estimate the readout noise more accurately, a Tukey Lambda distribution is used to fit the noise distribution and size. The Tukey Lambda distribution is a family of distributions that can be used to fit a range of common distributions. Wherein the parameters controlling the distribution category are shape parametersBy changingIs used to determine the specific shape of the distribution.

After the shape parameters are determined, standard deviation parameters of readout noise are determined by adopting a mode of maximum likelihood estimation, so that a probability distribution model can be fitted to the distribution of sample points. To this end, the corresponding system gain can be determined for the determined ISO valueAs well as other noise parameters.

Further, in order to determine various noise parameters at any ISO, it is necessary to further estimate the joint distribution of the system gain and the noise parameters. Analysis can find ISO and system gainIn an accurate and proportional relationship, therefore, with ISO=800 as a reference, ISO=can be obtainedTime-dependent system gain：（4）

For other noise parameters, the shape parameters of Tukey Lambda distribution are found through analysisIs substantially unaffected by ISO and therefore can take into account only the line noise standard deviationAnd the standard deviation of readout noiseSystem gainIs a relationship of (2); gain to system，AndTaking logarithms respectively, performing linear least square fitting under logarithmic scale, and using a Gaussian distribution with least square result as mean value to obtain parametersAndEstimation and sampling are performed.

Step S20: generating a random 4x4 seed matrix, up-sampling the seed matrix to a preset size, and normalizing the generated result to obtain a smooth mask;

Specifically, in the up-sampling process, a bicubic interpolation method is adopted to perform interpolation operation, and the method is used for performing cubic spline interpolation on pixels in the field of 4x4 pixels when an image is amplified.

Step S30: remapping the generated mask result to the value range space of [0,1], and carrying out weighted fusion on the image by using the mapped mask to generate a final high dynamic range remote sensing image synthesis result;

specifically, for the generated smooth mask, since adjacent pixel values cannot actually be continuous in a strict sense, if the mask is directly used for weighting and synthesizing the image, the difference between the pixels of the overexposure, the darkness and the original image is too large, so that more obvious image layering occurs and a more natural image cannot be generated, therefore, in order to synthesize a more natural high dynamic range image, two sections of quadratic functions are introduced, and the value of the smooth mask is mapped and then used as the final weight.

The specific synthesis method is as follows: when the mask intensity is 0.5, the corresponding pixels of the synthesized image adopt original images, when the mask intensity is more than 0.5, the overexposed image is mixed with the original images, and particularly, when the mask intensity is 1, the overexposed image is completely used; when the mask intensity is less than 0.5, the dark-light image is mixed with the original image, particularly when the mask intensity is 0, the dark-light image is completely used, and in order to reduce the pixel value difference between the extreme illumination condition and the original image, the proportion of the extreme illumination image is properly increased through quadratic function mapping weight when the images are mixed. This process can be expressed as: Wherein, Is the value of the mask that is to be used,For the weights to be mapped to,As an unprocessed raw remote sensing image,In order to generate an overexposed image,In order to generate a dark-light image,Is a final graph of the mixing results.

FIG. 2 is a system diagram of the apparatus of the present invention. The smooth mask guided high dynamic range remote sensing image data generation system comprises an overexposure data generation module M10, a dim light data generation module M20, a smooth mask generation module M30 and a basic data fusion module M40.

Wherein:

The overexposure data generating module M10 is configured to generate overexposed image data from the input normal remote sensing image data.

The dim light data generation module M20 is configured to generate dim light image data from the input normal remote sensing image data.

The smoothing mask generation module M30 is configured to generate a smoothing mask for guiding image fusion.

The basic data fusion module M40 is configured to remap the smooth mask and guide the fusion of the normal, dark and overexposed images into final high dynamic range remote sensing image data.

The connection relation between the modules is as follows:

the output ends of the overexposure data generation module M10, the darkness data generation module M20 and the smooth mask generation module M30 are connected with the input end of the basic data fusion module M40.

The method can solve the problem that the existing data synthesis method only aims at a single illumination scene to generate synthesized remote sensing data, provides a high dynamic range remote sensing image data synthesis scheme which is closer to the real situation for a large-scale remote sensing image enhancement task, and is beneficial to training an image enhancement model of the high dynamic range real remote sensing data.

Claims

1. A smooth mask-guided high dynamic range remote sensing image data synthesis method, characterized in that it includes the following steps: Step 1: basic data generation, generating corresponding over-exposure images and dark light images through the input original image; wherein, for normal remote sensing image data, the over-exposure image is generated by multiplying the brightness coefficient parameter; the parameter calibration of the sensor is obtained by shooting a flat field frame and a bias frame; for normal remote sensing image data, the inverse process of image signal processing is simulated through the full-element physical quantity noise model and the parameters obtained by sensor calibration, the RGB remote sensing image is inversely processed into a RAW image, and then the brightness is lowered and the noise model is used to add noise to synthesize low light noise data; Step 2: generating a random seed matrix, upsampling it to a preset size, and obtaining a smooth mask; wherein the preset upsampling result size is the input data size of subsequent other computer vision tasks; Step 3: remapping the generated mask result to the value domain space of [0,1], and using the mapped mask to weightedly fuse the image to generate the final high dynamic range remote sensing image synthesis result.

2. The smooth mask-guided high dynamic range remote sensing image data synthesis method according to claim 1, characterized in that, in step 1, the overexposed image is synthesized using random parameters as guidance;

use and The two parameters generate an overexposure map, and the generation of overexposure data is shown in the following formula: in, is the generated overexposed image; is the original image after cutting; is the brightness coefficient, which represents the scaling factor of the image pixel value; is the offset, the default value is 0.

3. The method for synthesizing high dynamic range remote sensing image data guided by a smooth mask as claimed in claim 1, characterized in that in step 1, for establishing a noise model, in the process of using a camera to shoot an image, the RAW data obtained by shooting Proportional to the number of photons incident on the sensor , and the ratio is affected by the camera analog gain and digital gain; let the imaging system gain be , considering various noises, the process can be expressed as: in, Represents the sum of all noise during the capture process; considers the photon shot noise during the imaging process , read noise , with line noise The main sensor stripe noise and quantization noise generated during digital-to-analog conversion ; Shot noise is the noise at the photon level, , as well as Considered as noise at the digital signal level, the noise is expressed as: Noise parameter calibration is performed for different sensors. For each sensor, two special calibration frames are taken to complete parameter calibration, including flat field frame and bias frame. Imaging system gain Reflects the relationship between the grayscale value of the captured image and the number of photons captured by the lens; The value of the flat-field frame is taken with different exposure times for a known sensor under a fixed ISO. According to the noise model assumption, the flat-field frame RAW image Variance Theoretical truth value There is a linear relationship between the variances of ;right The variance and Perform a linear fit, and the slope is the system gain ; After determining the shape parameters, the standard deviation parameters of the readout noise are determined by maximum likelihood estimation, so that the probability distribution model can fit the distribution of the sample points; at this point, the corresponding system gain can be calculated for the determined ISO value And other noise parameters; To determine the various noise parameters at any ISO, it is necessary to further estimate the joint distribution of system gain and noise parameters, ISO and system gain It is an accurate positive proportional relationship. Taking ISO=800 as the benchmark, we get ISO= The corresponding system gain is : For other noise parameters, only the row noise standard deviation is considered and readout noise standard deviation With system gain Relationship between system gain , and Take the logarithm respectively, perform linear least squares fitting on the logarithmic scale, and use a Gaussian distribution with the least squares result as the mean to calibrate the parameters and Perform estimation and sampling.

4. The smooth mask guided high dynamic range remote sensing image data synthesis method as claimed in claim 3, characterized in that the flat field frame is a picture taken when the light is uniformly shining on the sensor;

The bias frame is a picture taken in complete darkness and is used to estimate other noise parameters of the sensor that are independent of the lighting conditions, namely and The corresponding parameters.

5. The method for synthesizing high dynamic range remote sensing image data guided by a smoothing mask as described in claim 3 is characterized in that, in order to estimate the relevant parameters of the readout noise, a bias frame with an ISO of 50 to 6400 is shot using a minimum exposure time; in order to estimate the relevant parameters of the row noise, the mean of each row of the bias frame is calculated, and the mean of all rows of the same frame is used to fit the Gaussian distribution of zero mean using maximum likelihood estimation to obtain the standard deviation of the row noise; in the RAW data, the row noise is estimated using the pixel mean of each row; after completing the calibration of the row noise, the calibrated row noise needs to be subtracted, and then other noise parameters are calibrated.

6. The smoothing mask guided high dynamic range remote sensing image data synthesis method as claimed in claim 3, characterized in that the Tukey Lambda distribution is used to fit the distribution and size of the noise; the Tukey Lambda distribution is a distribution family used to fit a series of common distributions, wherein the parameter controlling the distribution category is the shape parameter , by changing The specific shape of the distribution can be determined in a way that

7. The smoothing mask guided high dynamic range remote sensing image data synthesis method as claimed in claim 3, characterized in that to obtain the variance , by taking two identical frames at each exposure time and subtracting the two frames to reduce the impact of non-uniformity, and only intercepting the middle 200x200 area of each flat field frame; the residual map obtained by subtracting the two frames needs to be divided by , to ensure the correct variance; average the two frames and take the median of the pixels as the true value.

8. The smoothing mask guided high dynamic range remote sensing image data synthesis method as described in claim 1 is characterized in that in step 2, a random 4x4 seed matrix is generated, which is upsampled to a preset size, and the generated result is normalized to obtain a smoothing mask; during the upsampling process, a bicubic interpolation method is used for interpolation operation, which is used to perform cubic spline interpolation on pixels within a 4x4 pixel neighborhood when the image is enlarged.

9. The method for synthesizing high dynamic range remote sensing image data guided by a smooth mask as claimed in claim 1 is characterized in that, in step 3, two quadratic functions are introduced, and the values of the smooth mask are mapped as the final weights; the specific synthesis method is: when the mask strength is 0.5, the corresponding pixels of the synthesized image use the original image, when the mask strength is greater than 0.5, the overexposed image is mixed with the original image, and when the mask strength is 1, the overexposed image is used completely; when the mask strength is less than 0.5, the dark light image is mixed with the original image, and when the mask strength is 0, the dark light image is used completely. In order to reduce the pixel value difference between the extreme illumination condition and the original image, when mixing the images, the proportion of the extreme illumination image is appropriately increased by mapping the weights through the quadratic function; the process is expressed as: in, is the value of the mask, is the weight after mapping, is the original remote sensing image that has not been processed. To generate an overexposed image, To generate a dark light image, This is the final mixed result image.

10. A smoothing mask guided high dynamic range remote sensing image data synthesis system, characterized in that it includes an overexposure data generation module (M10), a dark light data generation module (M20), a smoothing mask generation module (M30) and a basic data fusion module (M40); wherein: the overexposure data generation module (M10) is used to generate overexposure image data through input normal remote sensing image data; the dark light data generation module (M20) is used to generate dark light image data through input normal remote sensing image data; the smoothing mask generation module (M30) is used to generate a smoothing mask for guiding image fusion; the basic data fusion module (M40) is used to remap the smoothing mask and guide the fusion of normal, dark light and overexposed images into final high dynamic range remote sensing image data; the connection relationship between the above modules is: the output ends of the overexposure data generation module (M10), the dark light data generation module (M20) and the smoothing mask generation module (M30) are all connected to the input end of the basic data fusion module (M40).