CN116051687A

CN116051687A - Synthetic image processing method, apparatus, computer device, and storage medium

Info

Publication number: CN116051687A
Application number: CN202310090153.9A
Authority: CN
Inventors: 李肯立; 赵杏林; 谭光华; 朱宁波; 段明星; 唐卓; 刘楚波; 李克勤
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2023-02-07
Filing date: 2023-02-07
Publication date: 2023-05-02

Abstract

The present application relates to a composite image processing method, device, computer equipment, storage medium and computer program product. The method includes: acquiring a composite image, a mask of the foreground image, and a mask of the background image, and the first generator of the image processing model extracts light and shadow features of the foreground according to the mask of the composite image and the foreground image, and generates a first composite image; the second generator of the image processing model extracts the background light and shadow features according to the mask of the synthetic image and the background image, and generates a second composite image; based on the similarity between the foreground light and shadow features and the background light and shadow features, the first composite image and the second The composite images have similar light and shadow characteristics, and the first composite image and the second composite image are fused to obtain a harmonious target composite image, and the light and shadow are generated according to the light and shadow characteristics of the foreground image and the background image, and a harmonious target composite with light and shadow is obtained image.

Description

Composite image processing method, device, computer equipment and storage medium

技术领域Technical Field

本申请涉及计算机视觉领域，特别是涉及一种合成图像处理方法、装置、计算机设备、存储介质和计算机程序产品。The present application relates to the field of computer vision, and in particular to a synthetic image processing method, apparatus, computer equipment, storage medium and computer program product.

背景技术Background Art

图像合成作为一种常见的图像编辑操作，可以使娱乐、艺术和商业领域的各种应用和下游视觉任务的数据集扩充。例如，人们可以替换自拍的背景，并使用图像合成技术使获得的图像更真实。图像合成也可以用于自动广告，这有助于广告商在背景场景中插入产品。在图像合成的过程中，首先使用一种图像分割或抠图技术从一张图像中剪切前景目标图像，然后再将得到的前景图像粘贴到另一张图像上，由此得到一张合成图像。但在这样的合成图像中，它的前景和背景可能在不同的条件(例如，天气、季节、一天中的时间、相机设置)下拍摄，因此前景和背景具有不同的照明特性，这使得合成图像看起来并不和谐。Image synthesis, as a common image editing operation, can enable dataset augmentation for various applications and downstream vision tasks in entertainment, art, and business. For example, one can replace the background of a selfie and use image synthesis techniques to make the obtained image more realistic. Image synthesis can also be used in automatic advertising, which helps advertisers insert products in background scenes. In the process of image synthesis, a foreground target image is first cut from one image using an image segmentation or cutout technique, and then the obtained foreground image is pasted onto another image to obtain a composite image. But in such a composite image, its foreground and background may be taken under different conditions (e.g., weather, season, time of day, camera settings), so the foreground and background have different lighting characteristics, which makes the composite image look disharmonious.

为解决上述合成图像不协调的问题，需要根据合成背景调整合成前景的外观，使其与合成背景兼容。传统的图像解决方法一般是对前景进行颜色变换，以匹配前景与背景之间的低层次颜色统计量。由于缺乏高级信息，这些方法限制了协调性能，并且统计数据大都是人工计算，时间成本高。最近，深度学习理论在计算机视觉任务中取得了巨大的成功。凭借庞大的网络结构以及海量的训练参数，深度学习算法基于大量带标签数据优化训练网络，在诸多领域展现出卓越的性能，也被用来解决图像不协调的问题。To solve the above-mentioned problem of inharmonious composite images, it is necessary to adjust the appearance of the composite foreground according to the composite background to make it compatible with the composite background. Traditional image solution methods generally perform color transformation on the foreground to match the low-level color statistics between the foreground and the background. Due to the lack of high-level information, these methods limit the coordination performance, and the statistical data are mostly calculated manually, which is time-consuming. Recently, deep learning theory has achieved great success in computer vision tasks. With a huge network structure and a large number of training parameters, deep learning algorithms optimize the training network based on a large amount of labeled data, showing excellent performance in many fields, and are also used to solve the problem of inharmonious images.

目前的深度模型主要采用基于CNN的编码器-解码器结构，它采用一个编码器来捕获合成图像的上下文信息，一个解码器来重建和谐化后的合成图像。然而，由于CNN天生具有局部性的诱导性偏差，和谐化的合成图像的和谐效果并不理想。The current deep model mainly adopts the encoder-decoder structure based on CNN, which uses an encoder to capture the contextual information of the synthesized image and a decoder to reconstruct the harmonized synthesized image. However, due to the inherent local inductive bias of CNN, the harmonization effect of the harmonized synthesized image is not ideal.

发明内容Summary of the invention

基于此，有必要针对上述技术问题，提供一种能够和谐化合成图像的合成图像处理方法、装置、计算机设备、计算机可读存储介质和计算机程序产品。Based on this, it is necessary to provide a synthetic image processing method, device, computer equipment, computer-readable storage medium and computer program product that can harmonize synthetic images in order to solve the above technical problems.

第一方面，本申请提供了一种合成图像处理方法。所述方法包括：In a first aspect, the present application provides a synthetic image processing method. The method comprises:

获取合成图像、前景图像的掩码和背景图像的掩码；Get the composite image, the mask of the foreground image, and the mask of the background image;

将所述合成图像和所述前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据所述前景光影特征和所述合成图像得到第一合成图像；Inputting the composite image and the mask of the foreground image into a first generator of an image processing model, extracting foreground light and shadow features from the mask of the foreground image, and obtaining a first composite image according to the foreground light and shadow features and the composite image;

将所述合成图像和所述背景图像的掩码输入至图像处理模型的第二生成器，从所述背景图像的掩码提取背景光影特征，根据所述背景光影特征和所述合成图像得到第二合成图像；其中，所述图像处理模型的第一生成器提取的所述前景光影特征，与第二生成器提取的所述背景光影特征相似；Inputting the composite image and the mask of the background image into the second generator of the image processing model, extracting background light and shadow features from the mask of the background image, and obtaining a second composite image according to the background light and shadow features and the composite image; wherein the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator;

将所述第一合成图像和所述第二合成图像融合，得到目标合成图像。The first composite image and the second composite image are fused to obtain a target composite image.

在其中一个实施例中，训练所述图像处理模型的方法包括：In one embodiment, the method of training the image processing model includes:

获取图像数据集，所述图像数据集包括多组数据，每组数据由两个三元组数据组成，其中一个所述三元组为原始合成图像、前景图像的掩码和背景图像的掩码，另一个所述三元组为真实图像、真实前景光照图像和真实背景光照图像；Acquire an image data set, the image data set comprising multiple groups of data, each group of data consisting of two triplet data, one of the triplet is an original synthetic image, a mask of a foreground image and a mask of a background image, and the other triplet is a real image, a real foreground illumination image and a real background illumination image;

将图像数据集中所述原始合成图像和所述前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据所述前景光影特征和所述原始合成图像得到第三合成图像；Inputting the original synthesized image and the mask of the foreground image in the image data set into a first generator of the image processing model, extracting foreground light and shadow features from the mask of the foreground image, and obtaining a third synthesized image according to the foreground light and shadow features and the original synthesized image;

将所述原始合成图像和所述背景图像的掩码输入至图像处理模型的第二生成器，从所述背景图像的掩码提取背景光影特征，根据所述背景光影特征和所述合成图像得到第四合成图像；Inputting the original composite image and the mask of the background image into a second generator of an image processing model, extracting background light and shadow features from the mask of the background image, and obtaining a fourth composite image according to the background light and shadow features and the composite image;

将所述第三合成图像和所述第四合成图像融合，得到和谐处理图像；fusing the third synthesized image and the fourth synthesized image to obtain a harmoniously processed image;

根据所述和谐处理图像、所述真实图像、所述真实前景光照图像和所述真实背景光照图像，计算模型损失；Calculating a model loss according to the harmonized image, the real image, the real foreground illumination image, and the real background illumination image;

基于所述模型损失的约束对所述第一生成器和所述第二生成器进行参数调整，得到训练好的图像处理模型；所述损失函数的约束包括：所述图像处理模型的第一生成器提取的所述前景光影特征，与第二生成器提取的所述背景光影特征相似。The parameters of the first generator and the second generator are adjusted based on the constraint of the model loss to obtain a trained image processing model; the constraint of the loss function includes: the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator.

在其中一个实施例中，所述模型损失包括：对抗性损失和光照损失；In one of the embodiments, the model loss includes: adversarial loss and illumination loss;

根据所述和谐处理图像、所述真实图像、所述真实前景光照图像和所述真实背景光照图像，计算模型损失，包括：Calculating a model loss according to the harmonized image, the real image, the real foreground illumination image, and the real background illumination image includes:

将所述和谐处理图像和所述真实图像输入所述图像处理模型的判别器，所述判别器用于判断所述和谐处理图像和所述真实图像的真实性；Inputting the harmonized processed image and the real image into a discriminator of the image processing model, wherein the discriminator is used to judge the authenticity of the harmonized processed image and the real image;

根据所述判别器的输出，计算对抗性损失；Calculating adversarial loss based on the output of the discriminator;

根据所述真实前景光照图像与所述和谐处理图像的前景光照图像的差异，以及所述真实背景光照图像与所述和谐处理图像的背景光照图像的差异，计算光照特征损失。The illumination feature loss is calculated according to the difference between the real foreground illumination image and the foreground illumination image of the harmonized image, and the difference between the real background illumination image and the background illumination image of the harmonized image.

在其中一个实施例中，所述模型损失还包括：非光照特征损失；In one of the embodiments, the model loss further includes: non-illumination feature loss;

所述方法还包括：根据相同前景图像在不同光照下的非照明特征，计算得到非光照特征损失；其中，非光照特征损失函数的目标是使不同照明条件下的非照明特征的差异最小。The method further includes: calculating a non-illumination feature loss based on the non-illumination features of the same foreground image under different illumination conditions; wherein the goal of the non-illumination feature loss function is to minimize the difference in non-illumination features under different illumination conditions.

在其中一个实施例中，所述模型损失还包括：感知损失；In one of the embodiments, the model loss further includes: perceptual loss;

所述方法还包括：The method further comprises:

利用预先训练好的神经网络模型提取所述和谐处理图像的特征图；Extracting a feature map of the harmonized image using a pre-trained neural network model;

利用预先训练好的神经网络模型提取所述和谐处理图像对应的真实图像的特征图；Extracting a feature map of a real image corresponding to the harmonized image using a pre-trained neural network model;

根据所述和谐处理图像的特征图和所述和谐处理图像对应的真实图像的特征图，确定感知损失；所述感知损失函数的目标是使所述和谐处理图像的特征图和所述和谐处理图像对应的真实图像的特征图的差异最小。The perceptual loss is determined based on the feature map of the harmonized image and the feature map of the real image corresponding to the harmonized image; the goal of the perceptual loss function is to minimize the difference between the feature map of the harmonized image and the feature map of the real image corresponding to the harmonized image.

在其中一个实施例中，所述模型损失为光照特征损失、非光照特征损失、感知损失、对抗性损失以及绝对值偏差损失的加权和。In one embodiment, the model loss is a weighted sum of illumination feature loss, non-illumination feature loss, perceptual loss, adversarial loss, and absolute value deviation loss.

第二方面，本申请还提供了一种合成图像处理装置。所述装置包括：In a second aspect, the present application also provides a synthetic image processing device. The device comprises:

合成图像获取模块，用于获取合成图像、前景图像的掩码和背景图像的掩码。The composite image acquisition module is used to acquire a composite image, a mask of a foreground image, and a mask of a background image.

第一合成图像获取模块，用于将所述合成图像和所述前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据所述前景光影特征和所述合成图像得到第一合成图像。The first synthetic image acquisition module is used to input the synthetic image and the mask of the foreground image into the first generator of the image processing model, extract the foreground light and shadow features from the mask of the foreground image, and obtain the first synthetic image according to the foreground light and shadow features and the synthetic image.

第二合成图像获取模块，用于将所述合成图像和所述背景图像的掩码输入至图像处理模型的第二生成器，从所述背景图像的掩码提取背景光影特征，根据所述背景光影特征和所述合成图像得到第二合成图像；其中，所述图像处理模型的第一生成器提取的所述前景光影特征，与第二生成器提取的所述背景光影特征相似。The second synthetic image acquisition module is used to input the masks of the synthetic image and the background image into the second generator of the image processing model, extract background light and shadow features from the mask of the background image, and obtain a second synthetic image based on the background light and shadow features and the synthetic image; wherein the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator.

目标合成图像获取模块，用于将所述第一合成图像和所述第二合成图像融合，得到目标合成图像。The target composite image acquisition module is used to fuse the first composite image and the second composite image to obtain a target composite image.

第三方面，本申请还提供了一种计算机设备。所述计算机设备包括存储器和处理器，所述存储器存储有计算机程序，所述处理器执行所述计算机程序时实现以下步骤：In a third aspect, the present application further provides a computer device. The computer device includes a memory and a processor, the memory stores a computer program, and the processor implements the following steps when executing the computer program:

第四方面，本申请还提供了一种计算机可读存储介质。所述计算机可读存储介质，其上存储有计算机程序，所述计算机程序被处理器执行时实现以下步骤：In a fourth aspect, the present application further provides a computer-readable storage medium. The computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the following steps are implemented:

第五方面，本申请还提供了一种计算机程序产品。所述计算机程序产品，包括计算机程序，该计算机程序被处理器执行时实现以下步骤：In a fifth aspect, the present application further provides a computer program product. The computer program product includes a computer program, and when the computer program is executed by a processor, the following steps are implemented:

上述合成图像处理方法、装置、计算机设备、存储介质和计算机程序产品，获取合成图像、前景图像的掩码和背景图像的掩码，图像处理模型的第一生成器根据合成图像和前景图像的掩码进行前景光影特征的提取，并生成第一合成图像；图像处理模型的第二生成器根据合成图像和背景图像的掩码进行背景光影特征的提取，并生成第二合成图像；基于前景光影特征和背景光影特征相似，第一合成图像和第二合成图像具有相似的光影特征，由第一合成图像和第二合成图像融合得到的目标合成图像是和谐化，并根据前景图像光影特征和背景图像的光影特征在合成图像上进行光影生成，得到具有光影的和谐化目标合成图像。The above-mentioned synthetic image processing method, device, computer equipment, storage medium and computer program product obtain a synthetic image, a mask of a foreground image and a mask of a background image. The first generator of the image processing model extracts foreground light and shadow features according to the masks of the synthetic image and the foreground image, and generates a first synthetic image; the second generator of the image processing model extracts background light and shadow features according to the masks of the synthetic image and the background image, and generates a second synthetic image; based on the similarity of the foreground light and shadow features and the background light and shadow features, the first synthetic image and the second synthetic image have similar light and shadow features, the target synthetic image obtained by fusing the first synthetic image and the second synthetic image is harmonized, and light and shadow generation is performed on the synthetic image according to the light and shadow features of the foreground image and the light and shadow features of the background image, to obtain a harmonized target synthetic image with light and shadow.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为一个实施例中合成图像处理方法的应用环境图；FIG1 is a diagram showing an application environment of a synthetic image processing method according to an embodiment;

图2为一个实施例中合成图像处理方法的流程示意图；FIG2 is a schematic diagram of a flow chart of a synthetic image processing method in one embodiment;

图3为一个实施例中图像处理模型训练的流程示意图；FIG3 is a schematic diagram of a process of training an image processing model in one embodiment;

图4为一个实施例中图像处理模型的网络结构图；FIG4 is a network structure diagram of an image processing model in one embodiment;

图5为一个实施例中确定感知损失的流程示意图；FIG5 is a schematic diagram of a process for determining a perception loss in one embodiment;

图6为另一个实施例中合成图像处理方法的流程示意图；FIG6 is a schematic flow chart of a synthetic image processing method in another embodiment;

图7为一个实施例中合成图像处理方法的流程示意图；FIG7 is a schematic diagram of a flow chart of a synthetic image processing method in one embodiment;

图8为一个实施例中合成图像处理装置的结构框图；FIG8 is a block diagram of a synthetic image processing device in one embodiment;

图9为一个实施例中计算机设备的内部结构图。FIG. 9 is a diagram showing the internal structure of a computer device in one embodiment.

具体实施方式DETAILED DESCRIPTION

为了使本申请的目的、技术方案及优点更加清楚明白，以下结合附图及实施例，对本申请进行进一步详细说明。应当理解，此处描述的具体实施例仅仅用以解释本申请，并不用于限定本申请。In order to make the purpose, technical solution and advantages of the present application more clearly understood, the present application is further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application and are not used to limit the present application.

本申请实施例提供的合成图像处理方法，可以应用于如图1所示的应用环境中。其中，终端102通过网络与服务器104进行通信。数据存储系统可以存储服务器104需要处理的数据。数据存储系统可以集成在服务器104上，也可以放在云上或其他网络服务器上。终端102上传合成图像，服务器104通过网络获取合成图像、前景图像的掩码和背景图像的掩码，将合成图像和前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据前景光影特征和合成图像得到第一合成图像；将合成图像和背景图像的掩码输入至图像处理模型的第二生成器，从背景图像的掩码提取背景光影特征，根据背景光影特征和合成图像得到第二合成图像；其中，图像处理模型的第一生成器提取的前景光影特征，与第二生成器提取的背景光影特征相似；将第一合成图像和第二合成图像融合，得到目标合成图像。其中，终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑、物联网设备和便携式可穿戴设备，物联网设备可为智能电视、智能车载设备等。便携式可穿戴设备可为智能手表、智能手环、头戴设备等。服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The synthetic image processing method provided in the embodiment of the present application can be applied to the application environment shown in FIG1. Among them, the terminal 102 communicates with the server 104 through the network. The data storage system can store the data that the server 104 needs to process. The data storage system can be integrated on the server 104, or it can be placed on the cloud or other network servers. The terminal 102 uploads the synthetic image, and the server 104 obtains the synthetic image, the mask of the foreground image, and the mask of the background image through the network, and inputs the synthetic image and the mask of the foreground image into the first generator of the image processing model, extracts the foreground light and shadow features from the mask of the foreground image, and obtains the first synthetic image according to the foreground light and shadow features and the synthetic image; the synthetic image and the mask of the background image are input into the second generator of the image processing model, and the background light and shadow features are extracted from the mask of the background image, and the second synthetic image is obtained according to the background light and shadow features and the synthetic image; wherein the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator; the first synthetic image and the second synthetic image are fused to obtain the target synthetic image. The terminal 102 may be, but is not limited to, various personal computers, laptops, smart phones, tablet computers, IoT devices, and portable wearable devices. The IoT devices may be smart TVs, smart car-mounted devices, etc. The portable wearable devices may be smart watches, smart bracelets, head-mounted devices, etc. The server 104 may be implemented as an independent server or a server cluster consisting of multiple servers.

在一个实施例中，如图2所示，提供了一种合成图像处理方法，该合成图像处理方法包括以下步骤：In one embodiment, as shown in FIG2 , a synthetic image processing method is provided, the synthetic image processing method comprising the following steps:

步骤202，获取合成图像、前景图像的掩码和背景图像的掩码。Step 202, obtaining a composite image, a mask of a foreground image, and a mask of a background image.

其中，合成图像是通过从前景图像中提取目标图像，将目标图像粘贴到另外一张背景图像上所形成的组合图像。广义来讲，合成图像是把来自不同图像的多个视觉元素嫁接到同一张图像上所形成的组合图像。Among them, the composite image is a composite image formed by extracting the target image from the foreground image and pasting the target image onto another background image. In a broad sense, a composite image is a composite image formed by grafting multiple visual elements from different images onto the same image.

前景图像的掩码以及背景图像的掩码可以通过合成图像的掩码操作得到，图像的掩码操作是指通过掩码核算子重新计算图像中各个像素的值，掩码核算子刻画领域像素点对新像素值得影响程度，同时根据掩码算子中权重因子对像素点进行加权平均。前景图像的掩码是指合成图像中前景图像中个像素的值，背景图像的掩码是指合成图像中背景图像中个像素的值。The mask of the foreground image and the mask of the background image can be obtained by the mask operation of the composite image. The mask operation of the image refers to recalculating the value of each pixel in the image through the mask kernel operator. The mask kernel operator describes the influence of the domain pixel points on the new pixel value, and at the same time, the pixel points are weighted averaged according to the weight factor in the mask operator. The mask of the foreground image refers to the value of the pixels in the foreground image in the composite image, and the mask of the background image refers to the value of the pixels in the background image in the composite image.

步骤204，将合成图像和前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据前景光影特征和合成图像得到第一合成图像。Step 204, input the composite image and the mask of the foreground image to the first generator of the image processing model, extract the foreground light and shadow features from the mask of the foreground image, and obtain the first composite image according to the foreground light and shadow features and the composite image.

其中，图像处理模型是由第一生成器和第二生成器构成的。第一生成器由编码器和解码器构成。The image processing model is composed of a first generator and a second generator. The first generator is composed of an encoder and a decoder.

具体地，将合成图像和前景图像的掩码输入至第一生成器的编码器，通过transformer模块对前景图像的掩码进行前景光影特征的提取，第一生成器的解码器根据前景光影特征和合成图像输出第一合成图像。Specifically, the masks of the composite image and the foreground image are input into the encoder of the first generator, the foreground light and shadow features are extracted from the mask of the foreground image through the transformer module, and the decoder of the first generator outputs the first composite image based on the foreground light and shadow features and the composite image.

步骤206，将合成图像和背景图像的掩码输入至图像处理模型的第二生成器，从背景图像的掩码提取背景光影特征，根据背景光影特征和合成图像得到第二合成图像；其中，图像处理模型的第一生成器提取的前景光影特征，与第二生成器提取的背景光影特征相似。Step 206, input the masks of the composite image and the background image to the second generator of the image processing model, extract the background light and shadow features from the mask of the background image, and obtain a second composite image based on the background light and shadow features and the composite image; wherein the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator.

其中，图像处理模型是由第一生成器和第二生成器构成的。其中，第二生成器主要用于光影生成。在图像处理模型的训练过程中，以能够获取相似的前景光影特征和背景光影特征为目标进行训练，因此训练好的图像处理模型的第一生成器提取的前景光影特征，与第二生成器提取的背景光影特征相似。The image processing model is composed of a first generator and a second generator. The second generator is mainly used for light and shadow generation. During the training process of the image processing model, the training is performed with the goal of obtaining similar foreground light and shadow features and background light and shadow features. Therefore, the foreground light and shadow features extracted by the first generator of the trained image processing model are similar to the background light and shadow features extracted by the second generator.

具体地，将合成图像和背景图像的掩码输入至图像处理模型的第二生成器的编码器，编码器对背景图像的掩码进行背景光影特征的提取，第二生成器的解码器根据背景光影特征和合成图像输出第二合成图像。由于前景光影特征与背景光影特征相似，所以生成的第一合成图像和第二合成图像的光影特征是相似的。Specifically, the masks of the composite image and the background image are input to the encoder of the second generator of the image processing model, the encoder extracts the background light and shadow features from the mask of the background image, and the decoder of the second generator outputs the second composite image according to the background light and shadow features and the composite image. Since the foreground light and shadow features are similar to the background light and shadow features, the light and shadow features of the generated first composite image and the second composite image are similar.

步骤208，将第一合成图像和第二合成图像融合，得到目标合成图像。Step 208: fuse the first composite image and the second composite image to obtain a target composite image.

具体地，基于将第一合成图像和第二合成图像，对前景图像和背景图像进行了融合，并根据前景图像光影特征和背景图像的光影特征在合成图像上进行光影生成，得到了目标合成图像。由于第一合成图像和第二合成图像的光影特征是相似的，因此，融合第一合成图像和第二合成图像得到的目标合成图像是和谐化的。Specifically, based on the first composite image and the second composite image, the foreground image and the background image are fused, and light and shadow are generated on the composite image according to the light and shadow characteristics of the foreground image and the light and shadow characteristics of the background image, so as to obtain a target composite image. Since the light and shadow characteristics of the first composite image and the second composite image are similar, the target composite image obtained by fusing the first composite image and the second composite image is harmonized.

在本实施例中，通过合成图像、前景图像的掩码和背景图像的掩码，提取了前景光影特征和背景光影特征，根据前景光影特征形成第一合成图像，根据背景光影特征形成第二合成图像。基于前景光影特征和背景光影特征相似，第一合成图像和第二合成图像具有相似的光影特征，由第一合成图像和第二合成图像融合得到的目标合成图像是和谐化，并根据前景图像光影特征和背景图像的光影特征在合成图像上进行光影生成，得到具有光影的和谐化目标合成图像。In this embodiment, the foreground light and shadow features and the background light and shadow features are extracted by synthesizing the image, the mask of the foreground image, and the mask of the background image, and a first synthesized image is formed according to the foreground light and shadow features, and a second synthesized image is formed according to the background light and shadow features. Based on the similarity between the foreground light and shadow features and the background light and shadow features, the first synthesized image and the second synthesized image have similar light and shadow features, and the target synthesized image obtained by fusing the first synthesized image and the second synthesized image is harmonized, and light and shadow are generated on the synthesized image according to the light and shadow features of the foreground image and the light and shadow features of the background image, so as to obtain a harmonized target synthesized image with light and shadow.

在一个实施例中，如图3所示，训练图像处理模型的合成图像处理方法包括：In one embodiment, as shown in FIG3 , a synthetic image processing method for training an image processing model includes:

步骤302，获取图像数据集，图像数据集包括多组数据，每组数据由两个三元组数据组成，其中一个三元组为原始合成图像、前景图像的掩码和背景图像的掩码，另一个三元组为真实图像、真实前景光照图像和真实背景光照图像。Step 302, obtaining an image data set, the image data set includes multiple groups of data, each group of data consists of two triplet data, one triplet is the original synthetic image, the mask of the foreground image and the mask of the background image, and the other triplet is the real image, the real foreground illumination image and the real background illumination image.

其中，图像数据集为公共开放的大规模合成图像和谐化的数据集IH Dataset。图像数据集包括多组数据，数据格式为六元组形式，即每组数据由两个三元组组成。其中一个三元组作为输入数据，包括原始合成图像、前景图像的掩码和背景图像的掩码，另一个三元组为真实图像、真实前景光照图像和真实背景光照图像。The image dataset is a publicly available large-scale synthetic image and harmonized dataset, IH Dataset. The image dataset includes multiple sets of data in the form of six-tuples, that is, each set of data consists of two triplets. One of the triplets is used as input data, including the original synthetic image, the mask of the foreground image, and the mask of the background image, and the other triplets are the real image, the real foreground illumination image, and the real background illumination image.

步骤304，将图像数据集中原始合成图像和前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据前景光影特征和原始合成图像得到第三合成图像。Step 304, input the masks of the original synthetic image and the foreground image in the image data set to the first generator of the image processing model, extract the foreground light and shadow features from the mask of the foreground image, and obtain the third synthetic image based on the foreground light and shadow features and the original synthetic image.

步骤306，将原始合成图像和背景图像的掩码输入至图像处理模型的第二生成器，从背景图像的掩码提取背景光影特征，根据背景光影特征和合成图像得到第四合成图像。Step 306, input the original composite image and the mask of the background image to the second generator of the image processing model, extract the background light and shadow features from the mask of the background image, and obtain a fourth composite image based on the background light and shadow features and the composite image.

具体地，通过原始合成图像、前景图像的掩码和背景图像的掩码，提取了前景光影特征和背景光影特征，根据前景光影特征形成第三合成图像，根据背景光影特征形成第四合成图像。Specifically, the foreground light and shadow features and the background light and shadow features are extracted through the original composite image, the mask of the foreground image and the mask of the background image, and a third composite image is formed according to the foreground light and shadow features, and a fourth composite image is formed according to the background light and shadow features.

步骤308，将第三合成图像和第四合成图像融合，得到和谐处理图像。Step 308: merge the third synthesized image and the fourth synthesized image to obtain a harmonized image.

步骤310，根据和谐处理图像、真实图像、真实前景光照图像和真实背景光照图像，计算模型损失。Step 310, calculating the model loss based on the harmonized image, the real image, the real foreground illumination image and the real background illumination image.

其中，模型损失可以包括光照特征损失(L_light)、非光照特征损失(L_nonlight)、感知损失(L_perceptual)、对抗性损失(L_adv)以及绝对值偏差函数(L₁)。The model loss may include illumination feature loss (L _light ), non-illumination feature loss (L _nonlight ), perceptual loss (L _perceptual ), adversarial loss (L _adv ), and absolute value deviation function (L ₁ ).

具体地，根据和谐处理图像，确定和谐处理图像的前景光照图像和和谐处理图像的背景光照图像。根据和谐处理图像、真实图像、和谐处理图像的前景光照图像、和谐处理图像的背景光照图像、真实前景光照图像和真实背景光照图像，计算模型损失。Specifically, according to the harmoniously processed image, a foreground illumination image of the harmoniously processed image and a background illumination image of the harmoniously processed image are determined. According to the harmoniously processed image, the real image, the foreground illumination image of the harmoniously processed image, the background illumination image of the harmoniously processed image, the real foreground illumination image and the real background illumination image, the model loss is calculated.

步骤312，基于模型损失的约束对第一生成器和第二生成器进行参数调整，得到训练好的图像处理模型；损失函数的约束包括：图像处理模型的第一生成器提取的前景光影特征，与第二生成器提取的背景光影特征相似。Step 312, adjusting the parameters of the first generator and the second generator based on the constraints of the model loss to obtain a trained image processing model; the constraints of the loss function include: the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator.

具体地，图像数据集按8:1:1进行划分，80％用作训练集，10％用作验证集，10％用作测试集，训练数据集和测试数据集的前景对象之间没有交叉。获取图像数据集中的训练集，根据训练集中的真实图像和融合处理得到的和谐处理图像，确定模型损失，根据模型损失对生成器进行普通参数调整。获取图像数据集中的验证集，根据验证集中的数据，得到经过融合处理的和谐处理图像，根据验证集中的真实图像和融合处理得到的和谐处理图像，确定模型损失，根据模型损失对生成器进行超参数调整，得到多个训练好的模型。获取图像数据集中的测试集，通过测试集的数据对多个训练好的模型进行评估，将评分最高的的模型作为图像处理模型。Specifically, the image dataset is divided into 8:1:1, 80% is used as a training set, 10% is used as a validation set, and 10% is used as a test set. There is no overlap between the foreground objects of the training dataset and the test dataset. The training set in the image dataset is obtained, and the model loss is determined based on the real images in the training set and the harmoniously processed images obtained by fusion processing, and the ordinary parameters of the generator are adjusted according to the model loss. The validation set in the image dataset is obtained, and the harmoniously processed images after fusion processing are obtained according to the data in the validation set. The model loss is determined based on the real images in the validation set and the harmoniously processed images obtained by fusion processing, and the hyperparameters of the generator are adjusted according to the model loss to obtain multiple trained models. The test set in the image dataset is obtained, and multiple trained models are evaluated through the data of the test set, and the model with the highest score is used as the image processing model.

在本实施例中，通过图像数据集得到和谐处理图像，根据和谐处理图像、真实前景光照图像和真实背景光照图像，确定模型损失，根据模型损失对第一生成器和第二生成器进行参数调整，得到训练好的图像处理模型。In this embodiment, a harmoniously processed image is obtained through an image data set, and a model loss is determined based on the harmoniously processed image, a true foreground illumination image, and a true background illumination image. Parameters of the first generator and the second generator are adjusted based on the model loss to obtain a trained image processing model.

在一个实施例中，模型损失包括：对抗性损失和光照损失；根据和谐处理图像、真实图像、真实前景光照图像和真实背景光照图像，计算模型损失，包括：将和谐处理图像和真实图像输入图像处理模型的判别器，判别器用于判断和谐处理图像和真实图像的真实性；根据判别器的输出，计算对抗性损失；根据真实前景光照图像与和谐处理图像的前景光照图像的差异，以及真实背景光照图像与和谐化处理图像的背景光照图像的差异，计算光照特征损失。In one embodiment, the model loss includes: adversarial loss and illumination loss; the model loss is calculated based on the harmonized image, the real image, the real foreground illumination image and the real background illumination image, including: inputting the harmonized image and the real image into the discriminator of the image processing model, the discriminator is used to judge the authenticity of the harmonized image and the real image; according to the output of the discriminator, the adversarial loss is calculated; according to the difference between the real foreground illumination image and the foreground illumination image of the harmonized image, and the difference between the real background illumination image and the background illumination image of the harmonized image, the illumination feature loss is calculated.

具体地，对抗性损失L_adv是在对抗结构上生成器和鉴别器之间的对抗损失。判别器期望生成器努力生成逼真的图像，判别器努力判断图像是真的还是假的。如图4所示，生成器生成和谐处理图像后，和谐处理图像和真实图像输入至图像处理模型的判别器中，判别器对和谐处理图像和真实图像的真实性进行判断，输出和谐处理图像和真实图像“真实”的概率，根据该输出计算对抗性损失。Specifically, the adversarial loss _Ladv is the adversarial loss between the generator and the discriminator in the adversarial structure. The discriminator expects the generator to strive to generate realistic images, and the discriminator strives to judge whether the image is real or fake. As shown in Figure 4, after the generator generates the harmonized image, the harmonized image and the real image are input into the discriminator of the image processing model. The discriminator judges the authenticity of the harmonized image and the real image, and outputs the probability that the harmonized image and the real image are "real". The adversarial loss is calculated based on the output.

对抗性损失为

The adversarial loss is

其中，D(i)是图像i“真实”的概率，Pic表示目标合成图像，

表示真实图像。Where D(i) is the probability that image i is “real”, Pic represents the target synthetic image,

Represents a real image.

具体地，光照特征损失L_light用于表示和谐处理图像的光照和对应的真实图像的光照之间的元素光照损失。Specifically, the illumination feature loss _Llight is used to represent the element illumination loss between the illumination of the harmonized image and the illumination of the corresponding real image.

其中，

表示目标合成图像的前景光照图像，

表示目标合成图像的背景光照图像。Pic_obj表示真实图像的前景光照图像，Pic_background表示真实图像的背景光照图像，

为欧几里得范数的平方，期望光照损失特征尽可能的小。in,

represents the foreground illumination image of the target composite image,

Represents the background illumination image of the target synthetic image. Pic _obj represents the foreground illumination image of the real image, and Pic _background represents the background illumination image of the real image.

is the square of the Euclidean norm, and it is expected that the illumination loss feature is as small as possible.

在本实施例中，通过和谐处理图像的前景光照图像、真实图像的前景光照图像、和谐处理图像的背景光照图像和真实图像的背景光照图像，计算对抗性损失和光照特征损失。对抗性损失是在对抗结构上生成器和鉴别器之间的对抗损失。光照特征损失表征了和谐处理图像的光照和真实图像的光照之间的光照损失情况，我们期望光照损失函数尽可能的小，以使得目标合成图像的光照和真实图像的光照之间的光照损失更小。In this embodiment, the adversarial loss and the illumination feature loss are calculated by using the foreground illumination image of the harmonized image, the foreground illumination image of the real image, the background illumination image of the harmonized image, and the background illumination image of the real image. The adversarial loss is the adversarial loss between the generator and the discriminator in the adversarial structure. The illumination feature loss characterizes the illumination loss between the illumination of the harmonized image and the illumination of the real image. We expect the illumination loss function to be as small as possible so that the illumination loss between the illumination of the target synthetic image and the illumination of the real image is smaller.

在一个实施例中，模型损失还包括：非光照特征损失；In one embodiment, the model loss further includes: non-illumination feature loss;

合成图像处理方法还包括：根据相同前景图像在不同光照下的非照明特征，计算得到非光照特征损失；其中，非光照特征损失函数的目标是使不同照明条件下的非照明特征的差异最小。The synthetic image processing method also includes: calculating the non-illumination feature loss according to the non-illumination features of the same foreground image under different illumination conditions; wherein the goal of the non-illumination feature loss function is to minimize the difference in non-illumination features under different illumination conditions.

具体地，非光照特征损失L_nonlight是基于Retinex理论，引入了强制非光照特征匹配，以提高对象重新光照图像的准确性，期望同一物体在不同光照条件下具有相同的非光照(即反射率)特征。Specifically, the non-lighting feature loss L _nonlight is based on the Retinex theory and introduces forced non-lighting feature matching to improve the accuracy of object re-lighting images, expecting that the same object has the same non-lighting (i.e., reflectivity) features under different lighting conditions.

其中，

和

是相同前景对象在不同光照条件下的非光照特征，Nnonlight是非光照特征F_nonlight中的元素数，期望非光照特征损失函数尽可能的小。in,

and

is the non-lighting feature of the same foreground object under different lighting conditions, Nnonlight is the number of elements in the non-lighting feature F _nonlight , and it is expected that the non-lighting feature loss function is as small as possible.

在本实施例中，通过相同前景对象在不同光照条件下的非光照特征和非光照特征中的元素数，给出了非光照损失函数的表达式。期望非光照特征损失函数越小，表示同一物体在不同光照条件下具有越接近的非光照(即反射率)特征，在本实施例中，希望非光照特征损失函数尽可能的小，以使得同一物体在不同光照条件下具有越接近的非光照(即反射率)特征。In this embodiment, the expression of the non-illumination loss function is given by the non-illumination features of the same foreground object under different illumination conditions and the number of elements in the non-illumination features. It is expected that the smaller the non-illumination feature loss function is, the closer the non-illumination (i.e., reflectivity) features of the same object under different illumination conditions are. In this embodiment, it is hoped that the non-illumination feature loss function is as small as possible so that the same object has closer non-illumination (i.e., reflectivity) features under different illumination conditions.

在一个实施例中，模型损失还包括：感知损失；如图5所示，合成图像处理方法还包括：In one embodiment, the model loss further includes: perceptual loss; as shown in FIG5 , the synthetic image processing method further includes:

步骤502，利用预先训练好的神经网络模型提取和谐处理图像的特征图。Step 502: extracting and harmonizing feature maps of the image using a pre-trained neural network model.

具体地，神经网络模型可以为VGG16，神经网络模型VGG16是在ImageNet上预先训练好的。通过预先训练好的神经网络模型提取和谐处理图像的特征图。Specifically, the neural network model may be VGG16, which is pre-trained on ImageNet. The pre-trained neural network model is used to extract and harmonize the feature map of the processed image.

步骤504，利用预先训练好的神经网络模型提取和谐处理图像对应的真实图像的特征图。Step 504: Use the pre-trained neural network model to extract a feature map of the real image corresponding to the harmonized image.

具体地，确定和谐处理图像对应的真实图像，通过预先训练好的神经网络模型提取真实图像的特征图。Specifically, a real image corresponding to the harmonized processed image is determined, and a feature map of the real image is extracted through a pre-trained neural network model.

步骤506，根据和谐处理图像的特征图和和谐处理图像对应的真实图像的特征图，确定感知损失；感知损失函数的目标是使和谐处理图像的特征图和和谐处理图像对应的真实图像的特征图的差异最小。Step 506, determining the perceptual loss based on the feature map of the harmonized image and the feature map of the real image corresponding to the harmonized image; the goal of the perceptual loss function is to minimize the difference between the feature map of the harmonized image and the feature map of the real image corresponding to the harmonized image.

感知损失函数i_per用于表示通过神经网络模型提取的和谐处理图像与真实图像间的语义差异。The perceptual loss function _iper is used to represent the semantic difference between the harmonized image extracted by the neural network model and the real image.

MSE(A，B)是计算A，B的均方误差，用于误差分析。V_I表示用预先训练好的模型提取的特征图，pic_R指输出的和谐处理图像，

指真实图像。感知损失函数表征和谐处理和真实图像的接近程度，感知损失函数越小，输出的和谐处理图像越接近真实图像。MSE(A, B) is the mean square error of A and B, which is used for error analysis. _{V I} represents the feature map extracted by the pre-trained model, pic _R refers to the output harmonized image,

Refers to the real image. The perceptual loss function represents the closeness between the harmonized image and the real image. The smaller the perceptual loss function, the closer the output harmonized image is to the real image.

在本实施例中，通过神经网络模型、目标合成图像和真实图像，确定感知损失函数，感知损失函数表征目标合成图像和真实图像的接近程度。我们期望知损失函数尽可能的小，以使输出的和谐处理图像尽可能的接近真实图像。In this embodiment, the perceptual loss function is determined by the neural network model, the target synthetic image and the real image, and the perceptual loss function represents the closeness between the target synthetic image and the real image. We hope that the perceptual loss function is as small as possible so that the outputted harmonized processed image is as close to the real image as possible.

在一个实施例中，模型损失为光照特征损失、非光照特征损失、感知损失、对抗性损失以及绝对值偏差损失的加权和。In one embodiment, the model loss is a weighted sum of illumination feature loss, non-illumination feature loss, perceptual loss, adversarial loss, and absolute value deviation loss.

绝对值偏差损失即L1范数损失，期望将和谐处理图像与真实图像的绝对差值的总和最小化。The absolute value deviation loss, also known as the L1 norm loss, aims to minimize the sum of the absolute differences between the harmonized image and the true image.

其中，Pic表示和谐处理图像，

表示真实图像。Among them, Pic means harmoniously processing the image,

Represents a real image.

具体地，在实际训练的过程中可以根据模型训练的目的，对光照特征损失、非光照特征损失、感知损失、对抗性损失以及绝对值偏差损失设置不同的权重。当设置相同的权重时，模型损失可表示为L_total＝L_light+L_nonlight+L_perceptual+L_adv+L₁。其中，L_light为光照特征损失，L_nonlight为非光照特征损失，L_perceptual为感知损失，L_adv为抗性损失，L₁为绝对值偏差损失，即经典的L1范数损失。Specifically, in the actual training process, different weights can be set for the illumination feature loss, non-illumination feature loss, perceptual loss, adversarial loss, and absolute value deviation loss according to the purpose of model training. When the same weights are set, the model loss can be expressed as L _total = L _light + L _nonlight + L _perceptual + L _adv + L ₁ . Among them, L _light is the illumination feature loss, L _nonlight is the non-illumination feature loss, L _perceptual is the perceptual loss, L _adv is the adversarial loss, and L ₁ is the absolute value deviation loss, that is, the classic L1 norm loss.

在训练图像处理模型的过程中，利用图像数据集中原始合成图像、前景图像的掩码和背景图像的掩码得到和谐处理图像，根据和谐处理图像、真实图像、真实前景光照图像和真实背景光照图像，计算光照特征损失、非光照特征损失、感知损失、对抗性损失以及绝对值偏差损失，得到光照特征损失、非光照特征损失、感知损失、对抗性损失以及绝对值偏差损失的加权和，确定图像处理模型损失，根据图像处理模型损失对第一生成器和第二生成器进行参数调整，得到训练好的图像处理模型。In the process of training the image processing model, the original synthetic image, the mask of the foreground image and the mask of the background image in the image dataset are used to obtain the harmoniously processed image. According to the harmoniously processed image, the real image, the real foreground illumination image and the real background illumination image, the illumination feature loss, the non-illumination feature loss, the perceptual loss, the adversarial loss and the absolute value deviation loss are calculated to obtain the weighted sum of the illumination feature loss, the non-illumination feature loss, the perceptual loss, the adversarial loss and the absolute value deviation loss, and the image processing model loss is determined. According to the image processing model loss, the parameters of the first generator and the second generator are adjusted to obtain the trained image processing model.

在本实施例中，通过光照特征损失、非光照特征损失、感知损失、对抗性损失以及绝对值偏差损失，计算得到模型损失。In this embodiment, the model loss is calculated through illumination feature loss, non-illumination feature loss, perception loss, adversarial loss and absolute value deviation loss.

在一个实施例中，如图6所示，提供一种合成图像处理方法，包括以下步骤：In one embodiment, as shown in FIG6 , a synthetic image processing method is provided, comprising the following steps:

步骤602，获取图像数据集，图像数据集包括多组数据，每组数据由两个三元组数据组成，其中一个三元组为原始合成图像、前景图像的掩码和背景图像的掩码，另一个三元组为真实图像、真实前景光照图像和真实背景光照图像。Step 602, obtaining an image data set, the image data set includes multiple groups of data, each group of data consists of two triplet data, one triplet is the original synthetic image, the mask of the foreground image and the mask of the background image, and the other triplet is the real image, the real foreground illumination image and the real background illumination image.

步骤604，设计损失函数，损失函数包括光照特征损失函数、非光照特征损失函数、感知损失函数、对抗性损失函数、以及绝对值偏差函数。Step 604, designing a loss function, the loss function includes an illumination feature loss function, a non-illumination feature loss function, a perceptual loss function, an adversarial loss function, and an absolute value deviation function.

步骤606，将图像数据集中原始合成图像和前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据前景光影特征和原始合成图像得到第三合成图像。Step 606, input the masks of the original synthetic image and the foreground image in the image data set to the first generator of the image processing model, extract the foreground light and shadow features from the mask of the foreground image, and obtain the third synthetic image based on the foreground light and shadow features and the original synthetic image.

步骤608，将原始合成图像和背景图像的掩码输入至图像处理模型的第二生成器，从背景图像的掩码提取背景光影特征，根据背景光影特征和合成图像得到第四合成图像。Step 608, input the original composite image and the mask of the background image to the second generator of the image processing model, extract the background light and shadow features from the mask of the background image, and obtain a fourth composite image based on the background light and shadow features and the composite image.

步骤610，将第三合成图像和第四合成图像融合，得到和谐处理图像。Step 610: merge the third synthesized image and the fourth synthesized image to obtain a harmonized image.

步骤612，根据和谐处理图像、真实图像、真实前景光照图像和真实背景光照图像，计算模型损失。Step 612, calculating the model loss based on the harmonized image, the real image, the real foreground illumination image and the real background illumination image.

步骤614，基于模型损失的约束对第一生成器和第二生成器进行参数调整，得到训练好的图像处理模型；损失函数的约束包括：图像处理模型的第一生成器提取的前景光影特征，与第二生成器提取的背景光影特征相似。Step 614, adjusting the parameters of the first generator and the second generator based on the constraints of the model loss to obtain a trained image processing model; the constraints of the loss function include: the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator.

步骤616，将合成图像和前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据前景光影特征和合成图像得到第一合成图像。Step 616, input the mask of the composite image and the foreground image to the first generator of the image processing model, extract the foreground light and shadow features from the mask of the foreground image, and obtain the first composite image based on the foreground light and shadow features and the composite image.

其中，合成图像为需要进行和谐化处理的合成图像。The composite image is a composite image that needs to be harmonized.

步骤618，将合成图像和背景图像的掩码输入至图像处理模型的第二生成器，从背景图像的掩码提取背景光影特征，根据背景光影特征和合成图像得到第二合成图像；其中，图像处理模型的第一生成器提取的前景光影特征，与第二生成器提取的背景光影特征相似。Step 618, input the masks of the composite image and the background image to the second generator of the image processing model, extract the background light and shadow features from the mask of the background image, and obtain a second composite image based on the background light and shadow features and the composite image; wherein the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator.

步骤620，将第一合成图像和第二合成图像融合，得到目标合成图像。Step 620: fuse the first composite image and the second composite image to obtain a target composite image.

在本实施例中，如图7所示，合成图像处理方法主要包括获取图像数据集、损失函数设计、图像处理模型训练和对合成图像进行和谐化处理四部分内容。首先通过图像数据集对图像处理模型进行训练，得到训练好的图像处理模型，根据训练好的图像处理模型，对合成图像进行和谐化处理，得到目标合成图像。In this embodiment, as shown in Figure 7, the synthetic image processing method mainly includes four parts: obtaining an image data set, designing a loss function, training an image processing model, and performing harmonization processing on the synthetic image. First, the image processing model is trained using the image data set to obtain a trained image processing model, and then the synthetic image is harmonized according to the trained image processing model to obtain a target synthetic image.

应该理解的是，虽然如上的各实施例所涉及的流程图中的各个步骤按照箭头的指示依次显示，但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明，这些步骤的执行并没有严格的顺序限制，这些步骤可以以其它的顺序执行。而且，如上的各实施例所涉及的流程图中的至少一部分步骤可以包括多个步骤或者多个阶段，这些步骤或者阶段并不必然是在同一时刻执行完成，而是可以在不同的时刻执行，这些步骤或者阶段的执行顺序也不必然是依次进行，而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the steps in the flowcharts involved in the above embodiments are displayed in sequence according to the indication of the arrows, these steps are not necessarily executed in sequence according to the order indicated by the arrows. Unless there is a clear explanation in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least a part of the steps in the flowcharts involved in the above embodiments may include multiple steps or multiple stages, and these steps or stages are not necessarily executed at the same time, but can be executed at different times, and the execution order of these steps or stages is not necessarily carried out in sequence, but can be executed in turn or alternately with other steps or at least a part of the steps or stages in other steps.

基于同样的发明构思，本申请实施例还提供了一种用于实现上述所涉及的合成图像处理方法的合成图像处理装置。该装置所提供的解决问题的实现方案与上述方法中所记载的实现方案相似，故下面所提供的一个或多个合成图像处理装置实施例中的具体限定可以参见上文中对于合成图像处理方法的限定，在此不再赘述。Based on the same inventive concept, the embodiment of the present application also provides a synthetic image processing device for implementing the synthetic image processing method involved above. The implementation solution provided by the device to solve the problem is similar to the implementation solution recorded in the above method, so the specific limitations in one or more synthetic image processing device embodiments provided below can refer to the limitations of the synthetic image processing method above, and will not be repeated here.

在一个实施例中，如图8所示，提供了一种合成图像处理装置，包括：合成图像获取模块802、第一合成图像获取模块804、第二合成图像获取模块806和目标合成图像获取模块808，其中：In one embodiment, as shown in FIG8 , a synthetic image processing device is provided, including: a synthetic image acquisition module 802, a first synthetic image acquisition module 804, a second synthetic image acquisition module 806 and a target synthetic image acquisition module 808, wherein:

合成图像获取模块802，用于获取合成图像、前景图像的掩码和背景图像的掩码。The composite image acquisition module 802 is used to acquire a composite image, a mask of a foreground image, and a mask of a background image.

第一合成图像获取模块804，用于将合成图像和前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据前景光影特征和合成图像得到第一合成图像。The first composite image acquisition module 804 is used to input the composite image and the mask of the foreground image into the first generator of the image processing model, extract the foreground light and shadow features from the mask of the foreground image, and obtain the first composite image according to the foreground light and shadow features and the composite image.

第二合成图像获取模块806，用于将合成图像和背景图像的掩码输入至图像处理模型的第二生成器，从背景图像的掩码提取背景光影特征，根据背景光影特征和合成图像得到第二合成图像；其中，图像处理模型的第一生成器提取的前景光影特征，与第二生成器提取的背景光影特征相似。The second synthetic image acquisition module 806 is used to input the masks of the synthetic image and the background image into the second generator of the image processing model, extract the background light and shadow features from the mask of the background image, and obtain the second synthetic image based on the background light and shadow features and the synthetic image; wherein the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator.

目标合成图像获取模块808，用于将第一合成图像和第二合成图像融合，得到目标合成图像。The target composite image acquisition module 808 is used to fuse the first composite image and the second composite image to obtain a target composite image.

在一个实施例中，训练图像处理模型的合成图像处理方法包括：In one embodiment, a synthetic image processing method for training an image processing model includes:

图像数据集获取模块，用于获取图像数据集，图像数据集包括多组数据，每组数据由两个三元组数据组成，其中一个三元组为原始合成图像、前景图像的掩码和背景图像的掩码，另一个三元组为真实图像、真实前景光照图像和真实背景光照图像。The image dataset acquisition module is used to acquire an image dataset, which includes multiple groups of data, each group of data consists of two triplet data, one of which is the original synthetic image, the mask of the foreground image and the mask of the background image, and the other triplet is the real image, the real foreground illumination image and the real background illumination image.

第三合成图像获取模块，用于将图像数据集中原始合成图像和前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据前景光影特征和原始合成图像得到第三合成图像。The third synthetic image acquisition module is used to input the masks of the original synthetic image and the foreground image in the image data set into the first generator of the image processing model, extract the foreground light and shadow features from the mask of the foreground image, and obtain the third synthetic image based on the foreground light and shadow features and the original synthetic image.

第四合成图像获取模块，用于将原始合成图像和背景图像的掩码输入至图像处理模型的第二生成器，从背景图像的掩码提取背景光影特征，根据背景光影特征和合成图像得到第四合成图像。The fourth composite image acquisition module is used to input the masks of the original composite image and the background image into the second generator of the image processing model, extract the background light and shadow features from the mask of the background image, and obtain the fourth composite image based on the background light and shadow features and the composite image.

和谐处理图像获取模块，将第三合成图像和第四合成图像融合，得到和谐处理图像。The harmonized processed image acquisition module fuses the third synthesized image and the fourth synthesized image to obtain a harmonized processed image.

模型损失计算模块，用于根据和谐处理图像、真实图像、真实前景光照图像和真实背景光照图像，计算模型损失。The model loss calculation module is used to calculate the model loss according to the harmonically processed image, the real image, the real foreground illumination image and the real background illumination image.

图像处理模型获取模块，用于基于模型损失的约束对第一生成器和第二生成器进行参数调整，得到训练好的图像处理模型；损失函数的约束包括：图像处理模型的第一生成器提取的前景光影特征，与第二生成器提取的背景光影特征相似。The image processing model acquisition module is used to adjust the parameters of the first generator and the second generator based on the constraints of the model loss to obtain a trained image processing model; the constraints of the loss function include: the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator.

在一个实施例中，模型损失包括：对抗性损失和光照损失；模型损失计算模块，用于将和谐处理图像和真实图像输入图像处理模型的判别器，判别器用于判断和谐处理图像和真实图像的真实性；根据判别器的输出，计算对抗性损失；根据真实前景光照图像与和谐处理图像的前景光照图像的差异，以及真实背景光照图像与和谐处理图像的背景光照图像的差异，计算光照特征损失。In one embodiment, the model loss includes: adversarial loss and illumination loss; a model loss calculation module is used to input the harmonized image and the real image into the discriminator of the image processing model, and the discriminator is used to judge the authenticity of the harmonized image and the real image; according to the output of the discriminator, the adversarial loss is calculated; according to the difference between the real foreground illumination image and the foreground illumination image of the harmonized image, and the difference between the real background illumination image and the background illumination image of the harmonized image, the illumination feature loss is calculated.

在一个实施例中，模型损失还包括：非光照特征损失；合成图像处理方法还包括：非光照特征损失计算模块，用于根据相同前景图像在不同光照下的非照明特征，计算得到非光照特征损失；其中，非光照特征损失函数的目标是使不同照明条件下的非照明特征的差异最小。In one embodiment, the model loss also includes: non-illumination feature loss; the synthetic image processing method also includes: a non-illumination feature loss calculation module, which is used to calculate the non-illumination feature loss based on the non-illumination features of the same foreground image under different illumination; wherein the goal of the non-illumination feature loss function is to minimize the difference in non-illumination features under different illumination conditions.

在一个实施例中，模型损失还包括：感知损失；合成图像处理方法还包括：感知损失计算模块，用于利用预先训练好的神经网络模型提取和谐处理图像的特征图；利用预先训练好的神经网络模型提取和谐处理图像对应的真实图像的特征图；根据和谐处理图像的特征图和和谐处理图像对应的真实图像的特征图，确定感知损失；感知损失函数的目标是使和谐处理图像的特征图和和谐处理图像对应的真实图像的特征图的差异最小。In one embodiment, the model loss also includes: perceptual loss; the synthetic image processing method also includes: a perceptual loss calculation module, which is used to use a pre-trained neural network model to extract a feature map of the harmoniously processed image; use the pre-trained neural network model to extract a feature map of the real image corresponding to the harmoniously processed image; determine the perceptual loss based on the feature map of the harmoniously processed image and the feature map of the real image corresponding to the harmoniously processed image; the goal of the perceptual loss function is to minimize the difference between the feature map of the harmoniously processed image and the feature map of the real image corresponding to the harmoniously processed image.

在一个实施例中，模型损失计算模块，用于对光照特征损失、非光照特征损失、感知损失、对抗性损失以及绝对值偏差损失的加权和。In one embodiment, the model loss calculation module is used to calculate the weighted sum of illumination feature loss, non-illumination feature loss, perceptual loss, adversarial loss and absolute value deviation loss.

上述合成图像处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中，也可以以软件形式存储于计算机设备中的存储器中，以便于处理器调用执行以上各个模块对应的操作。Each module in the above-mentioned synthetic image processing device can be implemented in whole or in part by software, hardware or a combination thereof. Each module can be embedded in or independent of a processor in a computer device in the form of hardware, or can be stored in a memory in a computer device in the form of software, so that the processor can call and execute the operations corresponding to each module.

在一个实施例中，提供了一种计算机设备，该计算机设备可以是服务器，其内部结构图可以如图9所示。该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中，该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质和内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储图像处理相关的数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种合成图像处理方法。In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be shown in FIG9 . The computer device includes a processor, a memory, and a network interface connected via a system bus. The processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of the operating system and the computer program in the non-volatile storage medium. The database of the computer device is used to store data related to image processing. The network interface of the computer device is used to communicate with an external terminal via a network connection. When the computer program is executed by the processor, a synthetic image processing method is implemented.

本领域技术人员可以理解，图9中示出的结构，仅仅是与本申请方案相关的部分结构的框图，并不构成对本申请方案所应用于其上的计算机设备的限定，具体的计算机设备可以包括比图中所示更多或更少的部件，或者组合某些部件，或者具有不同的部件布置。Those skilled in the art will understand that the structure shown in FIG. 9 is merely a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may include more or fewer components than shown in the figure, or combine certain components, or have a different arrangement of components.

在一个实施例中，提供了一种计算机设备，包括存储器和处理器，存储器中存储有计算机程序，该处理器执行计算机程序时实现以下步骤：In one embodiment, a computer device is provided, including a memory and a processor, wherein a computer program is stored in the memory, and when the processor executes the computer program, the following steps are implemented:

将合成图像和前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据前景光影特征和合成图像得到第一合成图像；Inputting the composite image and the mask of the foreground image into a first generator of the image processing model, extracting foreground light and shadow features from the mask of the foreground image, and obtaining a first composite image according to the foreground light and shadow features and the composite image;

将合成图像和背景图像的掩码输入至图像处理模型的第二生成器，从背景图像的掩码提取背景光影特征，根据背景光影特征和合成图像得到第二合成图像；其中，图像处理模型的第一生成器提取的前景光影特征，与第二生成器提取的背景光影特征相似；Inputting the masks of the composite image and the background image into the second generator of the image processing model, extracting background light and shadow features from the mask of the background image, and obtaining a second composite image according to the background light and shadow features and the composite image; wherein the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator;

将第一合成图像和第二合成图像融合，得到目标合成图像。The first composite image and the second composite image are fused to obtain a target composite image.

在一个实施例中，处理器执行计算机程序时还实现以下步骤：In one embodiment, when the processor executes the computer program, the processor further implements the following steps:

获取图像数据集，图像数据集包括多组数据，每组数据由两个三元组数据组成，其中一个三元组为原始合成图像、前景图像的掩码和背景图像的掩码，另一个三元组为真实图像、真实前景光照图像和真实背景光照图像；将图像数据集中原始合成图像和前景图像的掩码输入至图像处理模型的第一生成器，从前景图像的掩码提取前景光影特征，根据前景光影特征和原始合成图像得到第三合成图像；将原始合成图像和背景图像的掩码输入至图像处理模型的第二生成器，从背景图像的掩码提取背景光影特征，根据背景光影特征和合成图像得到第四合成图像；将第三合成图像和第四合成图像融合，得到和谐处理图像；根据和谐处理图像、真实图像、真实前景光照图像和真实背景光照图像，计算模型损失；基于模型损失的约束对第一生成器和第二生成器进行参数调整，得到训练好的图像处理模型；损失函数的约束包括：图像处理模型的第一生成器提取的前景光影特征，与第二生成器提取的背景光影特征相似。An image data set is obtained, the image data set includes multiple groups of data, each group of data consists of two triplet data, one of which is an original synthetic image, a mask of a foreground image and a mask of a background image, and the other triplet is a real image, a real foreground illumination image and a real background illumination image; the masks of the original synthetic image and the foreground image in the image data set are input to a first generator of an image processing model, foreground light and shadow features are extracted from the mask of the foreground image, and a third synthetic image is obtained according to the foreground light and shadow features and the original synthetic image; the masks of the original synthetic image and the background image are input to a first generator of the image processing model; The second generator extracts background light and shadow features from the mask of the background image, and obtains a fourth synthetic image based on the background light and shadow features and the synthetic image; the third synthetic image and the fourth synthetic image are fused to obtain a harmoniously processed image; the model loss is calculated based on the harmoniously processed image, the real image, the real foreground lighting image and the real background lighting image; the parameters of the first generator and the second generator are adjusted based on the constraints of the model loss to obtain a trained image processing model; the constraints of the loss function include: the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator.

将和谐处理图像和真实图像输入图像处理模型的判别器，判别器用于判断和谐处理图像和真实图像的真实性；根据判别器的输出，计算对抗性损失；根据真实前景光照图像与和谐处理图像的前景光照图像的差异，以及真实背景光照图像与和谐处理图像的背景光照图像的差异，计算光照特征损失。The harmonized image and the real image are input into the discriminator of the image processing model, and the discriminator is used to judge the authenticity of the harmonized image and the real image; the adversarial loss is calculated according to the output of the discriminator; the illumination feature loss is calculated according to the difference between the real foreground illumination image and the foreground illumination image of the harmonized image, and the difference between the real background illumination image and the background illumination image of the harmonized image.

方法还包括：根据相同前景图像在不同光照下的非照明特征，计算得到非光照特征损失；其中，非光照特征损失函数的目标是使不同照明条件下的非照明特征的差异最小。The method also includes: calculating the non-illumination feature loss according to the non-illumination features of the same foreground image under different illumination conditions; wherein the goal of the non-illumination feature loss function is to minimize the difference in non-illumination features under different illumination conditions.

利用预先训练好的神经网络模型提取和谐处理图像的特征图；利用预先训练好的神经网络模型提取和谐处理图像对应的真实图像的特征图；根据和谐处理图像的特征图和和谐处理图像对应的真实图像的特征图，确定感知损失；感知损失函数的目标是使和谐处理图像的特征图和和谐处理图像对应的真实图像的特征图的差异最小。A feature map of the harmonized image is extracted using a pre-trained neural network model; a feature map of the real image corresponding to the harmonized image is extracted using a pre-trained neural network model; a perceptual loss is determined based on the feature map of the harmonized image and the feature map of the real image corresponding to the harmonized image; the goal of the perceptual loss function is to minimize the difference between the feature map of the harmonized image and the feature map of the real image corresponding to the harmonized image.

模型损失为光照特征损失、非光照特征损失、感知损失、对抗性损失以及绝对值偏差损失的加权和。The model loss is the weighted sum of illumination feature loss, non-illumination feature loss, perceptual loss, adversarial loss, and absolute value deviation loss.

在一个实施例中，提供了一种计算机可读存储介质，其上存储有计算机程序，计算机程序被处理器执行时实现以下步骤：In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:

在一个实施例中，计算机程序被处理器执行时还实现以下步骤：In one embodiment, when the computer program is executed by a processor, the following steps are also implemented:

根据相同前景图像在不同光照下的非照明特征，计算得到非光照特征损失；其中，非光照特征损失函数的目标是使不同照明条件下的非照明特征的差异最小。According to the non-illumination features of the same foreground image under different illumination conditions, the non-illumination feature loss is calculated; wherein the goal of the non-illumination feature loss function is to minimize the difference in non-illumination features under different illumination conditions.

在一个实施例中，提供了一种计算机程序产品，包括计算机程序，该计算机程序被处理器执行时实现以下步骤：In one embodiment, a computer program product is provided, comprising a computer program, which, when executed by a processor, implements the following steps:

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的计算机程序可存储于一非易失性计算机可读取存储介质中，该计算机程序在执行时，可包括如上述各方法的实施例的流程。其中，本申请所提供的各实施例中所使用的对存储器、数据库或其它介质的任何引用，均可包括非易失性和易失性存储器中的至少一种。非易失性存储器可包括只读存储器(Read-OnlyMemory，ROM)、磁带、软盘、闪存、光存储器、高密度嵌入式非易失性存储器、阻变存储器(ReRAM)、磁变存储器(Magnetoresistive Random Access Memory，MRAM)、铁电存储器(Ferroelectric Random Access Memory，FRAM)、相变存储器(Phase Change Memory，PCM)、石墨烯存储器等。易失性存储器可包括随机存取存储器(Random Access Memory，RAM)或外部高速缓冲存储器等。作为说明而非局限，RAM可以是多种形式，比如静态随机存取存储器(Static Random Access Memory，SRAM)或动态随机存取存储器(Dynamic RandomAccess Memory，DRAM)等。本申请所提供的各实施例中所涉及的数据库可包括关系型数据库和非关系型数据库中至少一种。非关系型数据库可包括基于区块链的分布式数据库等，不限于此。本申请所提供的各实施例中所涉及的处理器可为通用处理器、中央处理器、图形处理器、数字信号处理器、可编程逻辑器、基于量子计算的数据处理逻辑器等，不限于此。Those of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be completed by instructing the relevant hardware through a computer program, and the computer program can be stored in a non-volatile computer-readable storage medium. When the computer program is executed, it can include the processes of the embodiments of the above-mentioned methods. Among them, any reference to the memory, database or other medium used in the embodiments provided in the present application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. As an illustration and not limitation, RAM can be in various forms, such as static random access memory (SRAM) or dynamic random access memory (DRAM). The database involved in each embodiment provided in this application may include at least one of a relational database and a non-relational database. Non-relational databases may include distributed databases based on blockchains, etc., but are not limited to this. The processor involved in each embodiment provided in this application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic device, a data processing logic device based on quantum computing, etc., but are not limited to this.

以上实施例的各技术特征可以进行任意的组合，为使描述简洁，未对上述实施例中的各个技术特征所有可能的组合都进行描述，然而，只要这些技术特征的组合不存在矛盾，都应当认为是本说明书记载的范围。The technical features of the above embodiments may be arbitrarily combined. To make the description concise, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

以上所述实施例仅表达了本申请的几种实施方式，其描述较为具体和详细，但并不能因此而理解为对本申请专利范围的限制。应当指出的是，对于本领域的普通技术人员来说，在不脱离本申请构思的前提下，还可以做出若干变形和改进，这些都属于本申请的保护范围。因此，本申请的保护范围应以所附权利要求为准。The above-described embodiments only express several implementation methods of the present application, and the descriptions thereof are relatively specific and detailed, but they cannot be understood as limiting the scope of the present application. It should be pointed out that, for a person of ordinary skill in the art, several modifications and improvements can be made without departing from the concept of the present application, and these all belong to the protection scope of the present application. Therefore, the protection scope of the present application shall be subject to the attached claims.

Claims

1. A synthetic image processing method, characterized in that the method comprises:

Get the composite image, the mask of the foreground image, and the mask of the background image;

Inputting the composite image and the mask of the foreground image into a first generator of an image processing model, extracting foreground light and shadow features from the mask of the foreground image, and obtaining a first composite image according to the foreground light and shadow features and the composite image;

Inputting the composite image and the mask of the background image into the second generator of the image processing model, extracting background light and shadow features from the mask of the background image, and obtaining a second composite image according to the background light and shadow features and the composite image; wherein the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator;

The first composite image and the second composite image are fused to obtain a target composite image.

2. The method according to claim 1, characterized in that the method of training the image processing model comprises:

Acquire an image data set, the image data set comprising multiple groups of data, each group of data consisting of two triplet data, one of the triplet is an original synthetic image, a mask of a foreground image and a mask of a background image, and the other triplet is a real image, a real foreground illumination image and a real background illumination image;

Inputting the original synthesized image and the mask of the foreground image in the image data set into a first generator of the image processing model, extracting foreground light and shadow features from the mask of the foreground image, and obtaining a third synthesized image according to the foreground light and shadow features and the original synthesized image;

Inputting the original composite image and the mask of the background image into a second generator of an image processing model, extracting background light and shadow features from the mask of the background image, and obtaining a fourth composite image according to the background light and shadow features and the composite image;

fusing the third synthesized image and the fourth synthesized image to obtain a harmoniously processed image;

Calculating a model loss according to the harmonized image, the real image, the real foreground illumination image, and the real background illumination image;

The parameters of the first generator and the second generator are adjusted based on the constraint of the model loss to obtain a trained image processing model; the constraint of the loss function includes: the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator.

3. The method according to claim 2, characterized in that the model loss includes: adversarial loss and illumination loss;

Calculating a model loss according to the harmonized image, the real image, the real foreground illumination image, and the real background illumination image includes:

Inputting the harmonized processed image and the real image into a discriminator of the image processing model, wherein the discriminator is used to judge the authenticity of the harmonized processed image and the real image;

Calculating adversarial loss based on the output of the discriminator;

The illumination feature loss is calculated according to the difference between the real foreground illumination image and the foreground illumination image of the harmonized image, and the difference between the real background illumination image and the background illumination image of the harmonized image.

4. The method according to claim 3, characterized in that the model loss further comprises: non-illumination feature loss;

The method further includes: calculating a non-illumination feature loss based on the non-illumination features of the same foreground image under different illumination conditions; wherein the goal of the non-illumination feature loss function is to minimize the difference in non-illumination features under different illumination conditions.

5. The method according to claim 4, characterized in that the model loss further comprises: perceptual loss;

The method further comprises:

Extracting a feature map of the harmonized image using a pre-trained neural network model;

Extracting a feature map of a real image corresponding to the harmonized image using a pre-trained neural network model;

The perceptual loss is determined based on the feature map of the harmonized image and the feature map of the real image corresponding to the harmonized image; the goal of the perceptual loss function is to minimize the difference between the feature map of the harmonized image and the feature map of the real image corresponding to the harmonized image.

6. The method according to claim 5 is characterized in that the model loss is a weighted sum of illumination feature loss, non-illumination feature loss, perceptual loss, adversarial loss and absolute value deviation loss.

7. A synthetic image processing device, characterized in that the device comprises:

A composite image acquisition module, used to acquire a composite image, a mask of a foreground image, and a mask of a background image;

a first synthetic image acquisition module, configured to input the synthetic image and the mask of the foreground image into a first generator of an image processing model, extract foreground light and shadow features from the mask of the foreground image, and obtain a first synthetic image according to the foreground light and shadow features and the synthetic image;

A second synthetic image acquisition module, used for inputting the synthetic image and the mask of the background image into the second generator of the image processing model, extracting background light and shadow features from the mask of the background image, and obtaining a second synthetic image according to the background light and shadow features and the synthetic image; wherein the foreground light and shadow features extracted by the first generator of the image processing model are similar to the background light and shadow features extracted by the second generator;

The target composite image acquisition module is used to fuse the first composite image and the second composite image to obtain a target composite image.

8. A computer device, comprising a memory and a processor, wherein the memory stores a computer program, wherein the processor implements the steps of the method according to any one of claims 1 to 6 when executing the computer program.

9. A computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 6 are implemented.

10. A computer program product, comprising a computer program, characterized in that when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 6 are implemented.