CN115496818B

CN115496818B - Semantic graph compression method and device based on dynamic object segmentation

Info

Publication number: CN115496818B
Application number: CN202211390992.4A
Authority: CN
Inventors: 高健健; 华炜; 明彬彬; 谢天
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2022-11-08
Filing date: 2022-11-08
Publication date: 2023-03-10
Anticipated expiration: 2042-11-08
Also published as: CN115496818A

Abstract

The invention discloses a semantic map compression method and device based on dynamic object segmentation. The method first divides the simulation scene into two parts, the static background and the dynamic object, and draws the simulation scene to obtain a semantic map; all dynamic objects in the semantic map are segmented out. Get the semantic submap, and use the adjacent pixels of the dynamic object semantic submap to fill the static background semantic map; finally use the coding algorithm to encode all the dynamic object semantic submap and the filled static background semantic map respectively. This method separates the static background from the dynamic objects, and splits a semantic map into a static background semantic map and multiple dynamic object semantic sub-maps, which reduces the pixel mutation in the semantic map and increases the continuity of data distribution. The compression ratio of semantic graph encoding has been significantly improved.

Description

A Semantic Graph Compression Method and Device Based on Dynamic Object Segmentation

技术领域technical field

本发明涉及图像数据压缩领域，尤其涉及一种基于动态物体分割的语义图压缩方法和装置。The invention relates to the field of image data compression, in particular to a semantic map compression method and device based on dynamic object segmentation.

背景技术Background technique

作为计算机视觉和图像理解的核心技术，目标检测算法是计算机视觉领域中的一个重要研究方向，也是其他复杂视觉任务的基础。目标检测算法的应用场景包括图像描述、场景理解、图像分割、目标跟踪、事件检测等，此外还被广泛应用于自动驾驶、医学图像和无人机导航等判定场景中。As the core technology of computer vision and image understanding, object detection algorithm is an important research direction in the field of computer vision and the basis of other complex vision tasks. The application scenarios of target detection algorithms include image description, scene understanding, image segmentation, target tracking, event detection, etc. In addition, they are also widely used in judgment scenarios such as automatic driving, medical images, and drone navigation.

基于机器学习的目标检测算法在训练过程中需要提供大量的真值数据，这些真值数据主要包括语义分割图，即语义图。语义图将一张图片或者视频按照类别的异同，将图像分成多个区块，实现像素级别的分类。语义图真值一般采用人工标注或者仿真程序自动生成。由于仿真算法定制化程度高，并且能够快速生成海量数据，因此成为语义图真值数据集生成的重要方法之一。传统的仿真语义图一般不进行编码，或者采用行程编码等压缩算法直接编码，没有充分考虑到仿真场景的特性，导致压缩率低下，存储和数据传输的效率因此也受到限制。The target detection algorithm based on machine learning needs to provide a large amount of real-valued data during the training process, and these real-valued data mainly include semantic segmentation graphs, that is, semantic graphs. The semantic map divides a picture or video into multiple blocks according to the similarities and differences of categories, and realizes the classification at the pixel level. The truth value of the semantic map is generally manually annotated or automatically generated by a simulation program. Due to the high degree of customization of the simulation algorithm and the ability to quickly generate massive amounts of data, it has become one of the important methods for generating semantic graph truth datasets. Traditional simulation semantic graphs are generally not coded, or directly coded using compression algorithms such as run-length coding, which does not fully consider the characteristics of the simulation scene, resulting in low compression rates and limited storage and data transmission efficiency.

发明内容Contents of the invention

为解决现有技术的不足，实现更高效的语义图压缩，本发明采用如下的技术方案：In order to solve the deficiencies of the prior art and realize more efficient semantic map compression, the present invention adopts the following technical solutions:

一种基于动态物体分割的语义图压缩方法，包括以下步骤：A semantic map compression method based on dynamic object segmentation, comprising the following steps:

S1，初始化仿真场景，所述仿真场景由静态背景和动态物体组成；S1, initializing a simulation scene, the simulation scene is composed of a static background and a dynamic object;

S2，更新并绘制仿真场景，得到语义图和所有动态物体在语义图坐标系下的二维包围体；S2, update and draw the simulation scene, and obtain the two-dimensional bounding volume of the semantic map and all dynamic objects in the coordinate system of the semantic map;

S3，使用所述二维包围体将动态物体的语义数据分割出来，构成若干个动态物体语义子图；剩下的静态背景语义图，使用每个动态物体语义子图临近的像素填充对应的图像区域；S3, using the two-dimensional bounding volume to segment the semantic data of the dynamic object to form several dynamic object semantic subgraphs; for the remaining static background semantic graph, use the pixels adjacent to each dynamic object semantic subgraph to fill the corresponding image area;

S4，使用编码算法分别编码所有动态物体语义子图和填充后的静态背景语义图。S4, use the encoding algorithm to encode all the dynamic object semantic sub-graphs and the filled static background semantic graph respectively.

进一步地，所述S1中仿真场景的所有物体都被指定一个ID，拥有同一类语义的物体ID相同；每个ID唯一对应一种颜色，不同ID对应的颜色不同。Further, all objects in the simulation scene in S1 are assigned an ID, and objects with the same type of semantics have the same ID; each ID corresponds to a unique color, and different IDs correspond to different colors.

进一步地，所述S2中仿真场景的更新包括所有动态物体和绘制视角的位姿更新；语义图中的每个像素唯一对应一个物体，使用该物体ID所对应的颜色进行着色。Further, the update of the simulated scene in S2 includes update of poses of all dynamic objects and rendering perspectives; each pixel in the semantic graph corresponds to an object uniquely, and is colored with the color corresponding to the object ID.

进一步地，所述S2中，动态物体在语义图坐标系下的二维包围体要包含该动态物体在语义图上的所有像素。Further, in S2, the two-dimensional bounding volume of the dynamic object in the semantic map coordinate system should include all the pixels of the dynamic object on the semantic map.

进一步地，所述S3中，使用S2中的二维包围体将动态物体的语义数据分割出来，构成若干个语义子图，具体通过如下子步骤来实现：Further, in S3, the semantic data of the dynamic object is segmented using the two-dimensional bounding volume in S2 to form several semantic subgraphs, which are specifically implemented through the following substeps:

（1）根据每个动态物体的二维包围体大小，初始化对应的语义子图数据，其中语义子图的每个元素初始化为（R:0, G:0, B:0）；(1) According to the size of the two-dimensional bounding volume of each dynamic object, initialize the corresponding semantic submap data, where each element of the semantic submap is initialized to (R:0, G:0, B:0);

（2）遍历语义图的每个像素，对于每个像素，进行如下处理：(2) Traverse each pixel of the semantic map, and for each pixel, perform the following processing:

判断该像素的像素坐标是否位于某个动态物体的二维包围体的范围内，如果是，则计算该像素在该动态物体语义子图坐标系下的相对坐标，并根据该相对坐标写入该动态物体语义子图的对应元素中，否则忽略该像素，继续遍历下一个像素，从而得到某个动态物体的语义子图。Determine whether the pixel coordinates of the pixel are within the range of the two-dimensional bounding volume of a dynamic object, if so, calculate the relative coordinates of the pixel in the semantic sub-image coordinate system of the dynamic object, and write the In the corresponding element of the semantic subgraph of the dynamic object, otherwise ignore the pixel and continue to traverse the next pixel to obtain the semantic subgraph of a dynamic object.

进一步地，所述S3中，剩下的静态背景语义图，使用每个动态物体语义子图临近的像素填充对应的图像区域，通过如下子步骤来实现：Further, in the S3, for the rest of the static background semantic map, the pixels adjacent to each dynamic object semantic sub-map are used to fill the corresponding image area through the following sub-steps:

（1）查找动态物体语义子图周围的边缘像素，如果超出语义图范围，则该边缘像素视为无效；(1) Find the edge pixels around the semantic submap of the dynamic object. If it exceeds the range of the semantic map, the edge pixels are considered invalid;

（2）统计所有有效的边缘像素，根据这些边缘像素的值，填充该动态物体语义子图在语义图中对应的图像区域。(2) Count all valid edge pixels, and fill the corresponding image area of the dynamic object semantic submap in the semantic map according to the values of these edge pixels.

进一步地，所述S4中每个动态物体语义子图的编码结果要附带对应的二维包围体信息。Further, the encoding result of each dynamic object semantic subgraph in S4 should be accompanied by corresponding two-dimensional bounding volume information.

一种基于动态物体分割的语义图压缩装置，该装置包括一个或多个处理器，用于实现上述的基于动态物体分割的语义图压缩方法。A semantic graph compression device based on dynamic object segmentation, the device includes one or more processors for realizing the above-mentioned semantic graph compression method based on dynamic object segmentation.

一种计算机可读存储介质，其上存储有程序，该程序被处理器执行时，实现基于动态物体分割的语义图压缩方法。A computer-readable storage medium stores a program on it. When the program is executed by a processor, a semantic map compression method based on dynamic object segmentation is realized.

本发明的有益效果如下：The beneficial effects of the present invention are as follows:

本发明公开一种基于动态物体分割的语义图压缩方法，该方法首先将仿真场景分为静态背景和动态物体两部分，绘制仿真场景得到语义图；将语义图中的所有动态物体分割出来得到语义子图，并使用动态物体语义子图的临近像素填充静态背景语义图；最后使用编码算法分别编码所有动态物体语义子图和填充后的静态背景语义图。该方法将静态背景和动态物体分离，把一张语义图拆分成一张静态背景语义图和多张动态物体语义子图，减少了语义图中的像素突变，增加了数据分布的连续性，对语义图编码的压缩率有显著提升。The invention discloses a semantic map compression method based on dynamic object segmentation. The method firstly divides the simulation scene into two parts, the static background and the dynamic object, and draws the simulation scene to obtain the semantic map; all the dynamic objects in the semantic map are separated to obtain the semantic map. Submap, and use the adjacent pixels of the dynamic object semantic submap to fill the static background semantic map; finally use the coding algorithm to encode all the dynamic object semantic submap and the filled static background semantic map respectively. This method separates the static background from the dynamic objects, and splits a semantic map into a static background semantic map and multiple dynamic object semantic sub-maps, which reduces the pixel mutation in the semantic map and increases the continuity of data distribution. The compression ratio of semantic graph encoding has been significantly improved.

附图说明Description of drawings

图1为示例性实施例中一种基于动态物体分割的语义图压缩方法的步骤图。Fig. 1 is a step diagram of a semantic graph compression method based on dynamic object segmentation in an exemplary embodiment.

图2为示例性实施例中仿真场景的静态背景和动态物体示意图。Fig. 2 is a schematic diagram of a static background and dynamic objects in a simulation scene in an exemplary embodiment.

图3为示例性实施例中语义图动态物体分割和压缩示意图。Fig. 3 is a schematic diagram of segmentation and compression of a dynamic object in a semantic graph in an exemplary embodiment.

图4为示例性实施例中基于动态物体分割的语义图压缩装置示意图。Fig. 4 is a schematic diagram of a semantic map compression device based on dynamic object segmentation in an exemplary embodiment.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加明白清楚，结合附图和实施例，对本发明进一步的详细说明，应当理解，此处所描述的具体实施例仅仅用以解释本发明，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，均在本发明保护范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described in detail in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, rather than all Example. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts are within the protection scope of the present invention.

在一个实施例中，如图1所示，提出了一种基于动态物体分割的语义图压缩方法和装置，该方法首先将仿真场景分为静态背景和动态物体两部分，绘制仿真场景得到语义图；将语义图中的所有动态物体分割出来得到语义子图，并使用动态物体语义子图的临近像素填充静态背景语义图；最后使用编码算法分别编码所有动态物体语义子图和填充后的静态背景语义图。In one embodiment, as shown in Figure 1, a semantic map compression method and device based on dynamic object segmentation is proposed. The method first divides the simulation scene into two parts, the static background and the dynamic object, and draws the simulation scene to obtain the semantic map. ;Segment all the dynamic objects in the semantic map to get the semantic submap, and use the adjacent pixels of the dynamic object semantic submap to fill the static background semantic map; finally use the encoding algorithm to encode all the dynamic object semantic submap and the filled static background Semantic map.

该方法具体包括以下步骤：The method specifically includes the following steps:

步骤1，初始化仿真场景，仿真场景由静态背景和动态物体组成。Step 1, initialize the simulation scene, which consists of static background and dynamic objects.

如图2所示，本实施例中，仿真场景为某园区的数字孪生场景，由倾斜摄影模型、美术手工制作模型和过程化生成模型等几类模型组成，分为静态背景和动态物体两大类。其中静态背景包括天空、地形、路面、路牙、路灯、建筑、植物等，动态物体包括非机动车、机动车、行人等。As shown in Figure 2, in this embodiment, the simulation scene is a digital twin scene of a park, which is composed of several types of models such as oblique photography model, art handmade model and procedural generation model, and is divided into static background and dynamic objects. kind. The static background includes the sky, terrain, road surface, curbs, street lights, buildings, plants, etc., and the dynamic objects include non-motor vehicles, motor vehicles, pedestrians, etc.

仿真程序的图像渲染接口采用开源开放的图形渲染库OpenGL（Open GraphicsLibrary），OpenGL是一种跨语言、跨平台的应用程序编程接口(API)，用于渲染2D和3D矢量图形。 OpenGL的接口通常用于与图形处理单元(GPU)交互，以实现硬件加速渲染，其中包括近350个不同的函数调用组成，用来绘制从简单的图形比特到复杂的三维景象；OpenGL常用于CAD、虚拟现实、科学可视化程序和电子游戏开发，包含7大功能：建立3D模型、图形变换、颜色模式、光照和材质的设置、纹理映射、图像增强功能和位图显示的扩展功能和双缓存功能。The image rendering interface of the simulation program uses the open source and open graphics rendering library OpenGL (Open Graphics Library). OpenGL is a cross-language, cross-platform application programming interface (API) for rendering 2D and 3D vector graphics. The OpenGL interface is usually used to interact with the graphics processing unit (GPU) to achieve hardware-accelerated rendering, including nearly 350 different function calls, used to draw from simple graphics bits to complex three-dimensional scenes; OpenGL is often used in CAD , virtual reality, scientific visualization programs and video game development, including 7 major functions: building 3D models, graphics transformation, color mode, lighting and material settings, texture mapping, image enhancement functions and bitmap display extensions and double buffering functions .

仿真场景的所有物体都被指定了一个ID，拥有同一类语义的物体ID相同，其中同一类语义指的是在目标检测算法中属于同一种物体类型；每个ID唯一对应一种颜色，不同ID对应的颜色不同。不同物体的ID和颜色的对应关系如表1所示。All objects in the simulation scene are assigned an ID. Objects with the same type of semantics have the same ID. The same type of semantics refers to the same type of object in the target detection algorithm; each ID corresponds to a unique color, and different IDs Corresponding colors are different. The corresponding relationship between ID and color of different objects is shown in Table 1.

在仿真场景中还摆放了一个语义图仿真摄像头，通过场景绘制的方法生成语义图。设置语义图仿真摄像头的视场角为72度，长宽比为16:9，目标图像的分辨率为1280*720。该语义图仿真摄像头还包含一个运动控制模块，控制其在场景中的运动逻辑：预设一条运动轨迹，运动轨迹的每个点都包含一组位姿数据，语义图仿真摄像头沿着该运动轨迹按照预设的速度匀速运动，并按照固定时间间隔绘制生成语义图。A semantic map simulation camera is also placed in the simulation scene, and the semantic map is generated by scene rendering. Set the field of view of the semantic map simulation camera to 72 degrees, the aspect ratio to 16:9, and the resolution of the target image to 1280*720. The semantic graph simulation camera also includes a motion control module to control its motion logic in the scene: preset a motion trajectory, each point of the motion trajectory contains a set of pose data, and the semantic graph simulation camera follows the motion trajectory Move at a constant speed according to the preset speed, and draw and generate a semantic map according to a fixed time interval.

表1 不同物体的ID和颜色的对应关系Table 1 Correspondence between ID and color of different objects

步骤2，更新并绘制仿真场景，得到语义图和所有动态物体在语义图坐标系下的二维包围体。Step 2, update and draw the simulation scene, and obtain the semantic map and the two-dimensional bounding volume of all dynamic objects in the semantic map coordinate system.

这里的二维包围体，既可以是二维包围框，如矩形框，也可以是其他规则形状，如圆形，椭圆形等，或其他不规则形状。The two-dimensional bounding volume here can be a two-dimensional bounding box, such as a rectangular box, or other regular shapes, such as a circle, an ellipse, etc., or other irregular shapes.

语义分割是图像处理和计算机视觉技术中关于图像理解的重要的一环。语义分割对图像中的每一个像素点进行分类，确定每个点的类别，如属于背景、边缘或身体等，将实例分割区分开来。语义分割没有分离同一类的实例；它关心的只是每个像素的类别，如果输入对象中有两个相同类别的对象，则分割本身不会将它们区分为单独的对象。语义分割算法会输出语义分割图像，简称语义图。Semantic segmentation is an important part of image understanding in image processing and computer vision technology. Semantic segmentation classifies each pixel in the image, determines the category of each point, such as belonging to the background, edge or body, etc., and distinguishes the instance segmentation. Semantic segmentation does not separate instances of the same class; it only cares about the class of each pixel, and if there are two objects of the same class among the input objects, the segmentation itself will not distinguish them as separate objects. The semantic segmentation algorithm will output a semantic segmentation image, referred to as a semantic map.

本实施例中，语义图将一张图片或者视频按照类别的异同，将图像分成多个区块，实现像素级别的分类；通过绘制得到的语义图中的每个像素都唯一对应一个物体，使用该物体ID所对应的颜色进行着色。In this embodiment, the semantic map divides a picture or video into multiple blocks according to the similarities and differences of categories, so as to realize the classification at the pixel level; each pixel in the semantic map obtained by drawing is uniquely corresponding to an object, using The color corresponding to the object ID is colored.

本实施例中，仿真场景的更新包括所有动态物体和语义图仿真摄像头的位姿更新，设置更新频率为60HZ；其中所有的动态物体，包括非机动车、机动车和行人都为骨骼蒙皮模型，其位姿更新包括以下两个步骤：In this embodiment, the update of the simulation scene includes the pose update of all dynamic objects and semantic map simulation cameras, and the update frequency is set to 60HZ; all dynamic objects, including non-motor vehicles, motor vehicles and pedestrians, are bone skin models , its pose update includes the following two steps:

（1）接收来自服务端的动态物体状态数据，解析出其中的位置和旋转数据，并根据物体标志符字符串设置给对应的动态物体；(1) Receive the dynamic object status data from the server, parse out the position and rotation data, and set it to the corresponding dynamic object according to the object identifier string;

（2）根据当前仿真时刻和动态物体的动画状态，进行基于GPU的动画蒙皮解算，更新动态物体模型的每个顶点。(2) According to the current simulation moment and the animation state of the dynamic object, perform GPU-based animation skin calculation and update each vertex of the dynamic object model.

语义图仿真摄像头通过采样预设的运动轨迹更新其位置和姿态，具体步骤如下：The semantic map simulation camera updates its position and posture by sampling the preset motion trajectory, and the specific steps are as follows:

（1）根据当前仿真时间和预设的语义图仿真摄像头运动速度，计算出当前运动的路径长度；(1) According to the current simulation time and the preset semantic map simulation camera movement speed, calculate the path length of the current movement;

（2）如果路径长度大于预设的运动轨迹长度，停止仿真过程，退出程序；否则使用路径长度采样预设的运动轨迹，将得到的位置和姿态设置给语义图仿真摄像头。(2) If the path length is greater than the preset motion track length, stop the simulation process and exit the program; otherwise use the path length to sample the preset motion track, and set the obtained position and attitude to the semantic map simulation camera.

等待所有动态物体和语义图仿真摄像头的位姿更新完毕，开始执行绘制操作。在绘制每个场景物体之前，先调用glUniform1i函数将物体的ID作为一个uniform变量传递到像素着色器中，然后调用glDrawElements函数绘制模型的顶点数据。在像素着色器中，输出的像素颜色PixelColor和物体ID的计算公式如下：Wait for the poses of all dynamic objects and semantic map simulation cameras to be updated, and start the drawing operation. Before drawing each scene object, first call the glUniform1i function to pass the object ID as a uniform variable to the pixel shader, and then call the glDrawElements function to draw the vertex data of the model. In the pixel shader, the output pixel color PixelColor and the calculation formula of the object ID are as follows:

PixelColor.r=ID*20PixelColor.r=ID*20

PixelColor.g=ID*20PixelColor.g=ID*20

PixelColor.b=ID*20PixelColor.b=ID*20

除了得到绘制的语义图数据之外，本实施例还会生成所有动态物体在语义图坐标系下的二维包围框，该二维包围框覆盖的区域包含该动态物体在语义图上的所有像素；对于每个动态物体，其二维包围框的具体生成算法如下：In addition to obtaining the drawn semantic map data, this embodiment will also generate a two-dimensional bounding box of all dynamic objects in the semantic map coordinate system, and the area covered by the two-dimensional bounding box includes all pixels of the dynamic object on the semantic map ; For each dynamic object, the specific generation algorithm of its two-dimensional bounding box is as follows:

（1）遍历该动态物体的所有顶点，分别计算在三个轴向的最大值和最小值，构造一个轴向对齐的三维包围盒（AABB，Axis-Aligned Bounding Box），该三维包围盒能够包含动态物体的所有顶点；(1) Traverse all vertices of the dynamic object, calculate the maximum and minimum values in the three axes respectively, and construct an axially aligned three-dimensional bounding box (AABB, Axis-Aligned Bounding Box), which can contain All vertices of the dynamic object;

（2）遍历该三维包围盒的8个三维角点，依次通过模型矩阵、视图矩阵、透视矩阵、视口变换矩阵变换为语义图坐标系下的二维角点；(2) Traverse the 8 3D corners of the 3D bounding box, and transform them into 2D corners in the semantic map coordinate system through the model matrix, view matrix, perspective matrix, and viewport transformation matrix in turn;

（3）遍历上一步中得到的二维角点，分别计算在两个轴向的最大值和最小值，构造一个轴向对齐的二维包围框包含所有二维角点，该二维包围框即为该动态物体的二维包围框。(3) Traverse the two-dimensional corner points obtained in the previous step, calculate the maximum and minimum values in the two axes respectively, and construct an axially aligned two-dimensional bounding box containing all two-dimensional corner points, the two-dimensional bounding box is the two-dimensional bounding box of the dynamic object.

步骤3，使用步骤2中的二维包围体将动态物体的语义数据分割出来，构成若干个语义子图；剩下的静态背景语义图，使用每个动态物体语义子图临近的像素填充对应的图像区域。Step 3, use the two-dimensional bounding volume in step 2 to segment the semantic data of the dynamic object to form several semantic submaps; for the rest of the static background semantic map, use the pixels adjacent to the semantic submap of each dynamic object to fill the corresponding image area.

本实施例中，如图3所示，使用动态物体的二维包围框从原始的语义图中分割出所有动态物体的语义数据，构成语义子图；语义图数据和语义子图数据使用二维数组进行存储，二维数组的元素为像素的RGB颜色值；In this embodiment, as shown in Figure 3, the semantic data of all dynamic objects is segmented from the original semantic graph using the two-dimensional bounding box of the dynamic object to form a semantic sub-graph; the semantic graph data and the semantic sub-graph data use two-dimensional The array is stored, and the elements of the two-dimensional array are the RGB color values of the pixels;

根据每个动态物体的二维包围体大小，初始化对应的语义子图数据，其中语义子图的每个元素初始化为（R:0, G:0, B:0）；遍历原始的语义图的每个像素，对于每个像素，进行如下处理：According to the size of the two-dimensional bounding volume of each dynamic object, initialize the corresponding semantic subgraph data, where each element of the semantic subgraph is initialized to (R:0, G:0, B:0); traverse the original semantic graph For each pixel, for each pixel, the following processing is performed:

判断该像素的像素坐标是否位于某个动态物体的二维包围体的范围内，如果是，则计算该像素在该动态物体语义子图坐标系下的相对坐标，并根据该相对坐标写入该动态物体语义子图的对应元素中；否则忽略该像素，继续遍历下一个像素，得到某个动态物体的语义子图。Determine whether the pixel coordinates of the pixel are within the range of the two-dimensional bounding volume of a dynamic object, if so, calculate the relative coordinates of the pixel in the semantic sub-image coordinate system of the dynamic object, and write the In the corresponding element of the semantic submap of the dynamic object; otherwise, the pixel is ignored, and the next pixel is traversed to obtain the semantic submap of a dynamic object.

对于每个动态物体语义子图，查找该动态物体语义子图在原始语义图中周围一圈的所有边缘像素，如果超出语义图范围则视为无效；选择所有有效的边缘像素的众数作为指定的临近像素，用该临近像素填充该动态物体语义子图在原始的语义图中的对应图像区域。其中，边缘像素可以为一圈或多圈；在指定临近像素时，除了选择所有有效的边缘像素的众数外，也可以选择所有有效的边缘像素的平均数、中位数。For each dynamic object semantic subgraph, find all the edge pixels of the dynamic object semantic subgraph in the circle around the original semantic map, if it exceeds the range of the semantic map, it will be considered invalid; select the mode of all valid edge pixels as the specified The adjacent pixels are used to fill the corresponding image area of the dynamic object semantic submap in the original semantic map. Wherein, the edge pixels can be one or more circles; when specifying adjacent pixels, in addition to selecting the mode of all effective edge pixels, the average number and median of all effective edge pixels can also be selected.

本实施例中，如果仿真场景中不存在动态物体，则不执行以上动态物体分割和原始语义图填充的步骤。In this embodiment, if there is no dynamic object in the simulation scene, the above steps of segmenting the dynamic object and filling the original semantic map are not performed.

步骤4，使用编码算法分别编码所有动态物体语义子图和填充后的静态背景语义图。Step 4: Encode all dynamic object semantic subgraphs and filled static background semantic graphs respectively using an encoding algorithm.

本实施例中，使用行程编码对所有动态物体语义子图和填充后的静态背景语义图进行压缩编码；行程编码（RLE, Run-Length Encoding）是一种无损数据压缩形式，其中数据运行（在许多连续数据元素中出现相同数据值的序列）存储为单个数据值和计数，而不是作为原始运行。这对于包含许多此类运行的数据最为有效，例如简单的图形图像，如图标、线条图和动画。对于没有太多重复字符的文件，行程编码反而可能会增加文件大小。In this embodiment, run-length encoding is used to compress and encode all dynamic object semantic subgraphs and filled static background semantic graphs; run-length encoding (RLE, Run-Length Encoding) is a form of lossless data compression, in which data runs (in A sequence of occurrences of the same data value in many consecutive data elements) is stored as individual data values and counts, rather than as raw runs. This works best for data that contains many such runs, such as simple graphic images such as icons, line drawings, and animations. For files that don't have many repeating characters, run-length encoding can actually increase the file size.

此外，行程编码对传输差错很敏感，如果其中一位符号发生错误，就会影响整个编码序列的正确性，使行程编码无法还原回原始数据，因此一般要用行同步、列同步的方法把差错控制在一行一列之内。总而言之，行程编码非常适合用来压缩重复度高的、计算机生成的图像，尤其是二值图像，对于语义图这类图像数据也较为适用。In addition, run-length coding is very sensitive to transmission errors. If an error occurs in one of the symbols, it will affect the correctness of the entire coding sequence, making it impossible to restore the original data to the run-length coding. Control within one row and one column. All in all, run-length encoding is very suitable for compressing highly repetitive, computer-generated images, especially binary images, and is also suitable for image data such as semantic maps.

所有动态物体语义子图和填充后的静态背景语义图都使用行程编码进行压缩，编码压缩过程是一致的；这里以填充后的静态背景语义图（简称背景语义图）为例，介绍说明其压缩过程：All dynamic object semantic submaps and filled static background semantic maps are compressed using run-length encoding, and the encoding compression process is consistent; here we take the filled static background semantic map (abbreviated as background semantic map) as an example to describe its compression process:

创建变量Header，初始化为背景语义图的第一个像素；创建变量Count，初始化为0；创建一个空的一维数组，用来存储压缩的结果数据，其中数组元素为24位的整数值；遍历背景语义图的每个像素，对于每个像素执行如下操作：判断当前像素的颜色值是否和Header相同，如果相同则将Count加一；否则将当前Count值写入压缩结果数组中，并将Header颜色值的RGB三个通道值组合成一个整数写入压缩结果数组中，同时把当前像素的颜色值赋值给Header，Count值重置为0，继续遍历下一个像素。Create a variable Header, initialized to the first pixel of the background semantic map; create a variable Count, initialized to 0; create an empty one-dimensional array to store the compressed result data, where the array elements are 24-bit integer values; traverse For each pixel of the background semantic map, perform the following operations for each pixel: determine whether the color value of the current pixel is the same as the Header, and if so, add one to the Count; otherwise, write the current Count value into the compressed result array, and write the Header The RGB three channel values of the color value are combined into an integer and written into the compressed result array, and at the same time, the color value of the current pixel is assigned to the Header, the Count value is reset to 0, and the next pixel is traversed.

动态物体语义子图的压缩结果除了行程编码的压缩数据以外，还需要存储其在原始的语义图上的二维包围体信息。最后得到的压缩结果数据包括三个部分：In addition to the compressed data of the run-length encoding, the compressed result of the semantic subgraph of the dynamic object also needs to store its two-dimensional bounding volume information on the original semantic graph. The final compressed result data consists of three parts:

（1）背景语义图的行程编码压缩数组；(1) The run-length encoded compressed array of the background semantic map;

（2）所有动态物体语义子图的行程编码压缩数组；(2) The run-length encoded compressed array of all dynamic object semantic subgraphs;

（3）所有动态物体语义子图的二维包围体信息。(3) 2D bounding volume information of all dynamic object semantic subgraphs.

本实施例中，完整的压缩结果数据通过网络传输到算法端，算法端通过对应的解压方法还原出原始的语义图：首先根据背景语义图的行程编码压缩数组解压出背景语义图数据，再根据所有动态物体语义子图的行程编码压缩数组解压出所有动态物体语义子图数据，最后根据所有动态物体语义子图的二维包围体信息将所有动态物体语义子图数据覆盖到背景语义图数据的对应位置，得到原始的语义图数据；In this embodiment, the complete compressed result data is transmitted to the algorithm side through the network, and the algorithm side restores the original semantic map through the corresponding decompression method: first, decompress the background semantic map data according to the run-length code compression array of the background semantic map, and then according to Decompress the data of all dynamic object semantic subgraphs from the run-length encoding compressed array of all dynamic object semantic subgraphs, and finally overwrite all dynamic object semantic subgraph data to the background semantic graph data according to the two-dimensional bounding volume information of all dynamic object semantic subgraphs Corresponding to the position, the original semantic map data is obtained;

根据行程编码压缩数组解压出原始数据的具体方法如下：The specific method of decompressing the original data according to the run-length encoding compressed array is as follows:

（1）创建变量Count，初始化为0；读取第一个数组元素，赋值给Count；(1) Create a variable Count and initialize it to 0; read the first array element and assign it to Count;

（2）读取后续Count个数组元素，将每个数组元素都拆分成RGB三个通道值构成一个像素颜色值，写入原始数据中；(2) Read the subsequent Count array elements, split each array element into three RGB channel values to form a pixel color value, and write it into the original data;

（3）如果已经遍历完所有数组元素，则完成解压步骤，退出程序；否则读取后一个数组元素，赋值给Count，继续执行第2步。(3) If all array elements have been traversed, complete the decompression step and exit the program; otherwise read the last array element, assign it to Count, and continue to step 2.

完成所有动态物体语义子图和背景语义图的解压之后，执行最终的语义图拼接步骤：遍历所有动态物体语义子图的二维包围体信息，对于当前二维包围体对应的动态物体语义子图，遍历该动态物体语义子图的所有像素，使用二维包围体将当前像素在动态物体语义子图的坐标变换到背景语义图的坐标，将当前像素写入背景语义图对应位置的像素。After completing the decompression of all dynamic object semantic subgraphs and background semantic graphs, perform the final semantic map stitching step: traverse the two-dimensional bounding volume information of all dynamic object semantic subgraphs, and for the dynamic object semantic subgraph corresponding to the current two-dimensional bounding volume , traverse all the pixels of the dynamic object semantic submap, use the two-dimensional bounding volume to transform the coordinates of the current pixel in the dynamic object semantic submap to the coordinates of the background semantic map, and write the current pixel to the pixel at the corresponding position of the background semantic map.

最终的基于动态物体分割的语义图压缩方法的应用效果如表2所示。The final application effect of the semantic map compression method based on dynamic object segmentation is shown in Table 2.

表2为采用现有的三种方法和本发明的方法进行图像压缩的压缩率对比Table 2 is the comparison of compression ratios for image compression using existing three methods and the method of the present invention

RLERLE TRLETRLE FastPForFastPFor OursOurs 压缩率Compression ratio 22.4%22.4% 14.6%14.6% 9.3%9.3% 5.9%5.9%

表2对比了传统行程编码（RLE, Run-Length Encoding）、加速行程编码（TRLE,Turbo Run-Length Encoding）、快速整数序列压缩（FastPFor）和基于动态物体分割的语义图压缩等四种编码方法的压缩率。可以看出，由于充分利用了仿真场景的特性，将动静物体分离，最大化语义图数据的连续性以便于后续的压缩编码，基于动态物体分割的语义图压缩方法的压缩率最低，体现了本发明的有益效果。Table 2 compares four encoding methods: traditional run-length encoding (RLE, Run-Length Encoding), accelerated run-length encoding (TRLE, Turbo Run-Length Encoding), fast integer sequence compression (FastPFor) and semantic map compression based on dynamic object segmentation. the compression rate. It can be seen that due to the full use of the characteristics of the simulation scene, the separation of dynamic and static objects, and the maximization of the continuity of semantic map data for subsequent compression coding, the compression rate of the semantic map compression method based on dynamic object segmentation is the lowest. Beneficial effects of the invention.

与前述的实施例相对应，本发明还提供了一种基于动态物体分割的语义图压缩装置的实施例，如图4所示，该装置包括一个或多个处理器，用于实现上述激光雷达点云生成方法。Corresponding to the aforementioned embodiments, the present invention also provides an embodiment of a semantic map compression device based on dynamic object segmentation, as shown in Figure 4, the device includes one or more processors for implementing the above-mentioned laser radar Point cloud generation method.

本发明基于动态物体分割的语义图压缩装置的实施例可以应用在任意具备数据处理能力的设备上，该任意具备数据处理能力的设备可以为诸如计算机等设备或置。装置实施例可以通过软件实现，也可以通过硬件或者软硬件结合的方式实现。

以软件实现为例，作为一个逻辑意义上的装置，是通过其所在任意具备数据处理能力的设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言，除了处理器、内存、网络接口、以及非易失性存储器之外，实施例中装置所在的任意具备数据处理能力的设备通常根据该任意具备数据处理能力的设备的实际功能，还可以包括其他硬件，对此不再赘述。The embodiment of the device for compressing semantic graph based on dynamic object segmentation of the present invention can be applied to any device with data processing capability, and any device with data processing capability can be a device or device such as a computer. The device embodiments can be implemented by software, or by hardware or a combination of software and hardware.

Taking software implementation as an example, as a device in a logical sense, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory for operation by the processor of any device capable of data processing. From the perspective of hardware, in addition to the processor, memory, network interface, and non-volatile memory, any device with data processing capabilities in which the device in the embodiment is usually based on the actual function of any device with data processing capabilities , may also include other hardware, which will not be repeated here.

上述装置中各个单元的功能和作用的实现过程具体详见上述方法中对应步骤的实现过程，在此不再赘述。For the implementation process of the functions and effects of each unit in the above device, please refer to the implementation process of the corresponding steps in the above method for details, and will not be repeated here.

对于装置实施例而言，由于其基本对应于方法实施例，所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的，其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本发明方案的目的。本领域普通技术人员在不付出创造性劳动的情况下，即可以理解并实施。As for the device embodiment, since it basically corresponds to the method embodiment, for related parts, please refer to the part description of the method embodiment. The device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network elements. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of the present invention. It can be understood and implemented by those skilled in the art without creative effort.

本发明实施例还提供一种计算机可读存储介质，其上存储有程序，该程序被处理器执行时，实现上述实施例中的基于动态物体分割的语义图压缩方法。An embodiment of the present invention also provides a computer-readable storage medium, on which a program is stored. When the program is executed by a processor, the semantic map compression method based on dynamic object segmentation in the above-mentioned embodiment is implemented.

所述计算机可读存储介质可以是前述任一实施例所述的任意具备数据处理能力的设备的内部存储单元，例如硬盘或内存。所述计算机可读存储介质也可以是外部存储设备，例如所述设备上配备的插接式硬盘、智能存储卡(SmartMedia card, SMC)、SD卡、闪存卡(Flash card)等。进一步的，所述计算机可读存储介质还可以既包括任意具备数据处理能力的设备的内部存储单元也包括外部存储设备。所述计算机可读存储介质用于存储所述计算仉程序以及所述任意具备数据处理能力的设备所需的其他程序和数据，还可以用于暂时地存储己经输出或者将要输出的数据。The computer-readable storage medium may be an internal storage unit of any device capable of data processing described in any of the foregoing embodiments, such as a hard disk or a memory. The computer-readable storage medium may also be an external storage device, such as a plug-in hard disk, a smart memory card (SmartMedia card, SMC), an SD card, a flash memory card (Flash card) and the like equipped on the device. Further, the computer-readable storage medium may also include both an internal storage unit of any device capable of data processing and an external storage device. The computer-readable storage medium is used to store the computing program and other programs and data required by any device capable of data processing, and may also be used to temporarily store outputted or to-be-outputted data.

以上实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述实施例所记载的技术方案进行修改，或者对其中部分或者全部技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明实施例技术方案的范围。The above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still be described in the foregoing embodiments Modifications to the technical solutions, or equivalent replacement of some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A semantic graph compression method based on dynamic object segmentation is characterized by comprising the following steps:

s1, initializing a simulation scene, wherein the simulation scene consists of a static background and a dynamic object;

s2, updating and drawing the simulation scene to obtain a semantic graph and two-dimensional bounding volumes of all dynamic objects under a semantic graph coordinate system;

s3, segmenting the semantic data of the dynamic object by using the two-dimensional bounding volume to form a plurality of dynamic object semantic subgraphs; filling the corresponding image area with the adjacent pixels of each dynamic object semantic subgraph in the rest static background semantic graph;

in the step S3, the semantic data of the dynamic object is segmented by using the two-dimensional bounding volume in the step S2 to form a plurality of semantic subgraphs, and the semantic subgraphs are specifically realized by the following substeps:

(1) Initializing corresponding semantic subgraph data according to the size of a two-dimensional bounding volume of each dynamic object, wherein each element of the semantic subgraph is initialized to be (R: 0, G:0, B: 0);

(2) Traversing each pixel of the semantic graph, and for each pixel, performing the following processing:

judging whether the pixel coordinate of the pixel is located in the range of a two-dimensional surrounding body of a certain dynamic object, if so, calculating the relative coordinate of the pixel under the semantic sub-image coordinate system of the dynamic object, writing the relative coordinate into a corresponding element of the semantic sub-image of the dynamic object, otherwise, ignoring the pixel, and continuously traversing the next pixel to obtain the semantic sub-image of the certain dynamic object;

and S4, respectively coding all the dynamic object semantic subgraphs and the filled static background semantic graph by using a coding algorithm.

2. The semantic graph compression method based on dynamic object segmentation according to claim 1, wherein all objects of the simulation scene in S1 are assigned with an ID, and IDs of objects having the same semantic class are the same; each ID uniquely corresponds to one color, and different IDs correspond to different colors.

3. The semantic graph compression method based on dynamic object segmentation according to claim 1, wherein the updating of the simulation scene in S2 includes pose updating of all dynamic objects and rendering view angles; each pixel in the semantic graph uniquely corresponds to one object, and the color corresponding to the object ID is used for coloring.

4. The method according to claim 1, wherein in S2, the two-dimensional bounding volume of the dynamic object in the semantic map coordinate system contains all pixels of the dynamic object on the semantic map.

5. The semantic map compression method based on dynamic object segmentation according to claim 1, wherein in the step S3, the remaining static background semantic map fills the corresponding image region with the adjacent pixels of each dynamic object semantic sub-map, and is implemented by the following sub-steps:

(1) Searching edge pixels around the dynamic object semantic subgraph, and if the edge pixels exceed the range of the semantic graph, determining the edge pixels to be invalid;

(2) And counting all effective edge pixels, and filling the corresponding image area of the dynamic object semantic subgraph in the semantic graph according to the values of the edge pixels.

6. The semantic graph compression method based on dynamic object segmentation according to claim 1, wherein the encoding result of each dynamic object semantic subgraph in S4 is accompanied by corresponding two-dimensional bounding volume information.

7. A semantic graph compression device based on dynamic object segmentation is characterized by comprising one or more processors and being used for realizing the semantic graph compression method based on dynamic object segmentation in any one of claims 1 to 6.

8. A computer-readable storage medium, on which a program is stored, which, when executed by a processor, implements the semantic graph compression method based on dynamic object segmentation according to any one of claims 1 to 6.