CN106600576A

CN106600576A - RGB-D camera-based human head locking method

Info

Publication number: CN106600576A
Application number: CN201610937575.5A
Authority: CN
Inventors: 王国强; 宋焕生; 孙士杰; 王韬
Original assignee: Shanghai Ge Ge Electronic Technology Co Ltd
Current assignee: Shenzhen Xiudan Technology Co ltd
Priority date: 2016-10-25
Filing date: 2016-10-25
Publication date: 2017-04-26
Anticipated expiration: 2036-10-25
Also published as: CN106600576B

Abstract

The invention discloses a head locking method based on an RGB-D camera. By setting up an RGB-D camera in the channel, the camera is used to shoot the channel containing a human body target, obtain multiple depth maps, and obtain a top view corresponding to the depth map , form a set of rectangular frames according to the top view, and realize the locking of the human head. The method of the invention can accurately lock the human head.

Description

Human head locking method based on RGB-D camera

Technical Field

The invention relates to a human head locking method based on an RGB-D camera.

Background

With the development of camera technology, an RGB-D camera appears as a new technology, and in recent years, the RGB-D camera has been widely used as the price thereof is lowered. At present, the RGB-D camera has many implementation principles, such as speckle, TOF and the like, and is gradually and widely applied to various fields, such as three-dimensional reconstruction, image understanding and video monitoring. An advantage of an RGB-D camera is that the distance of the scene to the camera can be directly obtained and then presented to the user in the form of an image (referred to as a depth image or depth image) which is more accurate than the conventional depth image obtained using binoculars. The advantages of an RGB-D camera may provide great convenience for people counting in complex environments.

People counting is one of core contents of video monitoring all the time, and is not well solved for a long time, the main reason is that not only human targets but also other targets exist in scenes, and the targets do not have obvious colors or edge features in some crowded scenes, such as public transportation scenes, so that the targets are often difficult to be segmented by using a traditional RGB camera design algorithm, such as in supermarket channels, except people, bags, carts, purchased articles and the like, false targets (such as bags and carried articles) and human heads do not have obvious features to be distinguished, and the traditional RGB camera is difficult to be accurately locked to people.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a human head locking method based on an RGB-D camera, which can accurately lock the human head.

In order to achieve the purpose, the invention adopts the following technical scheme:

a human head locking method based on an RGB-D camera comprises the following steps:

the method comprises the following steps: erecting an RGB-D camera in a channel scene, calibrating the camera, and calculating a parameter matrix of the camera, wherein the channel comprises an A direction and a B direction which are opposite;

step two: continuously shooting a channel containing a human body target by using a camera to obtain N depth maps; obtaining a top view of each depth map; obtaining background image I by using all the obtained top views_b；

Step three: shooting a channel containing a human body target by using a camera to obtain a depth map of m at a certain moment; acquiring a corresponding top view of the depth map; performing background removal operation on a top view to obtain a foreground picture, performing blocking operation on the foreground picture to obtain a blocked picture, searching a local maximum region operation on the blocked picture to obtain a local maximum region set, performing expansion operation on the local maximum region set to obtain an expanded local maximum region set, and performing filtering rectangular frame processing on the expanded local maximum region set to obtain a rectangular frame set S containing a plurality of elements_FmThe purpose of locking the human head is achieved.

Specifically, in the second step and the third step, the top view of each depth map is obtained by the following formula:

len＝m*r

where θ is the distance P (x) passing through the depth map_p,y_p,z_p) Of dotsCorresponding to the included angle between the ray and the ground plane; g (x)_G,y_G0) is the intersection point of the oblique line passing through the point P and the ground plane; h_CIs the camera height; m (0 < m < D) is the depth value of the P point in the depth map, wherein D is the maximum pixel value set by a user; r is the distance in world space corresponding to the unit depth value;

plan view I is obtained using the following formula:

wherein, (u, v) represents a pixel point in the top view I corresponding to the point P on the depth map, and I (u, v) represents a pixel value at the pixel point (u, v);

and aiming at each point in the depth map, obtaining a pixel point in the top view corresponding to the point and a pixel value at the pixel point, wherein all the pixel values form a top view I.

Specifically, in the third step, a background removal operation is performed on the top view to obtain a foreground picture, and an adopted formula is as follows:

wherein,_Fthreshold value set for user for extracting foreground, I_F(u, v) denotes the foreground Picture I_FPixel value, I, at the middle pixel point (u, v)_b(u, v) is background plot I_bPixel value at pixel point (u, v) position, I_m(u, v) shows the plan view I_mPixel value at pixel point (u, v).

Specifically, in the third step, a blocking operation is performed on the foreground picture to obtain a blocked picture, and an adopted formula is as follows:

wherein, I_F(u, v) is the foreground picture I_FPixel value with coordinates (u, v), I_B(x, y) is Picture I_BThe size of the delineated block is w for the pixel value at the location of the pixel point (x, y)_b×w_b。

Specifically, the step three includes performing an operation of finding a local maximum region for a picture after being partitioned to obtain a local maximum region set, and specifically includes the following steps:

for picture I_BThe above pixel point (x, y) is searched for eight pixel points around the pixel point, if the pixel value corresponding to the pixel point is larger than the pixel values corresponding to the eight pixel points, the pixel point is put into a local maximum area set S_LIn (1) by S_L ⁽ⁱ⁾Denotes S_LIs a member of, and S_L ⁽ⁱ⁾＝(u_i,v_i,d_i)，(u_i,v_i) Represents the pixel point, d_iIs a pixel point (u)_i,v_i) In picture I_BThe pixel value of (1).

Specifically, the expanding the local maximum region operation on the local maximum region set in the third step to obtain an expanded local maximum region set specifically includes the following steps:

for a local maximum region set S_LEach element S of_L ⁽ⁱ⁾Looking for S_L ⁽ⁱ⁾In the foreground picture I_FThe formula adopted by the corresponding pixel position in (1) is as follows:

wherein (x)_i,y_i) Is S_L ⁽ⁱ⁾Corresponding to the foreground picture I_FThe position of (1); order S_S ⁽ⁱ⁾＝(x_i,y_i,z_i)，(x_i,y_i) Denotes S_L ⁽ⁱ⁾Corresponding to the foreground picture I_FTo obtain a set S_S，S_S ⁽ⁱ⁾Is a set S_SAn element of (1);

for S_SEach member S in_S ⁽ⁱ⁾＝(x_i,y_i,z_i) With S_S ⁽ⁱ⁾For the seeds, a seed filling method is utilized, outward expansion is carried out, and the conditions of the expansion are as follows: if I_F(x_i,y_i)-z_i|≤_EThen use a rectangular frame S_E ⁽ⁱ⁾＝(u_i,v_i,H_i,W_i,z_i) All pixel points satisfying the condition in the frame selection, wherein (u)_i,v_i) Is the upper left corner of the rectangular frame, (H)_i,W_i) Is the height and width of a rectangular frame, z_iAs a result of the original pixel values,_Eforming a set S of expanded regions for a specified threshold_E，S_E ⁽ⁱ⁾Is a set S_EOf (2) is used.

Specifically, the step three includes performing rectangular frame filtering processing on the expanded local maximum region set to obtain a rectangular frame set including multiple elements, and includes the following steps:

using two sets S of filter condition pairs_EFiltering the elements in (1):

(1) if the element S_E ⁽ⁱ⁾The following conditions are met:then the element is deleted;

(2) if two rectangular frames S_E ⁽ⁱ⁾＝(u_i,v_i,H_i,W_i,z_i) And S_E ^(j)＝(u_j,v_j,H_j,W_j,z_j) Satisfy the following requirementsThen determine S_E ⁽ⁱ⁾And S_E ^(j)Coincidence, if coincident, then z is retained_iAnd z_jA larger rectangular frame;

forming the reserved rectangular frames into a rectangular frame set S_FmSet of rectangular frames S_FmThe element in (A) is S_Fm ⁽ⁱ⁾Where m represents time.

Compared with the prior art, the invention has the following technical effects: according to the method, the RGB-D camera is erected in the channel, the channel containing the human body target is shot by the camera, a plurality of depth maps are obtained, the top view corresponding to the depth maps is obtained, and the rectangular frame set is formed according to the top view.

The embodiments of the invention will be explained and explained in further detail with reference to the figures and the detailed description.

Drawings

FIG. 1 is a scene model without a coordinate system;

FIG. 2 is a channel model of the world coordinate system;

FIG. 3 is a schematic diagram of a top view image blocking operation;

FIG. 4 is a schematic illustration of finding a local maximum; wherein, (a) represents the picture area in which the maximum value is sought, (b) represents the process of finding the local maximum value, and (c) represents the final finding of the local maximum value;

FIG. 5 is a schematic view of a camera mounting location;

FIG. 6 is a diagram illustrating the selection of six sets of world coordinates and their corresponding image coordinates;

FIG. 7 is a schematic diagram of a top view from a depth map; the method comprises the following steps of (a) obtaining a background image of a channel scene, (b) obtaining a depth image, (c) obtaining a foreground image through background removing operation, and (d) obtaining a top view;

FIG. 8 is a schematic diagram of a filtered set of rectangular boxes taken from a top view; the method comprises the following steps of (a) representing a blocking operation result graph, (b) representing a local maximum area set, (c) extending a rectangular frame set after a local maximum area, and (d) filtering the rectangular frame set after rectangular frame processing.

Detailed Description

The invention discloses a human head locking method based on an RGB camera, which comprises the following steps:

erecting an RGB-D camera in a channel scene, calibrating the camera, and calculating a parameter matrix P of the camera;

step 1.1: selecting a certain channel as a scene for people counting, referring to fig. 1, installing a camera right above the channel, and enabling a plurality of human body targets to walk on the channel along the direction A or the direction B, wherein the direction A is opposite to the direction B;

step 1.2: and establishing a world coordinate system. Referring to fig. 2, the camera is located on the Z-axis of the world coordinate system, the direction along the channel is the Y-axis direction of the world coordinate system, the direction perpendicular to the channel is the X-axis direction of the world coordinate system, and the position coordinate of the camera in the world coordinate system is (0,0, H), where H is the distance of the camera from the origin of the world coordinate system.

Step 1.3: and calibrating the camera. Using a calibration support, selecting N (N is more than or equal to 6) groups of image coordinates and world coordinates corresponding to the image coordinates:

the parameter matrix P of the camera is calculated using the following formula:

wherein,

step two: continuously shooting a channel containing a human body target by using a camera to obtain N (N is more than or equal to 50) depth maps; obtaining a top view of each depth map; obtaining a background image I from a top view_b。

The method for obtaining the top view of each depth map comprises the following steps:

the depth values in the depth map represent points in the world coordinate space, such as the distance len from a point P to the camera, i.e. the length of the hypotenuse of the small right triangle in the map, and we can obtain the following formula according to the geometric relationship of the objects under the world coordinate system:

len＝m*r (4)

wherein, theta is an included angle between a corresponding ray passing through the point P on the depth map and the ground plane; g (x)_G,y_G0) is the intersection point of the oblique line passing through the point P and the ground plane; h_CIs the camera height; m (0 < m < D) is the depth value of the P point in the depth map, wherein D is set by a userA maximum pixel value; and r is the distance in world space corresponding to the unit depth value.

After obtaining the coordinates of the point P, zooming and translating the point P to be located at the center of the top view I, then:

wherein, (u, v) represents a pixel point in the top view I corresponding to the point P, and I (u, v) represents a pixel value at the pixel point (u, v), where (r)_x,r_y) To point P of (x)_p,y_p) Scaling factor of (d)_x,d_y) To point P of (x)_p,y_p) And (4) translation coefficient.

And aiming at each point in the depth map, obtaining a pixel point in the top view corresponding to the point and a pixel value of the pixel point, wherein all the pixel values form a top view I. N top views I can be obtained by adopting the method aiming at N depth maps_i(i＝1,...N)。

Wherein, a background image I is obtained by using a top view_bThe formula adopted is as follows;

wherein H is the length of the top view, W is the width of the top view, I_b(x, y) is background picture I_bThe pixel value at the position of the pixel point (x, y) can be used to obtain the background image I_b。

Step three: shooting a channel containing a human body target by using a camera to obtain a depth map at a certain moment; acquiring a corresponding top view of the depth map; background removal, blocking, local maximum area searching, local maximum area expanding and rectangular frame filtering processing are carried out on the top view, and a rectangular frame set S is obtained_Fm(ii) a The method specifically comprises the following steps:

step 3.1: the method comprises the steps that a camera is used for shooting a channel containing a human body target, an RGB-D camera is adopted, and a depth map of a certain moment m (m is 1, 2.);

step 3.2, acquiring a corresponding top view I of the shot depth map_mAnd the adopted method is the same as the method for acquiring the top view in the second step.

Step 3.3, for top view I obtained in step 3.2_mPerforming background removal, blocking, local maximum area searching, local maximum area expanding and rectangular frame filtering processing to obtain a rectangular frame set S_FmThe specific treatment process is as follows:

removing the background: for top view I_mObtaining the foreground picture I by adopting a formula (8)_F：

Wherein,_Fthreshold value set for user for extracting foreground, I_F(u, v) denotes the foreground Picture I_FPixel value at the middle pixel point (u, v).

And (3) blocking operation: with a size w_b×w_bBlock pair foreground picture I_FBlocking to obtain picture I_BThe formula adopted is as follows:

wherein, I_F(u, v) is the foreground picture I_FPixel value with coordinates (u, v), I_B(x, y) is Picture I_BPixel value at pixel point (x, y) location.

Finding the local maximum area: for picture I_BThe (x, y) of the above point, and the (x, y) of the point around the point is searchedIf the pixel value corresponding to the pixel point is larger than the pixel values corresponding to the eight pixel points, the pixel point is placed into a local maximum area set S_LIn, adopt S_L ⁽ⁱ⁾Denotes S_LAnd S is_L ⁽ⁱ⁾＝(u_i,v_i,d_i)，(u_i,v_i) Represents the pixel point, d_iIs a pixel point (u)_i,v_i) In picture I_BThe pixel value of (1).

Expanding the local maximum area: for a local maximum region set S_LEach element S of_L ⁽ⁱ⁾Looking for S_L ⁽ⁱ⁾In the foreground picture I_FThe formula adopted by the corresponding pixel position in (1) is as follows:

wherein (x)_i,y_i) Is S_L ⁽ⁱ⁾Corresponding to the foreground picture I_FOf (c) is used. Order S_S ⁽ⁱ⁾＝(x_i,y_i,z_i)，(x_i,y_i) Denotes S_L ⁽ⁱ⁾Corresponding to the foreground picture I_FThe pixel points of (2) can obtain a set S_S，S_S ⁽ⁱ⁾Is a set S_SOf (2) is used.

For S_SEach member S in_S ⁽ⁱ⁾＝(x_i,y_i,z_i) With S_S ⁽ⁱ⁾For the seeds, a seed filling method is utilized, outward expansion is carried out, and the conditions of the expansion are as follows: if I_F(x_i,y_i)-z_i|≤_E，_ETo set the threshold value to 10, a rectangular frame S is used_E ⁽ⁱ⁾＝(u_i,v_i,H_i,W_i,z_i) All pixel points satisfying the condition in the frame selection, wherein (u)_i,v_i) Is the upper left corner of the rectangular frame, (H)_i,W_i) Is a rectangular frameHeight and width of (z)_iFor the original pixel values (i.e. the spatial height of the rectangular box), a set S of expanded regions is finally formed_E，S_E ⁽ⁱ⁾Is a set S_EOf (2) is used.

And (3) filtering a rectangular frame: after the extended area is obtained, the overlap area and the abnormal area need to be filtered, and two filtering conditions are used, 1. if the rectangle frame S_E ⁽ⁱ⁾The following conditions are met:then not reserved; 2. if two rectangular frames S_E ⁽ⁱ⁾＝(u_i,v_i,H_i,W_i,z_i) And S_E ^(j)＝(u_j,v_j,H_j,W_j,z_j) Satisfy the following requirementsThen determine S_E ⁽ⁱ⁾And S_E ^(j)Coincidence, if coincident, then z is retained_iAnd z_jA larger rectangular frame.

The remaining rectangular frames form a set S of rectangular frames_FmSet of rectangular frames S_FmThe element in (A) is S_Fm ⁽ⁱ⁾And completing the human head locking task.

Examples

In the processing process of the embodiment, the sampling frequency is 25 frames/second, the size of the frame image is 320 × 240, and the scene is a front door scene of a bus.

The camera is mounted on the bus at the position shown in fig. 5, and a world coordinate system is established, and the height H of the camera is 254 (cm). And 6 groups of points of world coordinates corresponding to the images are selected by using the calibration frame, and parameters P of the camera are calculated according to the figure 6.

Here is selected_b10, as shown in fig. 7, (a) is a background image of the channel scene, and (b) is obtainedDepth map, (c) foreground picture obtained by background removing operation, and (d) top view.

As shown in FIG. 8, select B_S×B_S＝5×5，_h10(a) a partitioning operation result graph, wherein a white rectangular frame in (b) is a local maximum region set, a white rectangular frame in (c) is an expanded local maximum region, and a white rectangular frame in (d) is a set after filtering rectangular frame processing.

Claims

1. A head locking method based on RGB-D camera, is characterized in that, comprises the following steps:

Step 1: Set up an RGB-D camera in the channel scene, calibrate the camera, and calculate the parameter matrix of the camera. The channel includes the A direction and the B direction, and the two directions are opposite;

Step 2: Use the camera to continuously shoot the channel containing the human target to obtain N depth maps; obtain the top view of each depth map; use all the obtained top views to obtain the background image I _b ;

Step 3: Use the camera to shoot the channel containing the human target to obtain a depth map of m at a certain moment; obtain its corresponding top view for the depth map; perform background removal operations on the top view to obtain a foreground image, and divide the foreground image into blocks The operation is to obtain the divided picture, and the operation of finding the local maximum area is performed on the divided picture to obtain the local maximum area set, and the local maximum area set is extended to obtain the expanded local maximum area set, and for the expanded local maximum area set The local maximum area set is processed by filtering the rectangular frame, and a rectangular frame set S _Fm containing multiple elements is obtained to achieve the purpose of head locking.

2. the people counting method based on RGB-D camera as claimed in claim 1, is characterized in that, in step 2 and step 3, obtain the top view of every depth map, the formula that adopts is as follows:

[\begin{matrix} {x x}_{G G} \\ {y the y}_{G G} \end{matrix}] = = {[\begin{matrix} {p p}_{1111} - - {p p}_{3131} x x & {p p}_{1212} - - {p p}_{3232} x x \\ {p p}_{21 twenty one} - - {p p}_{3131} y the y & {p p}_{22 twenty two} - - {p p}_{3232} y the y \end{matrix}]}^{- - 11} [\begin{matrix} {p p}_{3434} x x - - {p p}_{1414} \\ {p p}_{3434} y the y - - {p p}_{24 twenty four} \end{matrix}]

s the s i i n no ((θ θ)) = = {H h}_{c c} / / \sqrt{{x x}_{G G}^{22} + + {y the y}_{G G}^{22} + + {H h}_{c c}^{22}}

len=m*r

\{\begin{matrix} {z z}_{P P} = = {H h}_{c c} - - l l e e n no * * s the s i i n no ((θ θ)) \\ {x x}_{p p} = = {x x}_{G G} ((11 - - \frac{{z z}_{p p}}{{H h}_{C C}})) \\ {y the y}_{p p} = = {y the y}_{G G} ((11 - - \frac{{z z}_{p p}}{{H h}_{C C}})) \end{matrix}

Among them, θ is the angle between the corresponding ray passing through point P(x _p ,y _p ,z _p ) on the depth map and the ground plane; G(x _G ,y _G ,0) is the angle between the oblique line passing through point P and the ground plane H _C is the height of the camera; m (0<m<D) is the depth value of point P in the depth map, where D is the maximum pixel value set by the user; r is the world space corresponding to the unit depth value distance;

Use the following formula to get the top view I:

\{\begin{matrix} u u = = {r r}_{x x} {x x}_{p p} + + {d d}_{x x} \\ v v = = {r r}_{y the y} {y the y}_{p p} + + {d d}_{y the y} \\ I I ((u u,, v v)) = = {z z}_{p p} \end{matrix}

Among them, (u, v) represents the pixel point in the top view I corresponding to the point P on the depth map, and I(u, v) represents the pixel value at the pixel point (u, v);

For each point in the depth map, the pixel point in the top view corresponding to the point and the pixel value at the pixel point are obtained, and all pixel values form the top view I.

3. the people counting method based on RGB-D camera as claimed in claim 2, is characterized in that, in described step 3, carry out background removal operation to obtain foreground picture for overhead view, the formula that adopts is as follows:

Among them, δ _F is the threshold value set by the user for extracting the foreground, I _F (u, v) represents the pixel value at the pixel point (u, v) in the foreground picture I _F , and I _b (u, v) is the background The pixel value at the pixel point (u, v) in the image I _b , and _Im (u, v) represents the pixel value at the pixel point (u, v) in the top view _Im .

4. the people counting method based on RGB-D camera as claimed in claim 3, is characterized in that, in described step 3, carry out block operation at foreground picture and obtain the picture after block, the formula that adopts is as follows:

{I I}_{B B} ((x x,, y the y)) = = \frac{{Σ Σ}_{u u = = {w w}_{b b} x x}^{{w w}_{b b} x x + + {w w}_{b b}} {Σ Σ}_{v v = = {w w}_{b b} y the y}^{{w w}_{b b} y the y + + {w w}_{b b}} {I I}_{F f} ((u u,, v v))}{{w w}_{b b}^{22}}

Wherein, I _F (u, v) is the pixel value of the foreground picture I _F coordinate (u, v), and I _B (x, y) is the pixel value of the picture I _B at the pixel point (x, y) position, The size of the defined block is w _b ×w _b .

5. the people counting method based on RGB-D camera as claimed in claim 4, it is characterized in that, the picture in described step 3 is searched for local maximum area operation and obtains local maximum area set for the picture after the block, specifically comprises the following step:

For the pixel point (x, y) on the picture I _B , find the eight pixel points around the pixel point, if the pixel value corresponding to the pixel point is larger than the pixel value corresponding to the eight pixel points, the pixel point Points are placed in the local maximum area set _SL , and S _L ⁽ⁱ⁾ is used to represent the members of S _L , and S _L ⁽ⁱ⁾ = (u _i , v _i , d _i ), (u _i , v _i ) represents the pixel, d _i is the pixel value of the pixel (u _i , v _i ) in the picture I _B.

6. the people counting method based on RGB-D camera as claimed in claim 5, it is characterized in that, in described step 3, carry out expansion local maximum area operation for local maximum area set to obtain expanded local maximum area set, specifically Include the following steps:

For each element _SL ⁽ⁱ⁾ of the local maximum area set _SL , find the corresponding pixel position of _SL ⁽ⁱ⁾ in the foreground picture I _F , the formula adopted is:

\{\begin{matrix} {x x}_{i i} = = {u u}_{i i} {w w}_{b b} + + \frac{{w w}_{b b}}{22} \\ {y the y}_{i i} = = {v v}_{i i} {w w}_{b b} + + \frac{{w w}_{b b}}{22} \\ {z z}_{i i} = = {d d}_{i i} \end{matrix}

Among them, ( _xi , y _i ) is the position of S _L ⁽ⁱ⁾ corresponding to the foreground picture I _F ; let S _S ⁽ⁱ⁾ = ( _xi , y _i , z _i ), ( _xi , y _i ) Represent S _L ⁽ⁱ⁾ corresponding to the pixel point of foreground picture I _F , obtain set S _S , S _S ⁽ⁱ⁾ is the element of set S _S ;

For each member _{S S} ₍ ⁱ⁾ = ( _xi , y _i , _zi ) in S S, use S _S ⁽ⁱ⁾ as the seed, and use the seed filling method to expand outward. The expansion condition is: if| I _F (x _i ,y _i )-z _i |≤δ _E , then use a rectangular box S _E ( _i ⁾ = (u _i ,v _i ,H _i ,W _i ,zi ) to select all the Pixels, where (u _i , v _i ) is the upper left corner of the rectangular frame, (H _i , W _i ) is the height and width of the rectangular frame, z _i is the original pixel value, and δ _E is the specified threshold, forming an extended The set S _E of the back region, S _E ⁽ⁱ⁾ is an element of the set S _E .

7. the people counting method based on RGB-D camera as claimed in claim 6, it is characterized in that, in described step 3, carry out filter rectangular frame processing for the local maximum area collection after expansion, obtain the multi-element A collection of rectangular boxes, including the following steps:

Use two filter conditions to filter the elements in the set S _E :

(1) If the element S _E ⁽ⁱ⁾ meets the following conditions: then delete the element;

(2) If two rectangular frames S _E ⁽ⁱ⁾ = (u _i , v _i , H _i , W _i , z _i ) and S _E ^(j) = (u _j , v _j , H _j , W _j , z _j ), satisfy Then it is determined that S _E ⁽ⁱ⁾ and S _E ^(j) are coincident, and if they coincide, the rectangular frame with larger z _i and z _j is reserved;

The remaining rectangular frames are formed into a rectangular frame set S _Fm , and the elements in the rectangular frame set S _Fm are S _Fm ⁽ⁱ⁾ , where m represents a moment.