KR102746357B1

KR102746357B1 - Pyramid history map generating method for calculating feature map in deep learning based on convolution neural network and feature map generating method

Info

Publication number: KR102746357B1
Application number: KR1020160105515A
Authority: KR
Inventors: 박재한; 이배근
Original assignee: 주식회사 케이티
Priority date: 2016-08-19
Filing date: 2016-08-19
Publication date: 2024-12-24
Anticipated expiration: 2036-08-19
Also published as: KR20180020724A

Abstract

특징맵(A)에 대한 피라미드 히스토리 맵(B)을 생성하는 방법은 상기 특징맵(A)의 각 컨볼루션 값에 대응하는 윈도우 영역에 포함된 제 1 프레임과 제 2 프레임의 픽셀값이 일치하는지 여부에 기초하여 각 컨볼루션 값에 대응하는 참조값을 결정하고, 복수의 참조값을 포함하는 제1 레이어를 생성하는 단계; 상기 제 1 레이어를 복수개의 블록으로 구분하고, 구분된 각 블록 내의 참조값에 기초하여 각 블록에 대응하는 블록값을 결정하고, 복수의 블록값을 포함하는 제 2 레이어를 생성하는 단계; 및 상기 제 2 레이어의 상기 복수의 블록값에 기초하여 제3 레이어를 생성하는 단계를 포함하도록 구성된다. A method for generating a pyramid history map (B) for a feature map (A) is configured to include the steps of: determining a reference value corresponding to each convolution value based on whether pixel values of a first frame and a second frame included in a window area corresponding to each convolution value of the feature map (A) match, and generating a first layer including a plurality of reference values; dividing the first layer into a plurality of blocks, determining a block value corresponding to each block based on the reference value in each of the divided blocks, and generating a second layer including a plurality of block values; and generating a third layer based on the plurality of block values of the second layer.

Description

{PYRAMID HISTORY MAP GENERATING METHOD FOR CALCULATING FEATURE MAP IN DEEP LEARNING BASED ON CONVOLUTION NEURAL NETWORK AND FEATURE MAP GENERATING METHOD}

본 발명은 피라미드 히스토리 맵을 생성하는 방법 및 피라미드 히스토리 맵을 이용하여 특징맵을 생성하는 방법에 관한 것이다. 구체적으로, 나선형 신경망 네트워크(CNN; Convolution Neural Network) 기반의 딥러닝에서 특징맵의 계산을 위한 피라미드 히스토리 맵을 생성하는 방법 및 피라미드 히스토리 맵을 이용하여 특징맵을 생성하는 방법에 관한 것이다.The present invention relates to a method for generating a pyramid history map and a method for generating a feature map using the pyramid history map. Specifically, the present invention relates to a method for generating a pyramid history map for calculating a feature map in deep learning based on a convolutional neural network (CNN) and a method for generating a feature map using the pyramid history map.

기존에 이미지 또는 장면을 검색할 때는 유사한 그림을 찾아주거나, 캡션(주석)이 달려있는 텍스트 단어를 이용하여 검색 결과를 보여주었다. 이를 위해서는 이미지를 갖고 있거나, 캡션이 달려있는 방대한 데이터베이스가 필요하게 된다.When searching for images or scenes in the past, search results were presented by finding similar pictures or using text words with captions (annotations). This required a large database of images or captions.

최근 들어 딥러닝이 발전하며 기존의 영상 인식 정확도를 크게 향상시키면서, 이미지의 객체의 카테고리를 분류하거나 어떤 객체가 있는지를 인식할 수 있는 정도의 정확도를 보이고 있다. 게다가 이미지나 장면을 글로써 설명하는 문장을 만드는 수준까지 발전되고 있다.Recently, deep learning has been developed and has greatly improved the accuracy of existing image recognition, and has shown an accuracy level that can classify the category of objects in an image or recognize what objects are present. In addition, it has developed to the level of creating sentences that describe images or scenes in words.

딥러닝에서는 나선형 신경망 네트워크를 활용하여 이미지의 특징을 추출하는 방법을 사용하고 있다. 비디오 영상을 검색 및 분석하기 위해서는 모든 이미지를 나선형 층(Convolution Layer)으로 분석해야 하는데, 계산량이 커지는 문제점이 있다. 추가로 깊이가 깊어질수록 계산량이 승수로 증가하는 것이다. Deep learning uses a method to extract image features using a convolutional neural network. In order to search and analyze video images, all images must be analyzed with convolutional layers, but there is a problem that the amount of calculation increases. In addition, as the depth increases, the amount of calculation increases multiplier.

이러한 이미지의 특징을 추출하는 방법과 관련하여, 선행기술인 한국등록특허 제10-1349672호는 영상 특징 고속 추출 방법 및 이를 지원하는 장치에 관한 것으로, 슬라이딩 윈도우의 크기 및 스탭 거리 정보를 획득하고, 상기 슬라이딩 윈도우 정보를 기반으로 영상 분석된 데이터에서 중복 영상을 검출하고, 상기 검출된 중복 영상을 기반으로 슬라이딩 윈도우가 지정한 새로 분석된 영상 데이터 중 중복 영상 부분을 제외하고 새로 지정된 영역의 데이터만을 분석하는 방법에 대해 개시하고 있다. With respect to a method for extracting features of such images, prior art Korean Patent No. 10-1349672 relates to a method for high-speed extraction of image features and a device supporting the same, and discloses a method for obtaining information on the size of a sliding window and step distance, detecting duplicate images in image analyzed data based on the sliding window information, and analyzing only data in a newly designated area, excluding the duplicate image portion among newly analyzed image data designated by the sliding window based on the detected duplicate images.

이전 프레임과 현재 프레임간의 비교를 통하여 이전 프레임에 대한 특징맵을 현재 프레임에서 재활용할 수 있도록 하는 피라미드 히스토리 맵을 생성하는 방법 및 피라미드 히스토리 맵을 이용하여 특징맵을 생성하는 방법을 제공하고자 한다. The present invention provides a method for generating a pyramid history map that enables the feature map of a previous frame to be reused in the current frame by comparing the previous frame with the current frame, and a method for generating a feature map using the pyramid history map.

이전 프레임과 현재 프레임간의 중복되는 부분을 3단계 피라미드 방식으로 부호화함으로써 이전 프레임에서 계산된 특징맵을 효율적으로 재활용하고, 특징맵의 계산량을 줄일 수 있는 피라미드 히스토리 맵을 생성하는 방법 및 피라미드 히스토리 맵을 이용하여 특징맵을 생성하는 방법을 제공하고자 한다.The present invention provides a method for generating a pyramid history map that can efficiently reuse feature maps calculated from previous frames and reduce the amount of computation for feature maps by encoding overlapping parts between previous frames and current frames in a three-level pyramid manner, and a method for generating feature maps using the pyramid history map.

다만, 본 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다.However, the technical tasks that this embodiment seeks to accomplish are not limited to the technical tasks described above, and other technical tasks may exist.

상술한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 발명의 일 실시예는 특징맵(A)에 대한 피라미드 히스토리 맵(B)을 생성하는 방법에 있어서, 상기 특징맵(A)의 각 컨볼루션 값에 대응하는 윈도우 영역에 포함된 제 1 프레임과 제 2 프레임의 픽셀값이 일치하는지 여부에 기초하여 각 컨볼루션 값에 대응하는 참조값을 결정하고, 복수의 참조값을 포함하는 제1 레이어를 생성하는 단계; 상기 제 1 레이어를 복수개의 블록으로 구분하고, 구분된 각 블록 내의 참조값에 기초하여 각 블록에 대응하는 블록값을 결정하고, 복수의 블록값을 포함하는 제 2 레이어를 생성하는 단계; 및 상기 제 2 레이어의 상기 복수의 블록값에 기초하여 제3 레이어를 생성하는 단계를 포함하는 것인, 피라미드 히스토리 맵 생성 방법을 제공할 수 있다. As a technical means for achieving the above-described technical task, one embodiment of the present invention can provide a method for generating a pyramid history map (B) for a feature map (A), the method including: a step of determining a reference value corresponding to each convolution value based on whether pixel values of a first frame and a second frame included in a window area corresponding to each convolution value of the feature map (A) match, and generating a first layer including a plurality of reference values; a step of dividing the first layer into a plurality of blocks, and determining a block value corresponding to each block based on the reference value in each of the divided blocks, and generating a second layer including a plurality of block values; and a step of generating a third layer based on the plurality of block values of the second layer.

상기 제 1 레이어를 생성하는 단계는, 상기 각 컨볼루션 값에 대응하는 윈도우 영역에 포함된 제 1 프레임과 제 2 프레임의 픽셀값이 모두 일치하는 경우 해당 컨볼루션 값에 대응하는 참조값을 제 1 값으로 결정하고, 그 이외의 경우 참조값을 제 2 값으로 결정하는 단계를 포함할 수 있다.The step of generating the first layer may include a step of determining a reference value corresponding to the convolution value as the first value when pixel values of the first frame and the second frame included in the window area corresponding to each convolution value all match, and otherwise determining the reference value as the second value.

상기 제 2 레이어를 생성하는 단계는, 상기 각 블록 내의 참조값이 모두 제 1 값인 경우, 해당 블록의 블록값을 제 3 값로 결정하고, 그 이외의 경우 블록값을 제 4 값으로 결정하는 단계를 포함할 수 있다.The step of generating the second layer may include a step of determining the block value of the block as the third value when all reference values within each block are the first value, and otherwise determining the block value as the fourth value.

상기 제 3 레이어를 생성하는 단계는, 상기 제 2 레이어 내의 블록값이 모두 제 3 값인 경우, 상기 제 3 레이어의 값을 제 5 값으로 결정하고, 그 이외의 경우 상기 제 3 레이어의 값을 제 6 값으로 결정하는 단계를 포함할 수 있다.The step of generating the third layer may include a step of determining the value of the third layer as the fifth value when all block values in the second layer are the third value, and otherwise determining the value of the third layer as the sixth value.

상기 특징맵(A)에 대한 피라미드 히스토리 맵(B)을 재구성하여 다음 깊이의 특징맵(A')에 대한 피라미드 히스토리 맵(B')을 생성하는 단계를 더 포함할 수 있다.The method may further include a step of reconstructing a pyramid history map (B) for the above feature map (A) to generate a pyramid history map (B') for the feature map (A') of the next depth.

상기 피라미드 히스토리 맵(B')을 생성하는 단계는, 상기 피라미드 히스토리 맵(B)의 참조값에 기초하여 상기 피라미드 히스토리 맵(B')의 참조값을 결정하고, 복수의 참조값을 포함하는 상기 피라미드 히스토리 맵(B')의 제 1 레이어를 생성하는 단계, 상기 피라미드 히스토리 맵(B')의 제 1 레이어를 복수개의 블록으로 구분하고, 구분된 각 블록 내의 참조값에 기초하여 각 블록에 대응하는 블록값을 결정하고, 복수의 블록값을 포함하는 피라미드 히스토리 맵(B')의 제 2 레이어를 생성하는 단계; 및 상기 피라미드 히스토리 맵(B')의 제 2 레이어의 상기 복수의 블록값에 기초하여 상기 피라미드 히스토리 맵(B')의 제3 레이어를 생성하는 단계를 포함할 수 있다.The step of generating the pyramid history map (B') may include: a step of determining a reference value of the pyramid history map (B') based on a reference value of the pyramid history map (B), and generating a first layer of the pyramid history map (B') including a plurality of reference values; a step of dividing the first layer of the pyramid history map (B') into a plurality of blocks, and determining a block value corresponding to each block based on the reference value in each of the divided blocks, and generating a second layer of the pyramid history map (B') including a plurality of block values; and a step of generating a third layer of the pyramid history map (B') based on the plurality of block values of the second layer of the pyramid history map (B').

상기 피라미드 히스토리 맵(B')의 제 1 레이어를 생성하는 단계는, 상기 윈도우 영역이 k*l(k, l은 자연수) 행렬이고, 상기 피라미드 히스토리 맵(B)의 제 1 레이어의 각각의 제 2 값의 위치에 대하여 그 위치를 (x, y)로 표시했을 때(x, y는 0 이상의 정수), 상기 피라미드 히스토리 맵(B')의 제 1 레이어의 [x-k+1, x]*[y-l+1,y] 직사각형 영역 내의 참조값을 제 2 값으로 결정할 수 있다.The step of generating the first layer of the above pyramid history map (B') is such that, when the window area is a k*l (k, l are natural numbers) matrix and the position of each second value of the first layer of the pyramid history map (B) is represented as (x, y) (x, y are integers greater than or equal to 0), a reference value within a rectangular area of [x-k+1, x]*[y-l+1, y] of the first layer of the pyramid history map (B') can be determined as the second value.

상기 제 1 프레임 및 상기 제 2 프레임이 m*n 행렬이고(m, n은 자연수), 상기 윈도우 영역이 k*l(k, l은 자연수) 행렬이면, 상기 제 1 레이어는 (m-k+1)*(n-l+1) 행렬일 수 있다.If the first frame and the second frame are m*n matrices (m, n are natural numbers) and the window area is a k*l (k, l are natural numbers) matrix, the first layer may be a (m-k+1)*(n-l+1) matrix.

상기 제 2 레이어는 제 1 크기의 행렬이고, 상기 제 3 레이어는 상기 제 1 크기보다 작은 제 2 크기의 행렬일 수 있다.The second layer may be a matrix of the first size, and the third layer may be a matrix of the second size smaller than the first size.

상기 제 1 프레임과 상기 제 2 프레임은 하나의 영상에서 연속하는 프레임일 수 있다.The above first frame and the above second frame may be consecutive frames in one image.

본 발명의 다른 실시예는, 특징맵(A)에 대한 피라미드 히스토리 맵(B)을 생성하는 방법에 있어서, 상기 특징맵(A)의 각 컨볼루션 값에 대응하는 윈도우 영역에 포함된 제 1 프레임과 제 2 프레임의 픽셀값의 차이가 기준값 미만인지 여부에 기초하여 각 컨볼루션 값에 대응하는 참조값을 결정하고, 복수의 참조값을 포함하는 제 1 레이어를 생성하는 단계; 상기 제 1 레이어를 복수개의 블록으로 구분하고, 구분된 각 블록 내의 상기 참조값에 기초하여 각 블록에 대응하는 블록값을 결정하고, 복수의 블록값을 포함하는 제 2 레이어를 생성하는 단계; 및 상기 제 2 레이어의 상기 복수의 블록값에 기초하여 제3 레이어를 생성하는 단계를 포함하는 것인, 피라미드 히스토리 맵 생성 방법을 제공할 수 있다. Another embodiment of the present invention provides a method for generating a pyramid history map (B) for a feature map (A), the method comprising: a step of determining a reference value corresponding to each convolution value based on whether a difference between pixel values of a first frame and a second frame included in a window area corresponding to each convolution value of the feature map (A) is less than a reference value, and generating a first layer including a plurality of reference values; a step of dividing the first layer into a plurality of blocks, and determining a block value corresponding to each block based on the reference value in each of the divided blocks, and generating a second layer including a plurality of block values; and a step of generating a third layer based on the plurality of block values of the second layer.

본 발명의 또 다른 실시예는, 제 1 프레임과 제 2 프레임의 픽셀값에 기초하여 생성되고, 제 1 레이어, 제 2 레이어 및 제 3 레이어를 포함하는 피라미드 히스토리 맵을 이용하여 특징맵을 생성하는 방법에 있어서, 상기 제 1 프레임에 커널을 적용한 컨볼루션 값을 계산하여 상기 제 1 프레임의 특징맵(a)을 생성하는 단계; 상기 피라미드 히스토리 맵을 참고하여 상기 제 1 프레임의 특징맵(a)과 상기 제 2 프레임의 특징맵(b)의 중복 영역을 결정하는 단계; 상기 특징맵(a)의 중복 영역 내의 컨볼루션 값을 상기 특징맵(b)의 중복 영역내의 컨볼루션 값으로 사용하는 단계; 및 상기 특징맵(b)의 상기 중복 영역을 제외한 나머지 영역에 대하여 상기 제 2 프레임에 상기 커널을 적용한 컨볼루션 값을 계산하여 상기 제 2 프레임의 특징맵(b)을 생성하는 단계를 포함하는, 특징맵 생성 방법을 제공할 수 있다. Another embodiment of the present invention provides a method for generating a feature map using a pyramid history map, which is generated based on pixel values of a first frame and a second frame and includes a first layer, a second layer, and a third layer, the method comprising: calculating a convolution value applied with a kernel to the first frame to generate a feature map (a) of the first frame; determining an overlapping area between the feature map (a) of the first frame and the feature map (b) of the second frame by referring to the pyramid history map; using a convolution value within the overlapping area of the feature map (a) as a convolution value within the overlapping area of the feature map (b); and calculating a convolution value applied with the kernel to the second frame for a remaining area of the feature map (b) excluding the overlapping area, to generate a feature map (b) of the second frame.

상술한 과제 해결 수단은 단지 예시적인 것으로서, 본 발명을 제한하려는 의도로 해석되지 않아야 한다. 상술한 예시적인 실시예 외에도, 도면 및 발명의 상세한 설명에 기재된 추가적인 실시예가 존재할 수 있다.The above-described problem solving means are merely exemplary and should not be construed as limiting the present invention. In addition to the above-described exemplary embodiments, there may be additional embodiments described in the drawings and detailed description of the invention.

전술한 본 발명의 과제 해결 수단 중 어느 하나에 의하면, 이전 프레임과 현재 프레임간의 비교를 통해 이전 프레임의 정보를 재활용할 수 있도록 하는 피라미드 히스토리 맵을 생성하는 방법 및 피라미드 히스토리 맵을 이용하여 특징맵을 생성하는 방법을 제공할 수 있다. According to any one of the problem solving means of the present invention described above, a method for generating a pyramid history map that enables reuse of information of a previous frame through comparison between a previous frame and a current frame, and a method for generating a feature map using the pyramid history map can be provided.

이전 프레임과 현재 프레임간의 중복되는 부분을 3단계 피라미드 방식으로 부호화함으로써 이전 프레임에서 계산된 특징맵을 효율적으로 재활용하고, 특징맵의 계산량을 줄일 수 있는 피라미드 히스토리 맵을 생성하는 방법 및 피라미드 히스토리 맵을 이용하여 특징맵을 생성하는 방법을 제공할 수 있다.A method for generating a pyramid history map that can efficiently reuse a feature map calculated in a previous frame and reduce the amount of computation of the feature map by encoding the overlapping portion between the previous frame and the current frame in a three-level pyramid manner, and a method for generating a feature map using the pyramid history map can be provided.

영상에서 얼굴인식, 객체인식, 객체 분류 등의 인식을 위하여 이미지를 CNN(Convolution Neural Network)등과 같은 신경망을 통해 연산하는 경우 중복되는 구역을 피라미드 형식으로 부호화함으로써 연산량을 대폭 감소 시킬 수 있어 실시간 처리나 더 깊은 신경망의 결과를 기존보다 더 빨리 얻을 수 있게 하는 피라미드 히스토리 맵을 생성하는 방법 및 피라미드 히스토리 맵을 이용하여 특징맵을 생성하는 방법을 제공할 수 있다.When an image is operated through a neural network such as a CNN (Convolution Neural Network) for recognition of faces, objects, and object classification in an image, the amount of calculation can be drastically reduced by encoding overlapping areas in a pyramid format, thereby enabling real-time processing or obtaining results from a deeper neural network faster than before. A method for generating a pyramid history map and a method for generating a feature map using the pyramid history map can be provided.

도 1은 본 발명의 일 실시예에 따른 피라미드 히스토리 맵을 생성하는 방법을 나타낸 흐름도이다.
도 2는 본 발명의 일 실시예에 따른 컨볼루션 값에 대응하는 윈도우 영역을 설명하기 위한 도면이다.
도 3은 본 발명의 일 실시예에 따른 피라미드 히스토리 맵의 제 1 레이어를 생성하는 방법을 설명하기 위한 도면이다.
도 4는 본 발명의 일 실시예에 따른 피라미드 히스토리맵의 제 2 레이어 및 제 3 레이어를 생성하는 방법을 설명하기 위한 도면이다.
도 5는 본 발명의 일 실시예에 따른 피라미드 히스토리 맵을 재구성하는 방법을 설명하기 위한 도면이다.
도 6은 본 발명의 다른 실시예에 따른 피라미드 히스토리 맵의 제 1 레이어를 생성하는 방법을 설명하기 위한 도면이다.
도 7은 본 발명의 일 실시예에 따른 피라미드 히스토리 맵을 이용하여 특징맵을 생성하는 방법을 나타낸 흐름도이다.
도 8은 본 발명의 일 실시예에 따른 제 1 프레임의 특징맵을 생성하는 방법을 설명하기 위한 도면이다.
도 9는 본 발명의 일 실시예에 따른 제 1 프레임과 제 2 프레임의 중복 영역을 결정하는 방법을 설명하기 위한 도면이다. FIG. 1 is a flowchart illustrating a method for generating a pyramid history map according to one embodiment of the present invention.
FIG. 2 is a drawing for explaining a window area corresponding to a convolution value according to one embodiment of the present invention.
FIG. 3 is a diagram for explaining a method for generating a first layer of a pyramid history map according to one embodiment of the present invention.
FIG. 4 is a drawing for explaining a method for generating a second layer and a third layer of a pyramid history map according to one embodiment of the present invention.
FIG. 5 is a drawing for explaining a method for reconstructing a pyramid history map according to one embodiment of the present invention.
FIG. 6 is a diagram for explaining a method for generating a first layer of a pyramid history map according to another embodiment of the present invention.
FIG. 7 is a flowchart illustrating a method for generating a feature map using a pyramid history map according to one embodiment of the present invention.
FIG. 8 is a drawing for explaining a method for generating a feature map of a first frame according to one embodiment of the present invention.
FIG. 9 is a drawing for explaining a method for determining an overlapping area between a first frame and a second frame according to one embodiment of the present invention.

아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. Hereinafter, with reference to the attached drawings, embodiments of the present invention will be described in detail so that those skilled in the art can easily practice the present invention. However, the present invention may be implemented in various different forms and is not limited to the embodiments described herein. In addition, in order to clearly describe the present invention in the drawings, parts that are not related to the description are omitted, and similar parts are assigned similar drawing reference numerals throughout the specification.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미하며, 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. Throughout the specification, when a part is said to be "connected" to another part, this includes not only the case where it is "directly connected" but also the case where it is "electrically connected" with another element in between. Also, when a part is said to "include" a component, this should be understood to mean that it may further include other components, unless specifically stated to the contrary, and does not preclude the presence or possibility of one or more other features, numbers, steps, operations, components, parts, or combinations thereof.

본 명세서에 있어서 단말 또는 디바이스가 수행하는 것으로 기술된 동작이나 기능 중 일부는 해당 단말 또는 디바이스와 연결된 서버에서 대신 수행될 수도 있다. 이와 마찬가지로, 서버가 수행하는 것으로 기술된 동작이나 기능 중 일부도 해당 서버와 연결된 단말 또는 디바이스에서 수행될 수도 있다.Some of the operations or functions described as being performed by a terminal or device in this specification may instead be performed by a server connected to the terminal or device. Similarly, some of the operations or functions described as being performed by a server may also be performed by a terminal or device connected to the server.

이하 첨부된 도면을 참고하여 본 발명의 일 실시예를 상세히 설명하기로 한다. Hereinafter, an embodiment of the present invention will be described in detail with reference to the attached drawings.

도 1은 본 발명의 일 실시예에 따른 피라미드 히스토리 맵을 생성하는 방법을 나타낸 흐름도이다. FIG. 1 is a flowchart illustrating a method for generating a pyramid history map according to one embodiment of the present invention.

도 1을 참조하면, 단계 S102에서 제 1 프레임과 제 2 프레임의 픽셀값을 비교하여 제 1 레이어를 생성한다. Referring to FIG. 1, in step S102, pixel values of the first frame and the second frame are compared to generate a first layer.

제 1 프레임과 제 2 프레임은 복수의 픽셀(pixel)로 구성된 이미지 또는 영상의 프레임일 수 있다. 제 1 프레임 또는 제 2 프레임은 가로로 m 픽셀, 세로로 n 픽셀의 크기를 가지는 직사각형 프레임일 수 있으며(m, n은 자연수), 이러한 경우 m*n 행렬로 표시할 수 있다. The first frame and the second frame may be frames of an image or video composed of a plurality of pixels. The first frame or the second frame may be a rectangular frame having a size of m pixels horizontally and n pixels vertically (m and n are natural numbers), in which case it may be expressed as an m*n matrix.

제 1 프레임과 제 2 프레임은 하나의 영상에서 연속하는 프레임일 수 있다. 예컨대, 제 1 프레임과 제 2 프레임은 1초가 30프레임으로 구성된 영상에서 연속하는 2개의 프레임일 수 있으며, 제 1 프레임과 제 2 프레임은 픽셀의 일부 또는 전부가 중복될 수 있다. The first frame and the second frame may be consecutive frames in one video. For example, the first frame and the second frame may be two consecutive frames in a video in which one second consists of 30 frames, and the first frame and the second frame may overlap some or all of the pixels.

단계 S102에서 제 1 프레임과 제 2 프레임의 픽셀값을 비교하는 것은 특징맵(feature map)의 각 컨볼루션 값에 대응하는 윈도우 영역별로 수행되는 것일 수 있으며, 각 윈도우 영역에 포함된 제 1 프레임과 제 2 프레임의 픽셀값이 일치하는지 여부에 기초하여 각 컨볼루션 값에 대응하는 참조값을 결정하고, 복수의 참조값을 포함하는 제 1 레이어를 생성할 수 있다. In step S102, comparing pixel values of the first frame and the second frame may be performed for each window area corresponding to each convolution value of the feature map, and a reference value corresponding to each convolution value may be determined based on whether pixel values of the first frame and the second frame included in each window area match, and a first layer including a plurality of reference values may be generated.

특징맵이란, 프레임에 컨볼루션 커널(convolution kernel)을 적용하여 특징값을 추출해낸 결과물로서, 복수의 컨볼루션 값을 포함하는 행렬로 구성된다.A feature map is the result of extracting feature values by applying a convolution kernel to a frame, and is composed of a matrix containing multiple convolution values.

특징맵의 각 컨볼루션 값에 대응하는 윈도우 영역이란, 프레임의 픽셀 영역 중 각 컨볼루션 값의 계산 시 컨볼루션 커널이 적용되는 영역을 의미하며, 이하 도 2를 참조하여 컨볼루션 값에 대응하는 윈도우 영역에 대해 설명하도록 한다. The window area corresponding to each convolution value of the feature map refers to an area among the pixel areas of the frame to which the convolution kernel is applied when calculating each convolution value. The window area corresponding to the convolution value will be described below with reference to FIG. 2.

도 2는 본 발명의 일 실시예에 따른 컨볼루션 값에 대응하는 윈도우 영역을 설명하기 위한 도면이다. FIG. 2 is a drawing for explaining a window area corresponding to a convolution value according to one embodiment of the present invention.

도 2를 참조하면, 5*5 크기의 프레임(201)에 3*3 크기의 컨볼루션 커널(203)을 적용하여 3*3 크기의 특징맵(205)을 생성할 수 있다. 프레임(201)의 첫번째 영역(209)에 컨볼루션 커널(203)을 적용하면, 1*1 + 1*0 + 1*1 + 0*0 + 1*1 + 1*0 + 0*0 + 1*1 = 4가 나오므로, 특징맵(205)의 첫번째 컨볼루션 값(207)은 4가 된다. 마찬가지로 방식으로, 프레임(201) 내에서 컨볼루션 커널(203)을 1 픽셀씩 이동시키며 컨볼루션 값을 계산한다. Referring to Fig. 2, a 3*3 sized convolution kernel (203) can be applied to a 5*5 sized frame (201) to generate a 3*3 sized feature map (205). If the convolution kernel (203) is applied to the first region (209) of the frame (201), 1*1 + 1*0 + 1*1 + 0*0 + 1*1 + 1*0 + 0*0 + 1*1 = 4 is obtained, so the first convolution value (207) of the feature map (205) becomes 4. In the same manner, the convolution value is calculated by moving the convolution kernel (203) by 1 pixel within the frame (201).

이 때 첫번째 컨볼루션 값(207)에 대응하는 윈도우 영역은 프레임(201)의 첫번째 영역(209)이고, 두번째 컨볼루션 값(211)에 대응하는 윈도우 영역은 프레임(201)의 두번째 영역(213)이다. At this time, the window area corresponding to the first convolution value (207) is the first area (209) of the frame (201), and the window area corresponding to the second convolution value (211) is the second area (213) of the frame (201).

이어서, 도 3을 참조하여 피라미드 히스토리 맵의 제 1 레이어를 생성하는 방법에 대해 상세히 설명하도록 한다. Next, a method for generating the first layer of the pyramid history map will be described in detail with reference to FIG. 3.

도 3은 본 발명의 일 실시예에 따른 피라미드 히스토리 맵의 제 1 레이어를 생성하는 방법을 설명하기 위한 도면이다. FIG. 3 is a diagram for explaining a method for generating a first layer of a pyramid history map according to one embodiment of the present invention.

도 3을 참조하면, 8*8 크기의 제 1 프레임(301)과 제 2 프레임(303)의 픽셀값을 비교하여 6*6 크기의 제 1 레이어(305)를 생성할 수 있다. 제 1 레이어(305)의 크기는 피라미드 히스토리 맵을 이용하여 생성할 특징맵의 크기와 동일하며, 프레임의 크기 및 특징맵 생성 시 사용되는 컨볼루션 커널의 크기에 따라 결정된다. 프레임의 크기가 m*l이고, 특징맵 생성 시 사용되는 컨볼루션 커널의 크기가 k*l이면 제 1 레이어의 크기는 (m-k+1)*(n-l+1)이 된다(m, l, k, n은 자연수). 도 3에 도시된 제 1 레이어(305)는 제 1 프레임(301)과 제 2 프레임(303)에 대해 3*3 크기의 컨볼루션 커널을 적용한 특징맵 생성 시 이용될 수 있는 피라미드 히스토리 맵의 제 1 레이어이다. Referring to FIG. 3, the pixel values of the first frame (301) and the second frame (303) having a size of 8*8 can be compared to generate the first layer (305) having a size of 6*6. The size of the first layer (305) is the same as the size of the feature map to be generated using the pyramid history map, and is determined according to the size of the frame and the size of the convolution kernel used when generating the feature map. If the size of the frame is m*l and the size of the convolution kernel used when generating the feature map is k*l, the size of the first layer becomes (m-k+1)*(n-l+1) (m, l, k, n are natural numbers). The first layer (305) illustrated in FIG. 3 is the first layer of the pyramid history map that can be used when generating a feature map by applying a convolution kernel having a size of 3*3 to the first frame (301) and the second frame (303).

제 1 레이어(305)는 복수의 참조값을 포함하며, 참조값은 제 1 값 또는 제 2 값으로 결정될 수 있다. 예를 들어, 도 3에 도시된 제 1 레이어(305)는 6*6=36개의 참조값을 포함하고 있으며, 제 1 값은 1이고, 제 2 값은 0이다. 이하에서는, 제 1 값이 1이고, 제 2 값이 0인 것으로 가정하여 설명하기로 한다.The first layer (305) includes a plurality of reference values, and the reference values can be determined as the first value or the second value. For example, the first layer (305) illustrated in FIG. 3 includes 6*6=36 reference values, and the first value is 1 and the second value is 0. In the following, it will be explained assuming that the first value is 1 and the second value is 0.

제 1 레이어(305)의 참조값은, 각 컨볼루션 값에 대응하는 윈도우 영역에 포함된 제 1 프레임과 제 2 프레임의 픽셀값이 모두 일치하는 경우 1로, 그 이외의 경우 0으로 결정될 수 있다. 그 이외의 경우라 함은 적어도 하나의 픽셀값이 일치하지 않는 경우일 수 있다. The reference value of the first layer (305) can be determined as 1 when the pixel values of the first frame and the second frame included in the window area corresponding to each convolution value all match, and 0 otherwise. The other case may be a case where at least one pixel value does not match.

예컨대, 첫번째 컨볼루션 값에 대응하는 제 1 프레임(301)의 윈도우 영역(307)에 포함된 픽셀값과 제 2 프레임(303)의 윈도우 영역(309)에 포함된 픽셀값을 비교해보면, 12, 20, 30, 8, 11, 2, 5, 7, 3으로 모두 일치한다. 따라서, 첫번째 컨볼루션 값에 대응하는 참조값은 1로 결정될 수 있다. For example, when comparing the pixel values included in the window area (307) of the first frame (301) corresponding to the first convolution value and the pixel values included in the window area (309) of the second frame (303), they all match as 12, 20, 30, 8, 11, 2, 5, 7, and 3. Therefore, the reference value corresponding to the first convolution value can be determined as 1.

마찬가지 방식으로, 제 1 프레임(301)과 제 2 프레임(303) 내에서 윈도우 영역을 1 픽셀씩 이동시키며 양 프레임의 윈도우에 포함된 픽셀값을 비교하여 참조값을 결정할 수 있다. In the same manner, the window area can be moved by 1 pixel within the first frame (301) and the second frame (303) and the pixel values included in the windows of both frames can be compared to determine the reference value.

도 3에 도시된 제 1 프레임(301)과 제 2 프레임(303)의 경우 우측 최상단의 직사각형 영역(313) 내의 픽셀값을 제외하고 픽셀값이 모두 동일하다. 따라서, 윈도우 영역을 1 픽셀씩 이동시키며 모든 윈도우 영역에 대하여 참조값을 결정해보면 제 1 레이어(305)의 우측 최상단의 직사각형 영역(315)의 참조값은 0으로 결정되고, 그 외 영역의 참조값은 1로 결정된다. In the case of the first frame (301) and the second frame (303) illustrated in Fig. 3, all pixel values are the same except for the pixel value within the rectangular area (313) at the upper right. Therefore, when the window area is moved by 1 pixel and reference values are determined for all window areas, the reference value of the rectangular area (315) at the upper right of the first layer (305) is determined as 0, and the reference values of the other areas are determined as 1.

다시 도 1로 돌아와서, 단계 S104에서 제 1 레이어를 복수개의 블록으로 구분하고, 각 블록의 블록값을 결정하여 제 2 레이어를 생성한다. 그 다음, 단계 S106에서 제 2 레이어의 블록값에 기초하여 제 3 레이어를 생성한다. 이하 도 4를 참조하여, 제 2 레이어 및 제 3 레이어의 생성 방법에 대해 설명한다. Returning to FIG. 1 again, in step S104, the first layer is divided into a plurality of blocks, and the block value of each block is determined to generate the second layer. Then, in step S106, the third layer is generated based on the block value of the second layer. Hereinafter, with reference to FIG. 4, a method of generating the second layer and the third layer will be described.

도 4는 본 발명의 일 실시예에 따른 피라미드 히스토리맵의 제 2 레이어 및 제 3 레이어를 생성하는 방법을 설명하기 위한 도면이다. FIG. 4 is a drawing for explaining a method for generating a second layer and a third layer of a pyramid history map according to one embodiment of the present invention.

도 4를 참조하면, 제 1 레이어(410)를 복수개의 블록, 예컨대 4개의 블록(411, 413, 415, 417)로 구분하고, 구분된 각 블록 내의 참조값에 기초하여 각 블록에 대응하는 블록값을 결정하고, 복수의 블록값을 포함하는 제 2 레이어(430)를 생성할 수 있다. Referring to FIG. 4, a first layer (410) may be divided into a plurality of blocks, for example, four blocks (411, 413, 415, 417), a block value corresponding to each block may be determined based on a reference value within each divided block, and a second layer (430) including a plurality of block values may be generated.

예컨대, 각 블록 내의 참조값이 모두 1이면 해당 블록의 블록값을 제 3 값으로 결정하고, 블록 내의 적어도 하나의 참조값이 0이면 블록값을 제 4 값으로 결정할 수 있다. 예를 들어, 도 4에서는 제 3 값이 1, 제 4 값이 0인 경우를 도시하고 있다. 이하에서는, 제 3 값이 1이고, 제 4 값이 0인 것으로 가정하여 설명하기로 한다.For example, if all reference values in each block are 1, the block value of the corresponding block can be determined as the third value, and if at least one reference value in the block is 0, the block value can be determined as the fourth value. For example, FIG. 4 illustrates a case where the third value is 1 and the fourth value is 0. In the following, it will be explained assuming that the third value is 1 and the fourth value is 0.

도 4를 참조하면, 제 1 레이어(410)의 제 1 블록(411) 내의 참조값은 모두 1이므로, 제 1 블록(411)의 블록값(431)은 1로 결정된다. 한편, 제 2 블록(413)은 참조값이 0인 영역을 포함하고 있으므로, 제 2 블록(413)의 블록값(433)은 0으로 결정된다. 마찬가지로 제 3 블록(415)의 블록값(435)은 1로, 제 4 블록(417)의 블록값(437)은 0으로 결정된다. Referring to FIG. 4, since all reference values in the first block (411) of the first layer (410) are 1, the block value (431) of the first block (411) is determined as 1. Meanwhile, since the second block (413) includes an area where the reference value is 0, the block value (433) of the second block (413) is determined as 0. Similarly, the block value (435) of the third block (415) is determined as 1, and the block value (437) of the fourth block (417) is determined as 0.

상술한 바와 같이, 제 1 레이어의 참조값에 기초하여 복수개의 블록값을 포함하는 제 2 레이어(430)를 생성할 수 있다. 제 2 레이어는 제 1 크기의 행렬일 수 있으며, 예를 들어, 도 4에 도시된 바와 같이 2*2 크기의 행렬일 수 있다. As described above, a second layer (430) including a plurality of block values can be generated based on the reference values of the first layer. The second layer can be a matrix of the first size, for example, a matrix of the size of 2*2 as illustrated in FIG. 4.

이어서 제 2 레이어(430)의 블록값에 기초하여 제 3 레이어(450)를 생성하는 방법을 설명한다. Next, a method of generating a third layer (450) based on the block value of the second layer (430) is described.

제 2 레이어(430)의 블록값이 모두 1인 경우, 제 3 레이어를 제 5 값으로 결정하고, 적어도 하나의 블록값이 0인 경우, 제 3 레이어를 제 6 값으로 결정하여 제 3 레이어를 생성할 수 있다. 예를 들어, 제 5 값은 1이고, 제 6 값은 0일 수 있다. 이하에서는, 제 5 값이 1이고, 제 6 값이 0인 것으로 가정하여 설명하기로 한다.If all block values of the second layer (430) are 1, the third layer can be determined as the fifth value, and if at least one block value is 0, the third layer can be determined as the sixth value to generate the third layer. For example, the fifth value can be 1 and the sixth value can be 0. In the following, it will be explained assuming that the fifth value is 1 and the sixth value is 0.

도 4에 도시된 실시예의 경우, 제 2 레이어(430)는 블록값이 0인 블록을 포함하고 있으므로, 제 3 레이어(450)의 값은 0으로 결정된다. In the embodiment illustrated in Fig. 4, since the second layer (430) includes a block with a block value of 0, the value of the third layer (450) is determined to be 0.

이상 설명한 바와 같이, 피라미드 히스토리 맵을 구성하는 제 1 레이어(410), 제 2 레이어(430) 및 제 3 레이어(450)를 생성할 수 있다. As described above, a first layer (410), a second layer (430), and a third layer (450) constituting a pyramid history map can be created.

도 1에 도시된 일 실시예에 따른 피라미드 히스토리 맵을 생성하는 방법에 따라 특징맵(A)에 대한 피라미드 히스토리 맵(B)을 생성한 경우, 생성한 히스토리 맵(B)을 재구성하여 다음 깊이의 특징맵(A')에 대한 피라미드 히스토리 맵(B')을 생성할 수 있다. When a pyramid history map (B) for a feature map (A) is generated according to a method for generating a pyramid history map according to an embodiment illustrated in FIG. 1, the generated history map (B) can be reconstructed to generate a pyramid history map (B') for a feature map (A') of the next depth.

도 5는 본 발명의 일 실시예에 따른 피라미드 히스토리 맵을 재구성하는 방법을 설명하기 위한 도면이다. FIG. 5 is a drawing for explaining a method for reconstructing a pyramid history map according to one embodiment of the present invention.

도 5를 참조하면, 특징맵(A)에 대한 피라미드 히스토리 맵(B)의 제 1 레이어(510)를 재구성하여 다음 깊이의 특징맵(A')에 대한 피라미드 히스토리 맵(B')의 제 1 레이어(530)를 생성할 수 있다. 도 5에 도시된 실시예는 3*3 크기의 컨볼루션 커널을 사용하는 경우로서, 제 1 레이어(510)의 크기가 6*6이면, 재구성된 제 1 레이어(530)의 크기는 4*4가 된다. Referring to FIG. 5, the first layer (510) of the pyramid history map (B) for the feature map (A) can be reconstructed to generate the first layer (530) of the pyramid history map (B') for the feature map (A') of the next depth. The embodiment illustrated in FIG. 5 is a case where a convolution kernel of size 3*3 is used, and if the size of the first layer (510) is 6*6, the size of the reconstructed first layer (530) becomes 4*4.

재구성된 제 1 레이어(530)의 참조값은 제 1 레이어(510)의 참조값에 기초하여 결정될 수 있다. 예컨대, 도 5에 도시된 바와 같이, 제 1 레이어(510)의 제 1 위치(511)의 참조값이 0인 경우, 재구성된 제 1 레이어(530)는 제 1 위치로부터 왼쪽으로 2칸, 위쪽으로 2칸을 더 포함하는 직사각형 영역(531)의 참조값이 0으로 결정될 수 있다. The reference value of the reconstructed first layer (530) can be determined based on the reference value of the first layer (510). For example, as illustrated in FIG. 5, if the reference value of the first location (511) of the first layer (510) is 0, the reference value of the rectangular area (531) of the reconstructed first layer (530) that includes 2 spaces to the left and 2 spaces above from the first location can be determined as 0.

다시 말해, 제 1 레이어(510)에서 4행3열 위치(511)의 참조값이 0이면, 재구성된 제 1 레이어(530)의 2행1열부터 4행3열까지의 직사각형 영역(531)의 참조값이 0으로 결정될 수 있다. 이는 제 1 레이어(510)에 3*3 커널을 적용하여 1픽셀씩 이동시킬 때 0인 참조값이 커널에 포함되는 경우를 생각해보면 이해할 수 있다. In other words, if the reference value of the 4th row, 3rd column location (511) in the 1st layer (510) is 0, the reference value of the rectangular area (531) from the 2nd row, 1st column to the 4th row, 3rd column of the reconstructed 1st layer (530) can be determined as 0. This can be understood by considering the case where a reference value of 0 is included in the kernel when moving the 1st layer (510) by 1 pixel by applying a 3*3 kernel.

임의의 위치에 대해 설명해보면, 컨볼루션 커널이 k*l 행렬인 경우, 제 1 레이어(510)의 x행y열 (x, y)의 위치에서 참조값이 0이면, 재구성된 제 1 레이어(530)의 (x-k+1)행(y-l+1)열부터 x행y열까지 직사각형 영역([x-k+1, x]*[y-l+1,y])내의 참조값이 0으로 결정될 수 있다. For an arbitrary location, if the convolution kernel is a k*l matrix, and the reference value is 0 at the location of (x, y) in the x row and y column of the first layer (510), the reference value within the rectangular area ([x-k+1, x]*[y-l+1, y]) from the (x-k+1) row and (y-l+1) column to the x row and y column of the reconstructed first layer (530) can be determined as 0.

이상 설명한 바와 같이, 특징맵(A)에 대한 제 1 레이어(510)의 참조값에 기초하여, 다음 깊이의 특징맵(A')에 대한 제 1 레이어(530)를 재구성할 수 있다. 피라미드 히스토리 맵(B')의 제 2 레이어 및 제 3 레이어를 생성하는 방법은 피라미드 히스토리 맵(B)의 제 2 레이어 및 제 3 레이어를 생성하는 방법과 동일하므로, 설명을 생략한다.As described above, based on the reference value of the first layer (510) for the feature map (A), the first layer (530) for the feature map (A') of the next depth can be reconstructed. The method for generating the second layer and the third layer of the pyramid history map (B') is the same as the method for generating the second layer and the third layer of the pyramid history map (B), and therefore, the description is omitted.

이상 설명한 바와 같이, 이전 깊이의 특징맵에 대하여 생성한 피라미드 히스토리 맵을 간단히 재구성하여 다음 깊이의 특징맵에 대한 피라미드 히스토리 맵을 생성할 수 있으므로, 깊이가 다른 모든 특징맵에 대해서 픽셀을 비교하여 피라미드 히스토리 맵을 생성할 필요가 없어진다. As explained above, the pyramid history map for the feature map of the next depth can be generated by simply reconstructing the pyramid history map generated for the feature map of the previous depth, so there is no need to generate the pyramid history map by comparing pixels for all feature maps of different depths.

도 6은 본 발명의 다른 실시예에 따른 피라미드 히스토리 맵의 제 1 레이어를 생성하는 방법을 설명하기 위한 도면이다. FIG. 6 is a diagram for explaining a method for generating a first layer of a pyramid history map according to another embodiment of the present invention.

도 6을 참조하면, 제 2 프레임(603)은 제 1 픽셀 영역(605) 및 제 2 픽셀 영역(607)에 있어서 제 1 프레임(601)의 픽셀값과 차이가 있다. 제 1 픽셀 영역(605)은 픽셀값의 차이가 작으며(차이값 2 미만), 제 2 픽셀 영역(607)은 픽셀값의 차이가 크다. Referring to FIG. 6, the second frame (603) has differences in pixel values from the first frame (601) in the first pixel area (605) and the second pixel area (607). The first pixel area (605) has a small difference in pixel values (a difference value of less than 2), and the second pixel area (607) has a large difference in pixel values.

도 1 에서 설명한 본 발명의 일 실시예에 따른 피라미드 히스토리 맵의 생성 방법에 따르면, 제 1 프레임(601)과 제 2 프레임(603)의 픽셀값이 일치하는지 여부에 기초하여 참조값을 결정하였므로, 일 실시예에 따라 생성된 제 1 레이어(608)는 제 1 픽셀 영역(605)에 대응하는 영역(609)의 참조값이 0으로 결정된다. According to the method for generating a pyramid history map according to one embodiment of the present invention described in FIG. 1, since the reference value is determined based on whether the pixel values of the first frame (601) and the second frame (603) match, the reference value of the area (609) corresponding to the first pixel area (605) of the first layer (608) generated according to one embodiment is determined to be 0.

본 발명의 다른 실시예에 따른 피라미드 히스토리 맵을 생성하는 방법에서는 제 1 프레임(601)과 제 2프레임(603)의 픽셀값의 차이가 기준값 미만인지 여부에 기초하여 참조값을 결정한다는 점에서 차이가 있다. 기준값은 제 1프레임(601) 또는 제 2 프레임(603)의 픽셀값의 평균 및 분산에 기초하여 결정될 수 있다. A method for generating a pyramid history map according to another embodiment of the present invention differs in that a reference value is determined based on whether the difference between pixel values of the first frame (601) and the second frame (603) is less than a reference value. The reference value may be determined based on the average and variance of pixel values of the first frame (601) or the second frame (603).

다른 실시예에 따라 생성된 제 1 레이어(611)의 경우, 기준값이 2 이상이면, 제 1 픽셀 영역(605)에 대응하는 영역(613)의 참조값이 1로 결정된다. In the case of the first layer (611) generated according to another embodiment, if the reference value is 2 or more, the reference value of the area (613) corresponding to the first pixel area (605) is determined as 1.

본 발명의 다른 실시예에 따른 피라미드 히스토리 맵의 생성 방법에 따르면, 양 프레임 간의 픽셀값에 차이가 있더라도, 그 차이가 미세한 경우에는 해당 영역에 대해서 양 프레임이 중복되는 것으로 판단한다. 이는 픽셀값의 차이가 미세한 영역은 이를 중복되는 영역으로 취급하더라도 프레임의 특징 추출에 영향을 미치지 않기 때문이다. According to a method for generating a pyramid history map according to another embodiment of the present invention, even if there is a difference in pixel values between two frames, if the difference is slight, it is determined that the two frames overlap with respect to the corresponding area. This is because even if an area with a slight difference in pixel values is treated as an overlapping area, it does not affect feature extraction of the frame.

이상으로 본 발명의 일 실시예 및 다른 실시예에 따른 피라미드 히스토리 맵을 생성하는 방법에 대해 설명하였으며, 이하에서는 피라미드 히스토리 맵을 활용하는 방법에 대해 설명한다. The method for generating a pyramid history map according to one embodiment and another embodiment of the present invention has been described above, and the method for utilizing the pyramid history map will be described below.

도 7은 본 발명의 일 실시예에 따른 피라미드 히스토리 맵을 이용하여 특징맵을 생성하는 방법을 나타낸 흐름도이다.FIG. 7 is a flowchart illustrating a method for generating a feature map using a pyramid history map according to one embodiment of the present invention.

도 7을 참조하면, 단계 S701에서 제 1 프레임의 특징맵을 생성한다. 단계 S701에서의 특징맵 생성은 종래의 특징맵 생성 방법과 같다. 즉, 제 1 프레임의 전 영역에 컨볼루션 커널을 적용하여 컨볼루션 값을 계산함으로써 특징맵을 생성한다. Referring to Fig. 7, a feature map of the first frame is generated in step S701. The feature map generation in step S701 is the same as a conventional feature map generation method. That is, a feature map is generated by applying a convolution kernel to the entire area of the first frame and calculating a convolution value.

단계 S703에서 제 1 프레임과 제 2 프레임의 중복 영역을 결정하고, 단계 S705에서 중복 영역에 대하여 제 1 프레임의 특징맵의 컨볼루션 값을 제 2 프레임의 특징맵에서 재사용한다. 따라서, 컨볼루션 값을 재사용하는 만큼 제 2 프레임의 특징맵 생성 시의 연산량을 줄일 수 있다. In step S703, an overlapping area between the first frame and the second frame is determined, and in step S705, the convolution value of the feature map of the first frame for the overlapping area is reused in the feature map of the second frame. Therefore, the amount of computation required to generate the feature map of the second frame can be reduced by the amount of reuse of the convolution value.

단계 S707에서 중복 영역을 제외한 나머지 영역에 대한 컨볼루션 값을 계산하여 제 2 프레임의 특징맵을 생성한다. In step S707, a feature map of the second frame is generated by calculating convolution values for the remaining areas excluding the overlapping areas.

이하에서는, 도 8을 참조하여 컨볼루션 연산을 통해 제 1 프레임의 특징맵을 생성하는 방법에 대해 설명하고, 이어서, 도 9를 참조하여 피라미드 히스토리 맵을 이용하여 제 1 프레임과 제 2 프레임의 중복 영역을 결정하고, 컨볼루션 값을 재사용하는 방법에 대해 설명한다. Hereinafter, a method for generating a feature map of the first frame through a convolution operation is described with reference to FIG. 8, and then, a method for determining an overlapping area between the first frame and the second frame and reusing the convolution value using a pyramid history map is described with reference to FIG. 9.

도 8은 본 발명의 일 실시예에 따른 제 1 프레임의 특징맵을 생성하는 방법을 설명하기 위한 도면이다. FIG. 8 is a drawing for explaining a method for generating a feature map of a first frame according to one embodiment of the present invention.

도 8을 참조하면, 제 1 프레임(801)에 컨볼루션 커널(803)을 적용하여 특징맵(805)를 생성한다. 특징맵(805)의 첫번째 컨볼루션 값인 61은 12*1 + 20*0 + 30*1 + 8*0 + 11*1 + 2*0 + 5*1 + 7*0 + 3*1=61과 같이 9번의 곱셈과 8번의 덧셈을 통해 얻어진다. 따라서, 특징맵(805)을 생성하기 위하여 총 9*36 = 324번의 곱셈과 8*36 = 288번의 덧셈을 수행하여야 한다. Referring to Fig. 8, a convolution kernel (803) is applied to the first frame (801) to generate a feature map (805). The first convolution value of the feature map (805), 61, is obtained through 9 multiplications and 8 additions, as in 12*1 + 20*0 + 30*1 + 8*0 + 11*1 + 2*0 + 5*1 + 7*0 + 3*1 = 61. Accordingly, a total of 9*36 = 324 multiplications and 8*36 = 288 additions must be performed to generate the feature map (805).

도 9는 본 발명의 일 실시예에 따른 제 1 프레임과 제 2 프레임의 중복 영역을 결정하는 방법을 설명하기 위한 도면이다. FIG. 9 is a drawing for explaining a method for determining an overlapping area between a first frame and a second frame according to one embodiment of the present invention.

도 9를 참조하면, 피라미드 히스토리 맵의 제 3 레이어(901), 제 2 레이어(905) 및 제 1 레이어(909)를 순차적으로 참고하여 제 1 프레임의 특징맵(805)과 제 2 프레임의 특징맵(903 또는 915)의 중복 영역을 결정하는 과정이 도시되어 있다. Referring to FIG. 9, a process of determining an overlapping area between a feature map (805) of a first frame and a feature map (903 or 915) of a second frame is illustrated by sequentially referencing the third layer (901), the second layer (905), and the first layer (909) of the pyramid history map.

우선, 제 3 레이어(901)을 참고하여 제 3 레이어의 값이 1인 경우에는, 제 1 프레임의 특징맵(805)과 제 2 프레임의 특징맵(903)의 전체 영역을 중복 영역으로 결정할 수 있다. 이 경우, 제 1 프레임의 특징맵(805)의 모든 컨볼루션 값이 제 2 프레임의 특징맵(903)에서 재사용될 수 있다. First, with reference to the third layer (901), if the value of the third layer is 1, the entire area of the feature map (805) of the first frame and the feature map (903) of the second frame can be determined as an overlapping area. In this case, all convolution values of the feature map (805) of the first frame can be reused in the feature map (903) of the second frame.

제 3 레이어(901)의 값이 0인 경우에는 제 2 레이어(905)를 참고한다. 제 2 레이어(905)를 참고하여, 제 2 레이어의 블록값이 1인 블록에 대응하는 제 2 프레임의 영역(907)을 중복 영역으로 결정하고, 해당 중복 영역에 대하여 제 1 프레임의 특징맵(805)의 컨볼루션 값을 재사용할 수 있다. If the value of the third layer (901) is 0, the second layer (905) is referenced. With reference to the second layer (905), the area (907) of the second frame corresponding to the block whose block value of the second layer is 1 is determined as an overlapping area, and the convolution value of the feature map (805) of the first frame can be reused for the overlapping area.

이어서, 제 2 레이어(905)의 블록값이 0인 블록에 대하여는 제 1 레이어(909)를 참고하여 제 1 레이어(909)의 참조값이 1인 영역(911)에 대응하는 영역을 중복 영역으로 추가 결정할 수 있다. Next, for blocks in which the block value of the second layer (905) is 0, an area corresponding to an area (911) in which the reference value of the first layer (909) is 1 can be additionally determined as an overlapping area by referring to the first layer (909).

제 3 레이어(901), 제 2 레이어(905) 및 제 1 레이어(909)를 순차적으로 참고하여 결정한 중복 영역(913)에 대해서는 제 1 프레임의 특징맵(805)의 컨볼루션 값을 재사용할 수 있다. 따라서, 중복 영역을 제외한 나머지 영역에 대해서만 컨볼루션 연산을 수행하면 되므로, 제 2 프레임의 특징맵 생성을 위한 연산량을 줄이고, 효율적으로 특징맵을 생성할 수 있다. The convolution value of the feature map (805) of the first frame can be reused for the overlapping area (913) determined by sequentially referring to the third layer (901), the second layer (905), and the first layer (909). Accordingly, since the convolution operation only needs to be performed for the remaining areas excluding the overlapping area, the amount of calculation required to generate the feature map of the second frame can be reduced, and the feature map can be generated efficiently.

본 발명에 따른 피라미드 히스토리 맵은 3d 컨볼루션 신경망을 활용하여 물체/사람인식과 그 물체나 사람의 행동을 인식하는 시스템에도 활용될 수 있다. 움직임 인식을 위해 3~5 프레임을 하나의 블록으로 보는 부분에 대해서 특정 시점의 블록과 그 다음 블록을 비교하여 상술한 바와 같이 피라미드 히스토리 맵을 구성하게 되면 이전 블록 대비 변화가 없는 부분에 대한 픽셀에 대해서 중복된 계산량을 줄일 수 있기 때문이다.The pyramid history map according to the present invention can also be utilized in a system that recognizes objects/people and recognizes actions of the objects or people by utilizing a 3D convolutional neural network. For motion recognition, when 3 to 5 frames are viewed as one block, the block at a specific point in time is compared with the next block to construct the pyramid history map as described above, so that the amount of redundant calculations for pixels in a part that has no change compared to the previous block can be reduced.

본 발명에 따른 피라미드 히스토리 맵은 화면의 크기가 커지면 커질수록 계산량을 줄이는 폭이 커진다는 장점이 있다. 컨볼루션 신경망을 이용하여 사람/물체를 인식하기 위해서는 픽셀 단위의 연산이 필요하기 때문이다. The pyramid history map according to the present invention has the advantage that the larger the screen size, the greater the reduction in the amount of calculation. This is because pixel-level calculations are required to recognize people/objects using a convolutional neural network.

예를 들어 UHD 4k 적용해보면, (4096) x (2160) 픽셀에 3x3 컨볼루션 커널 적용시 필요한 계산량은 대략 800만번 필요하게 된다. 컨볼루션 커널을 3개 정도를 사용한다면 2400만번의 계산이 필요하게 된다. Depth가 10개라면 2.4억번의 계산량이 필요하게 된다. 이전 프레임과 현재 프레임이 1/3 중복이 된다면 대략 1/3의 계산량을 감소 시켜, 0.8억번의 계산량으로 줄일 수 있다.For example, if we apply UHD 4k, the required calculation amount when applying a 3x3 convolution kernel to (4096) x (2160) pixels is approximately 8 million times. If about 3 convolution kernels are used, 24 million calculations are required. If the depth is 10, 240 million calculations are required. If the previous frame and the current frame overlap by 1/3, the calculation amount can be reduced by approximately 1/3, reducing the calculation amount to 80 million times.

도 1을 통해 설명된 피라미드 히스토리 맵을 생성하는 방법 및 도 7을 통해 설명된 피라미드 히스토리 맵을 이용하여 특징맵을 생성하는 방법은 컴퓨터에 의해 실행되는 매체에 저장된 프로그램 또는 컴퓨터에 의해 실행 가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체 및 통신 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. 통신 매체는 전형적으로 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈, 또는 반송파와 같은 변조된 데이터 신호의 기타 데이터, 또는 기타 전송 메커니즘을 포함하며, 임의의 정보 전달 매체를 포함한다. The method for generating a pyramid history map described through FIG. 1 and the method for generating a feature map using the pyramid history map described through FIG. 7 can also be implemented in the form of a recording medium including a program stored on a computer-executable medium or instructions executable by a computer. The computer-readable medium can be any available medium that can be accessed by a computer, and includes both volatile and nonvolatile media, removable and non-removable media. In addition, the computer-readable medium can include both computer storage media and communication media. The computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules, or other data. The communication media typically includes computer-readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave, or other transport mechanism, and includes any information delivery media.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다. The above description of the present invention is for illustrative purposes, and those skilled in the art will understand that the present invention can be easily modified into other specific forms without changing the technical idea or essential characteristics of the present invention. Therefore, it should be understood that the embodiments described above are exemplary in all respects and not restrictive. For example, each component described as a single component may be implemented in a distributed manner, and likewise, components described as distributed may be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다. The scope of the present invention is indicated by the claims described below rather than the detailed description above, and all changes or modifications derived from the meaning and scope of the claims and their equivalent concepts should be interpreted as being included in the scope of the present invention.

410: 제 1 레이어
430: 제 2 레이어
450: 제 3 레이어410: Layer 1
430: 2nd layer
450: 3rd layer

Claims

A method for generating a pyramid history map (B) for a feature map (A) performed by a program stored in a medium executed by a computer,
A step of determining a reference value corresponding to each convolution value based on whether the pixel values of the first frame and the second frame included in the window area corresponding to each convolution value of the feature map (A) match, and generating a first layer including a plurality of reference values;
A step of dividing the first layer into a plurality of blocks, determining a block value corresponding to each block based on a reference value within each divided block, and generating a second layer including a plurality of block values; and
A step of generating a third layer based on the plurality of block values of the second layer
A method for creating a pyramid history map, comprising:

In paragraph 1,
The step of generating the above first layer is:
A step of determining a reference value corresponding to the convolution value as the first value if the pixel values of the first frame and the second frame included in the window area corresponding to each of the above convolution values are all the same, and otherwise determining the reference value as the second value.
A method for creating a pyramid history map, comprising:

In the second paragraph,
The step of creating the above second layer is:
If all reference values within each block above are the first value, the block value of the block is determined as the third value, and in other cases, the block value is determined as the fourth value.
A method for creating a pyramid history map, comprising:

In the third paragraph,
The steps to create the third layer are
If all block values in the second layer are the third value, a step of determining the value of the third layer as the fifth value, and otherwise determining the value of the third layer as the sixth value
A method for creating a pyramid history map, comprising:

In paragraph 4,
A step of reconstructing a pyramid history map (B) for the above feature map (A) to generate a pyramid history map (B') for the feature map (A') of the next depth.
Including more,
The steps for generating the above pyramid history map (B') are:
A step of determining a reference value of the pyramid history map (B') based on a reference value of the pyramid history map (B), and generating a first layer of the pyramid history map (B') including a plurality of reference values;
A step of dividing the first layer of the pyramid history map (B') into a plurality of blocks, determining a block value corresponding to each block based on a reference value within each divided block, and generating a second layer of the pyramid history map (B') including a plurality of block values; and
A step of generating a third layer of the pyramid history map (B') based on the plurality of block values of the second layer of the pyramid history map (B')
A method for creating a pyramid history map, comprising:

In paragraph 5,
The step of generating the first layer of the above pyramid history map (B') is:
A method for generating a pyramid history map, wherein the window area is a k*l (k, l are natural numbers) matrix, and when the position of each second value of the first layer of the pyramid history map (B) is represented as (x, y) (x, y are integers greater than or equal to 0), a reference value within a rectangular area of [x-k+1, x]*[y-l+1, y] of the first layer of the pyramid history map (B') is determined as the second value.

In paragraph 1,
A method for generating a pyramid history map, wherein if the first frame and the second frame are m*n matrices (m, n are natural numbers) and the window area is a k*l (k, l are natural numbers) matrix, the first layer is a (m-k+1)*(n-l+1) matrix.

In paragraph 1,
A method for generating a pyramid history map, wherein the second layer is a matrix of the first size.

In Article 8,
A method for generating a pyramid history map, wherein the third layer is a matrix of a second size smaller than the first size.

In paragraph 1,
A method for generating a pyramid history map, wherein the first frame and the second frame are consecutive frames in one image.

A method for generating a pyramid history map (B) for a feature map (A) performed by a program stored in a medium executed by a computer,
A step of determining a reference value corresponding to each convolution value based on whether the difference between the pixel values of the first frame and the second frame included in the window area corresponding to each convolution value of the feature map (A) is less than a reference value, and generating a first layer including a plurality of reference values;
A step of dividing the first layer into a plurality of blocks, determining a block value corresponding to each block based on the reference value within each divided block, and generating a second layer including a plurality of block values; and
A step of generating a third layer based on the plurality of block values of the second layer
A method for creating a pyramid history map, comprising:

In Article 11,
A method for generating a pyramid history map, wherein the reference value is determined based on the average and variance of pixel values of the first frame or the second frame.

A method for generating a feature map using a pyramid history map including a first layer, a second layer, and a third layer, the pyramid history map being generated based on pixel values of a first frame and a second frame performed by a program stored in a medium executed by a computer,
A step of generating a feature map (a) of the first frame by calculating a convolution value by applying a kernel to the first frame;
A step of determining an overlapping area between the feature map (a) of the first frame and the feature map (b) of the second frame by referring to the pyramid history map;
A step of using the convolution value within the overlapping area of the above feature map (a) as the convolution value within the overlapping area of the above feature map (b); and
A step of generating a feature map (b) of the second frame by calculating a convolution value that applies the kernel to the second frame for the remaining area excluding the overlapping area of the feature map (b)
Including
The above pyramid history map is,
A step of determining a reference value corresponding to each convolution value based on whether the pixel values of the first frame and the second frame included in the window area corresponding to each convolution value of the feature map (a) match, and generating the first layer including a plurality of reference values;
A step of dividing the first layer into a plurality of blocks, determining a block value corresponding to each block based on the reference value within each divided block, and generating the second layer including a plurality of block values, and
A method for generating a feature map, the method comprising: generating a third layer based on the plurality of block values of the second layer.

In Article 13,
A method for generating a feature map, wherein the step of determining the overlapping area determines the overlapping area by sequentially referring to the third layer, the second layer, and the first layer of the pyramid history map.

delete

In Article 14,
The step of generating the first layer includes a step of determining a reference value corresponding to the convolution value as the first value if the pixel values of the first frame and the second frame included in the window area corresponding to each convolution value all match, and otherwise determining the reference value as the second value.
The step of generating the second layer includes the step of determining the block value of the block as the third value if all reference values within each block are the first value, and otherwise determining the block value as the fourth value.
A method for generating a feature map, wherein the step of generating the third layer includes the step of determining the value of the third layer as the fifth value when all block values in the second layer are the third value, and otherwise determining the value of the third layer as the sixth value.

In Article 16,
The step of determining the above overlapping area is:
Referring to the third layer, if the value of the third layer is the fifth value, a step of determining the entire area of the feature map (a) and the feature map (b) as an overlapping area;
If the value of the third layer is the sixth value, a step of determining the area of the feature map (a) and the feature map (b) corresponding to the block whose block value is the third value as the overlapping area by referring to the second layer; and
A step of determining an area of the feature map (a) and the feature map (b) corresponding to an area where the reference value is the first value as the overlapping area by referring to the first layer for a block where the block value is the fourth value.
A method for generating a feature map, comprising: