KR102586816B1

KR102586816B1 - Image enhancement apparatus and method for obtaining image in which image quliaty is enhanced based on multiple transformation function estimation

Info

Publication number: KR102586816B1
Application number: KR1020220063171A
Authority: KR
Inventors: 이철; 박재민; 비엔지아안; 김한울
Original assignee: 동국대학교 산학협력단; 서울과학기술대학교 산학협력단
Priority date: 2022-05-24
Filing date: 2022-05-24
Publication date: 2023-10-11
Anticipated expiration: 2042-05-24

Abstract

The present invention relates to a technology for obtaining an improved image by estimating a multi-transformation function having various color characteristics by utilizing spatial information and statistical characteristics of an image, and improving the quality of the image by utilizing the estimated multi-transformation functions. According to an embodiment of the present invention, an image enhancement device may include: a transformation function estimation unit for estimating a multi-transformation function by utilizing spatial information based on an input image and statistical characteristic information based on a histogram of the input image; a weight map generation unit for generating pixel-by-pixel weights of the estimated multi-transformation functions and generating a weight map based on the generated pixel-by-pixel weights; and an improved image acquisition unit for obtaining an improved image from the input image by using a weighted sum of an image in which pixel values of the input image are converted by each of the multi-transformation functions and the generated weight map.

Description

Image improvement device and method for obtaining an improved image based on estimation of multiple transformation functions

본 발명은 다중 변환함수의 추정에 기반하여 개선 영상을 획득하는 영상 개선 장치 및 방법에 관한 것으로, 구체적으로, 영상의 공간적 정보와 통계적 특성을 활용하여 다양한 색상 특성을 가진 다중 변환함수를 추정하고, 추정된 다중 변환함수를 이용하여 영상의 품질을 개선함에 따라 개선 영상을 획득하는 기술에 관한 것이다.The present invention relates to an image improvement device and method for obtaining an improved image based on the estimation of a multiple conversion function. Specifically, the present invention relates to an image improvement device and method for obtaining an improved image based on the estimation of a multiple conversion function. Specifically, the present invention relates to an image improvement device and method, which utilize spatial information and statistical characteristics of an image to estimate a multiple conversion function with various color characteristics; This relates to a technology for obtaining an improved image by improving the quality of the image using an estimated multiple transformation function.

산업 최근 영상 기술의 발전으로 인해 많은 사람이 다양한 환경에서 일상을 기록한다.Industry Recent advances in video technology have led many people to record their daily lives in various environments.

그러나 잘못된 카메라 설정과 불가피한 환경 등은 영상 품질을 저하할 수 있다.However, incorrect camera settings and unavoidable circumstances can deteriorate video quality.

게다가 이러한 저품질 영상들은 자율주행과 같이 최근 필수적으로 사용되는 컴퓨터 비전 애플리케이션에서 성능 저하의 원인이 된다.In addition, these low-quality images cause performance degradation in computer vision applications that are recently essential, such as autonomous driving.

따라서 저품질 영상으로부터 고품질 영상을 획득하는 많은 영상개선 알고리즘이 개발되었다.Therefore, many image improvement algorithms have been developed to obtain high-quality images from low-quality images.

대규모 데이터셋과 딥러닝 기술의 발전에 따라, 최근 영상개선 기법들의 성능이 현저하게 향상되었다.With the development of large-scale datasets and deep learning technology, the performance of recent image enhancement techniques has significantly improved.

대부분의 영상개선 알고리즘은 입력 영상을 픽셀 단위로 매핑하여 개선하는 image-to-image 접근 방식에 기반하여 개발된다.Most image improvement algorithms are developed based on the image-to-image approach, which improves input images by mapping them pixel-by-pixel.

이러한 방식은 내부 구조가 복잡한 딥러닝 네트워크가 영상개선 과정이 되므로, 영상개선 과정에 대한 분석이 매우 어렵다는 단점이 있다.This method has the disadvantage that it is very difficult to analyze the image improvement process because the image improvement process involves a deep learning network with a complex internal structure.

따라서 최근 저품질 영상을 개선할 수 있는 변환함수나 계수를 추정하는 영상 변환 기능(image-to-transformation function, ITF) 접근 방식을 기반으로 하는 기법이 개발되었다.Therefore, a technique based on the image-to-transformation function (ITF) approach, which estimates transformation functions or coefficients that can improve low-quality images, has recently been developed.

이러한 기법들은 변환함수나 계수를 이용하여 입력 저품질 영상의 화소값을 변환하여 개선 과정에 대한 해석이 가능하다.These techniques enable analysis of the improvement process by converting the pixel values of the input low-quality image using conversion functions or coefficients.

하지만 전통적인 영상개선 기법들이 영상의 통계적 특성을 기반으로 영상을 개선했음에도 불구하고, 영상 변환 기능 기법들은 입력 영상만을 가지고 변환함수나 계수를 추정한다.However, although traditional image improvement techniques improve images based on statistical characteristics of the image, image transformation function techniques estimate transformation functions or coefficients using only the input image.

또한, 하나의 변환함수를 이용하여 영상의 화소값을 변환하는 것은 영상의 공간적 정보를 고려하지 못하며 다양한 색상을 표현하는 것에 한계가 있다.Additionally, converting the pixel value of an image using a single conversion function does not take into account the spatial information of the image and has limitations in expressing various colors.

영상개선 기술은 카메라의 물리적 한계 및 주변 환경의 영향 등의 이유로 촬영된 저품질 영상을 개선하는 기술이다.Image improvement technology is a technology that improves low-quality images captured due to physical limitations of the camera and the influence of the surrounding environment.

스마트폰, DSLR(Digital Single-Lens Reflex) 등의 보급으로 사진 촬영이 대중화되었지만, 비전문가들은 카메라 설정을 다루는 것에 익숙하지 않아 고품질의 사진을 획득하는 것이 어렵다.Photography has become popular with the spread of smartphones and DSLR (Digital Single-Lens Reflex), but it is difficult for non-professionals to obtain high-quality photos because they are not familiar with camera settings.

저품질 영상은 미적으로도 만족스럽지 않을 뿐만 아니라 세부 정보, 영상 속 객체 정보 등이 왜곡되어 영상으로부터 얻을 수 있는 정보가 손실되어 기록물로서 가치가 떨어진다.Low-quality video is not only aesthetically unsatisfactory, but also the details and object information in the video are distorted, resulting in loss of information that can be obtained from the video, reducing its value as a record.

또한, 저품질 영상은 4차 산업 혁명의 필수적인 기술인 컴퓨터 비전 분야에서 성능저하의 원인이 되므로 저품질 영상을 효과적으로 개선할 수 있는 기술의 개발이 필수적이다.In addition, low-quality images cause performance degradation in the field of computer vision, an essential technology of the 4th Industrial Revolution, so the development of technology that can effectively improve low-quality images is essential.

한국등록특허 제10-1468433호, "결합된 색상 채널 변환 맵을 이용한 다이나믹 레인지 확장 장치 및 방법"Korean Patent No. 10-1468433, “Dynamic range expansion device and method using combined color channel conversion map” 한국등록특허 제10-1341616호, "세부정보 추정에 의한 영상 개선 장치 및 방법"Korean Patent No. 10-1341616, “Image improvement device and method by detailed information estimation” 한국등록특허 제10-2124497호, "영상 개선 장치"Korean Patent No. 10-2124497, “Image enhancement device”

본 발명은 영상의 공간적 정보와 통계적 특성을 활용하여 다양한 색상 특성을 가진 다중 변환함수를 추정하고, 추정된 다중 변환함수를 이용하여 영상의 품질을 개선함에 따라 개선 영상을 획득하는 것을 목적으로 한다.The purpose of the present invention is to estimate a multi-transformation function with various color characteristics by utilizing the spatial information and statistical characteristics of the image, and to obtain an improved image by improving the quality of the image using the estimated multi-transformation function.

본 발명은 개선하고자 하는 저품질 영상과 해당 영상의 히스토그램을 이용하여 영상 개선을 위한 다중 변환함수를 추정하고, 추정된 다중 변환함수들을 픽셀 단위로 적절하게 조합하여 개선 영상을 획득하는 것을 목적으로 한다.The purpose of the present invention is to estimate a multiple transformation function for image improvement using a low-quality image to be improved and histogram of the image, and to obtain an improved image by appropriately combining the estimated multiple transformation functions on a pixel basis.

본 발명은 히스토그램 기반 다중 변환함수 추정 네트워크와 가중치 생성 네트워크를 이용하여 추정된 변환함수에 픽셀 단위 가중치를 적용하여 공간적 정보를 고려한 영상의 색상을 표현함에 따라 저품질 영상뿐만 아니라 극한 저조도 영상, 수중영상까지 고품질의 영상으로 개선하는 것을 목적으로 한다.The present invention uses a histogram-based multiple transformation function estimation network and a weight generation network to apply pixel-level weights to the estimated transformation function to express the color of the image considering spatial information, allowing not only low-quality images but also extreme low-light images and underwater images. The purpose is to improve video quality.

본 발명의 일실시예에 따르면 영상 개선 장치는 입력 영상에 기반한 공간적 정보와 상기 입력 영상의 히스토그램(histogram)에 기반한 통계적 특성 정보를 이용하여 다중 변환 함수를 추정하는 변환함수 추정부, 상기 추정된 다중 변환 함수의 픽셀 단위 가중치를 생성하고, 상기 생성된 픽셀 단위 가중치에 기반하여 가중치맵을 생성하는 가중치맵 생성부 및 상기 입력 영상을 상기 다중 변환 함수 각각으로 화소값 변환한 영상과 상기 생성된 가중치맵의 가중합으로 상기 입력 영상으로부터 개선 영상을 획득하는 개선 영상 획득부를 포함할 수 있다.According to an embodiment of the present invention, an image improvement device includes a transformation function estimator that estimates a multiple transformation function using spatial information based on an input image and statistical characteristic information based on a histogram of the input image, and the estimated multiple A weight map generator that generates pixel weights of a transformation function and generates a weight map based on the generated pixel weights; an image obtained by converting the input image to a pixel value using each of the multiple transformation functions; and the generated weight map. It may include an improved image acquisition unit that obtains an improved image from the input image using a weighted sum of .

상기 변환함수 추정부는 상기 입력 영상으로부터 특징맵을 추출하고, 컨볼루션(Conv) 블록을 이용하여 상기 추출된 특징맵의 채널을 확장하며, 복수의 SFC(self-fusion convolution) 블록을 이용하여 상기 특징맵의 해상도를 줄이고 채널 수를 증가시키고, 평균 풀링(average pooling) 블록을 이용하여 특정 크기를 가진 제1 특징맵을 생성할 수 있다.The transform function estimator extracts a feature map from the input image, expands the channel of the extracted feature map using a convolution (Conv) block, and extracts the feature using a plurality of self-fusion convolution (SFC) blocks. It is possible to reduce the resolution of the map, increase the number of channels, and create a first feature map with a specific size by using an average pooling block.

상기 변환함수 추정부는 복수의 SHFE(self-fusion histogram feature extraction) 블록을 이용하여 상기 히스토그램의 제2 특징맵을 추출하고, 상기 제1 특징맵과 상기 제2 특징맵을 결합하여 상기 통계적 특성 정보가 고려된 제3 특징맵을 생성하며, 잔류 어텐션 메커니즘(residual attention mechanism)에 따라 상기 제3 특징맵을 상기 제1 특징맵의 어텐션 특징맵으로 작용하여 제4 특징맵을 생성할 수 있다.The transformation function estimation unit extracts a second feature map of the histogram using a plurality of SHFE (self-fusion histogram feature extraction) blocks, and combines the first feature map and the second feature map to obtain the statistical feature information. The considered third feature map is generated, and the fourth feature map can be generated by acting as an attention feature map of the first feature map according to a residual attention mechanism.

상기 변환함수 추정부는 상기 생성된 제4 특징맵을 복수의 브랜치 각각으로 전달하고, 상기 복수의 브랜치 각각에서 상기 입력 영상의 특정 채널에 있는 특정 픽셀에 대한 변환함수를 추정하여 상기 다중 변환 함수를 추정할 수 있다.The transform function estimator transmits the generated fourth feature map to each of a plurality of branches, estimates the transform function for a specific pixel in a specific channel of the input image in each of the plurality of branches, and estimates the multiple transform function. can do.

상기 가중치맵 생성부는 상기 입력 영상과 상기 다중 변환 함수 각각으로 화소값 변환한 영상을 이용하여 상기 변환한 영상에 적용되는 가중치를 출력하여 상기 공간적 정보를 고려한 변환함수 조합으로 상기 가중치맵을 생성할 수 있다.The weight map generator may output a weight applied to the converted image using the input image and an image whose pixel value is converted using each of the multiple conversion functions, and generate the weight map using a combination of conversion functions considering the spatial information. there is.

상기 개선 영상 획득부는 하기 수학식 3를 이용하여 원소 단위 가중합으로 상기 가중합을 획득하고, 상기 획득된 가중합에 따라 상기 입력 영상으로부터 상기 개선 영상을 획득하며,The improved image acquisition unit obtains the weighted sum as an element-wise weighted sum using Equation 3 below, and obtains the improved image from the input image according to the obtained weighted sum,

[수학식 3][Equation 3]

상기 는 개선 영상을 나타내고, n은 순번을 나타내며, 는 n번째 변환한 영상을 나타내고, 은 n번째 변환한 영상의 픽셀 단위 가중치를 나타내며, ⊙는 원소 단위 곱을 나타낼 수 있다.remind represents the improved image, n represents the sequence number, represents the nth converted image, represents the pixel-wise weight of the nth converted image, and ⊙ can represent the element-wise product.

본 발명의 일실시예에 따르면 영상 개선 방법은 변환함수 추정부에서, 입력 영상에 기반한 공간적 정보와 상기 입력 영상의 히스토그램(histogram)에 기반한 통계적 특성 정보를 이용하여 다중 변환 함수를 추정하는 단계, 가중치맵 생성부에서, 상기 추정된 다중 변환 함수의 픽셀 단위 가중치를 생성하고, 상기 생성된 픽셀 단위 가중치에 기반하여 가중치맵을 생성하는 단계 및 영상 획득부에서, 상기 입력 영상을 상기 다중 변환 함수 각각으로 화소값 변환한 영상과 상기 생성된 가중치맵의 가중합으로 상기 입력 영상으로부터 개선 영상을 획득하는 단계를 포함할 수 있다.According to one embodiment of the present invention, the image improvement method includes the steps of estimating a multiple transformation function in a transformation function estimation unit using spatial information based on the input image and statistical characteristic information based on the histogram of the input image, weights, In a map generator, generating a pixel-based weight of the estimated multi-transform function and generating a weight map based on the generated pixel-unit weight, and in an image acquisition unit, converting the input image to each of the multiple transform functions. It may include obtaining an improved image from the input image using a weighted sum of the pixel value converted image and the generated weight map.

상기 입력 영상에 기반한 공간적 정보와 상기 입력 영상의 히스토그램(histogram)에 기반한 통계적 특성 정보를 이용하여 다중 변환 함수를 추정하는 단계는, 상기 입력 영상으로부터 특징맵을 추출하고, 컨볼루션(Conv) 블록을 이용하여 상기 추출된 특징맵의 채널을 확장하며, 복수의 SFC(self-fusion convolution) 블록을 이용하여 상기 특징맵의 해상도를 줄이고 채널 수를 증가시키고, 평균 풀링(average pooling) 블록을 이용하여 특정 크기를 가진 제1 특징맵을 생성하는 단계, 복수의 SHFE(self-fusion histogram feature extraction) 블록을 이용하여 상기 히스토그램의 제2 특징맵을 추출하고, 상기 제1 특징맵과 상기 제2 특징맵을 결합하여 상기 통계적 특성 정보가 고려된 제3 특징맵을 생성하며, 잔류 어텐션 메커니즘(residual attention mechanism)에 따라 상기 제3 특징맵을 상기 제1 특징맵의 어텐션 특징맵으로 작용하여 제4 특징맵을 생성하는 단계 및 상기 생성된 제4 특징맵을 복수의 브랜치 각각으로 전달하고, 상기 복수의 브랜치 각각에서 상기 입력 영상의 특정 채널에 있는 특정 픽셀에 대한 변환함수를 추정하여 상기 다중 변환 함수를 추정하는 단계를 포함할 수 있다.The step of estimating a multiple transformation function using spatial information based on the input image and statistical characteristic information based on the histogram of the input image includes extracting a feature map from the input image and performing a convolution (Conv) block. The channel of the extracted feature map is expanded using a plurality of SFC (self-fusion convolution) blocks to reduce the resolution of the feature map and the number of channels is increased, and an average pooling block is used to specify Generating a first feature map with a size, extracting a second feature map of the histogram using a plurality of SHFE (self-fusion histogram feature extraction) blocks, and combining the first feature map and the second feature map. Combined to generate a third feature map in which the statistical characteristic information is taken into consideration, the third feature map acts as an attention feature map of the first feature map according to a residual attention mechanism to create a fourth feature map. Generating and transmitting the generated fourth feature map to each of a plurality of branches, and estimating the multiple transformation function by estimating a transformation function for a specific pixel in a specific channel of the input image in each of the plurality of branches. May include steps.

상기 추정된 다중 변환 함수의 픽셀 단위 가중치를 생성하고, 상기 생성된 픽셀 단위 가중치에 기반하여 가중치맵을 생성하는 단계는, 상기 입력 영상과 상기 다중 변환 함수 각각으로 화소값 변환한 영상을 이용하여 상기 변환한 영상에 적용되는 가중치를 출력하여 상기 공간적 정보를 고려한 변환함수 조합으로 상기 가중치맵을 생성하는 단계를 포함할 수 있다.The step of generating pixel-level weights of the estimated multi-transformation function and generating a weight map based on the generated pixel-level weights includes using the input image and the image obtained by converting pixel values using each of the multi-transform functions. It may include outputting weights applied to the converted image and generating the weight map using a combination of conversion functions considering the spatial information.

상기 입력 영상을 상기 다중 변환 함수 각각으로 화소값 변환한 영상과 상기 생성된 가중치맵의 가중합으로 상기 입력 영상으로부터 개선 영상을 획득하는 단계는, 하기 수학식 3를 이용하여 원소 단위 가중합으로 상기 가중합을 획득하고, 상기 획득된 가중합에 따라 상기 입력 영상으로부터 상기 개선 영상을 획득하는 단계를 포함하고,The step of obtaining an improved image from the input image as a weighted sum of the pixel value converted image of the input image using each of the multiple transformation functions and the generated weight map includes the element-wise weighted sum using Equation 3 below. Obtaining a weighted sum and obtaining the improved image from the input image according to the obtained weighted sum,

[수학식 3][Equation 3]

본 발명은 영상의 공간적 정보와 통계적 특성을 활용하여 다양한 색상 특성을 가진 다중 변환함수를 추정하고, 추정된 다중 변환함수를 이용하여 영상의 품질을 개선함에 따라 개선 영상을 획득할 수 있다.The present invention utilizes spatial information and statistical characteristics of an image to estimate a multi-transformation function with various color characteristics, and can obtain an improved image by improving the quality of the image using the estimated multi-transformation function.

본 발명은 개선하고자 하는 저품질 영상과 해당 영상의 히스토 그램을 이용하여 영상 개선을 위한 다중 변환함수를 추정하고, 추정된 다중 변환함수들을 픽셀 단위로 적절하게 조합하여 개선 영상을 획득할 수 있다.The present invention estimates a multiple transformation function for image improvement using a low-quality image to be improved and histogram of the image, and obtains an improved image by appropriately combining the estimated multiple transformation functions on a pixel basis.

본 발명은 히스토그램 기반 다중 변환함수 추정 네트워크와 가중치 생성 네트워크를 이용하여 추정된 변환함수에 픽셀 단위 가중치를 적용하여 공간적 정보를 고려한 영상의 색상을 표현함에 따라 저품질 영상뿐만 아니라 극한 저조도 영상, 수중영상까지 고품질의 영상으로 개선할 수 있다.The present invention uses a histogram-based multiple transformation function estimation network and a weight generation network to apply pixel-level weights to the estimated transformation function to express the color of the image considering spatial information, allowing not only low-quality images but also extreme low-light images and underwater images. It can be improved with high-quality video.

도 1은 본 발명의 일실시예에 따른 영상 개선 장치를 설명하는 도면이다.
도 2는 본 발명의 일실시예에 따른 영상 개선 장치에서 수행되는 영상 개선 알고리즘을 설명하는 도면이다.
도 3a 내지 도 3c는 본 발명의 일실시예에 따른 영상 개선 알고리즘에서 이용되는 히스토그램 기반 다중 변환함수 추정 네트워크의 블록들의 구조를 설명하는 도면이다.
도 4a 내지 도 4c는 본 발명의 일실시예에 따른 영상 개선 장치의 영상 개선 성능을 설명하는 도면이다.
도 5는 본 발명의 일실시예에 따른 영상 개선 방법을 설명하는 도면이다.1 is a diagram explaining an image enhancement device according to an embodiment of the present invention.
Figure 2 is a diagram explaining an image enhancement algorithm performed in an image enhancement device according to an embodiment of the present invention.
3A to 3C are diagrams illustrating the structure of blocks of a histogram-based multiple transform function estimation network used in an image enhancement algorithm according to an embodiment of the present invention.
4A to 4C are diagrams illustrating the image enhancement performance of the image enhancement device according to an embodiment of the present invention.
Figure 5 is a diagram explaining an image improvement method according to an embodiment of the present invention.

이하, 본 문서의 다양한 실시 예들이 첨부된 도면을 참조하여 기재된다.Hereinafter, various embodiments of this document are described with reference to the attached drawings.

실시 예 및 이에 사용된 용어들은 본 문서에 기재된 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 해당 실시 예의 다양한 변경, 균등물, 및/또는 대체물을 포함하는 것으로 이해되어야 한다.The embodiments and terms used herein are not intended to limit the technology described in this document to a specific embodiment, and should be understood to include various changes, equivalents, and/or substitutes for the embodiments.

하기에서 다양한 실시 예들을 설명에 있어 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다.In the following description of various embodiments, if a detailed description of a related known function or configuration is judged to unnecessarily obscure the gist of the invention, the detailed description will be omitted.

그리고 후술되는 용어들은 다양한 실시 예들에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.The terms described below are terms defined in consideration of functions in various embodiments, and may vary depending on the intention or custom of the user or operator. Therefore, the definition should be made based on the contents throughout this specification.

도면의 설명과 관련하여, 유사한 구성요소에 대해서는 유사한 참조 부호가 사용될 수 있다.In connection with the description of the drawings, similar reference numbers may be used for similar components.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다.Singular expressions may include plural expressions, unless the context clearly indicates otherwise.

본 문서에서, "A 또는 B" 또는 "A 및/또는 B 중 적어도 하나" 등의 표현은 함께 나열된 항목들의 모든 가능한 조합을 포함할 수 있다.In this document, expressions such as “A or B” or “at least one of A and/or B” may include all possible combinations of the items listed together.

"제1," "제2," "첫째," 또는 "둘째," 등의 표현들은 해당 구성요소들을, 순서 또는 중요도에 상관없이 수식할 수 있고, 한 구성요소를 다른 구성요소와 구분하기 위해 사용될 뿐 해당 구성요소들을 한정하지 않는다.Expressions such as “first,” “second,” “first,” or “second,” can modify the corresponding components regardless of order or importance and are used to distinguish one component from another. It is only used and does not limit the corresponding components.

어떤(예: 제1) 구성요소가 다른(예: 제2) 구성요소에 "(기능적으로 또는 통신적으로) 연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로 연결되거나, 다른 구성요소(예: 제3 구성요소)를 통하여 연결될 수 있다.When a component (e.g. a first) component is said to be "connected (functionally or communicatively)" or "connected" to another (e.g. a second) component, it means that the component is connected to the other component. It may be connected directly to a component or may be connected through another component (e.g., a third component).

본 명세서에서, "~하도록 구성된(또는 설정된)(configured to)"은 상황에 따라, 예를 들면, 하드웨어적 또는 소프트웨어적으로 "~에 적합한," "~하는 능력을 가지는," "~하도록 변경된," "~하도록 만들어진," "~를 할 수 있는," 또는 "~하도록 설계된"과 상호 호환적으로(interchangeably) 사용될 수 있다.In this specification, “configured to” means “suitable for,” “having the ability to,” or “changed to,” depending on the situation, for example, in terms of hardware or software. ," can be used interchangeably with "made to," "capable of," or "designed to."

어떤 상황에서는, "~하도록 구성된 장치"라는 표현은, 그 장치가 다른 장치 또는 부품들과 함께 "~할 수 있는" 것을 의미할 수 있다.In some contexts, the expression “a device configured to” may mean that the device is “capable of” working with other devices or components.

예를 들면, 문구 "A, B, 및 C를 수행하도록 구성된(또는 설정된) 프로세서"는 해당 동작을 수행하기 위한 전용 프로세서(예: 임베디드 프로세서), 또는 메모리 장치에 저장된 하나 이상의 소프트웨어 프로그램들을 실행함으로써, 해당 동작들을 수행할 수 있는 범용 프로세서(예: CPU 또는 application processor)를 의미할 수 있다.For example, the phrase "processor configured (or set) to perform A, B, and C" refers to a processor dedicated to performing the operations (e.g., an embedded processor), or by executing one or more software programs stored on a memory device. , may refer to a general-purpose processor (e.g., CPU or application processor) capable of performing the corresponding operations.

또한, '또는' 이라는 용어는 배타적 논리합 'exclusive or' 이기보다는 포함적인 논리합 'inclusive or' 를 의미한다.Additionally, the term 'or' means 'inclusive or' rather than 'exclusive or'.

즉, 달리 언급되지 않는 한 또는 문맥으로부터 명확하지 않는 한, 'x가 a 또는 b를 이용한다' 라는 표현은 포함적인 자연 순열들(natural inclusive permutations) 중 어느 하나를 의미한다.That is, unless otherwise stated or clear from the context, the expression 'x uses a or b' means any of the natural inclusive permutations.

이하 사용되는 '..부', '..기' 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어, 또는, 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다.Terms such as '..unit' and '..unit' used hereinafter refer to a unit that processes at least one function or operation, and may be implemented as hardware, software, or a combination of hardware and software.

도 1은 본 발명의 일실시예에 따른 영상 개선 장치를 설명하는 도면이다.1 is a diagram explaining an image enhancement device according to an embodiment of the present invention.

도 1은 본 발명의 일실시예에 따른 영상 개선 장치의 구성 요소를 예시한다.1 illustrates the components of an image enhancement device according to an embodiment of the present invention.

도 1을 참고하면, 본 발명의 일실시예에 따른 영상 개선 장치(100)는 변환함수 추정부(110), 가중치맵 생성부(120) 및 개선 영상 획득부(130)를 포함할 수 있다.Referring to FIG. 1, the image enhancement device 100 according to an embodiment of the present invention may include a transformation function estimation unit 110, a weight map generation unit 120, and an enhanced image acquisition unit 130.

본 발명의 일실시예에 따르면 변환함수 추정부(110)는 입력 영상의 품질 개선을 통해 개선 영상을 획득하기 위하여 다중 변환함수를 추정한다.According to one embodiment of the present invention, the transform function estimation unit 110 estimates multiple transform functions to obtain an improved image by improving the quality of the input image.

일례로, 변환함수 추정부(110)는 개선 대상인 입력 영상을 입력 받은 뒤, 입력 영상에 기반한 공간적 정보와 상기 입력 영상의 히스토그램(histogram)에 기반한 통계적 특성 정보를 이용하여 다중 변환 함수를 추정할 수 있다.For example, after receiving an input image to be improved, the transformation function estimation unit 110 may estimate a multiple transformation function using spatial information based on the input image and statistical characteristic information based on the histogram of the input image. there is.

본 발명의 일실시예에 따르면 변환함수 추정부(110)는 입력 영상으로부터 특징맵을 추출하고, 컨볼루션(Conv) 블록을 이용하여 추출된 특징맵의 채널을 확장할 수 있다.According to one embodiment of the present invention, the transform function estimation unit 110 can extract a feature map from an input image and expand the channel of the extracted feature map using a convolution (Conv) block.

또한, 변환함수 추정부(110)는 복수의 SFC(self-fusion convolution) 블록을 이용하여 특징맵의 해상도를 줄이고 채널 수를 증가시키고, 평균 풀링(average pooling) 블록을 이용하여 특정 크기를 가진 제1 특징맵(f_img)을 생성할 수 있다.In addition, the transformation function estimation unit 110 uses a plurality of SFC (self-fusion convolution) blocks to reduce the resolution of the feature map and increase the number of channels, and uses an average pooling block to 1 A feature map (f _img ) can be created.

본 발명의 일실시예에 따르면 변환함수 추정부(110)는 복수의 SHFE(self-fusion histogram feature extraction) 블록을 이용하여 히스토그램의 제2 특징맵(f_h)을 추출할 수 있다.According to one embodiment of the present invention, the transformation function estimation unit 110 may extract the second feature map (f _h ) of the histogram using a plurality of self-fusion histogram feature extraction (SHFE) blocks.

또한, 변환함수 추정부(110)는 제1 특징맵(f_img)과 제2 특징맵(f_h)을 결합하여 통계적 특성 정보가 고려된 제3 특징맵(f_s)을 생성하며, 잔류 어텐션 메커니즘(residual attention mechanism)에 따라 제3 특징맵(f_s)을 제1 특징맵(f_img)의 어텐션 특징맵으로 작용하여 제4 특징맵(f_HFF)을 생성할 수 있다.In addition, the conversion function estimation unit 110 combines the first feature map (f _img ) and the second feature map (f _h ) to generate a third feature map (f _s ) in which statistical characteristic information is taken into consideration, and the residual attention According to a residual attention mechanism, the third feature map (f _s ) can be used as an attention feature map of the first feature map (f _img ) to generate the fourth feature map (f _HFF ).

일례로, 변환함수 추정부(110)는 생성된 제4 특징맵(f_HFF)을 복수의 브랜치 각각으로 전달하고, 복수의 브랜치 각각에서 입력 영상의 특정 채널에 있는 특정 픽셀에 대한 변환함수를 추정하여 다중 변환 함수를 추정할 수 있다.For example, the conversion function estimation unit 110 transmits the generated fourth feature map (f _HFF ) to each of a plurality of branches, and estimates the conversion function for a specific pixel in a specific channel of the input image in each of the plurality of branches. Thus, multiple transformation functions can be estimated.

예를 들어, 특정 채널 및 특정 픽셀은 입력 영상으로부터 구분되는 채널들에서 어느 하나의 채널을 구성하는 복수의 픽셀들 중 어느 하나의 픽셀을 나타낼 수 있다.For example, a specific channel and a specific pixel may represent one pixel among a plurality of pixels constituting one channel among channels distinguished from the input image.

본 발명의 일실시예에 따르면 가중치맵 생성부(120)는 다중 변환 함수의 픽셀 단위 가중치를 생성하고, 생성된 픽셀 단위 가중치에 기반하여 가중치맵을 생성할 수 있다.According to one embodiment of the present invention, the weight map generator 120 may generate pixel-level weights of the multiple transformation function and generate a weight map based on the generated pixel-level weights.

예를 들어, 다중 변환 함수의 픽셀 단위 가중치는 추정된 다중 변환함수를 이용하여 변환되는 영상과 입력 영상 간의 관계성 및 공간적 정보를 기반으로 복잡한 색상 매핑을 구현하기 위해서 생성될 수 있다.For example, pixel-level weights of a multi-transform function can be created to implement complex color mapping based on the relationship and spatial information between the image converted using the estimated multi-transform function and the input image.

예를 들어, 추정된 다중 변환함수를 이용하여 변환되는 영상은 입력 영상을 다중 변환 함수 각각으로 화소값 변환한 영상일 수 있다.For example, an image converted using an estimated multiple transformation function may be an image obtained by converting the pixel values of an input image using each of the multiple transformation functions.

일례로, 가중치맵 생성부(120)는 입력 영상을 구성하는 색상과 관련된 공간적 정보와 입력 영상으로부터 변환된 히스토그램에 기반한 통계적 특성 정보에 기반하여 추정된 다중 변환 함수 각각에 대하여 픽셀 단위 가중치를 생성한다.For example, the weight map generator 120 generates pixel-level weights for each of the estimated multiple transformation functions based on spatial information related to the colors constituting the input image and statistical characteristic information based on the histogram converted from the input image. .

본 발명의 일실시예에 따르면 가중치맵 생성부(120)는 입력 영상과 다중 변환 함수 각각으로 화소값 변환한 영상을 이용하여 변환한 영상에 적용되는 가중치를 출력하여 공간적 정보를 고려한 변환함수 조합으로 가중치맵을 생성할 수 있다.According to one embodiment of the present invention, the weight map generator 120 outputs a weight applied to the converted image using an input image and an image whose pixel values are converted using a multiple conversion function, respectively, and creates a combination of conversion functions considering spatial information. You can create a weight map.

예를 들어, 변환함수 조합은 개선 영상의 복잡한 색상 매핑을 표현하는데 활용될 수 있다.For example, a combination of transformation functions can be used to express complex color mapping of an enhanced image.

본 발명의 일실시예에 따르면 개선 영상 획득부(130)는 입력 영상을 다중 변환 함수 각각으로 화소값 변환한 영상과 가중치맵 생성부(120)에 의해 생성된 가중치맵의 가중합으로 입력 영상으로부터 개선 영상을 획득할 수 있다.According to one embodiment of the present invention, the enhanced image acquisition unit 130 selects the input image as a weighted sum of the image obtained by converting the pixel value of the input image using each of the multiple transformation functions and the weight map generated by the weight map generator 120. An improvement video can be obtained.

본 발명의 일실시예에 따르면 영상 개선 장치(100)는 개선하고자 하는 저품질 영상을 입력 영상으로 입력받고, 입력 영상의 히스토그램을 이용하여 영상개선을 위한 다중 변환함수를 추정하고, 변환함수들을 픽셀 단위로 적절하게 조합할 수 있다.According to one embodiment of the present invention, the image improvement device 100 receives a low-quality image to be improved as an input image, estimates multiple transformation functions for image improvement using the histogram of the input image, and converts the transformation functions in pixel units. It can be combined appropriately.

일례로, 영상 개선 장치(100)는 히스토그램 기반 다중 변환함수 추정 네트워크(Histogram Multiple Transformation Function Network, HMTF-Net)와 가중치 생성 네트워크(WeightNet)를 이용하여 추정된 변환함수에 픽셀 단위 가중치를 적용함으로써 공간적 정보를 고려하여 영상의 색상을 표현함에 따라 개선 영상을 획득할 수 있다.For example, the image improvement device 100 applies a pixel-level weight to the transformation function estimated using a histogram-based multiple transformation function estimation network (HMTF-Net) and a weight generation network (WeightNet), thereby spatially An improved image can be obtained by considering the information and expressing the color of the image.

또한, 영상 개선 장치(100)는 일반적인 저품질 영상 뿐만 아니라 극한 저조도 영상 및 수중영상 등 다양한 개선 대상 영상을 고품질의 영상으로 개선할 수 있다.In addition, the image improvement device 100 can improve not only general low-quality images but also various images to be improved, such as extreme low-light images and underwater images, into high-quality images.

따라서, 본 발명은 영상의 공간적 정보와 통계적 특성을 활용하여 다양한 색상 특성을 가진 다중 변환함수를 추정하고, 추정된 다중 변환함수를 이용하여 영상의 품질을 개선함에 따라 개선 영상을 획득할 수 있다.Therefore, the present invention utilizes the spatial information and statistical characteristics of the image to estimate a multi-transformation function with various color characteristics, and can obtain an improved image by improving the quality of the image using the estimated multi-transformation function.

도 2는 본 발명의 일실시예에 따른 영상 개선 장치에서 수행되는 영상 개선 알고리즘을 설명하는 도면이다.Figure 2 is a diagram explaining an image enhancement algorithm performed in an image enhancement device according to an embodiment of the present invention.

도 2는 본 발명의 일실시예에 따른 영상 개선 장치에서 수행되는 영상 개선 알고리즘의 구성을 예시한다.Figure 2 illustrates the configuration of an image enhancement algorithm performed in an image enhancement device according to an embodiment of the present invention.

도 2를 참고하면, 본 발명의 일실시예에 따르면 영상 개선 알고리즘(200)은 히스토그램 기반 다중 변환함수 추정 네트워크(220) 및 가중치 생성 네트워크(240)를 이용하여 저품질 영상으로 개선 대상인 입력 영상(210)을 개선 영상(250)으로 획득하는 영상 품질 개선 알고리즘일 수 있다.Referring to FIG. 2, according to one embodiment of the present invention, the image improvement algorithm 200 uses the histogram-based multiple transformation function estimation network 220 and the weight generation network 240 to improve the input image 210 as a low-quality image. ) may be an image quality improvement algorithm that obtains the improved image 250.

일례로, 영상 개선 알고리즘(200)은 영상개선을 위한 다중 변환함수 추정 기법을 제공할 수 있다.For example, the image improvement algorithm 200 may provide a multiple transformation function estimation technique for image improvement.

영상 개선 알고리즘(200)은 사용자로부터 입력 영상(210)을 입력받고, 입력 영상(210)으로부터 히스토그램(211)이 통계적 특성 정보로서 추출하며, 입력 영상(210) 및 히스토그램(211)을 히스토그램 기반 다중 변환함수 추정 네트워크(220)으로 입력한다.The image improvement algorithm 200 receives an input image 210 from the user, extracts a histogram 211 from the input image 210 as statistical characteristic information, and combines the input image 210 and the histogram 211 with histogram-based multiplication. Input into the conversion function estimation network (220).

히스토그램 기반 다중 변환함수 추정 네트워크(220)는 효과적인 변환함수 추정을 위해 입력 영상(210)과 히스토그램(211)을 동시에 활용하여 다중 변환함수를 추정한다.The histogram-based multiple transformation function estimation network 220 estimates multiple transformation functions by simultaneously utilizing the input image 210 and the histogram 211 for effective transformation function estimation.

히스토그램 기반 다중 변환함수 추정 네트워크(220)는 영상 특징맵 추출(image feature-map extraction, IFE) 블록, 히스토그램 특징맵 합성(histogram feature-map fusion, HFF) 블록, 변환함수 생성(transformation function generation, TFG) 블록으로 구성된다.The histogram-based multiple transformation function estimation network 220 includes an image feature-map extraction (IFE) block, histogram feature-map fusion (HFF) block, and transformation function generation (TFG). ) consists of blocks.

히스토그램 기반 다중 변환함수 추정 네트워크(220)는 도 1에서 설명된 변환함수 추정부에 의해 구동될 수 있다.The histogram-based multiple transform function estimation network 220 may be driven by the transform function estimation unit described in FIG. 1.

영상 특징맵 추출 블록, 히스토그램 특징맵 합성 블록 및 변환함수 생성 블록은 도 3a 내지 도 3c를 이용하여 보충 설명한다.The image feature map extraction block, histogram feature map synthesis block, and transformation function generation block will be supplementally explained using FIGS. 3A to 3C.

히스토그램 기반 다중 변환함수 추정 네트워크(220)는 도 1에서 설명된 변환함수 추정부의 역할을 수행한다.The histogram-based multiple transformation function estimation network 220 performs the role of the transformation function estimator described in FIG. 1.

히스토그램 기반 다중 변환함수 추정 네트워크(220)는 입력 영상(210) 및 히스토그램(211)에 기반하여 다중 변환함수(221)를 추정하고, 다중 변환함수(221)는 하기 수학식 1과 같이 추정될 수 있다.The histogram-based multiple transformation function estimation network 220 estimates the multiple transformation function 221 based on the input image 210 and the histogram 211, and the multiple transformation function 221 can be estimated as shown in Equation 1 below. there is.

[수학식 1][Equation 1]

수학식 1에서, 는 채널 단위 변환함수의 집합을 나타낼 수 있고, 는 채널 단위 변환함수를 나타낼 수 있으며, N은 변환함수의 개수를 나타낼 수 있다.In equation 1, can represent a set of channel unit conversion functions, may represent a channel unit conversion function, and N may represent the number of conversion functions.

채널 단위 변환함수는 768 차원의 벡터로 표현할 수 있고, 각 색상 채널의 변환함수로 과 같이 구성되어 있다.The channel unit conversion function can be expressed as a 768-dimensional vector, and can be expressed as a conversion function for each color channel. It is composed as follows.

영상 개선 알고리즘(200)은 다중 변환함수(221)를 화소값 변환부(230)로 입력한다.The image improvement algorithm 200 inputs the multiple conversion function 221 to the pixel value conversion unit 230.

화소값 변환부(230)는 다중 변환함수(221)에 기반하여 화소값 변환된 영상(231)을 생성한다.The pixel value conversion unit 230 generates the pixel value converted image 231 based on the multiple conversion function 221.

화속 값 변환부(230)는 도 1에서 설명된 개선 영상 획득부에 의해서 동작될 수 있다.The picture speed value converter 230 may be operated by the improved image acquisition unit described in FIG. 1.

다중 변환함수(221)는 공간 관계성을 가지고 화소값 변환을 수행하기 위해 추정된다.The multiple conversion function 221 is estimated to perform pixel value conversion with spatial relationships.

영상 개선 알고리즘(200)은 입력 영상(210)을 가중치 생성 네트워크(240)로 입력하고, 가중치 생성 네트워크(240)는 입력 영상(210) 및 화소값 변환된 영상(231)에 기반하여 변환함수의 픽셀 단위 가중치를 추정하여 가중치맵(241)을 생성한다.The image improvement algorithm 200 inputs the input image 210 into the weight generation network 240, and the weight generation network 240 calculates the conversion function based on the input image 210 and the pixel value converted image 231. A weight map 241 is generated by estimating the pixel weight.

이에 따라, 각 변환함수로 변환된 영상들에 적용되는 가중치를 적응적으로 출력하여 영상의 공간적 정보를 고려한 변환함수 조합을 생성할 필요성이 존재한다.Accordingly, there is a need to adaptively output weights applied to images converted by each conversion function to generate a combination of conversion functions that take into account the spatial information of the image.

가중치 생성 네트워크(240)는 U-Net의 컨볼루션 출력 채널 수를 16, 32, 64, 128, 256의 다섯 가지 크기로 수정하여 구성된다.The weight generation network 240 is constructed by modifying the number of convolutional output channels of U-Net to five sizes: 16, 32, 64, 128, and 256.

가중치 생성 네트워크(240)는 변환함수에 의해 변환된 변환 영상과 변환함수 간의 관계성을 파악하기 위해서 입력 영상(210)과 변환 영상(231)을 동시에 입력으로 활용한다.The weight generation network 240 simultaneously uses the input image 210 and the converted image 231 as input to determine the relationship between the transformed image converted by the transformation function and the transformation function.

가중치 생성 네트워크(240)는 하기 수학식 2에 기반하여 가중치맵(241)을 생성할 수 있다. The weight generation network 240 may generate the weight map 241 based on Equation 2 below.

[수학식 2][Equation 2]

수학식 2에서 는 가중치맵을 나타낼 수 있고, ф()는 시그모이드(sigmoid) 활성화 함수를 나타낼 수 있으며, f_n은 U-Net에서 출력되는 마지막 특징맵을 나타낼 수 있다.In equation 2 may represent a weight map, ф() may represent a sigmoid activation function, and f _n may represent the final feature map output from the U-Net.

가중치 생성 네트워크(240)는 도 1에서 설명된 가중치맵 생성부에 의해서 구동될 수 있다.The weight generation network 240 may be driven by the weight map generation unit described in FIG. 1.

영상 개선 알고리즘(200)은 변환 영상(231)과 가중치맵(241)을 가중합하여 개선 영상(250)을 획득한다.The image improvement algorithm 200 obtains the improved image 250 by weighted summing the transformed image 231 and the weight map 241.

영상 개선 알고리즘(200)은 하기 수학식 3에 기반하여 개선 영상(250)을 획득할 수 있다.The image enhancement algorithm 200 can obtain the improved image 250 based on Equation 3 below.

[수학식 3][Equation 3]

수학식 3에서, 는 개선 영상을 나타낼 수 있고, n은 순번을 나타내며, 는 n번째 변환한 영상을 나타낼 수 있고, 은 n번째 변환한 영상의 픽셀 단위 가중치를 나타낼 수 있으며, ⊙는 원소 단위 곱을 나타낼 수 있다.In equation 3, may represent the improved image, n represents the sequence number, can represent the nth converted image, may represent the pixel-wise weight of the nth transformed image, and ⊙ may represent the element-wise product.

영상 개선 알고리즘(200)은 기호(ⓒ)를 통할 경우 접합(concatenation)을 수행하고, 기호(×)를 통할 경우 성분곱(element-wise multiplication)을 수행하며, 기호(+)를 통할 경우에 성분합(element-wise addition)을 수행한다.The image enhancement algorithm 200 performs concatenation when using the symbol (ⓒ), element-wise multiplication when using the symbol (×), and element-wise multiplication when using the symbol (+). Perform element-wise addition.

본 발명의 일실시예에 따르면 영상 개선 알고리즘(200)은 총 손실 함수에 의해 단 대 단 방법(end-to-end manner)에서 학습될 수 있다.According to one embodiment of the present invention, the image enhancement algorithm 200 may be learned in an end-to-end manner using a total loss function.

총 손실 함수는 하기 수학식 4에 의해서 정의될 수 있다.The total loss function can be defined by Equation 4 below.

[수학식 4][Equation 4]

수학식 4에서, L_total은 총 손실 함수를 나타낼 수 있고, L_img는 영상 손실을 나타낼 수 있으며, L_col은 색상 손실을 나타낼 수 있고, L_ent는 엔트로피 손실을 나타낼 수 있으며, L_tv는 총 변화(total variation) 손실을 나타낼 수 있고, λ는 4개의 손실의 균형을 맞추기 위한 하이퍼 파라미터일 수 있다.In Equation 4, L _total can represent the total loss function, L _img can represent the image loss, L _col can represent the color loss, L _ent can represent the entropy loss, and L _tv can represent the total loss. It can represent the total variation loss, and λ can be a hyperparameter to balance the four losses.

영상 손실은 하기 수학식 5와 같이 개선 영상으로서 출력 영상과 진실(ground-truth) 영상 간의 l₂-norm으로 정의될 수 있다.Image loss can be defined as l ₂ -norm between the output image as an improved image and the ground-truth image as shown in Equation 5 below.

[수학식 5][Equation 5]

수학식 5에서, L_img는 영상 손실을 나타낼 수 있고, 는 진실 영상을 나타낼 수 있고, 는 개선 영상을 나타낼 수 있다.In Equation 5, L _img can represent the image loss, can represent the truth image, may represent an improvement image.

색상 손실은 색상 정보 보존을 위해서 도입되고, 하기 수학식 6과 같이 정의될 수 있다.Color loss is introduced to preserve color information, and can be defined as Equation 6 below.

[수학식 6][Equation 6]

수학식 6에서, L_col은 색상 손실을 나타낼 수 있고, H는 영상의 높이를 나타낼 수 있으며, W는 영상의 너비를 나타낼 수 있고, 와 는 진실 영상과 개선 영상의 픽셀의 위치(i,j)에서의 RGB 색상 벡터를 나타낼 수 있다.In Equation 6, L _col may represent color loss, H may represent the height of the image, W may represent the width of the image, and May represent the RGB color vector at the position (i,j) of the pixel of the true image and the improved image.

영상 개선 알고리즘(200)은 다양한 변환함수를 추정하게 되면 입력 영상의 다양한 특성을 고려하여 색상을 표현할 수 있다.The image improvement algorithm 200 can express colors by taking into account various characteristics of the input image by estimating various conversion functions.

영상 개선 알고리즘(200)은 가중치맵의 엔트로피를 최소화하여 변환함수의 다양성을 확보하기 위해 수학식 7과 같이 엔트로피 손실을 정의한다.The image improvement algorithm 200 defines the entropy loss as shown in Equation 7 to secure the diversity of the transformation function by minimizing the entropy of the weight map.

[수학식 7][Equation 7]

수학식 7에서, L_ent는 엔트로피 손실을 나타낼 수 있고, 은 n번째 변환한 영상의 픽셀 단위 가중치를 나타낼 수 있으며, N은 가중치맵의 개수를 나타낼 수 있고, H는 영상의 높이를 나타낼 수 있으며, W는 영상의 너비를 나타낼 수 있다.In equation 7, L _ent can represent the entropy loss, may represent the pixel weight of the nth converted image, N may represent the number of weight maps, H may represent the height of the image, and W may represent the width of the image.

그러나, 엔트로피 손실은 영상의 공간적 정보를 고려하지 않고 다양한 픽셀분포만을 고려하기 때문에 최소화하는 것은 영상의 잡음을 증가시킬 수 있다.However, since entropy loss only considers various pixel distributions without considering spatial information of the image, minimizing it may increase image noise.

이에 따라, 영상 개선 알고리즘(200)은 잡음 증대를 줄이기 위해 가중치맵을 공간적으로 부드럽게 만들어주는 총 변화(total variation) 손실 함수를 하기 수학식 8과 같이 정의할 수 있다.Accordingly, the image enhancement algorithm 200 can define a total variation loss function that spatially smoothes the weight map to reduce noise increase as shown in Equation 8 below.

[수학식 8][Equation 8]

수학식 8에서, L_tv는 총 변화 손실을 나타낼 수 있고, 은 n번째 변환한 영상의 픽셀 단위 가중치를 나타낼 수 있으며, ▽_x는 수평 그래디언트 연산을 나타낼 수 있고, ▽_y는 수직 그래디언트 연산을 나타낼 수 있다.In Equation 8, L _tv can represent the total change loss, may represent the pixel weight of the nth converted image, ▽ _x may represent the horizontal gradient operation, and ▽ _y may represent the vertical gradient operation.

수학식 5와 같이 각 손실 함수의 최적 하이퍼 파라미터(λ)는 손실 간의 균형을 조절하는데 활용된다.As shown in Equation 5, the optimal hyperparameter (λ) of each loss function is used to adjust the balance between losses.

총 변화 손실 함수의 파라미터(λ_tv)는 다른 손실 함수들과 같은 크기를 갖게 되면 가중치맵의 공간적인 정보가 사라지게 됨에 따라 예외적으로 10^-4의 고정적인 값으로 설정될 필요성이 존재한다.When the parameter (λ _tv ) of the total change loss function has the same size as other loss functions, the spatial information of the weight map disappears, so there is an exceptional need to set it to a fixed value of 10 ^-4 .

우선 순위 값에 따라 손실들의 균형을 맞추므로, 영상 개선 알고리즘(200)은 영상 손실 함수, 색상 손실 함수 및 엔트로피 손실 함수의 우선 순위 값을 0.5, 0.3 및 0.2로 설정한다.Because losses are balanced according to priority values, the image enhancement algorithm 200 sets priority values of the image loss function, color loss function, and entropy loss function to 0.5, 0.3, and 0.2.

본 발명의 일실시예에 따라 영상 개선 장치(100)는 손실 함수 산출부(미도시)를 더 포함하고, 손실 함수 산출부는 상술한 수학식 4 내지 8에 따른 손실 함수를 산출하여 개선 영상 획득에 활용할 수 있다.According to one embodiment of the present invention, the image enhancement device 100 further includes a loss function calculation unit (not shown), and the loss function calculation unit calculates a loss function according to the above-mentioned equations 4 to 8 to obtain an improved image. You can utilize it.

도 3a 내지 도 3c는 본 발명의 일실시예에 따른 영상 개선 알고리즘에서 이용되는 히스토그램 기반 다중 변환함수 추정 네트워크의 블록들의 구조를 설명하는 도면이다.3A to 3C are diagrams illustrating the structure of blocks of a histogram-based multiple transform function estimation network used in an image enhancement algorithm according to an embodiment of the present invention.

도 3a는 본 발명의 일실시예에 따른 영상 특징맵 추출 블록의 구조를 예시한다.Figure 3a illustrates the structure of an image feature map extraction block according to an embodiment of the present invention.

도 3a를 참고하면, 본 발명의 일실시예에 따른 영상 특징맵 추출 블록(300)은 입력 영상(I)로부터 특징맵을 추출한다.Referring to FIG. 3A, the image feature map extraction block 300 according to an embodiment of the present invention extracts a feature map from the input image (I).

영상 특징맵 추출 블록(300)은 컨볼루션 연산, 배치 정규화, ReLU 활성화 함수로 구성된 컨볼루션(ConV) 블록으로 특징맵의 채널을 확장한다.The image feature map extraction block 300 expands the channel of the feature map with a convolution (ConV) block composed of convolution operation, batch normalization, and ReLU activation function.

복수의 SFC(self-fusion convolution) 블록을 이용하여 특징맵의 해상도를 줄이고 채널 수를 768까지 증가시킨다.Using multiple SFC (self-fusion convolution) blocks, the resolution of the feature map is reduced and the number of channels is increased to 768.

이때, 컨볼루션 블록과 첫 번째 SFC 블록은 데이터 손실을 최소화하기 위해 해상도를 줄이지 않는다.At this time, the convolution block and the first SFC block do not reduce resolution to minimize data loss.

마지막으로, 평균 풀링(average pooling, Avg. Pool) 블록을 이용하여 1×1×768의 크기를 가진 제1 특징맵(f_img)를 생성한다.Finally, the first feature map (f _img ) with a size of 1×1×768 is generated using an average pooling (Avg. Pool) block.

영상 특징맵 추출 블록(300)은 변환함수 추정부에 의해 구동될 수 있다.The image feature map extraction block 300 may be driven by a transform function estimator.

따라서, 변환함수 추정부는 입력 영상으로부터 특징맵을 추출하고, 컨볼루션(Conv) 블록을 이용하여 상기 추출된 특징맵의 채널을 확장할 수 있다.Therefore, the transform function estimator can extract a feature map from the input image and expand the channel of the extracted feature map using a convolution (Conv) block.

또한, 변환함수 추정부는 복수의 SFC 블록을 이용하여 특징맵의 해상도를 줄이고 채널 수를 증가시키고, 평균 풀링(average pooling) 블록을 이용하여 특정 크기를 가진 제1 특징맵을 생성할 수 있다.Additionally, the transform function estimator may reduce the resolution of the feature map and increase the number of channels using a plurality of SFC blocks, and generate a first feature map with a specific size using an average pooling block.

도 3b는 본 발명의 일실시예에 따른 히스토그램 특징맵 합성 블록의 구조를 예시한다.Figure 3b illustrates the structure of a histogram feature map synthesis block according to an embodiment of the present invention.

도 3b를 참고하면, 본 발명의 일실시예에 따른 히스토그램 특징맵 합성 블록(310)은 입력 영상으로부터 추출된 특징맵(f_img) 및 히스토그램으로부터 추출된 특징맵(f_h)을 합성한다.Referring to FIG. 3B, the histogram feature map synthesis block 310 according to an embodiment of the present invention synthesizes the feature map (f _img ) extracted from the input image and the feature map (f _h ) extracted from the histogram.

본 발명의 일실시예에 따른 히스토그램 특징맵 합성 블록(310)은 입력 영상과 히스토그램을 동시에 활용함에 따라 영상의 공간적 정보와 통계적 특성을 모두 고려한 특징맵(f_HHF)을 생성한다.The histogram feature map synthesis block 310 according to an embodiment of the present invention uses the input image and histogram simultaneously to generate a feature map (f _HHF ) that takes into account both spatial information and statistical characteristics of the image.

본 발명의 일실시예에 따른 히스토그램 특징맵 합성 블록(310)은 4개의 SHFE(self-fusion histogram feature extraction) 블록을 통해 히스토그램 특징맵(f_h)을 추출한다.The histogram feature map synthesis block 310 according to an embodiment of the present invention extracts the histogram feature map (f _h ) through four self-fusion histogram feature extraction (SHFE) blocks.

여기서 SHFE 블록은 SFC 블록을 1D 컨볼루션 연산으로 확장한 블록이며, 1D 히스토그램 벡터로부터 효과적이고 효율적으로 1D 특징맵을 추출한다.Here, the SHFE block is a block that extends the SFC block with a 1D convolution operation, and effectively and efficiently extracts a 1D feature map from a 1D histogram vector.

구체적으로, 히스토그램 특징맵 합성 블록(310)은 입력 특징맵을 채널 단위로 나눈 후, BN과 ReLU 활성화 함수를 포함한 3×1 컨볼루션 블록을 이용하여 각 특징맵의 채널의 확장, 합성, 압축 연산을 진행한다.Specifically, the histogram feature map synthesis block 310 divides the input feature map into channels and then performs expansion, synthesis, and compression operations on the channels of each feature map using a 3×1 convolution block including BN and ReLU activation functions. proceed.

그 후 히스토그램 특징맵 합성 블록(310)은 특징맵들을 결합하여 point-wise 연산으로 채널 간의 관계성을 보존한다.Afterwards, the histogram feature map synthesis block 310 combines the feature maps and preserves the relationship between channels through point-wise operation.

또한, 히스토그램 특징맵 합성 블록(310)은 추출된 특징맵(f_h)을 앞서 추출된 특징맵(f_img)과 결합하고, 두 개의 컨볼루션 블록, FC(fully-connected), BN, 시그모이드 활성화 함수를 거쳐 특징맵(fs)을 생성한다.In addition, the histogram feature map synthesis block 310 combines the extracted feature map (f _h ) with the previously extracted feature map (f _img ), and two convolution blocks, FC (fully-connected), BN, and Sigmoy. A feature map (fs) is created through a de-activation function.

히스토그램 특징맵 합성 블록(310)은 생성된 특징맵(fs)을 잔류 어텐션 메커니즘(residual attention mechanism)에 의해 특징맵(f_img)의 어텐션(attention) 특징맵으로 작용하여 특징맵(f_HHF)을 생성한다.The histogram feature map synthesis block 310 uses the generated feature map (fs) as an attention feature map of the feature map (f _img ) by using a residual attention mechanism to create a feature map (f _HHF ). Create.

일례로, 변환함수 추정부는 복수의 SHFE 블록을 이용하여 히스토그램의 제2 특징맵(f_h)을 추출하고, 제1 특징맵(f_img)과 제2 특징맵(f_h)을 결합하여 통계적 특성 정보가 고려된 제3 특징맵(fs)를 생성하고, 잔류 어텐션 메커니즘에 따라 제3 특징맵(fs)을 제1 특징맵(f_img)의 어텐션 특징맵으로 작용하여 제4 특징맵(f_HHF)을 생성할 수 있다.For example, the transformation function estimation unit extracts the second feature map (f _h ) of the histogram using a plurality of SHFE blocks, and combines the first feature map (f _img ) and the second feature map (f _h ) to obtain statistical characteristics. Generate a third feature map (fs) in which information is taken into account, and according to the residual attention mechanism, the third feature map (fs) acts as an attention feature map of the first feature map (f _img ) to create a fourth feature map (f _HHF) ) can be created.

도 3c는 본 발명의 일실시예에 따른 변환함수 생성 블록의 구조를 예시한다.Figure 3c illustrates the structure of a conversion function generation block according to an embodiment of the present invention.

도 3c를 참고하면, 본 발명의 일실시예에 따른 변환함수 생성 블록(320)은 복수의 브랜치(branch)로 구성되고, 복수의 브랜치의 수에 해당하는 n과 관련하여 n 번째 변환함수()로 추정될 수 있다.Referring to FIG. 3C, the conversion function generation block 320 according to an embodiment of the present invention is composed of a plurality of branches, and the nth conversion function (in relation to n corresponding to the number of the plurality of branches) ) can be estimated.

변환함수 생성 블록(320)은 3개의 브랜치를 사용하고, 각 브랜치는 3개의 FC, 시그모이드 활성화 함수로 구성된다.The conversion function creation block 320 uses three branches, and each branch consists of three FC and sigmoid activation functions.

변환함수()는 화소값 변환에 활용되고, I_c(p)를 c 채널에 있는 픽셀 p의 입력 화소값이하고 할 때, RGB 영상에서의 변환함수는 256×3의 행렬로 표현할 수 있다.Conversion function ( ) is used for pixel value conversion, and when I _c (p) is the input pixel value of pixel p in the c channel, the conversion function in the RGB image can be expressed as a 256 × 3 matrix.

행렬의 c열은 c 채널의 변환함수가 되고, 이에 따라, n 번째 변환함수()에 의해 변환되는 영상()의 c채널에서 픽셀(p)이 변화된 화소값은 하기 수학식 9와 같이 표현될 수 있다.The c column of the matrix becomes the conversion function of the c channel, and accordingly, the nth conversion function ( ) Image converted by ( The pixel value of the changed pixel (p) in the c channel of ) can be expressed as Equation 9 below.

[수학식 9][Equation 9]

수학식 9에서, 는 n 번째 변환함수()에 의해 변환되는 영상()의 c채널에서 픽셀(p)이 변화된 화소값을 나타낼 수 있고, v_p는 I_c(p)에 해당하는 원소만 1이고, 나머지 원소는 0인 256차원의 원-핫(one-hot) 벡터이고, 변환함수는 c 채널의 화소값을 변환하는 n번째 변환함수일 수 있다.In equation 9, is the nth conversion function ( ) Image converted by ( ) can represent the changed pixel value of the pixel (p) in the c channel, and v _p is a 256-dimensional one-hot in which only the element corresponding to I _c (p) is 1 and the remaining elements are 0. It is a vector, and the conversion function may be the nth conversion function that converts the pixel value of the c channel.

도 4a 내지 도 4c는 본 발명의 일실시예에 따른 영상 개선 장치의 영상 개선 성능을 설명하는 도면이다.4A to 4C are diagrams illustrating the image enhancement performance of the image enhancement device according to an embodiment of the present invention.

도 4a 내지 도 4c는 본 발명의 일실시예에 따른 영상 개선 장치를 이용하여 개선 영상을 획득하는 성능과 관련된 개선 영상의 결과를 예시한다.Figures 4A to 4C illustrate the results of the enhanced image related to the performance of acquiring the enhanced image using the image enhancement device according to an embodiment of the present invention.

도 4a를 참고하면, 그림(400)은 저품질 영상을 나타낼 수 있고, 그림(401)은 개선 영상을 나타낼 수 있으며, 그림(402)은 진실(ground-truth) 영상을 나타낼 수 있다.Referring to FIG. 4A, figure 400 may represent a low-quality image, figure 401 may represent an improved image, and figure 402 may represent a ground-truth image.

도 4b를 참고하면, 그림(410)은 저품질 영상을 나타낼 수 있고, 그림(411)은 개선 영상을 나타낼 수 있으며, 그림(412)은 진실 영상을 나타낼 수 있다.Referring to FIG. 4B, figure 410 may represent a low-quality image, figure 411 may represent an improved image, and figure 412 may represent a true image.

도 4c를 참고하면, 그림(420)은 저품질 영상을 나타낼 수 있고, 그림(421)은 개선 영상을 나타낼 수 있으며, 그림(422)은 진실 영상을 나타낼 수 있다.Referring to FIG. 4C, figure 420 may represent a low-quality image, figure 421 may represent an improved image, and figure 422 may represent a truth image.

그림(400), 그림(410) 및 그림(420)과 그림(402), 그림(412) 및 그림(422)을 각각 대비하면 그림(400), 그림(410) 및 그림(420)이 조도 등의 영상 품질 측면에서 저 품질임을 확인할 수 있다.When comparing figure 400, figure 410, and figure 420 with figure 402, figure 412, and figure 422, respectively, figure 400, figure 410, and figure 420 show illuminance, etc. In terms of video quality, it can be confirmed that the quality is low.

그림(401), 그림(411) 및 그림(421)과 그림(402), 그림(412) 및 그림(422)을 각각 대비하면 그림(401), 그림(411) 및 그림(421)이 조도 등의 영상 품질 측면에서 진실 영상에 가깝도록 영상의 화질 개선이 이루어진 것으 확인할 수 있다. 예를 들어, 그림은 영상을 구성하는 복수의 그림들 중 하나의 그림일 수 있다.When comparing figure 401, figure 411, and figure 421 with figure 402, figure 412, and figure 422, respectively, figure 401, figure 411, and figure 421 show illuminance, etc. In terms of image quality, it can be seen that the image quality has been improved to be closer to the true image. For example, a picture may be one of a plurality of pictures that make up an image.

따라서, 본 발명은 히스토그램 기반 다중 변환함수 추정 네트워크와 가중치 생성 네트워크를 이용하여 추정된 변환함수에 픽셀 단위 가중치를 적용하여 공간적 정보를 고려한 영상의 색상을 표현함에 따라 저품질 영상뿐만 아니라 극한 저조도 영상, 수중영상까지 고품질의 영상으로 개선할 수 있음을 확인할 수 있다.Therefore, the present invention uses a histogram-based multiple transformation function estimation network and a weight generation network to apply pixel-level weights to the estimated transformation function to express the color of the image considering spatial information, so that not only low-quality images, but also extreme low-light images, underwater It can be seen that even the video can be improved to a high quality video.

도 5는 본 발명의 일실시예에 따른 영상 개선 방법을 설명하는 도면이다.Figure 5 is a diagram explaining an image improvement method according to an embodiment of the present invention.

도 5는 본 발명의 일실시예에 따른 영상 개선 방법이 다중 변환함수를 추정하고, 다중 변환함수의 픽셀 단위 가중치에 대한 가중치맵을 생성하여 개선 영상을 획득하는 절차를 예시한다.Figure 5 illustrates a procedure in which the image enhancement method according to an embodiment of the present invention estimates a multiple transformation function and generates a weight map for the pixel weight of the multiple transformation function to obtain an improved image.

도 5를 참고하면, 단계(501)에서 본 발명의 일실시예에 따른 영상 개선 방법은 입력 영상의 공간적 정보와 통계적 특성 정보를 이용하여 다중 변환함수를 추정한다.Referring to FIG. 5, in step 501, the image improvement method according to an embodiment of the present invention estimates a multiple transformation function using spatial information and statistical characteristic information of the input image.

즉, 영상 개선 방법은 사용자로부터 입력 영상을 입력받고, 입력된 입력 영상에 기반한 공간적 정보와 입력 영상의 히스토그램에 기반한 통계적 특성 정보를 이용하여 다중 변환함수를 추정할 수 있다.In other words, the image improvement method can receive an input image from the user and estimate a multiple transformation function using spatial information based on the input image and statistical characteristic information based on the histogram of the input image.

단계(502)에서 본 발명의 일실시예에 따른 영상 개선 방법은 다중 변환함수의 픽셀 단위 가중치를 생성하고, 생성된 가중치에 기반한 가중치맵을 생성한다.In step 502, the image enhancement method according to an embodiment of the present invention generates pixel-level weights of the multiple transformation function and generates a weight map based on the generated weights.

즉, 영상 개선 방법은 다중 변환 함수의 픽셀 단위 가중치를 생성하고, 생성된 픽셀 단위 가중치에 기반하여 가중치맵을 생성할 수 있다.That is, the image improvement method can generate pixel-level weights of the multiple transformation function and generate a weight map based on the generated pixel-level weights.

단계(503)에서 본 발명의 일실시예에 따른 영상 개선 방법은 다중 변환함수에 기반하여 변환된 영상과 가중치맵에 기반하여 개선 영상을 획득할 수 있다.In step 503, the image improvement method according to an embodiment of the present invention can obtain an improved image based on an image converted based on a multiple transformation function and a weight map.

즉, 영상 개선 방법은 입력 영상을 다중 변환 함수 각각으로 화소값 변환한 영상과 생성된 가중치맵에 기반한 가중치의 가중합으로 입력 영상으로부터 개선 영상을 획득할 수 있다.That is, the image improvement method can obtain an improved image from an input image using a weighted sum of an image obtained by converting the pixel values of the input image using each of the multiple transformation functions and weights based on the generated weight map.

또한, 영상 개선 방법은 상술한 수학식 3에 기반하여 개선 영상을 획득할 수 있다.Additionally, the image improvement method can obtain an improved image based on Equation 3 described above.

따라서, 본 발명은 개선하고자 하는 저품질 영상과 해당 영상의 히스토 그램을 이용하여 영상 개선을 위한 다중 변환함수를 추정하고, 추정된 다중 변환함수들을 픽셀 단위로 적절하게 조합하여 개선 영상을 획득할 수 있다.Therefore, the present invention estimates a multiple transformation function for image improvement using the low-quality image to be improved and the histogram of the image, and obtains an improved image by appropriately combining the estimated multiple transformation functions on a pixel basis. there is.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPA(field programmable array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented with hardware components, software components, and/or a combination of hardware components and software components. For example, devices and components described in embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable array (FPA), It may be implemented using one or more general-purpose or special-purpose computers, such as a programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. A processing device may execute an operating system (OS) and one or more software applications that run on the operating system. Additionally, a processing device may access, store, manipulate, process, and generate data in response to the execution of software. For ease of understanding, a single processing device may be described as being used; however, those skilled in the art will understand that a processing device includes multiple processing elements and/or multiple types of processing elements. It can be seen that it may include. For example, a processing device may include a plurality of processors or one processor and one controller. Additionally, other processing configurations, such as parallel processors, are possible.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.Software may include a computer program, code, instructions, or a combination of one or more of these, which may configure a processing unit to operate as desired, or may be processed independently or collectively. You can command the device. Software and/or data may be used on any type of machine, component, physical device, virtual equipment, computer storage medium or device to be interpreted by or to provide instructions or data to a processing device. , or may be permanently or temporarily embodied in a transmitted signal wave. Software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.Although the embodiments have been described with limited drawings as described above, various modifications and variations can be made by those skilled in the art from the above description. For example, the described techniques are performed in a different order than the described method, and/or components of the described system, structure, device, circuit, etc. are combined or combined in a different form than the described method, or other components are used. Alternatively, appropriate results may be achieved even if substituted or substituted by an equivalent.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents of the claims also fall within the scope of the claims described below.

100: 영상 개선 장치 110: 변환함수 추정부
120: 가중치맵 생성부 130: 개선 영상 획득부100: Image improvement device 110: Transformation function estimation unit
120: Weight map generation unit 130: Improved image acquisition unit

Claims

a transformation function estimation unit that estimates a multiple transformation function using spatial information based on the input image and statistical characteristic information based on a histogram of the input image;
a weight map generator that generates pixel-level weights of the estimated multi-transformation function and generates a weight map based on the generated pixel-level weights; and
Characterized by comprising an improved image acquisition unit that obtains an improved image from the input image using a weighted sum of an image obtained by converting the pixel value of the input image using each of the multiple transformation functions and the generated weight map.
Image enhancement device.

According to paragraph 1,
The transform function estimator extracts a feature map from the input image, expands the channel of the extracted feature map using a convolution (Conv) block, and extracts the feature using a plurality of self-fusion convolution (SFC) blocks. Characterized by reducing the resolution of the map, increasing the number of channels, and generating a first feature map with a specific size using an average pooling block.
Image enhancement device.

According to paragraph 2,
The transformation function estimation unit extracts a second feature map of the histogram using a plurality of SHFE (self-fusion histogram feature extraction) blocks, and combines the first feature map and the second feature map to obtain the statistical feature information. Generates a considered third feature map, and generates a fourth feature map by acting as an attention feature map of the first feature map according to a residual attention mechanism.
Image enhancement device.

According to paragraph 3,
The transform function estimator transmits the generated fourth feature map to each of a plurality of branches, estimates the transform function for a specific pixel in a specific channel of the input image in each of the plurality of branches, and estimates the multiple transform function. characterized by
Image enhancement device.

According to paragraph 1,
The weight map generator outputs a weight applied to the converted image using the input image and an image whose pixel value is converted using each of the multiple conversion functions, and generates the weight map using a combination of conversion functions considering the spatial information. characterized by
Image enhancement device.

According to paragraph 1,
The improved image acquisition unit obtains the weighted sum as an element-wise weighted sum using Equation 3 below, and obtains the improved image from the input image according to the obtained weighted sum,
[Equation 3]

remind represents the improved image, n represents the sequence number, represents the nth converted image, represents the pixel weight of the nth transformed image, and ⊙ represents the element-wise product.
Image enhancement device.

In a transformation function estimation unit, estimating a multiple transformation function using spatial information based on an input image and statistical characteristic information based on a histogram of the input image;
In a weight map generator, generating a pixel-level weight of the estimated multi-transformation function and generating a weight map based on the generated pixel-level weight; and
In an image acquisition unit, obtaining an improved image from the input image using a weighted sum of an image obtained by converting the pixel value of the input image using each of the multiple transformation functions and the generated weight map.
How to improve video.

In clause 7,
The step of estimating a multiple transformation function using spatial information based on the input image and statistical characteristic information based on the histogram of the input image includes:
Extract a feature map from the input image, expand the channel of the extracted feature map using a convolution (Conv) block, and reduce the resolution of the feature map using a plurality of self-fusion convolution (SFC) blocks. increasing the number of channels and generating a first feature map with a specific size using an average pooling block;
A second feature map of the histogram is extracted using a plurality of SHFE (self-fusion histogram feature extraction) blocks, and the first feature map and the second feature map are combined to create a third feature in which the statistical characteristic information is taken into consideration. Generating a map and using the third feature map as an attention feature map of the first feature map according to a residual attention mechanism to generate a fourth feature map; and
Forwarding the generated fourth feature map to each of a plurality of branches, estimating a transform function for a specific pixel in a specific channel of the input image in each of the plurality of branches, and estimating the multiple transform function. characterized by
How to improve video.

In clause 7,
The step of generating a pixel-level weight of the estimated multi-transformation function and generating a weight map based on the generated pixel-level weight,
A step of generating the weight map using a combination of transformation functions considering the spatial information by outputting a weight applied to the transformed image using the input image and an image whose pixel values are converted using each of the multiple transformation functions. to do
How to improve video.

In clause 7,
The step of obtaining an improved image from the input image using a weighted sum of the image obtained by converting the pixel value of the input image using each of the multiple transformation functions and the generated weight map,
Obtaining the weighted sum as an element-wise weighted sum using Equation 3 below, and obtaining the improved image from the input image according to the obtained weighted sum,
[Equation 3]

remind represents the improved image, n represents the sequence number, represents the nth converted image, represents the pixel weight of the nth transformed image, and ⊙ represents the element-wise product.
How to improve video.