RU2538319C1

RU2538319C1 - Device of searching image duplicates

Info

Publication number: RU2538319C1
Application number: RU2013127087/08A
Authority: RU
Inventors: Владимир Иванович Марчук; Вячеслав Владимирович Воронин; Марина Михайловна Письменскова; Татьяна Владимировна Морозова
Priority date: 2013-06-13
Filing date: 2013-06-13
Publication date: 2015-01-10
Also published as: RU2013127087A

Abstract

FIELD: information technology.

SUBSTANCE: device comprises the pretreatment units of first and second images, the recording units of first and second images, the conversion units of first and second images into a color space YIQ, the enhancing units of the real component of first and second images, the image forming units as a result of rotation of the first and second image, the units of image forming in changing the angle of inclination of the first and second images, the units of storage of simulated images for the first and second images, the unit of application of the method SIFT, the calculation unit of quantity of equal descriptors, the unit of storage of the found pair of duplicates.

EFFECT: ensuring the ability to compare the descriptors applied to the task of searching image duplicates.

5 dwg

Description

Изобретение относится к способам обработки цифровых изображений и может быть использовано в системах компьютерного зрения для идентификации и регистрации объектов на изображении, мультимедийных приложениях, работающих с визуальными данными.The invention relates to methods for processing digital images and can be used in computer vision systems for identification and registration of objects in the image, multimedia applications that work with visual data.

Упрощенная математическая модель изображения представляет собой модель в цветовом пространстве RGB в виде массива S_i,j,k, где $\bar{i = 1, I}, j = \bar{1, J}$

- столбцы и строки изображения,

k = \bar{1, 3}

- цветовой канал. При использовании изображения в градациях серого, каждая точка представлена уровнем яркости в диапазоне от 0 (черный) до 255 (белый), с промежуточными значениями, представляющими различные уровни серого. Соответственно такое изображение обозначим как S_i,j.A simplified mathematical image model is a model in the RGB color space in the form of an array S _{i, j, k} , where

\bar{i = one, I}, j = \bar{one, J}

- columns and rows of the image,

k = \bar{one, 3}

- color channel. When using grayscale images, each dot is represented by a brightness level ranging from 0 (black) to 255 (white), with intermediate values representing different gray levels. Accordingly, we denote such an image as S _{i, j} .

Основная решаемая задача - сопоставление дескрипторов применительно к задаче поиска дубликатов изображений.The main task to be solved is matching the descriptors as applied to the problem of finding duplicate images.

Поиск точечных соответствий между изображениями одного и того же объекта имеет большое значение для идентификации сцены, реконструкции объектов трехмерного пространства, поиска изображений, распознавания объектов.The search for point correspondences between images of the same object is of great importance for scene identification, reconstruction of three-dimensional space objects, image search, and object recognition.

Методы поиска соответствий на изображениях основаны на построении дескрипторов. Дескриптор - вектор признаков, который рассчитывается для каждой точечной особенности изображения и описывающий структуру ее окрестности. Как правило, эти вектора формируются на основе набора значений первых и вторых производных изображения в точке.Image matching methods are based on the construction of descriptors. A descriptor is a feature vector that is calculated for each point feature of the image and describes the structure of its surroundings. As a rule, these vectors are formed on the basis of a set of values of the first and second derivatives of the image at a point.

Дескрипторы могут быть представлены в виде небольших инвариантных участков, точек изображения, особых областей. При выборе способа представления областей возникает следующая альтернатива: область можно представить ее внешними характеристиками (т.е. границей) или внутренними характеристиками (совокупность элементов составляющих эту область). Внешнее представление обычно выбирается в тех случаях, когда основное внимание обращено на характеристики формы области. Внутреннее представление выбирается, если интерес представляют свойства самой области, например цвет или текстура. Иногда используются оба способа представления одновременно. Данные, представленные в форме множества пикселей, расположенных вдоль границы или внутри области, используются для получения дескрипторов, которые принимают значения в пространстве признаков. Если на таком пространстве задать меру, то можно сравнивать изображения друг с другом, вычисляя расстояние между соответствующими векторами признаков.Descriptors can be represented in the form of small invariant sections, image points, special areas. When choosing a method for representing regions, the following alternative arises: a region can be represented by its external characteristics (i.e., its boundary) or internal characteristics (a set of elements that make up this region). The external representation is usually chosen in those cases when the main attention is paid to the characteristics of the shape of the region. An internal representation is selected if the properties of the area itself, such as color or texture, are of interest. Sometimes both methods of presentation are used simultaneously. Data presented in the form of a plurality of pixels located along a boundary or within an area is used to obtain descriptors that take values in a feature space. If you set a measure on such a space, then you can compare the images with each other, calculating the distance between the corresponding feature vectors.

Для того чтобы удовлетворять задаче сопоставления изображений и их точечных особенностей, дескрипторы должны быть специфичны и выбраны таким образом, чтобы с высокой вероятностью дескрипторы разных изображений (но, принадлежащие одинаковым точечным особенностям) были правильно согласованы.In order to satisfy the task of comparing images and their point features, the descriptors must be specific and selected so that with high probability the descriptors of different images (but belonging to the same point features) are correctly matched.

Дескрипторы должны быть:Descriptors must be:

- специфичны (отличать разные точки);- specific (distinguish different points);

- локальны (зависеть только от небольшой окрестности);- local (depend only on a small neighborhood);

- инвариантны (поворот, растяжение, сжатие, монотонное изменение яркости, аффинные и проективные преобразования);- are invariant (rotation, tension, compression, monotonic change in brightness, affine and projective transformations);

- просты в вычислении.- easy to calculate.

Общая проблема методов сопоставления дескрипторов заключается в распознавании объекта на фотографиях, сделанных с разного ракурса, под разными углами обзора, в разном масштабе, при разном освещении. Это приводит к тому, что один и тот же объект в зависимости от условий съемки будет характеризоваться различными векторами-признаками. Следовательно, методы распознания должно быть инвариантны относительно таких изменений.A common problem with descriptor matching methods is to recognize an object in photographs taken from different angles, at different viewing angles, at different scales, under different lighting conditions. This leads to the fact that the same object, depending on the shooting conditions, will be characterized by various feature vectors. Therefore, recognition methods must be invariant with respect to such changes.

Анализ существующих литературных источников позволяет выделить наиболее популярные методы. Главной проблемой всех методов сопоставления дескрипторов на изображениях является нахождение аффинных инвариантных локальных особенностей. Практически ни один из рассмотренных методов не удовлетворяет данному требованию. Методы Hessian-Affine и Harris-Affine - не устойчивы к изменению угла обзора и масштабу, поэтому не реализуют в должной мере аффинную инвариантность. Метод MSER - не полностью инвариантен масштабу: не справляется с резкими изменениями уровня геометрической размытости. Постоянно расширяется число различных вариантов метода SIFT, таких как PCA-SIFT, GLOH, SURF. Они широко применяются в области распознания, детектирования, регистрации движения, но не решают проблемы аффинной инвариантности.Analysis of existing literary sources makes it possible to single out the most popular methods. The main problem of all methods of matching descriptors in images is finding affine invariant local features. Almost none of the methods considered satisfies this requirement. The Hessian-Affine and Harris-Affine methods are not resistant to changes in viewing angle and scale; therefore, they do not adequately implement affine invariance. The MSER method is not completely scale invariant: it cannot cope with sharp changes in the level of geometric blur. The number of different variants of the SIFT method, such as PCA-SIFT, GLOH, SURF, is constantly expanding. They are widely used in the field of recognition, detection, motion registration, but do not solve the problem of affine invariance.

Известен способ распознавания объектов на изображении [Патент RU 2438174 C1, МПК G06K 9/68]. Изобретение относится к способам распознавания объектов в системах машинного зрения, телевизионных системах наблюдения, информационно-управляющих системах робототехнических комплексов.A known method of recognizing objects in an image [Patent RU 2438174 C1, IPC G06K 9/68]. The invention relates to methods for recognizing objects in machine vision systems, television surveillance systems, information and control systems of robotic complexes.

Основная техническая задача состоит в создании способа, позволяющего повысить точность распознавания за счет повышения стабильности работы детекторов ключевых областей на изображении и увеличения количества инвариантных характеристик данных детекторов. На первом этапе входное изображение сворачивается с заданной функцией Грина. Далее полученные свертки вычитают друг из друга для получения конечно-разностной аппроксимации первой производной свертки входного изображения с фильтром, при поиске локального экстремума данной свертки приравнивают к нулю соответствующие первые производные. На следующем этапе находят все локальные экстремумы и проводят адаптивную пороговую фильтрацию для отсечения незначительных особенностей, при этом выделенные точки служат центрами окрестностей, для которых строят произвольные дескрипторы.The main technical task is to create a method that allows to increase the recognition accuracy by increasing the stability of the detectors of key areas in the image and increasing the number of invariant characteristics of these detectors. At the first stage, the input image is minimized with the given Green function. Next, the resulting convolutions are subtracted from each other to obtain a finite-difference approximation of the first derivative of the convolution of the input image with the filter; when searching for the local extremum of this convolution, the corresponding first derivatives are equal to zero. At the next stage, all local extrema are found and adaptive threshold filtering is performed to cut off minor features, while the selected points serve as the centers of the neighborhoods for which arbitrary descriptors are constructed.

Признаки способа-аналога, совпадающие с признаками заявляемого технического решения, следующие:The signs of the method-analogue, coinciding with the signs of the proposed technical solution, are as follows:

- построение дескрипторов изображения.- construction of image descriptors.

Недостатками известного способа и устройства его реализующего являются:The disadvantages of the known method and device that implements it are:

- низкая точность определения локальных дескрипторов. Причины, препятствующие достижению требуемого технического результата, заключаются в следующем:- low accuracy of determining local descriptors. The reasons that impede the achievement of the required technical result are as follows:

- использование способа свертки с функцией Грина не позволяет детектировать различные типы текстур.- the use of the convolution method with the Green function does not allow to detect various types of textures.

Известен способ и устройство для обнаружения объекта на изображении [Патент US №6711293, МПК G06K 9/68]. Изобретение относится к способам распознавания объектов в системах машинного зрения.A known method and device for detecting an object in the image [US Patent No. 6711293, IPC G06K 9/68]. The invention relates to methods for recognizing objects in machine vision systems.

Основная техническая задача состоит в распознавании объектов на изображении с использованием инвариантной функции масштаба. На первом этапе вычисляются разности изображений: выполняется свертка изображения с функцией Гаусса, далее еще раз вычисляется свертка свернутого изображения с функцией Гаусса для построения разностного изображения и из входного изображения вычитают разностное изображение. На втором этапе находят локальные экстремумы значений пикселей. На третьем этапе выделяют области вокруг точек экстремума.The main technical task is to recognize objects in the image using an invariant scale function. At the first stage, image differences are calculated: the convolution of the image with the Gaussian function is performed, then the convolution of the minimized image with the Gaussian function is again calculated to construct the difference image, and the difference image is subtracted from the input image. At the second stage, local extremes of pixel values are found. At the third stage, areas around the points of extremum are distinguished.

На четвертом этапе области разбиваются на подобласти и на пятом этапе производят множество компонент - дескрипторов подобластей.At the fourth stage, regions are divided into subdomains and at the fifth stage, many components are produced - descriptors of subdomains.

- нахождение и сопоставление дескрипторов;- finding and matching descriptors;

- инвариантность дескрипторов к изменению масштаба.- invariance of descriptors to zooming.

- невысокая точность построения дескрипторов.- low accuracy of constructing descriptors.

Причины, препятствующие достижению требуемого технического результата, заключаются в следующем:The reasons that impede the achievement of the required technical result are as follows:

- функция Гаусса обладает ограниченным набором инвариантных характеристик, характеризующих особенности изображения, описываемые векторами-признаками, и, тем самым, данные вектора являются менее информативными.- The Gaussian function has a limited set of invariant characteristics characterizing the image features described by feature vectors, and, therefore, these vectors are less informative.

Известен способ компьютерного распознавания объектов [Патент RU №2234127 C2, МПК G06K 9/68]. Изобретение относится к области автоматики и вычислительной техники, а именно к системам искусственного интеллекта.A known method of computer recognition of objects [Patent RU No. 2234127 C2, IPC G06K 9/68]. The invention relates to the field of automation and computer technology, namely to artificial intelligence systems.

Основная техническая задача заключается в увеличении скорости распознавания объектов, вводимых в компьютер. На первом этапе происходит предварительное приведение изображения объекта, вводимого в компьютер, к нормальному, стандартному для данного способа виду - изменению масштаба, поворот в требуемое положение, центрирование, вписание в прямоугольник требуемого размера, преобразование изображения объекта в изображение, выполненное в градациях - различных степенях яркости - одного цвета. Затем на него последовательно, поочередно накладываются изображения хранящихся в памяти компьютера шаблонов. Программа распознавания объектов пошагово совмещает нормализованные изображения распознаваемых объектов, центрированных и вписанных в одинаковых размеров ячейки таблицы и шаблонов, центрированных и вписанных в аналогичные ячейки таблицы шаблонов, с шагом, равным высоте строки с ячейками или ширине столбца ячеек таблиц. При этом в каждом из столбцов или в каждой из строк таблицы шаблонов, число которых равно числу столбцов или строк в таблице распознаваемых объектов, находится полный комплект шаблоновThe main technical problem is to increase the recognition rate of objects entered into the computer. At the first stage, the image of the object introduced into the computer is preliminarily brought to a normal, standard form for this method - zooming, rotation to the desired position, centering, inscribing the rectangle of the required size, converting the image of the object into an image made in gradations - of various degrees brightness - one color. Then, images sequentially stored in the computer memory are sequentially and alternately superimposed on it. The object recognition program step-by-step combines normalized images of recognizable objects centered and inscribed in the same cell size of the table and patterns centered and inscribed in the same cells of the template table with a step equal to the height of the row with cells or the column width of the table cells. At the same time, in each of the columns or in each row of the pattern table, the number of which is equal to the number of columns or rows in the table of recognized objects, there is a complete set of patterns

- сопоставление дескрипторов двух изображений.- matching descriptors of two images.

- отсутствие инвариантности к аффинным преобразованиям;- lack of invariance to affine transformations;

- необходимость обеспечения центрирования и нормализации изображений;- the need to ensure centering and normalization of images;

- необходимость полного перебора шаблонов при распознавании.- the need for a complete enumeration of patterns in recognition.

- при пошаговом совмещении изображений даже небольшие различия в структуре изображений приводят к резкому увеличению ошибки распознавания.- when combining images step by step, even small differences in the structure of images lead to a sharp increase in recognition error.

Известен способ и устройство для распознавания изображений объектов (Патент RU №2361273 C2, МПК G06K 9/62). Изобретение относится к вычислительной технике и может быть использовано в системах технического зрения для идентификации объектов на изображении.A known method and device for recognizing images of objects (Patent RU No. 2361273 C2, IPC G06K 9/62). The invention relates to computer technology and can be used in vision systems to identify objects in the image.

Технической задачей является повышение точности и качества распознавания за счет использования трехмерной векторной модели эталона объекта. Технический результат достигается следующим образом: эталонное изображение хранят в виде векторной трехмерной модели; для каждой такой модели фиксируют набор параметров для аффинных преобразований: углы поворота по осям х, у, z и масштаб. Этот набор параметров определяют для каждой модели с учетом сложности ее формы: чем сложнее форма, тем большее количество ракурсов необходимо для наиболее полного описания возможных вариантов положения объекта в пространстве с тем, чтобы наиболее точно произвести идентификацию.The technical task is to increase the accuracy and quality of recognition through the use of a three-dimensional vector model of the standard of the object. The technical result is achieved as follows: the reference image is stored in the form of a vector three-dimensional model; for each such model, a set of parameters for affine transformations is fixed: rotation angles along the x, y, z axes and scale. This set of parameters is determined for each model, taking into account the complexity of its shape: the more complex the shape, the greater the number of angles necessary for the most complete description of possible options for the position of the object in space in order to most accurately identify.

Производят следующую последовательность действий: получают векторную трехмерную модель эталонного объекта путем геометрического построения, затем, изменяя ее положение в пространстве (поворот, отражение, масштабирование), получают ряд вышеуказанных параметров, которые сохраняют и используют в дальнейшем при распознавании для воссоздания соответствующего ракурса эталона объекта. При распознавании трехмерный образ поворачивается, каждый раз генерируется ряд плоских изображений, пока не удастся найти совпадение.The following sequence of actions is performed: a vector three-dimensional model of the reference object is obtained by geometric construction, then, changing its position in space (rotation, reflection, scaling), a number of the above parameters are obtained, which are saved and used in the future for recognition to recreate the corresponding angle of the object’s standard. Upon recognition, the three-dimensional image is rotated, each time a series of flat images are generated until a match can be found.

- моделирование аффинных преобразований объекта.- modeling of affine transformations of an object.

- наличие этапа предварительной обработки;- the presence of a preliminary processing stage;

- определение ряда параметров объекта, класс, к которому относится данный объект, габаритные размеры. Этот набор параметров определяют для каждой модели с учетом сложности ее формы.- determination of a number of parameters of the object, the class to which the given object belongs, overall dimensions. This set of parameters is determined for each model, taking into account the complexity of its shape.

- большие вычислительные затраты.- high computing costs.

Наиболее близким к заявленному решению, выбранному нами за прототип, является способ и устройство аффинно-инвариантного распознавания шаблона [Патент №US 2011/0069889 A1, МПК G06К 9/46]. Изобретение относится к распознаванию объектов цифровых изображений.Closest to the claimed solution, chosen by us for the prototype, is a method and device for affine-invariant pattern recognition [Patent No. US 2011/0069889 A1, IPC G06K 9/46]. The invention relates to the recognition of digital image objects.

Основным техническим требованием является распознавание объекта, инвариантного повороту, перемещению и масштабу. Данное требование достигается путем добавления аффинного инвариантного расширения к методу SIFT (Scale-invariant feature transform). ASIFT (Affine-scale-invariant feature transform) позволяет надежно идентифицировать функции, которые прошли сильные аффинные искажения. Вместо построения аффинно-инвариантных дескрипторов моделирует, с достаточной точностью, все искажения вызванные изменением положения оптических камер. Масштаб и изменение положения камеры (включает два параметра положения оси камеры) - моделируемые параметры; вращение и перемещение (также включает в себя 2 параметра изменения положения) - нормализуемые параметры. ASIFT сначала достигает аффинной инвариантности дескрипторов изображения, затем использует SIFT который стимулирует масштаб и нормализует вращение и перемещение.The main technical requirement is the recognition of an object that is invariant to rotation, displacement, and scale. This requirement is achieved by adding an affine invariant extension to the SIFT (Scale-invariant feature transform) method. ASIFT (Affine-scale-invariant feature transform) allows you to reliably identify functions that have passed strong affine distortion. Instead of constructing affine-invariant descriptors, it models, with sufficient accuracy, all distortions caused by a change in the position of the optical cameras. Scale and change of camera position (includes two parameters of the camera axis position) - simulated parameters; rotation and movement (also includes 2 parameters of change of position) - normalized parameters. ASIFT first achieves the affinity invariance of image descriptors, then uses SIFT which stimulates scale and normalizes rotation and movement.

Рассматриваемое устройство-прототип предполагает выполнение следующих операций:Consider a prototype device involves the following operations:

1. Каждое изображение преобразуется, путем симулирования всех возможных аффинных искажений, вызванных изменением положения камеры с фронтальной позиции. Эти искажения зависят от двух параметров: долготы φ и широты θ. В результате на первом шаге данного алгоритма получатся определенное множество изображений, отличающихся всевозможными аффинными преобразованиями;1. Each image is converted by simulating all possible affine distortions caused by a change in camera position from the front position. These distortions depend on two parameters: longitude φ and latitude θ. As a result, at the first step of this algorithm, we obtain a certain set of images that differ in all kinds of affine transformations;

2. Полученные после аффинных преобразований изображения обрабатываются алгоритмом SIFT;2. Images obtained after affine transformations are processed by the SIFT algorithm;

3. SIFT включает в себя следующие операции:3. SIFT includes the following operations:

- Нахождение особых точек, путем построения пирамиды гауссианов (Gaussian) и разностей гауссианов (Difference of Gaussian, DoG).- Finding singular points by constructing a pyramid of Gaussian (Gaussian) and differences of Gaussian (Difference of Gaussian, DoG).

- Уточнение особых точек с помощью аппроксимации функции DoG (разность гауссианов) многочленом Тейлора второго порядка, взятого в точке вычисленного экстремума.- Refinement of singular points by approximating the DoG function (Gaussian difference) by a second-order Taylor polynomial taken at the point of the calculated extremum.

- Нахождение ориентации ключевой точки, которая вычисляется исходя из направлений градиентов смежных точек.- Finding the orientation of the key point, which is calculated based on the directions of the gradients of adjacent points.

- Построение дескрипторов. В методе SIFT дескриптором является вектор. Как и направление ключевой точки, дескриптор вычисляется на гауссиане, ближайшем по масштабу к ключевой точке, и исходя из градиентов в некотором окне ключевой точки.- Building descriptors. In the SIFT method, the descriptor is a vector. Like the direction of the cue point, the handle is calculated on the Gaussian closest in scale to the cue point, and based on the gradients in a certain cue point window.

Недостатками известного устройства-прототипа являются:The disadvantages of the known prototype device are:

- Различны условия освещения (например, день/ночь).- Different lighting conditions (e.g. day / night).

- Объект имеет отражающую поверхность (как правило, автомобили, зеркала).- The object has a reflective surface (usually cars, mirrors).

- Объект имеет сильную 3-D структуру.- The object has a strong 3-D structure.

- Объект имеет себе подобные дескрипторы или периодическую структуру.- The object has similar descriptors or a periodic structure.

- методом ASIFT рассматривается массив изображения в градации серого S_i,j (массив изображения принимает значения в диапазоне от 0 до 255), такой способ позволяет сохранить устойчивость к изменению цвета. Но при таком подходе отражающие поверхности сохраняют свои особенности, что значительно ухудшает процесс поиска соответствия между дескрипторами.- the ASIFT method examines the image array in grayscale S _{i, j} (the image array takes values in the range from 0 to 255), this method allows you to maintain resistance to color change. But with this approach, reflective surfaces retain their features, which significantly worsens the process of finding correspondence between descriptors.

Предлагаемое устройство поиска дубликатов изображений позволяет решить одну из проблем оригинального устройства путем использования цветового пространства YIQ. Устройство реализуют следующий алгоритм. На первом этапе представим массив изображения S_i,j,k в цветовом пространстве YIQ. Это формат телевизионного стандарта NTSC. Данное цветовое пространство, так же как и человеческое зрение, более чувствительно к световой яркости (интенсивности), а не цвету. Изображение состоит из трех компонент - яркость (Y) и двух искусственных цветоразностных (I и Q) составляющих. Сигнал I называется синфазным, Q - квадратурным.The proposed device for searching for duplicate images allows you to solve one of the problems of the original device by using the YIQ color space. The device implements the following algorithm. At the first stage, we present an array of images S _{i, j, k} in the YIQ color space. This is the NTSC television standard format. This color space, as well as human vision, is more sensitive to light brightness (intensity), rather than color. The image consists of three components - brightness (Y) and two artificial color difference (I and Q) components. The signal I is called in-phase, Q - quadrature.

Разложим канал YIQ, на цветовые составляющие. Конверсия из RGB в YIQ осуществляется по следующим формулам:We decompose the YIQ channel into color components. The conversion from RGB to YIQ is carried out according to the following formulas:

$Y = 0,299 \cdot R + 0,587 \cdot G + 0,114 \cdot B$

,

Y = 0.299 \cdot R + 0.587 \cdot G + 0.114 \cdot B

,

$I = 0,596 \cdot R-0,274 \cdot G-0,322 \cdot Â$

,

I = 0.596 \cdot R-0,274 \cdot G-0,322 \cdot Â

,

$Q = 0,211 \cdot R-0,522 \cdot G + 0,311 \cdot Â$

.

Q = 0.211 \cdot R-0,522 \cdot G + 0.311 \cdot Â

.

Анализ изображений различных компонент пространства YIQ показал, что компоненты I и Q не содержат блики и отражения, присущие зеркальным поверхностям. Данные компоненты не содержат яркостной информации, а показывают только цветовую, при этом отражающая структура поверхности характеризуется в большей степени изменением яркости. Выберем канал I для поиска соответствия между дескрипторами на отражающих поверхностях.An analysis of the images of the various components of the YIQ space showed that components I and Q do not contain the glare and reflection inherent in mirror surfaces. These components do not contain brightness information, but show only color, while the reflective surface structure is characterized to a greater extent by a change in brightness. We choose channel I to search for correspondence between descriptors on reflective surfaces.

Далее полученные изображения обрабатываются алгоритмом ASIFT.Further, the obtained images are processed by the ASIFT algorithm.

Изображения преобразуются путем симулирования всех возможных аффинных искажений, вызванных изменением положения камеры с фронтальной позиции. Эти искажения получаются путем изменения двух параметров: φ долготы и широты θ. В результате на первом шаге данного способа имеется определенное множество изображений, отличающихся различными аффинными преобразованиями. Вращения и наклоны моделируются для конечных и небольших значений широты и долготы угла, выбор шага этих параметров обеспечивается моделируемым изображением, которое принимает любую возможность обзора при других значениях широты и долготы.Images are transformed by simulating all possible affine distortions caused by changing the position of the camera from the front position. These distortions are obtained by changing two parameters: φ longitude and latitude θ. As a result, at the first step of this method there is a certain set of images that differ in various affine transformations. Rotations and tilts are modeled for finite and small values of latitude and longitude of the angle, the choice of the step of these parameters is provided by a simulated image that takes any viewing opportunity at other values of latitude and longitude.

Затем выполняется этап нахождение особых точек. Основным моментом в детектировании особых точек является построение пирамиды гауссианов и разностей гауссианов. Гауссианом (или, изображением размытым гауссовым фильтром) является изображение:Then the step of finding the singular points is performed. The main point in the detection of singular points is the construction of a pyramid of Gaussians and differences of Gaussians. Gaussian (or, image blurred Gaussian filter) is the image:

$L (x, y, σ) = G (x, y, σ) * I (x, y)$

,

L (x, y, σ) = G (x, y, σ) * I (x, y)

,

где L - значение гауссиана в точке с координатами (x, y), σ - радиус размытия, G - гауссово ядро, I - значение исходного изображения, * - оператор свертки.where L is the Gaussian value at the point with coordinates (x, y), σ is the blur radius, G is the Gaussian core, I is the value of the original image, * is the convolution operator.

Разностью гауссианов называют изображение, полученное путем попиксельного вычитания одного гауссиана исходного изображения из изображения с другим радиусом размытия.Gaussian difference is an image obtained by pixel-by-pixel subtraction of one Gaussian source image from an image with a different blur radius.

$D (x, y, σ) = (G (x, y, k σ) - G (x, y, σ)) * I (x, y) = L (x, y, k σ) - L (x, y, σ) .$

D (x, y, σ) = (G (x, y, k σ) - G (x, y, σ)) * I (x, y) = L (x, y, k σ) - L (x, y, σ) .

Масштабируемым пространством изображения является набор всевозможных, сглаженных некоторым фильтром, версий исходного изображения. Доказано, что гауссово пространство является линейным, инвариантным относительно сдвигов, вращений, масштаба, не смещающим локальные экстремумы, и обладает свойством полугрупп. Различная степень размытия изображения гауссовым фильтром может быть принята за исходное изображение, взятое в некотором масштабе.Scalable image space is a set of various versions of the original image smoothed by some filter. It is proved that the Gaussian space is linear, invariant with respect to shifts, rotations, scale, not displacing local extrema, and has the property of semigroups. A different degree of image blurring by a Gaussian filter can be taken as the original image taken at a certain scale.

Инвариантность относительно масштаба достигается за счет нахождения ключевых точек для исходного изображения, взятого в разных масштабах. Для этого строится пирамида гауссианов: все масштабируемое пространство разбивается на некоторые участки - октавы, причем часть масштабируемого пространства, занимаемого следующей октавой, в два раза больше части, занимаемой предыдущей. К тому же, при переходе от одной октавы к другой делается ресэмплинг изображения, его размеры уменьшаются вдвое. Естественно, что каждая октава охватывает бесконечное множество гауссианов изображения, поэтому строится только некоторое их количество N, с определенным шагом по радиусу размытия. С тем же шагом достраиваются два дополнительных гауссиана (всего получается N+2), выходящих за пределы октавы. Масштаб первого изображения следующей октавы равен масштабу изображения из предыдущей октавы с номером N.Scale invariance is achieved by finding key points for the original image taken at different scales. To do this, a Gaussian pyramid is built: the entire scalable space is divided into some sections - octaves, and part of the scalable space occupied by the next octave is two times larger than the part occupied by the previous one. In addition, when moving from one octave to another, resampling of the image is done, its size is halved. Naturally, each octave spans an infinite number of Gaussian images, so only a certain number N of them is built, with a certain step along the blur radius. With the same step, two additional Gaussians (totaling N + 2), extending beyond the octave, are being completed. The scale of the first image of the next octave is equal to the scale of the image from the previous octave with the number N.

Параллельно с построением пирамиды гауссианов строится пирамида разностей гауссианов, состоящая из разностей соседних изображений в пирамиде гауссианов (фиг.1). Соответственно, количество изображений в этой пирамиде будет N+1.In parallel with the construction of the pyramid of Gaussians, a pyramid of differences of Gaussians is constructed, consisting of the differences of neighboring images in the pyramid of Gaussians (Fig. 1). Accordingly, the number of images in this pyramid will be N + 1.

Каждая разность получается из двух соседних гауссианов, количество разностей на единицу меньше количества гауссианов, при переходе к следующей октаве размер изображений уменьшается вдвое.Each difference is obtained from two neighboring Gaussians, the number of differences is one less than the number of Gaussians, when moving to the next octave, the image size is halved.

После построения пирамид находим особые точки. Точка будет считаться особой, если она является локальным экстремумом разности гауссианов. На фиг.2 показан этап определения точек экстремума. Она считается таковой, если значение разности гауссианов в помеченной точке, больше (меньше) всех остальных значений в точках.After building the pyramids we find the singular points. A point will be considered singular if it is a local extremum of the Gaussian difference. Figure 2 shows the step of determining the points of the extremum. It is considered such if the value of the difference of the Gaussians at the marked point is greater (less) than all other values at the points.

В каждом изображении из пирамиды разности гауссианов определяются точки локального экстремума. Каждая точка текущего изображения DoG сравнивается с ее восьмью соседями и с девятью соседями в DoG, находящихся на уровень выше и ниже в пирамиде. Если эта точка больше (меньше) всех соседей, то она принимается за точку локального экстремума.In each image from the pyramid of the difference of the Gaussians, local extremum points are determined. Each point of the current DoG image is compared with its eight neighbors and nine neighbors in the DoG, which are one level higher and lower in the pyramid. If this point is larger (less) than all neighbors, then it is taken as a point of local extremum.

Следующим шагом будет уточнение особых точек - проверка пригодности точки экстремума на роль особой.The next step will be to refine the singular points - to check the suitability of the extremum point for the singular role.

Определяются координаты особой точки с субпиксельной точностью. Это достигается с помощью аппроксимирования функции DoG многочленом Тейлора второго порядка, взятого в точке вычисленного экстремума.The coordinates of the singular point are determined with subpixel accuracy. This is achieved by approximating the DoG function by the second-order Taylor polynomial taken at the point of the calculated extremum.

$D (x) = D + \frac{\partial D^{T}}{\partial x} x + \frac{1}{2} x^{T} \frac{\partial^{2} D}{\partial x^{2}} x$

,

D (x) = D + \frac{\partial D^{T}}{\partial x} x + \frac{one}{2} x^{T} \frac{\partial^{2} D}{\partial x^{2}} x

,

где D - функция DoG, x=(x, y, σ) - вектор смещения относительно точки разложения, первая производная DoG - градиент, вторая производная DoG - матрица Гессе.where D is the DoG function, x = (x, y, σ) is the displacement vector relative to the decomposition point, the first DoG derivative is the gradient, the second DoG derivative is the Hessian matrix.

Экстремум многочлена Тейлора находится путем вычисления производной и приравнивания ее к нулю. В итоге получается смещение точки вычисленного экстремума, относительно точного:The extremum of the Taylor polynomial is found by calculating the derivative and equating it to zero. The result is a shift of the point of the calculated extremum, relatively accurate:

$\overset{\land}{х} = - \frac{\partial^{2} D^{- 1}}{\partial x^{2}} \frac{\partial D}{\partial x}$

.

\overset{\land}{x} = - \frac{\partial^{2} D^{- one}}{\partial x^{2}} \frac{\partial D}{\partial x}

.

Если одна из компонент ∧вектора $\overset{\land}{х}$

больше 0,5*шаг сетки в этом направлении, то это означает, что на самом деле точка экстремума была вычислена неверно и нужно сдвинуться к соседней точке в направлении указанных компонент. Для соседней точки все повторяется заново. Если таким образом мы вышли за пределы октавы, то следует исключить данную точку из рассмотрения.If one of the components of the vector

\overset{\land}{x}

more than 0.5 * grid spacing in this direction, this means that in fact the extremum point was calculated incorrectly and you need to move to an adjacent point in the direction of the indicated components. For a neighboring point, everything repeats again. If in this way we went beyond the octave, then this point should be excluded from consideration.

Когда положение точки экстремума вычислено, проверяется само значение DoG в этой точке по формуле:When the position of the extremum point is calculated, the DoG value at this point is checked by the formula:

$D (\overset{\land}{х}) = D + \frac{1}{2} \frac{\partial D^{T}}{\partial x} x$

.

D (\overset{\land}{x}) = D + \frac{one}{2} \frac{\partial D^{T}}{\partial x} x

.

Если эта проверка не проходит, то точка исключается как точка с малым контрастом.If this test fails, then the point is excluded as a point with low contrast.

Последняя проверка включает проверку, если особая точка лежит на границе какого-то объекта или плохо освещена, в этом случае такую точку можно исключить из рассмотрения. Эти точки имеют большой изгиб (одна из компонент второй производной) вдоль границы и малый в перпендикулярном направлении. Этот большой изгиб определяется матрицей Гессе H. Для проверки подойдет H размера 2×2.The last check includes checking if a particular point lies on the boundary of some object or is poorly lit, in which case such a point can be excluded from consideration. These points have a large bend (one of the components of the second derivative) along the boundary and small in the perpendicular direction. This large bend is determined by the Hessian matrix H. An H of size 2 × 2 is suitable for testing.

$H = [\begin{matrix} D_{x x} & D_{x y} \\ D_{x y} & D_{y y} \end{matrix}]$

.

H = [\begin{matrix} D_{x x} & D_{x y} \\ D_{x y} & D_{y y} \end{matrix}]

.

Пусть Tr(H) - след матрицы, a Det(H)- ее определитель.Let Tr (H) be the trace of the matrix, and Det (H) be its determinant.

$T r (H) = D_{x x} + D_{y y} = α + β$

,

T r (H) = D_{x x} + D_{y y} = α + β

,

$D e t (H) = D_{x x} D_{y y} - {(D_{x y})}^{2} = α β$

.

D e t (H) = D_{x x} D_{y y} - {(D_{x y})}^{2} = α β

.

Пусть r - отношение большего изгиба к меньшему,Let r be the ratio of the larger bend to the smaller,

$α = r β$

,

α = r β

,

ТогдаThen

$\frac{T r {(H)}^{2}}{D e t (H)} = \frac{{(α + β)}^{2}}{α β} = \frac{{(r α + β)}^{2}}{r β^{2}} = \frac{{(r + 1)}^{2}}{r}$

,

\frac{T r {(H)}^{2}}{D e t (H)} = \frac{{(α + β)}^{2}}{α β} = \frac{{(r α + β)}^{2}}{r β^{2}} = \frac{{(r + one)}^{2}}{r}

,

и точка рассматривается дальше, еслиand the point is considered further if

$\frac{T r {(H)}^{2}}{D e t (H)} < \frac{{(r + 1)}^{2}}{r}$

.

\frac{T r {(H)}^{2}}{D e t (H)} < \frac{{(r + one)}^{2}}{r}

.

Направление ориентации ключевой точки вычисляется исходя из направлений градиентов точек, соседних с особой. Все вычисления градиентов производятся на изображении в пирамиде гауссианов, с масштабом, наиболее близким к масштабу ключевой точки. Величина и направление градиента в точке (x, y) вычисляются по формулам:The orientation direction of the key point is calculated based on the direction of the gradients of the points adjacent to the singular. All calculations of the gradients are performed on the image in the pyramid of the Gaussians, with the scale closest to the scale of the key point. The magnitude and direction of the gradient at the point (x, y) are calculated by the formulas:

$m (x, y) = \sqrt{{(L (x + 1, y) - L (x - 1, y))}^{2} + {(L (x, y + 1) - L (x, y - 1))}^{2}}$

,

m (x, y) = \sqrt{{(L (x + one, y) - L (x - one, y))}^{2} + {(L (x, y + one) - L (x, y - one))}^{2}}

,

$θ (x, y) = \tan^{- 1} (\frac{L (x, y + 1) - L (x, y - 1)}{L (x + 1, y) - L (x - 1, y)}$

,

θ (x, y) = \tan^{- one} (\frac{L (x, y + one) - L (x, y - one)}{L (x + one, y) - L (x - one, y)}

,

где $m$

- величина градиента,

θ

- его направление.Where

m

is the magnitude of the gradient

θ

- his direction.

Для начала определим окно (окрестность) ключевой точки, в котором будут рассмотрены градиенты. Это будет окно, требуемое для свертки с гауссовым ядром, оно будет круглым и радиус размытия для этого ядра (σ) равен 1,5*масштаб ключевой точки. Для гауссова ядра действует, так называемое, правило «трех сигм». Оно состоит в том, что значение гауссова ядра очень близко к нулю на расстоянии, превышающем 3*σ. Таким образом, радиус окна определяется как [3*σ].First, we define a window (neighborhood) of the key point at which the gradients will be considered. This will be the window required for convolution with the Gaussian core, it will be round and the blur radius for this core (σ) is 1.5 * the scale of the key point. For the Gaussian kernel, the so-called “three sigma” rule applies. It consists in the fact that the value of the Gaussian nucleus is very close to zero at a distance exceeding 3 * σ. Thus, the radius of the window is defined as [3 * σ].

Направление особой точки находится из гистограммы направлений (фиг.3). Гистограмма состоит из 36 компонент, которые равномерно покрывают промежуток в 360 градусов, и формируется она следующим образом: каждая точка окна (x, y) вносит вклад, равный m*G(x, y, σ), в ту компоненту гистограммы, которая покрывает промежуток, содержащий направление градиента θ(x, y).The direction of the singular point is found from the histogram of directions (figure 3). The histogram consists of 36 components that evenly cover the gap of 360 degrees, and it is formed as follows: each point of the window (x, y) contributes m * G (x, y, σ) to that component of the histogram that covers the gap containing the direction of the gradient θ (x, y).

Направление ключевой точки лежит в промежутке, покрываемом максимальной компонентой гистограммы. Значения максимальной компоненты (max) и двух соседних с ней интерполируются параболой, и точка максимума этой параболы берется в качестве направления ключевой точки. Если в гистограмме есть еще компоненты с величинами не меньше 0.8*max, то они аналогично интерполируются и дополнительные направления приписываются ключевой точке.The direction of the key point lies in the gap covered by the maximum histogram component. The values of the maximum component (max) and two neighboring components are interpolated by a parabola, and the maximum point of this parabola is taken as the direction of the key point. If there are more components in the histogram with values of at least 0.8 * max, then they are likewise interpolated and additional directions are assigned to the key point.

На следующем этапе происходит построение дескрипторов. В методе SIFT дескриптором является вектор. Как и направление ключевой точки, дескриптор вычисляется на гауссиане, ближайшем по масштабу к ключевой точке, и исходя из градиентов в некотором окне ключевой точки. Перед вычислением дескриптора это окно поворачивают на угол направления ключевой точки, чем и достигается инвариантность относительно поворота.The next step is the construction of descriptors. In the SIFT method, the descriptor is a vector. Like the direction of the cue point, the handle is calculated on the Gaussian closest in scale to the cue point, and based on the gradients in a certain cue point window. Before calculating the descriptor, this window is rotated by the angle of direction of the key point, thereby achieving invariance with respect to rotation.

Фиг.3 схематично показывает часть изображения и полученный на ее основе дескриптор. Здесь изображены пиксели, обозначенные маленькими квадратиками. Эти пиксели берутся из квадратного окна дескриптора, которое в свою очередь поделено еще на четыре равные части (регионы). Центр этого окна находится между пикселями. Его надо выбирать как можно ближе к точным координатам ключевой точки. Круг обозначает окно свертки с гауссовым ядром (аналогично окну для вычисления направления ключевой точки). Для этого ядра определяется а, равное половине ширины окна дескриптора. В дальнейшем значение каждой точки окна дескриптора будет домножаться на значение гауссова ядра в этой точке, как на весовой коэффициент.Figure 3 schematically shows a part of the image and a descriptor derived from it. Here are the pixels indicated by small squares. These pixels are taken from the square window of the descriptor, which in turn is divided into four more equal parts (regions). The center of this window is between the pixels. It should be chosen as close as possible to the exact coordinates of the key point. A circle denotes a convolution window with a Gaussian core (similar to a window for calculating the direction of a key point). For this kernel, a is determined equal to half the width of the handle window. In the future, the value of each point in the window of the descriptor will be multiplied by the value of the Gaussian core at this point, as a weight coefficient.

Справа схематически изображен дескриптор особой точки, размерности 2×2×8. Первые две цифры в значении размерности - это количество регионов по горизонтали и вертикали. Те квадраты, которые охватывали некоторый регион пикселей на левом изображении, справа охватывают гистограммы, построенные на пикселях этих регионов. Соответственно, третья цифра в размерности дескриптора означает количество компонент гистограммы этих регионов.On the right, the descriptor of a singular point, dimension 2 × 2 × 8, is schematically shown. The first two digits in the dimension value are the number of regions horizontally and vertically. Those squares that covered a certain region of pixels in the left image, on the right, cover histograms built on the pixels of these regions. Accordingly, the third digit in the dimension of the descriptor means the number of components of the histogram of these regions.

Каждому градиенту в окне дескриптора можно приписать три вещественные координаты (x, y, n), где x - расстояние до градиента по горизонтали, y - расстояние по вертикали, n - расстояние до направления градиента в гистограмме (имеется в виду, соответствующая гистограмма дескриптора, в которую вносит вклад этот градиент). За точку отсчета принимается левый нижний угол окна дескриптора и начальное значение гистограммы. За единичные отрезки берутся размеры регионов по горизонтали и вертикали для x и y соответственно, и количество градусов в компоненте гистограммы для n. Коэффициент трилинейной интерполяции определяется для каждой координаты (x, y, n) градиента как 1-d, где d равно расстоянию от координаты градиента до середины того единичного промежутка в который эта координата попала. Каждое вхождение градиента в гистограмму умножается на все три весовых коэффициента трилинейной интерполяции.Three real coordinates (x, y, n) can be assigned to each gradient in the descriptor window, where x is the horizontal distance to the gradient, y is the vertical distance, n is the distance to the gradient direction in the histogram (meaning the corresponding histogram of the descriptor, to which this gradient contributes). The reference point is the lower left corner of the handle window and the initial value of the histogram. For single segments, we take the horizontal and vertical sizes of the regions for x and y, respectively, and the number of degrees in the histogram component for n. The coefficient of trilinear interpolation is determined for each coordinate (x, y, n) of the gradient as 1-d, where d is the distance from the coordinate of the gradient to the middle of the unit interval in which this coordinate fell. Each occurrence of the gradient in the histogram is multiplied by all three weighting coefficients of trilinear interpolation.

Дескриптор ключевой точки состоит из всех полученных гистограмм. Полученный дескриптор нормализуется, после чего все его компоненты, значение которых больше 0,2, урезаются до значения 0,2 и затем дескриптор нормализуется еще раз. В таком виде дескрипторы готовы к использованию.The cue point descriptor consists of all the histograms received. The resulting descriptor is normalized, after which all its components, the value of which is greater than 0.2, are trimmed to 0.2 and then the descriptor is normalized again. As such, the descriptors are ready for use.

На следующих этапах происходит сопоставление дескрипторов исходных изображений и изображений в цветовом пространстве YIQ. На заключительном этапе происходит объединение результатов сопоставления.In the next steps, the descriptors of the original images are compared with the images in the YIQ color space. At the final stage, the results of the comparison are combined.

Устройство поиска дубликатов изображений (фиг.4) содержит вход, который подключен к входу блока предобработки первого изображения 1 и к входу блока предобработки второго изображения 2, блок 1 состоит из блока регистрации первого изображения 1.1, выход которого подключен к входу блока преобразования первого изображения в цветовое пространство YIQ 1.2, выход которого подключен к входу блока выделения синфазной составляющей первого изображения 1.3, выход которого подключен к входу блока формирования изображений в результате вращения первого изображения 1.4, выход которого подключен к входу блока формирования изображения при изменении угла наклона первого изображения 1.5, выход которого подключен к входу блока хранения моделированных изображений для первого изображения 1.6, выход которого подключен к первому входу блока применения метода SIFT; выход блока регистрации второго изображения 2.1 подключен ко входу блока преобразования второго изображения в цветовое пространство YIQ 2.2, выход которого подключен ко входу блока выделения синфазной составляющей второго изображения 2.3, выход которого подключен к входу блока формирования изображений в результате вращения второго изображения 2.4, выход которого подключен к входу блока формирования изображения при изменении угла наклона второго изображения 2.5, выход которого подключен к входу блока хранения моделированных изображений для второго изображения 2.6, выход которого подключен ко второму входу блока применения метода SIFT (блок-схема устройства представлена на фиг.5, патент № US6711293B1, МПК G06К 9/68), выход которого подключен к входу блока вычисления количества одинаковых дескрипторов 4, выход которого подключен к входу блока хранения найденной пары дубликатов 5, выход которого является информационным выходом устройства.The duplicate image search device (Fig. 4) contains an input that is connected to the input of the preprocessing unit of the first image 1 and to the input of the preprocessing unit of the second image 2, block 1 consists of a first image 1.1 registration unit, the output of which is connected to the input of the first image conversion unit YIQ 1.2 color space, the output of which is connected to the input of the in-phase component extraction unit of the first image 1.3, the output of which is connected to the input of the image forming unit as a result of rotation of the first 1.4 expressions, whose output is connected to the input of the imaging unit by changing the angle of inclination of the first picture 1.5, the output of which is connected to an input of the simulated image storage unit 1.6 for the first image, whose output is connected to the first input of the SIFT method of application; the output of the second image registration unit 2.1 is connected to the input of the second image conversion unit in the YIQ 2.2 color space, the output of which is connected to the input of the in-phase component extraction unit of the second image 2.3, the output of which is connected to the input of the image forming unit as a result of rotation of the second image 2.4, the output of which is connected to the input of the imaging unit when changing the angle of inclination of the second image 2.5, the output of which is connected to the input of the storage unit of simulated images for the second image 2.6, the output of which is connected to the second input of the SIFT method application block (the block diagram of the device is shown in FIG. connected to the input of the storage unit of the found pair of duplicates 5, the output of which is the information output of the device.

Устройство поиска дубликатов изображений реализуется и работает следующим образом. На входы блоков регистрации первого 1.1 и второго 2.1 изображения поступает два изображения, на которых необходимо найти соответствующие дескрипторы и установить их похожесть. С целью обработки изображений с зеркальными поверхностями изображения преобразуются в цветовое пространство YIQ в блоках преобразования изображения в цветовое пространство YIQ 1.2 и 2.2. Далее происходит выделение синфазной составляющей цветового пространства YIQ в блоках 1.3 и 2.3. В следующих блоках 1.4, 2.4, 1.5, 2.5 формируются изображения в результате вращения и изменения угла наклона, то есть применения аффинных преобразований. В блоках 1.6 и 2.6 записываются все полученные изображения в результате данных преобразований. В блоке применения метода SIFT осуществляется построение SIFT-дескрипторов всех изображений указанным методом. Полученные дескрипторы для пары изображений поступают на вход блока 4, в котором вычисляется количество одинаковых дескрипторов и принимается решение о том, что второе изображение является дубликатом первого, после чего найденные дубликаты записываются в блок хранения найденной пары дубликатов 5.A device for searching for duplicate images is implemented and operates as follows. Two images enter the inputs of the registration blocks of the first 1.1 and second 2.1 images, on which it is necessary to find the corresponding descriptors and establish their similarity. For the purpose of processing images with mirror surfaces, images are converted into YIQ color space in blocks for converting images into YIQ 1.2 and 2.2 color space. Next, the in-phase component of the YIQ color space is selected in blocks 1.3 and 2.3. In the following blocks 1.4, 2.4, 1.5, 2.5, images are formed as a result of rotation and a change in the angle of inclination, that is, the use of affine transformations. In blocks 1.6 and 2.6, all received images are recorded as a result of these transformations. In the application block of the SIFT method, the SIFT descriptors of all images are constructed by the specified method. The resulting descriptors for a pair of images are sent to the input of block 4, in which the number of identical descriptors is calculated and the decision is made that the second image is a duplicate of the first, after which the found duplicates are written to the storage unit of the found pair of duplicates 5.

Технический результат - сопоставление дескрипторов применительно к задаче поиска дубликатов изображений.EFFECT: comparison of descriptors as applied to the task of searching for duplicate images.

Claims

A duplicate image retrieval device comprising a preprocessing unit for a first image and a preprocessing unit for a second image, a preprocessing unit for a first image comprising a first image registration unit; an image forming unit as a result of rotation of the first image, the output of which is connected to the input of the image forming unit when the angle of inclination of the first image changes, the output of which is connected to the input of the simulated image storage unit for the first image, the output of which is connected to the first input of the SIFT method application block; the second image registration unit, the image forming unit as a result of rotation of the second image, the output of which is connected to the input of the image forming unit when the angle of the second image changes, the output of which is connected to the input of the simulated image storage unit for the second image, the output of which is connected to the second input of the application unit SIFT method, the storage unit for the found pair of duplicates, the output of which is the information output of the device, characterized in that the output of the register block radios of the first image are connected to the input of the unit for converting the first image into the YIQ color space, the output of which is connected to the input of the in-phase component extraction unit of the first image; the output of the second image registration unit is connected to the input of the second image conversion unit in the YIQ color space, the output of which is connected to the input of the in-phase component extraction unit of the second image; the output of the application block of the SIFT method is connected to the input of the unit for calculating the number of identical descriptors, the output of which is connected to the input of the storage unit of the found pair of duplicates.