JP2015158848A

JP2015158848A - Image retrieval method, server, and image retrieval system

Info

Publication number: JP2015158848A
Application number: JP2014034008A
Authority: JP
Inventors: 大輔松原; Daisuke Matsubara; 廣池　敦; Atsushi Hiroike; 敦廣池
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2014-02-25
Filing date: 2014-02-25
Publication date: 2015-09-03
Also published as: WO2015129318A1

Abstract

PROBLEM TO BE SOLVED: To suppress excess of operational costs when creating learning data after installing a camera.SOLUTION: A computer including a processor and a memory detects a first object and a second object from an inputted image, extracts a first image feature amount of the first object and a second image feature amount of the second object, determines that the first object and the second object are different objects, generates a transformation matrix such that the variance between the first image feature amount and the second image feature amount, namely different objects, becomes large, and converts the image feature amount by the transformation matrix and then stores the image feature amount.

Description

本発明は、画像検索システム及び方法に関し、計算機での情報検索に関する。 The present invention relates to an image search system and method, and relates to information search in a computer.

近年、凶悪犯罪の増加やセキュリティ意識の向上に伴い、店舗や空港、道路などの人が集まる場所に、多くの監視カメラが設置されつつある。これらの監視カメラで撮影された映像は、監視レコーダなどの蓄積装置に格納され、必要に応じて閲覧される。しかしながら、ＩＰカメラ（ネットワーク接続カメラ）の普及によってネットワーク経由で多数のカメラが接続可能になり、また蓄積装置の大容量化が進んだことも相まって、膨大な量の映像が蓄積されつつある。したがって、従来のように目視で全映像データを確認することは非常に困難になっている。 In recent years, with the increase in violent crimes and security awareness, many surveillance cameras are being installed in places where people gather, such as stores, airports, and roads. Images taken by these surveillance cameras are stored in a storage device such as a surveillance recorder and viewed as necessary. However, with the widespread use of IP cameras (network connection cameras), a large number of cameras can be connected via a network, and the capacity of storage devices has been increased. Accordingly, it is very difficult to visually confirm all video data as in the prior art.

そこで、蓄積装置の大量の映像データの中から、特定の人物や物体が映っている場面を検索して提示するために、様々な類似検索技術が提案されている。ここで、類似検索技術とは、ユーザが指定した検索クエリに類似したデータを対象データ内から探して、その結果を提示する技術を指す。特に、類似画像検索技術とは、画像自体から抽出される色合いや形状、構図等の特徴量を用いて、特徴量間の類似度が大きいデータを検索する技術である。例えば、人物を検索する際には、顔画像のエッジパターンや服の色ヒストグラムなどのベクトルデータを特徴量として用いることができる。また、特徴量ベクトル間の距離が小さいほど類似度が大きくなる。 Therefore, various similar search techniques have been proposed in order to search and present a scene where a specific person or object is shown from a large amount of video data stored in the storage device. Here, the similar search technique refers to a technique in which data similar to a search query designated by a user is searched from target data and the result is presented. In particular, the similar image search technique is a technique for searching for data having a high degree of similarity between feature quantities using feature quantities such as hue, shape, composition, etc. extracted from the image itself. For example, when searching for a person, vector data such as an edge pattern of a face image and a clothes color histogram can be used as a feature amount. Also, the similarity increases as the distance between feature quantity vectors decreases.

しかしながら、一般的にこのような特徴量は数百次元から数千次元といった高次元ベクトルになるため、特徴量ベクトル間の距離を計算する際に、計算量が大きいことが課題となっている。 However, in general, such a feature quantity is a high-dimensional vector such as several hundreds to thousands of dimensions. Therefore, when calculating the distance between feature quantity vectors, a large amount of computation is a problem.

そこで、高次元の特徴量ベクトルを低次元に圧縮して、距離計算の回数を削減する必要がある。高次元ベクトルを低次元に圧縮する方法として、判別分析を用いる方法が提案されている。 Therefore, it is necessary to reduce the number of distance calculations by compressing a high-dimensional feature quantity vector to a low dimension. A method using discriminant analysis has been proposed as a method of compressing a high-dimensional vector to a low dimension.

特許文献１では、入力された特徴量ベクトルから文字画像や顔画像の判別に有効な特徴量ベクトルを得るために、判別分析を用いて特徴量ベクトルを変換する技術が開示されている。 Patent Document 1 discloses a technique for converting a feature quantity vector using discriminant analysis in order to obtain a feature quantity vector effective for discrimination of a character image or a face image from an input feature quantity vector.

特許文献２では、文字画像を対象にして判別分析を用いて次元圧縮を行う際に、高品質な画像データと低品質な画像データの両方を用いることで、精度を向上する技術が開示されている。 Patent Document 2 discloses a technique for improving accuracy by using both high-quality image data and low-quality image data when performing dimensional compression using discriminant analysis on a character image. Yes.

特開２００９−１４０５１３JP 2009-140513 A 特開２００４−３１０６３９JP 2004-310639 A

判別分析とは、クラスと特徴量ベクトルが対となっている形式の学習データが与えられたとき、クラス間の分散を大きくし、クラス内の分散を小さくするような特徴量ベクトル変換行列を求めるための教師あり次元削減方法である。以下、この変換行列を判別行列と呼ぶ。 Discriminant analysis is to obtain a feature vector transformation matrix that increases the variance between classes and reduces the variance within a class when learning data in the form of pairs of classes and feature vectors is given. This is a supervised dimension reduction method. Hereinafter, this transformation matrix is referred to as a discrimination matrix.

顔画像から抽出した特徴量ベクトルに対して判別分析による次元圧縮を行う場合、同一人物の顔画像の集合を上記同一クラスとして扱う。したがって、本人同士のベクトル間距離が小さく、他人同士のベクトル間距離が大きくなるような判別行列を求めることになる。つまり、顔の向きや表情、照明条件が異なる場合でも本人同士の類似度が大きくなり、同一環境で撮影した場合でも他人同士の類似度が小さくなるように変換される。 When performing dimension compression by discriminant analysis on a feature vector extracted from a face image, a set of face images of the same person is treated as the same class. Therefore, a discriminant matrix is obtained in which the distance between the vectors of the principals is small and the distance between the vectors of the others is large. That is, even when the face direction, facial expression, and lighting conditions are different, the degree of similarity between the persons is increased, and the degree of similarity between the other persons is reduced even when shooting in the same environment.

類似画像検索に適用する場合は、学習データ全体から一つの判別行列を作成し、顔画像から抽出した特徴量ベクトル全てに対して、この判別行列を用いた射影を行う。そして、射影された特徴量ベクトル同士のベクトル間距離の計算を行い、距離の値が小さいものから順にソートすることで、類似した顔画像を検索する。よって、判別分析による次元削減を行った特徴量ベクトルを用いて類似画像検索を行うと、本人を見つける精度が向上すると推定される。 When applied to similar image search, one discriminant matrix is created from the entire learning data, and projection using this discriminant matrix is performed on all feature quantity vectors extracted from the face image. Then, the vector distance between the projected feature quantity vectors is calculated, and similar face images are searched by sorting in order from the smallest distance value. Therefore, it is presumed that the accuracy of finding the person himself / herself is improved when a similar image search is performed using a feature quantity vector subjected to dimension reduction by discriminant analysis.

以下、判別分析を用いた次元圧縮方法について説明する。ここでは、ある顔画像から抽出したｄ次元の特徴量ベクトルｘを、判別分析によってｄ’次元の特徴量ベクトルに変換するための判別行列Φの生成方法について述べる。なお、ｄ次元は、顔画像から抽出した画像の数の次元数である。また、ｄ’次元は、圧縮後の次元数で、必要とされる精度や計算機の性能などに応じて設定された次元数である。 Hereinafter, a dimension compression method using discriminant analysis will be described. Here, a method for generating a discriminant matrix Φ for converting a d-dimensional feature quantity vector x extracted from a face image into a d′-dimensional feature quantity vector by discriminant analysis will be described. The d dimension is the number of dimensions of the number of images extracted from the face image. The d ′ dimension is the number of dimensions after compression, and is the number of dimensions set according to the required accuracy, computer performance, and the like.

まず、次式に示すように、本人同士、つまり同一クラスに属するデータを用いてクラス内分散行列Ｗを、他人同士、つまり異クラスに属するデータを用いてクラス間分散行列Ｂを計算する。 First, as shown in the following equation, intra-class variance matrix W is calculated using data belonging to each other, that is, data belonging to the same class, and inter-class variance matrix B is calculated using data belonging to other people, that is, different classes.

ここで、クラス数はｃ≧２、総データ数はｎ、データ集合はＸ＝｛ｘ｝、データ全体の平均値は_aveｘである。また、クラスｉのデータ集合をｘ_i、データ集合ｘ_iのデータ数をｎ_i、データ数ｎ_iのデータの平均を_aveｘ_iとする。また、Ｔは転置行列を示す。 Here, the number of classes is c ≧ 2, the total number of data is n, the data set is X = {x}, and the average value of the entire data is _ave x. Further, the data set of classes i x _i, the number of data of the data set x _i n _i, the average of the data of the data number n _i and _ave x _i. T represents a transposed matrix.

これらのクラス内分散行列Ｗとクラス間分散行列Ｂを用いて、次の（３）式を満たす固有ベクトル行列Ψと固有値行列Λを求める。 Using the intraclass variance matrix W and the interclass variance matrix B, an eigenvector matrix Ψ and an eigenvalue matrix Λ satisfying the following equation (3) are obtained.

ＢΨ ＝ＷΨΛ ・・・（３） BΨ = WΨΛ (3)

ここで、Ψは固有ベクトルψi (ｉ＝０、…、ｄ）を列ベクトルとする行列であり、Λは固有値λ_i)（λ₁≧λ₂≧…≧λ_d）を対角要素に持つ行列である。こうして得られた固有ベクトルを固有値の大きい順にｄ’個並べた行列Φ＝｛φ１、φ２、…、φｄ｝が判別行列となる。尚、この判別行列Φを用いて射影した空間を判別空間と呼ぶ。 Here, ψ is a matrix having eigenvectors ψi (i = 0,..., D) as column vectors, and Λ is a matrix having eigenvalues λ _i) (λ ₁ ≧ λ ₂ ≧ ... ≧ λ _d ) as diagonal elements. It is. A matrix Φ = {φ1, φ2,..., Φd} in which d ′ eigenvectors obtained in this way are arranged in descending order of eigenvalues is a discrimination matrix. A space projected using this discriminant matrix Φ is called a discriminant space.

圧縮前のｄ次元の特徴量ベクトルＸと判別行列Φを用いて、圧縮後のｄ’次元の特徴量ベクトルＹは次の（４）式のように表される。 Using the d-dimensional feature vector X before compression and the discriminant matrix Φ, the d′-dimensional feature vector Y after compression is expressed as the following equation (4).

Ｙ＝ Φ^TＸ・・・（４） Y = Φ ^T X (4)

なお、圧縮後の次元数ｄ’と学習データのクラス数ｃは、次の（５）式のような関係になる。 Note that the dimension number d ′ after compression and the class number c of the learning data have a relationship as shown in the following equation (5).

ｍ≦（ｃ−１）・・・（５） m ≦ (c-1) (5)

また、クラス内分散行列Ｗを使用せずに、クラス間分散行列Ｂのみを使用して、次の（６）式を満たす固有ベクトル行列Ψ’と固有値行列Λ’を求めることで、判別行列を作成することも可能である。 Also, the discriminant matrix is created by obtaining the eigenvector matrix Ψ ′ and the eigenvalue matrix Λ ′ satisfying the following equation (6) using only the interclass variance matrix B without using the intraclass variance matrix W. It is also possible to do.

ＢΨ’＝Ψ’Λ’ ・・・（６） BΨ ′ = Ψ′Λ ′ (6)

このように判別分析を用いて次元圧縮を行う場合は、人物の顔画像が映っている学習データを予め用意し、各顔画像を人物ごとに分類する必要がある。また、人物の特徴を保持したまま特徴量を圧縮できる量には限度があるため、一般的に数千次元の特徴量を圧縮する場合は数百次元の特徴量になると考えられる。 When dimensional compression is performed using discriminant analysis in this way, it is necessary to prepare in advance learning data showing a person's face image and classify each face image for each person. In addition, since there is a limit to the amount of the feature that can be compressed while retaining the features of a person, it is generally considered that when a feature amount of several thousand dimensions is compressed, the feature amount is several hundred dimensions.

したがって、上記（５）式に示したように、数百人以上の異なる人物が写っている画像を学習データとして収集する必要がある。さらに、クラス内分散を計算するためには、同一人物の学習データも多数必要である。以上の理由から、人手で学習データを作成するためには非常に多くの時間を必要としていた。 Therefore, as shown in the above equation (5), it is necessary to collect as a learning data an image in which several hundred or more different people are shown. Furthermore, in order to calculate intra-class variance, a lot of learning data of the same person is required. For these reasons, it takes a great deal of time to create learning data manually.

一方、顔認証装置のように、人物が撮影される環境の照明条件や顔画像の向きや大きさが統制されていて変化がない場合は、一度作成した判別空間を別の場所でも使用できると考えられる。したがって、統制環境向けに次元圧縮を行う場合は、初期の学習データ作成に多くの時間が必要となるが、同じ学習データを使いまわすことが可能であった。 On the other hand, if the lighting conditions of the environment where the person is photographed and the orientation and size of the face image are controlled and do not change, as in the face authentication device, the discriminant space created once can be used in another location. Conceivable. Therefore, when dimensional compression is performed for a controlled environment, it takes a lot of time to create initial learning data, but it is possible to reuse the same learning data.

しかしながら、カメラの撮影パラメータが異なる場合や、周囲の照明条件、人物が写る角度や大きさなどの撮影環境が異なる場合、適切な判別空間は異なる可能性が高い。例えば、証明写真のようにカメラに正対した顔写真を用いて判別空間を学習した場合、斜めを向いた顔画像や照明が暗い環境では適切な射影を行うことができない。 However, when the shooting parameters of the camera are different, or when the shooting environment such as the surrounding lighting conditions and the angle and size in which a person is photographed is different, the appropriate discrimination space is likely to be different. For example, when a discriminant space is learned using a face photograph that faces the camera, such as an ID photograph, an appropriate projection cannot be performed in an environment in which the face image facing obliquely or the illumination is dark.

したがって、監視カメラのように周囲の環境や人物の行動が予測できない非統制な状況で撮影された画像を対象に類似顔画像検索を行う場合、対象となる監視カメラで撮影された顔画像を用いて学習データを作成することが望ましい。 Therefore, when a similar face image search is performed for an image taken in an uncontrolled situation where the surrounding environment and human behavior cannot be predicted, such as a surveillance camera, the face image taken by the subject surveillance camera is used. It is desirable to create learning data.

以上のことから、異なる環境で事前に作成した判別空間を用いても高精度な次元圧縮を行うことはできない。よって、実際に使用される場所に設置された多数のカメラに写った顔画像を用いて、判別空間を学習して判別行列を一つ作成し、この判別行列を用いて特徴量ベクトルを射影することが求められる。この場合、事前に学習データを作成することができないため、カメラを設置した後に学習データを作成することになり、運用コストが非常に大きくなることが課題である。 From the above, high-precision dimensional compression cannot be performed even if a discriminant space created in advance in different environments is used. Therefore, using the face images captured by a number of cameras installed in actual locations, learning the discriminant space and creating one discriminant matrix, and projecting the feature vector using this discriminant matrix Is required. In this case, since learning data cannot be created in advance, learning data is created after the camera is installed, and the operation cost is very high.

プロセッサとメモリを備えた計算機で画像を検索する画像検索方法であって、前記計算機が、入力された画像から第１のオブジェクトと第２のオブジェクトとを検知する第１のステップと、前記計算機が、前記第１のオブジェクトの第１の画像特徴量と、前記第２のオブジェクトの第２の画像特徴量とを抽出する第２のステップと、前記計算機が、前記第１のオブジェクトと前記第２のオブジェクトとが異なるオブジェクトであると判定する第３のステップと、前記計算機が、異なるオブジェクト同士となる前記第１の画像特徴量と前記第２の画像特徴量との間の分散が大きくなるような変換行列を生成する第４のステップと、前記計算機が、前記変換行列を用いて前記画像特徴量を変換した後の画像特徴量を格納する第５のステップと、を含む。 An image search method for searching for an image using a computer having a processor and a memory, wherein the computer detects a first object and a second object from an input image, and the computer , A second step of extracting a first image feature quantity of the first object and a second image feature quantity of the second object, and the computer comprising the first object and the second object. A third step of determining that the object is different from the object, and the computer having a large variance between the first image feature quantity and the second image feature quantity that are different objects A fourth step of generating a simple transformation matrix, and a fifth step of storing the image feature quantity after the computer has transformed the image feature quantity using the transformation matrix. No.

本発明によれば、異なるオブジェクト同士の特徴量間の分散Ｂを大きくするために、同一画像中のオブジェクトは異なるオブジェクトであると決定することで、よりよい変換行列が生成でき、検索精度が向上する。そして、変換行列を作成する学習データを自動的に収集できるため、学習データを作成する処理を低減し、システムの運用コストを抑制することができる。 According to the present invention, in order to increase the variance B between feature quantities of different objects, it is possible to generate a better transformation matrix and improve search accuracy by determining that the objects in the same image are different objects. To do. Since learning data for creating a transformation matrix can be automatically collected, the process for creating learning data can be reduced and the operating cost of the system can be suppressed.

本発明の第１の実施例を示し、画像検索システムの構成を示すブロック図である。1 is a block diagram illustrating a configuration of an image search system according to a first embodiment of this invention. 本発明の第１の実施例を示し、特徴量管理情報を示す説明図である。It is explanatory drawing which shows the 1st Example of this invention and shows feature-value management information. 本発明の第１の実施例を示し、別人情報の生成を示し、カメラの画像である。FIG. 2 shows a first embodiment of the present invention, showing generation of another person information, and an image of a camera. FIG. 本発明の第１の実施例を示し、別人情報の生成を示し、他のカメラの画像である。Fig. 3 shows a first embodiment of the present invention, showing generation of different person information, and an image of another camera. 本発明の第１の実施例を示し、判別行列生成処理を示すブロック図である。It is a block diagram which shows the 1st Example of this invention and shows a discrimination matrix production | generation process. 本発明の第１の実施例を示し、特徴量ベクトル登録処理を示すフローチャートである。It is a flowchart which shows the 1st Example of this invention and shows the feature-value vector registration process. 本発明の第１の実施例を示し、検索処理を示すフローチャートである。It is a flowchart which shows the 1st Example of this invention and shows a search process. 本発明の第２の実施例を示し、別人情報および同一人物情報の生成を示す模式図である。It is a schematic diagram which shows 2nd Example of this invention and shows the production | generation of another person information and the same person information. 本発明の第２の実施例を示し、判別行列生成処理を示すフローチャートである。It is a flowchart which shows the 2nd Example of this invention and shows discriminant matrix production | generation processing.

以下、本発明の実施形態を添付図面に基づいて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings.

以下、本発明の第１の実施例の画像検索システムについて、図面に従って説明する。 Hereinafter, an image search system according to a first embodiment of the present invention will be described with reference to the drawings.

図１は、第１の実施例の画像検索システムの構成を示すブロック図である。 FIG. 1 is a block diagram showing the configuration of the image search system of the first embodiment.

第１の実施例の画像検索システムは、サーバ計算機１１０、クライアント計算機１３０、判別行列情報１４０、及び検索データベース１５０、カメラ１６０を備える。各装置は、通信基盤１２０によって相互に接続される。 The image search system of the first embodiment includes a server computer 110, a client computer 130, discriminant matrix information 140, a search database 150, and a camera 160. Each device is connected to each other by a communication infrastructure 120.

サーバ計算機１１０は、外部インタフェース１１１、中央処理演算装置（ＣＰＵ）１１２、メモリ１１３及び大容量外部記憶装置（ＨＤ）１１４を備える。 The server computer 110 includes an external interface 111, a central processing unit (CPU) 112, a memory 113, and a large-capacity external storage device (HD) 114.

外部インタフェース１１１は、サーバ計算機１１０を通信基盤１２０に接続するためのインタフェース（Ｉ／Ｆ）である。ＣＰＵ１１２は、サーバ計算機１１０の処理を実行するプロセッサである。メモリ１１３は、ＣＰＵ１１２によって実行される処理のための作業領域であり、各種データ、及び、ＨＤ１１４からロードされたプログラムを格納する。ＨＤ１１４は、ハードディスクなどの大容量記憶装置であり、ＣＰＵ１１２によって実行されるプログラム、データ（判別行列情報１４０、検索データベース１５０）などを格納する。なお、ＨＤ１１４は、サーバ計算機１１０に接続された外部の記憶装置であってもよい。 The external interface 111 is an interface (I / F) for connecting the server computer 110 to the communication infrastructure 120. The CPU 112 is a processor that executes processing of the server computer 110. The memory 113 is a work area for processing executed by the CPU 112, and stores various data and programs loaded from the HD 114. The HD 114 is a mass storage device such as a hard disk, and stores programs executed by the CPU 112, data (discriminant matrix information 140, search database 150), and the like. The HD 114 may be an external storage device connected to the server computer 110.

クライアント計算機１３０は、通信基盤１２０に接続される計算機である。図１には１つのクライアント計算機１３０を示すが、任意の数のクライアント計算機１３０を備えてもよい。尚、クライアント計算機１３０と同等の機能をサーバ計算機１１０が備えている場合、全ての処理をサーバ計算機１１０で行っても良い。 The client computer 130 is a computer connected to the communication infrastructure 120. Although FIG. 1 shows one client computer 130, any number of client computers 130 may be provided. If the server computer 110 has a function equivalent to that of the client computer 130, all processing may be performed by the server computer 110.

クライアント計算機１３０は、いかなる構成の計算機であってもよい。図１には、典型的なクライアント計算機１３０の構成を示す。すなわち、図１のクライアント計算機１３０は、ＣＰＵ１３１、メモリ１３２、Ｉ／Ｆ１３３、入力装置１３４及び出力装置１３５を備える。 The client computer 130 may be a computer having any configuration. FIG. 1 shows a configuration of a typical client computer 130. That is, the client computer 130 of FIG. 1 includes a CPU 131, a memory 132, an I / F 133, an input device 134, and an output device 135.

ＣＰＵ１３１は、メモリ１３２に格納されたプログラムを実行するプロセッサである。メモリ１３２は、ＣＰＵ１３１によって実行されるプログラム等を格納する記憶装置である。Ｉ／Ｆ１３３は、通信基盤１２０に接続され、クライアント計算機１３０とサーバ計算機１１０との間の通信に使用されるインタフェースである。入力装置１３４は、クライアント計算機１３０のユーザから入力を受け付ける装置である。入力装置１３４は、例えば、キーボード又はマウス等である。出力装置１３５は、クライアント計算機１３０のユーザに情報を表示する装置である。例えばＣＲＴ又は液晶ディスプレイのような画像表示装置である。なお、入力装置１３４及び出力装置１３５としては、タッチセンサを備えたディスプレイを入出力装置として用いてもよい。 The CPU 131 is a processor that executes a program stored in the memory 132. The memory 132 is a storage device that stores programs executed by the CPU 131. The I / F 133 is an interface connected to the communication infrastructure 120 and used for communication between the client computer 130 and the server computer 110. The input device 134 is a device that receives input from the user of the client computer 130. The input device 134 is, for example, a keyboard or a mouse. The output device 135 is a device that displays information to the user of the client computer 130. For example, an image display device such as a CRT or a liquid crystal display. As the input device 134 and the output device 135, a display provided with a touch sensor may be used as an input / output device.

なお、本実施例の画像検索システムは、通信基盤１２０（ネットワーク）を介して接続されたサーバ計算機１１０とクライアント計算機１３０とがサービスを提供する構成であるが、一般的なパーソナルコンピュータが画像検索のアプリケーションによってサービスを提供する構成であってもよい。 The image search system according to the present embodiment has a configuration in which a server computer 110 and a client computer 130 connected via a communication infrastructure 120 (network) provide a service. The configuration may be such that a service is provided by an application.

判別行列情報１４０は、特徴量ベクトルの次元圧縮を行うための判別行列（または変換行列）３００を格納している。なお、判別行列３００を転置した行列を格納しても良い。 The discriminant matrix information 140 stores a discriminant matrix (or transformation matrix) 300 for performing dimensional compression of the feature vector. Note that a matrix obtained by transposing the discriminant matrix 300 may be stored.

また、検索データベース１５０は、検索対象とする画像から抽出された画像特徴量（特徴量ベクトル）を格納するためのデータベースであり、例えば、特徴量管理情報２００（図２参照）を格納する。 The search database 150 is a database for storing image feature amounts (feature amount vectors) extracted from images to be searched, and stores, for example, feature amount management information 200 (see FIG. 2).

カメラ１６０ａから１６０ｎは、監視対象エリアに設置したカメラである。以下、カメラ１６０ａから１６０ｎの総称を、カメラ１６０と呼ぶ。なお、処理対象となる映像もしくは画像を事前に撮影しており、クライアント計算機１３０からサーバ計算機１１０に全ての映像もしくは画像を送信する場合は、カメラ１６０を備えなくても良い。あるいは、処理対象となる画像データ（映像または画像）を、予めＨＤ１１４に格納しておいても良い。または、カメラ１６０から受信した画像データを、ＨＤ１１４に格納してもよい。 Cameras 160a to 160n are cameras installed in the monitoring target area. Hereinafter, the generic name of the cameras 160a to 160n is referred to as the camera 160. Note that when the video or image to be processed is captured in advance and all the video or image is transmitted from the client computer 130 to the server computer 110, the camera 160 may not be provided. Alternatively, image data (video or image) to be processed may be stored in the HD 114 in advance. Alternatively, the image data received from the camera 160 may be stored in the HD 114.

ＣＰＵ１１２は、各プログラムの処理を実行することによって、所定の機能を提供する機能部として稼働する。例えば、ＣＰＵ１１２は、判別行列生成プログラム４００に従って処理することで判別行列生成部として機能する。ここで、判別行列生成部は、図４で示すように、画像取得部４０１、顔検知処理部４０２、人物情報生成部４０３、特徴量抽出部４０４、クラス間分散計算部４０５、判別行列生成部４０６、及び判別行列格納部４０７の機能部を含む。 The CPU 112 operates as a functional unit that provides a predetermined function by executing processing of each program. For example, the CPU 112 functions as a discriminant matrix generation unit by performing processing according to the discriminant matrix generation program 400. Here, as shown in FIG. 4, the discriminant matrix generation unit includes an image acquisition unit 401, a face detection processing unit 402, a person information generation unit 403, a feature amount extraction unit 404, an interclass variance calculation unit 405, and a discrimination matrix generation unit. 406 and a functional unit of the discriminant matrix storage unit 407.

また、ＣＰＵ１１２は、検索用特徴量変換プログラム５００に従って処理することで検索用特徴量変換部として機能する。ここで、検索用特徴量変換部は、図５で示すように、画像取得部５０１、顔検知処理部５０２、特徴量抽出部５０３、特徴量変換部５０４、及び特徴量格納部５０５の機能部を含む。 The CPU 112 functions as a search feature value conversion unit by performing processing according to the search feature value conversion program 500. Here, as shown in FIG. 5, the search feature quantity conversion unit includes functional units of an image acquisition unit 501, a face detection processing unit 502, a feature quantity extraction unit 503, a feature quantity conversion unit 504, and a feature quantity storage unit 505. including.

また、ＣＰＵ１１２は、検索プログラム６００に従って処理することで検索部として機能する。ここで、検索部は、図６で示すように、画像入力部６０１、顔検知処理部６０２、特徴量抽出部６０３、特徴量変換部６０４、類似検索部６０５、及び検索結果出力部６０６の機能部を含む。
このように、ＣＰＵ１１２は、各プログラムが実行する複数の処理のそれぞれの機能を提供する機能部としても稼働する。計算機及び計算機システムは、これらの機能部を含む装置及びシステムである。 Further, the CPU 112 functions as a search unit by performing processing according to the search program 600. Here, as illustrated in FIG. 6, the search unit includes functions of an image input unit 601, a face detection processing unit 602, a feature amount extraction unit 603, a feature amount conversion unit 604, a similarity search unit 605, and a search result output unit 606. Part.
As described above, the CPU 112 also operates as a functional unit that provides the functions of a plurality of processes executed by each program. A computer and a computer system are an apparatus and a system including these functional units.

サーバ計算機１１０の各機能を実現するプログラム、テーブル等の情報は、ＨＤ１１４や不揮発性半導体メモリ、ハードディスクドライブ、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記憶デバイス、または、ＩＣカード、ＳＤカード、ＤＶＤ等の計算機読み取り可能な非一時的データ記憶媒体に格納することができる。 Information such as a program and a table for realizing each function of the server computer 110 is stored in a storage device such as an HD 114, a nonvolatile semiconductor memory, a hard disk drive, or an SSD (Solid State Drive), or a computer such as an IC card, an SD card, or a DVD. It can be stored in a readable non-transitory data storage medium.

図２は、第１の実施例の特徴量管理情報２００を示す説明図である。 FIG. 2 is an explanatory diagram illustrating the feature amount management information 200 according to the first embodiment.

特徴量管理情報２００は、検索データＩＤ２０１、及び検索対象画像特徴量２０２を含む。検索データＩＤ２０１は、特徴量を識別するための識別子であり、画像データなどを紐付けるために使用される。検索対象画像特徴量２０２は、画像から抽出された後に判別行列３００を用いて変換された特徴量ベクトルである。 The feature amount management information 200 includes a search data ID 201 and a search target image feature amount 202. The search data ID 201 is an identifier for identifying a feature amount, and is used for associating image data and the like. The search target image feature amount 202 is a feature amount vector that is extracted from the image and then converted using the discrimination matrix 300.

なお、判別行列情報１４０及び検索データベース１５０は、サーバ計算機１１０が備えるＨＤ１１４に格納されてもよいし、ＨＤ１１４とは異なる他のハードディスクに格納されてもよい。また、画像データがＨＤ１１４に格納される場合、検索対象画像特徴量２０２に対応する画像データには検索データＩＤ２０１が付与される。あるいは、画像データが他の装置に格納される場合、検索対象画像特徴量２０２に対応する画像データに、検索データＩＤ２０１を付与する指示を送信しても良い。 The discriminant matrix information 140 and the search database 150 may be stored in the HD 114 included in the server computer 110, or may be stored in another hard disk different from the HD 114. When the image data is stored in the HD 114, the search data ID 201 is assigned to the image data corresponding to the search target image feature amount 202. Alternatively, when the image data is stored in another device, an instruction to assign the search data ID 201 to the image data corresponding to the search target image feature amount 202 may be transmitted.

図３Ａ、図３Ｂは、第１の実施例の別人情報の生成を示し、カメラ１６０ａ、１６０ｂの画像である。図４は、第１の実施例の判別行列生成プログラム４００で行われる判別行列生成処理の一例を示すブロック図である。 3A and 3B show generation of the different person information in the first embodiment, and are images of the cameras 160a and 160b. FIG. 4 is a block diagram illustrating an example of a discriminant matrix generation process performed by the discriminant matrix generation program 400 of the first embodiment.

以下、図３Ａ、図３Ｂと図４を使用して、判別行列生成処理について説明する。 Hereinafter, the discriminant matrix generation process will be described with reference to FIGS. 3A, 3B, and 4. FIG.

本実施例では、サーバ計算機１１０は判別行列生成プログラム４００を実行することで、判別行列生成部として機能する。判別行列生成部は、画像取得部４０１、顔検知処理部４０２、人物情報生成部４０３、特徴量抽出部４０４、クラス間分散計算部４０５、判別行列生成部４０６、及び判別行列格納部４０７によって、判別行列３００の生成を実行する。 In this embodiment, the server computer 110 functions as a discriminant matrix generation unit by executing the discriminant matrix generation program 400. The discriminant matrix generation unit includes an image acquisition unit 401, a face detection processing unit 402, a person information generation unit 403, a feature amount extraction unit 404, an interclass variance calculation unit 405, a discriminant matrix generation unit 406, and a discrimination matrix storage unit 407. The discriminant matrix 300 is generated.

なお、図１に示したＣＰＵ１１２は、ＨＤ１１４に格納された各種プログラムをメモリ１１３にロードし、メモリ１１３にロードされた各種プログラムを読み出し、読み出された各種プログラムを実行することによって、画像取得部４０１、顔検知処理部４０２、人物情報生成部４０３、特徴量抽出部４０４、クラス間分散計算部４０５、判別行列生成部４０６、及び判別行列格納部４０７の機能部を上述したように実現する。 The CPU 112 shown in FIG. 1 loads various programs stored in the HD 114 into the memory 113, reads the various programs loaded in the memory 113, and executes the read various programs, thereby obtaining an image acquisition unit. 401, the face detection processing unit 402, the person information generation unit 403, the feature amount extraction unit 404, the interclass variance calculation unit 405, the discrimination matrix generation unit 406, and the discrimination matrix storage unit 407 are realized as described above.

まず、画像取得部４０１では、サーバ計算機１１０が、カメラ１６０から通信基盤１２０を経由して、画像を取得する。画像取得部４０１では、学習データとして画像を取得する。なお、サーバ計算機１１０は、カメラ１６０から映像を取得した後に映像をデコードして、フレーム毎の画像を取得しても良い。また、カメラ１６０で撮影した画像もしくは映像を一旦クライアント計算機１３０に保存しておき、クライアント計算機１３０から通信基盤１２０を経由して、サーバ計算機１１０に画像もしくは映像を送信し、画像取得部４０１で受信しも良い。あるいは、学習データとして予め撮影した画像をＨＤ１１４に格納しておき、ＨＤ１１４から画像を取得（または入力）しても良い。 First, in the image acquisition unit 401, the server computer 110 acquires an image from the camera 160 via the communication infrastructure 120. The image acquisition unit 401 acquires an image as learning data. The server computer 110 may acquire the image for each frame by decoding the video after acquiring the video from the camera 160. In addition, an image or video captured by the camera 160 is temporarily stored in the client computer 130, and the image or video is transmitted from the client computer 130 to the server computer 110 via the communication infrastructure 120 and received by the image acquisition unit 401. Also good. Alternatively, an image captured in advance as learning data may be stored in the HD 114, and an image may be acquired (or input) from the HD 114.

次に、顔検知処理部４０２では、取得した画像に対して顔検知処理を実行し、画像に写った人物の顔領域を取得する。顔検知処理については、周知または公知の技術を適用すればよいので、ここでは詳述しない。 Next, the face detection processing unit 402 performs face detection processing on the acquired image, and acquires the face area of the person shown in the image. As the face detection process, a known or publicly known technique may be applied, and thus will not be described in detail here.

次に、人物情報生成部（オブジェクト情報生成部）４０３では、顔検知処理部４０２で検知した顔領域を対象に、一台のカメラ１６０から取得した画像に複数の人物が写っている場合、同時に写っている人物は別人であるとして、人物情報を生成する。 Next, in the person information generation unit (object information generation unit) 403, when a plurality of persons are captured in an image acquired from one camera 160 for the face area detected by the face detection processing unit 402, The person information is generated assuming that the person in the picture is a different person.

例えば、図３Ａ、図３Ｂを用いて説明すると、カメラ１６０ａから取得した画像３１０に人物３２０Ａ、３２０Ｂ、３２０Ｃが映っていた場合、人物３２０Ａ、３２０Ｂ、３２０Ｃはそれぞれ別の人物であると推定される。よって、人物情報生成部４０３は人物３２０Ａと３２０Ｂと３０２Ｃは各々が別人であるという情報を生成する。 For example, with reference to FIGS. 3A and 3B, when the persons 320A, 320B, and 320C appear in the image 310 acquired from the camera 160a, it is estimated that the persons 320A, 320B, and 320C are different persons. . Therefore, the person information generation unit 403 generates information that the persons 320A, 320B, and 302C are different persons.

また、カメラ１６０ｂから取得した画像３３０に人物３４０Ａ、３４０Ｂ、３４０Ｃ、３４０Ｄが映っていた場合、人物３４０Ａ、３４０Ｂ、３４０Ｃ、３４０Ｄもそれぞれ別の人物であると推定される。よって、人物情報生成部４０３は３４０Ａと３４０Ｂと３４０Ｃと３４０Ｄも各々が別人であるという情報を生成する。 Further, when the persons 340A, 340B, 340C, and 340D are shown in the image 330 acquired from the camera 160b, the persons 340A, 340B, 340C, and 340D are also estimated to be different persons. Therefore, the person information generation unit 403 generates information that 340A, 340B, 340C, and 340D are different persons.

人物情報生成部４０３は、同一人物という情報や別人の情報として、例えば、顔領域の人物毎に識別子を付与し、別人には異なる識別子を付与すれば良い。 For example, the person information generation unit 403 may assign an identifier to each person in the face area as information about the same person or information about another person, and a different identifier to another person.

次に、特徴量抽出部４０４では、顔検知処理部４０２で検知した顔領域から、顔画像特徴量としてｄ次元の特徴量ベクトルを抽出する。顔画像特徴量は例えば、エッジパターンや色ヒストグラム等に基づいて作成される多次元ベクトルである。なお、特徴量ベクトルの算出については前記エッジパターンや色ヒストグラム等の周知または公知の技術を用いればよいのでここでは詳述しない。 Next, the feature quantity extraction unit 404 extracts a d-dimensional feature quantity vector as a face image feature quantity from the face area detected by the face detection processing unit 402. The face image feature amount is, for example, a multidimensional vector created based on an edge pattern, a color histogram, or the like. Note that the calculation of the feature vector is not described in detail here because a known or publicly known technique such as the edge pattern or the color histogram may be used.

なお、前記人物情報生成部４０３と特徴量抽出部４０４の処理は、並列して行っても良いし、どちらかを先に行っても良い。 Note that the processing of the person information generation unit 403 and the feature amount extraction unit 404 may be performed in parallel, or one of them may be performed first.

次に、クラス間分散計算部４０５では、次の（７）式に従って、顔領域から抽出した特徴量ベクトルを用いて、クラス間分散Ｂを計算する。 Next, the interclass variance calculation unit 405 calculates the interclass variance B using the feature quantity vector extracted from the face area according to the following equation (7).

ここで、学習データの全フレーム数をｎ_fとすると、クラス数ｃ_j≧２はｊ番目のフレーム画像から検出された顔画像（顔領域）の数であり、ｘ_ijはｊ番目のフレーム画像のｉ番目の顔領域から抽出された特徴量ベクトルであり、特徴量ベクトルのデータの平均は_aveｘ_jである。 Here, if the total number of frames of learning data is n _f , the class number c _j ≧ 2 is the number of face images (face regions) detected from the j-th frame image, and x _ij is the j-th frame image. The feature amount vector extracted from the i-th face area of the image, and the average of the feature amount vector data is _ave x _j .

次に、判別行列生成部（変換行列生成部）４０６では、次の（８）式を満たす固有ベクトル行列Ψ_Bと固有値行列Λ_Bを求める。 Next, the discriminant matrix generation unit (conversion matrix generation unit) 406 obtains an eigenvector matrix Ψ _B and an eigenvalue matrix Λ _B that satisfy the following equation (8).

ＢΨ_B＝ Ψ_BΛ_B ・・・ (８) BΨ _B = Ψ _B Λ _B (8)

ここで、Ψ_Bは固有ベクトルψ_Bi（ｉ＝０、…、ｄ）を列ベクトルとする行列であり、Λ_Bは、固有値λ_Bi（λ_B1≧λ_B2≧…≧λ_Bd）を対角要素に持つ行列である。こうして得られた固有ベクトルψ_Biを固有値λ_Biの大きい順にｄ’個並べた行列Φ_B＝[Φ_B1，Φ_B2，…，Φ_Bd]がｄ列×ｄ’行の判別行列３００となる。この判別行列Φ_Bを用いて後述する変換を行うことでクラス間の分散Ｂは大きくなる。 Here, Ψ _B is a matrix having eigenvectors ψ _Bi (i = 0,..., D) as column vectors, and Λ _B is a diagonal element with eigenvalues λ _Bi (λ _B1 ≧ λ _B2 ≧... ≧ λ _Bd ). Is a matrix. A matrix Φ _B = [Φ _B1 , Φ _B2 ,..., Φ _Bd ] obtained by arranging d ′ eigenvectors ψ _Bi in the descending order of the eigenvalue λ _Bi is a discriminant matrix 300 of d columns × d ′ rows. The variance B between classes is increased by performing the conversion described later using this discriminant matrix Φ _B.

最後に、判別行列格納部４０７では、判別行列（変換行列）３００を判別行列情報（変換行列情報）１４０に格納する。 Finally, the discriminant matrix storage unit 407 stores the discriminant matrix (transformation matrix) 300 in the discriminant matrix information (transformation matrix information) 140.

以上の処理により、サーバ計算機１１０は入力された画像から顔領域を抽出し、顔領域から人物情報と特徴量ベクトルを抽出する。そして、サーバ計算機１１０は、抽出した人物情報と特徴量ベクトルからクラス間分散Ｂが大きくなるような判別行列３００を算出し、判別行列情報１４０に格納する。 Through the above processing, the server computer 110 extracts a face area from the input image, and extracts person information and a feature vector from the face area. Then, the server computer 110 calculates a discriminant matrix 300 such that the inter-class variance B becomes large from the extracted person information and feature quantity vector, and stores the discriminant matrix 300 in the discriminant matrix information 140.

図５は、第１の実施例の検索用特徴量変換プログラム５００で行われる特徴量ベクトル登録処理を示すブロック図である。 FIG. 5 is a block diagram illustrating a feature vector registration process performed by the search feature conversion program 500 according to the first embodiment.

本実施例では、サーバ計算機１１０は検索用特徴量変換プログラム５００を実行することで、検索用特徴量変換部として機能する。検索用特徴量変換部は、画像取得部５０１、顔検知処理部５０２、特徴量抽出部５０３、特徴量変換部５０４、及び特徴量格納部５０５によって、特徴量ベクトルを特徴量管理情報２００に登録する処理を実行する。なお、画像取得部５０１、顔検知処理部５０２、特徴量抽出部５０３は、それぞれ図４に示した画像取得部４０１、顔検知処理部４０２、特徴量抽出部４０４と同一でも良いし、異なっても良い。 In this embodiment, the server computer 110 functions as a search feature value conversion unit by executing the search feature value conversion program 500. The search feature quantity conversion unit registers the feature quantity vector in the feature quantity management information 200 using the image acquisition unit 501, face detection processing unit 502, feature quantity extraction unit 503, feature quantity conversion unit 504, and feature quantity storage unit 505. Execute the process. Note that the image acquisition unit 501, face detection processing unit 502, and feature amount extraction unit 503 may be the same as or different from the image acquisition unit 401, face detection processing unit 402, and feature amount extraction unit 404 shown in FIG. Also good.

まず、画像取得部５０１では、カメラ１６０から通信基盤１２０を経由して、類似画像の検索の対象となる画像を取得する。なお、映像を取得した後に映像をデコードして画像を取得しても良い。また、クライアント計算機１３０から通信基盤１２０を経由して、類似画像検索の対象となる画像もしくは映像を送信し、画像取得部５０１で受け取っても良い。あるいは、予め撮影した画像をＨＤ１１４に格納しておき、ＨＤ１１４から画像を取得（または入力）しても良い。 First, the image acquisition unit 501 acquires an image to be searched for similar images from the camera 160 via the communication infrastructure 120. Note that the image may be acquired by decoding the video after acquiring the video. Alternatively, the image acquisition unit 501 may transmit an image or video to be searched for similar images from the client computer 130 via the communication infrastructure 120. Alternatively, an image captured in advance may be stored in the HD 114, and an image may be acquired (or input) from the HD 114.

次に、顔検知処理部５０２では、取得した画像に対して顔検知処理を実行し、画像に写った人物の顔領域を取得する。顔検知処理は、前記図４の顔検知処理部４０２と同様であり、周知または公知の技術を適用すればよい。 Next, the face detection processing unit 502 performs face detection processing on the acquired image, and acquires the face area of the person shown in the image. The face detection process is the same as that of the face detection processing unit 402 in FIG. 4, and a known or publicly known technique may be applied.

次に、特徴量抽出部５０３では、顔検知処理部５０２で検知した顔領域から、顔画像の特徴量としてｄ次元の特徴量ベクトルを抽出する。顔画像特徴量は例えば、エッジパターンや色ヒストグラムに基づいて作成される多次元ベクトルである。顔検知処理部５０２で複数の顔領域を検出した場合は、全ての顔領域からｄ次元の特徴量ベクトルを抽出する。なお、特徴量ベクトルは前記図４の特徴量抽出部４０４と同様であり、周知または公知の技術を用いればよい。 Next, the feature quantity extraction unit 503 extracts a d-dimensional feature quantity vector as the feature quantity of the face image from the face area detected by the face detection processing unit 502. The face image feature amount is, for example, a multidimensional vector created based on an edge pattern or a color histogram. When the face detection processing unit 502 detects a plurality of face areas, d-dimensional feature quantity vectors are extracted from all the face areas. Note that the feature quantity vector is the same as that of the feature quantity extraction unit 404 of FIG. 4, and a known or publicly known technique may be used.

次に、特徴量変換部５０４では、特徴量抽出部５０３で抽出したｄ次元の特徴量ベクトルと、判別行列情報１４０から取得した判別行列３００の積を計算して、ｄ’次元の特徴量ベクトルに変換する。なお、次元数はｄ’＜ｄで、判別行列３００によって特徴量ベクトルの圧縮を行う。 Next, the feature quantity conversion unit 504 calculates a product of the d-dimensional feature quantity vector extracted by the feature quantity extraction unit 503 and the discriminant matrix 300 acquired from the discriminant matrix information 140, and obtains a d′-dimensional feature quantity vector. Convert to Note that the number of dimensions is d ′ <d, and the feature vector is compressed by the discriminant matrix 300.

最後に、特徴量格納部５０５では、特徴量変換部５０４で得たｄ’次元の特徴量ベクトルを検索データベース１５０の特徴量管理情報２００に格納する。 Finally, the feature quantity storage unit 505 stores the d′-dimensional feature quantity vector obtained by the feature quantity conversion unit 504 in the feature quantity management information 200 of the search database 150.

ここで、特徴量格納部５０５は、特徴量管理情報２００の検索対象画像特徴量２０２に特徴量ベクトルを格納し、この特徴量ベクトルに対応する検索データＩＤ２０１を付与する。なお、特徴量管理情報２００には、検索処理時に高速検索を行うために、クラスタリングやハッシュを生成し、インデクス情報も併せて格納しても良い。また、特徴量管理情報２００には、検索対象画像特徴量２０２に対応する画像の識別子や所在（ファイルパス等）を付加しても良い。 Here, the feature amount storage unit 505 stores the feature amount vector in the search target image feature amount 202 of the feature amount management information 200, and assigns a search data ID 201 corresponding to the feature amount vector. In the feature quantity management information 200, clustering or hashing may be generated and index information may be stored together in order to perform high-speed search during search processing. The feature amount management information 200 may include an image identifier or location (file path or the like) corresponding to the search target image feature amount 202.

上記処理によって、サーバ計算機１１０は、入力された画像（または映像）からｄ次元の特徴量ベクトルを算出し、判別行列３００を用いてｄ’次元の特徴量ベクトルに変換し、次元圧縮を行って特徴量管理情報２００に特徴量ベクトルを格納する。 Through the above processing, the server computer 110 calculates a d-dimensional feature vector from the input image (or video), converts it into a d′-dimensional feature vector using the discriminant matrix 300, and performs dimensional compression. The feature quantity vector is stored in the feature quantity management information 200.

図６は、第１の実施例の検索プログラム６００で行われる検索処理を示すブロック図である。 FIG. 6 is a block diagram showing search processing performed by the search program 600 of the first embodiment.

本実施例では、サーバ計算機１１０は検索プログラム６００を実行することで、検索部として機能する。検索部は、画像入力部６０１、顔検知処理部６０２、特徴量抽出部６０３、特徴量変換部６０４、類似検索部６０５、及び検索結果出力部６０６によって、検索処理を実行する。なお、顔検知処理部６０２、特徴量抽出部６０３、特徴量変換部６０４は、それぞれ図４に示した顔検知処理部４０２、特徴量抽出部４０４、特徴量変換部５０４と同一でも良いし、異なっても良い。 In this embodiment, the server computer 110 functions as a search unit by executing the search program 600. The search unit executes a search process using the image input unit 601, the face detection processing unit 602, the feature amount extraction unit 603, the feature amount conversion unit 604, the similarity search unit 605, and the search result output unit 606. The face detection processing unit 602, the feature amount extraction unit 603, and the feature amount conversion unit 604 may be the same as the face detection processing unit 402, the feature amount extraction unit 404, and the feature amount conversion unit 504 illustrated in FIG. It may be different.

まず、画像入力部６０１では、クライアント計算機１３０から通信基盤１２０を経由して、類似画像検索の検索キー（検索対象）となる人物が写っている画像（検索対象画像）が入力され、この画像を受け付ける。 First, in the image input unit 601, an image (search target image) showing a person as a search key (search target) for similar image search is input from the client computer 130 via the communication infrastructure 120. Accept.

次に、顔検知処理部６０２では、入力された画像（検索対象画像）に対して顔検知処理を実行し、画像に写った人物の顔領域を取得する。顔検知処理は、前記図４の顔検知処理部４０２と同様である。 Next, the face detection processing unit 602 executes face detection processing on the input image (search target image), and acquires the face area of the person shown in the image. The face detection processing is the same as the face detection processing unit 402 in FIG.

次に、特徴量抽出部６０３では、顔検知処理部５０２で検知した顔領域から、顔画像特徴量としてd次元の特徴量ベクトルを抽出する。顔画像特徴量は例えば、エッジパターンや色ヒストグラムに基づいて作成される多次元ベクトルである。顔検知処理部５０２で複数の顔領域を検出した場合は、全ての顔領域からｄ次元の特徴量ベクトルを抽出する。なお、顔検知処理部６０２で複数の顔領域が検知された場合は、クライアント計算機１３０から検索キーとなる顔領域を指定しても良いし、複数の顔領域全てから特徴量ベクトルを抽出して、今後の処理に使用しても良い。なお、特徴量ベクトルは前記図４の特徴量抽出部４０４と同様である。 Next, the feature amount extraction unit 603 extracts a d-dimensional feature amount vector as a face image feature amount from the face area detected by the face detection processing unit 502. The face image feature amount is, for example, a multidimensional vector created based on an edge pattern or a color histogram. When the face detection processing unit 502 detects a plurality of face areas, d-dimensional feature quantity vectors are extracted from all the face areas. When a plurality of face areas are detected by the face detection processing unit 602, a face area as a search key may be specified from the client computer 130, or feature quantity vectors may be extracted from all the plurality of face areas. It may be used for future processing. The feature quantity vector is the same as that of the feature quantity extraction unit 404 in FIG.

次に、特徴量変換部６０４では、特徴量抽出部６０３で抽出したｄ次元の特徴量ベクトルと、判別行列情報１４０から取得した判別行列３００の積を計算して、ｄ’次元の特徴量ベクトルを得る。なお、複数の検索キーを用いる場合は、判別行列３００を用いて全ての特徴量ベクトルを変換する。なお、特徴量の変換は、前記図５の特徴量変換部５０４と同様である。 Next, the feature quantity conversion unit 604 calculates the product of the d-dimensional feature quantity vector extracted by the feature quantity extraction unit 603 and the discriminant matrix 300 acquired from the discriminant matrix information 140 to obtain a d′-dimensional feature quantity vector. Get. When a plurality of search keys are used, all feature vectors are converted using the discrimination matrix 300. Note that the feature value conversion is the same as the feature value conversion unit 504 in FIG.

次に、類似検索部６０５では、検索キーである特徴量ベクトルと、検索データベース１５０に格納されている検索対象画像特徴量２０２のベクトル間距離を計算する。そして、ベクトル間距離の小さいものから昇順に、検索データＩＤ２０１を並べる。 Next, the similarity search unit 605 calculates a distance between vectors of the feature quantity vector as a search key and the search target image feature quantity 202 stored in the search database 150. Then, the search data IDs 201 are arranged in ascending order from the smallest vector distance.

最後に、検索結果出力部６０６では、並び替えられた検索データＩＤ２０１を元に検索結果をクライアント計算機１３０に出力する。例えば、検索データＩＤ２０１に画像データが紐付けられている場合は、画像データ列を出力する。 Finally, the search result output unit 606 outputs the search result to the client computer 130 based on the sorted search data ID 201. For example, when image data is associated with the search data ID 201, an image data string is output.

以上の処理によって、サーバ計算機１１０は、クライアント計算機１３０から入力された検索対象画像について、ｄ’次元の特徴量ベクトルを算出し、検索データベース１５０の検索対象画像特徴量２０２のベクトル間距離を算出する。そして、サーバ計算機１１０は、ベクトル間距離の小さい順に検索データＩＤ２０１または画像をクライアント計算機１３０へ検索結果として送信する。なお、サーバ計算機１１０が検索結果としてクライアント計算機１３０へ送信する検索データＩＤ２０１の数または画像の数は、所定の値以内に制限しても良い。 With the above processing, the server computer 110 calculates the d′-dimensional feature vector for the search target image input from the client computer 130 and calculates the inter-vector distance of the search target image feature 202 in the search database 150. . Then, the server computer 110 transmits the search data ID 201 or the image as a search result to the client computer 130 in ascending order of the vector distance. The number of search data IDs 201 or the number of images that the server computer 110 transmits to the client computer 130 as a search result may be limited to a predetermined value.

なお、本実施例１では、検出した顔領域から抽出した顔特徴量を対象に説明したが、画像の中から検出できるものであれば、全て特徴量の対象にすることができる。例えば、人物領域から抽出した人物特徴量や、人物以外の物体の特徴量を用いても良い。 In the first embodiment, the facial feature amount extracted from the detected face area has been described as an object. However, any feature feature can be used as long as it can be detected from an image. For example, a person feature amount extracted from a person region or a feature amount of an object other than a person may be used.

以上を踏まえ、本実施例１に記載の画像検索システムは、入力された画像から第１のオブジェクトと第２のオブジェクトとを検知し、前記第１のオブジェクトの第１の画像特徴量と、前記第２のオブジェクトの第２の画像特徴量とを抽出し、前記第１のオブジェクトと前記第２のオブジェクトとが異なるオブジェクトであると決定し、異なるオブジェクト同士となる前記第１の画像特徴量と前記第２の画像特徴量との間の分散Ｂが大きくなるような変換行列（判別行列３００）を生成し、前記変換行列を用いて変換した後の画像特徴量を用いて検索することを特徴とする。 Based on the above, the image search system according to the first embodiment detects the first object and the second object from the input image, the first image feature amount of the first object, A second image feature amount of a second object is extracted, the first object and the second object are determined to be different objects, and the first image feature amount to be different objects from each other; Generating a transformation matrix (discriminant matrix 300) such that a variance B between the second image feature amount and the second image feature amount is increased, and performing a search using the image feature amount after being transformed using the transformation matrix; And

かかる特徴により、人手を介することなく、本人同士のベクトル間距離が小さく、他人同士のベクトル間距離が大きくなるような変換行列が生成でき、検索精度が向上する。そして、変換行列を作成する学習データを自動的に収集できるため、学習データを作成する処理を低減し、システムの運用コストを抑制することができる。 With this feature, it is possible to generate a transformation matrix in which the distance between vectors between individuals is small and the distance between vectors between others is large without any human intervention, and search accuracy is improved. Since learning data for creating a transformation matrix can be automatically collected, the process for creating learning data can be reduced and the operating cost of the system can be suppressed.

以下、本発明の第２の実施例の画像検索システムについて、図７、図８に従って説明する。 The image search system according to the second embodiment of the present invention will be described below with reference to FIGS.

第２の実施例の画像検索システムは、前記第１の実施の例の画像検索システムと同じコンピュータシステムを用いて実現したものであり、構成を示すブロック図、特徴量管理情報を示す説明図、特徴量ベクトル登録処理を示すブロック図、検索処理を示すブロック図は同一である。 The image search system of the second embodiment is realized by using the same computer system as the image search system of the first embodiment, a block diagram showing a configuration, an explanatory diagram showing feature quantity management information, The block diagram showing the feature vector registration process and the block diagram showing the search process are the same.

図７は、第２の実施例の別人情報および同一人物情報の生成を示す模式図であり、図８は、第２の実施例の判別行列生成処理を示すブロック図である。 FIG. 7 is a schematic diagram showing generation of different person information and identical person information in the second embodiment, and FIG. 8 is a block diagram showing discrimination matrix generation processing in the second embodiment.

以下、図７と図８を使用して、第２の実施例の判別行列生成処理について説明する。 Hereinafter, the discriminant matrix generation process of the second embodiment will be described with reference to FIGS.

本実施例２では、サーバ計算機１１０は判別行列生成プログラム４００を実行することで、判別行列生成部として機能する。判別行列生成部は、図８で示すように、画像取得部８０１、顔検知処理部８０２、人物追跡部８０３、人物情報生成部８０４、特徴量抽出部８０５、クラス間分散計算部８０６、クラス内分散計算部８０７、判別行列生成部８０８、及び判別行列格納部８０９によって、判別行列３００の生成を実行する。 In the second embodiment, the server computer 110 functions as a discriminant matrix generation unit by executing the discriminant matrix generation program 400. As shown in FIG. 8, the discriminant matrix generation unit includes an image acquisition unit 801, a face detection processing unit 802, a person tracking unit 803, a person information generation unit 804, a feature amount extraction unit 805, an interclass variance calculation unit 806, an intraclass The discriminant matrix 300 is generated by the variance calculation unit 807, the discriminant matrix generation unit 808, and the discriminant matrix storage unit 809.

まず、画像取得部８０１では、サーバ計算機１１０が、カメラ１６０から通信基盤１２０を経由して、画像を取得する。画像取得部８０１では、前記実施例１と同様に、学習データとして画像を取得する。 First, in the image acquisition unit 801, the server computer 110 acquires an image from the camera 160 via the communication infrastructure 120. The image acquisition unit 801 acquires an image as learning data, as in the first embodiment.

なお、サーバ計算機１１０は、カメラ１６０から映像を取得した後に映像をデコードして、フレーム毎の画像を取得しても良い。また、カメラ１６０または他のカメラで撮影した画像もしくは映像を一旦クライアント計算機１３０に保存しておき、クライアント計算機１３０から通信基盤１２０を経由して、サーバ計算機１１０に画像もしくは映像を送信し、画像取得部８０１で受信しも良い。あるいは、学習データとして予め撮影した画像をＨＤ１１４に格納しておき、ＨＤ１１４から画像を取得（または入力）しても良い。 The server computer 110 may acquire the image for each frame by decoding the video after acquiring the video from the camera 160. In addition, an image or video captured by the camera 160 or another camera is temporarily stored in the client computer 130, and the image or video is transmitted from the client computer 130 to the server computer 110 via the communication infrastructure 120 to obtain an image. It may be received by the unit 801. Alternatively, an image captured in advance as learning data may be stored in the HD 114, and an image may be acquired (or input) from the HD 114.

次に、顔検知処理部８０２では、取得した画像に対して顔検知処理を実行し、画像に写った人物の顔領域を取得する。顔検知０処理は、前記実施例１の図４に示した顔検知処理部４０２と同様であり、周知または公知の技術を適用すればよい。 Next, the face detection processing unit 802 executes face detection processing on the acquired image, and acquires the face area of the person shown in the image. The face detection 0 process is the same as the face detection processing unit 402 shown in FIG. 4 of the first embodiment, and a known or known technique may be applied.

次に、人物追跡部８０３では、連続したフレーム（画像）中に写った人物を追跡する。顔検知処理部８０２で複数の顔領域を検知した場合は、それぞれの顔領域を追跡する。人物追跡部８０３の顔領域の追跡は、異なるフレーム間で同一人物の顔領域を関連付けるもので、周知または公知の技術を用いればよいので、ここでは詳述しない。 Next, the person tracking unit 803 tracks a person captured in successive frames (images). When the face detection processing unit 802 detects a plurality of face areas, each face area is tracked. The tracking of the face area of the person tracking unit 803 associates the face area of the same person between different frames, and a well-known or publicly known technique may be used, and will not be described in detail here.

次に、人物情報生成部８０４では、顔検知処理部８０２で検知した顔領域を対象に、一台のカメラ１６０から取得した画像に複数の人物が写っている場合、同時に写っている人物は別人であるとして、別人用人物情報を生成する。別人用人物情報として、画像中の顔領域に人物ＩＤを付与しても良いし、別人という情報のみを保持しておいても良い。さらに、人物追跡部８０３で、追跡した顔領域は同一人物であるとして、同一人物用人物情報を生成する。同一人物用人物情報として、人物ＩＤを付与してグループ化しても良い。このように人物追跡部８０３は、サーバ計算機１１０に入力された複数の画像間で第１のオブジェクトまたは第２のオブジェクトが同一であれば同一のオブジェクト（同一人物）として特定する。 Next, in the person information generation unit 804, when a plurality of persons are shown in the image acquired from one camera 160 for the face area detected by the face detection processing unit 802, the person shown at the same time is a different person. As a result, personal information for another person is generated. As person information for another person, a person ID may be given to a face area in the image, or only information about another person may be held. Further, the person tracking unit 803 generates person information for the same person, assuming that the tracked face area is the same person. As person information for the same person, a person ID may be assigned and grouped. As described above, the person tracking unit 803 specifies the same object (the same person) if the first object or the second object is the same among the plurality of images input to the server computer 110.

例えば、図７を用いて説明すると、カメラ１６０ａから取得した画像（フレーム）７１０に顔領域（人物）７２０Ａ、７２０Ｂ，７２０Ｃが映っていた場合、人物７２０Ａ、７２０Ｂ，７２０Ｃはそれぞれ別の人物であると推定される。よって、人物追跡部８０３は、人物７２０Ａと７２０Ｂと７０２Ｃは各々が別人であるという情報を生成する。人物追跡部８０３は、画像７３０に写っている人物７４０Ａ，７４０Ｂ，７４０Ｃと、画像７５０に写っている人物７６０Ａ，７６０Ｂ，７６０Ｃについても同様である。また、人物追跡部８０３は、画像７１０の人物７２０Ａ、７２０Ｂ，７２０Ｃを画像７３０，７５０で追跡した結果、画像７３０中に７４０Ａ、７４０Ｂ，７４０Ｃを検出し、画像７５０中に７６０Ａ、７６０Ｂ，７６０Ｃを検出した場合、人物７２０Ａ，７４０Ａ，７６０Ａは同一人物であり、人物７２０Ｂ，７４０Ｂ，７６０Ｂは同一人物であり、人物７２０Ｃ，７４０Ｃ，７６０Ｃは同一人物という情報を生成する。 For example, referring to FIG. 7, when face regions (persons) 720A, 720B, and 720C are shown in an image (frame) 710 acquired from the camera 160a, the persons 720A, 720B, and 720C are different persons. It is estimated to be. Therefore, the person tracking unit 803 generates information that the persons 720A, 720B, and 702C are different persons. The person tracking unit 803 is the same for the persons 740A, 740B, and 740C shown in the image 730 and the persons 760A, 760B, and 760C shown in the image 750. In addition, the person tracking unit 803 detects 740A, 740B, and 740C in the image 730 as a result of tracking the persons 720A, 720B, and 720C of the image 710, and detects 760A, 760B, and 760C in the image 750. If detected, the persons 720A, 740A, and 760A are the same person, the persons 720B, 740B, and 760B are the same person, and the persons 720C, 740C, and 760C generate information that is the same person.

人物情報生成部８０４は、同一人物という情報や別人の情報として、例えば、顔領域の人物毎に識別子を付与し、同一人物には同一の識別子を付与し、別人には異なる識別子を付与すれば良い。 For example, the person information generation unit 804 assigns an identifier for each person in the face area as information about the same person or information about another person, assigns the same identifier to the same person, and assigns a different identifier to another person. good.

次に、特徴量抽出部８０５では、顔検知処理部８０２で検知した顔領域から、顔画像特徴量としてd次元の特徴量ベクトルを抽出する。顔画像特徴量は例えば、エッジパターンや色ヒストグラムに基づいて作成される多次元ベクトルである。なお、人物追跡部８０３と人物情報生成部８０４の処理と、特徴量抽出部８０５の処理は、並列して行っても良いし、どちらかを先に行っても良い。なお、特徴量ベクトルは前記実施例１の特徴量抽出部４０４と同様であり、周知または公知の技術を用いればよい。 Next, the feature quantity extraction unit 805 extracts a d-dimensional feature quantity vector as a face image feature quantity from the face area detected by the face detection processing unit 802. The face image feature amount is, for example, a multidimensional vector created based on an edge pattern or a color histogram. Note that the processing of the person tracking unit 803 and the person information generation unit 804 and the processing of the feature amount extraction unit 805 may be performed in parallel, or one of them may be performed first. Note that the feature quantity vector is the same as that of the feature quantity extraction unit 404 of the first embodiment, and a known or publicly known technique may be used.

次に、クラス間分散計算部８０６では、前記実施例１に示した（７）式に従って、顔領域から抽出した特徴量ベクトルを用いて、クラス間分散Ｂを計算する。 Next, the inter-class variance calculation unit 806 calculates the inter-class variance B using the feature quantity vector extracted from the face area according to the equation (7) shown in the first embodiment.

次に、クラス内分散計算部８０７では、次の（９）式に従って、顔領域から抽出した特徴量ベクトルを用いて、クラス内分散Ｗを計算する。 Next, the intra-class variance calculation unit 807 calculates the intra-class variance W using the feature quantity vector extracted from the face area according to the following equation (9).

ここで、人物追跡部８０３で追跡された人物の数をｎ_pとすると、ｐ_j≧２はｊ番目の人物から検出された顔画像（顔領域）の数であり、ｘ_ijはｊ番目の人物のｉ番目の顔領域から抽出された特徴量ベクトルであり、特徴量ベクトルのデータの平均は_aveｘ_jである。 Here, assuming that the number of persons tracked by the person tracking unit 803 is n _p , p _j ≧ 2 is the number of face images (face regions) detected from the j-th person, and x _ij is the j-th number. This is a feature vector extracted from the i-th face area of a person, and the average of the feature vector data is _ave x _j .

次に、判別行列生成部８０８では、次の（１０）式を満たす固有ベクトル行列Ψ_BWと固有値行列Λ_BWを求める。 Next, the discriminant matrix generation unit 808 obtains an eigenvector matrix Ψ _BW and an eigenvalue matrix Λ _BW that satisfy the following equation (10).

ＢΨ_BW ＝ＷΨ_BWΛ_BW ・・・（１０） BΨ _BW = WΨ _BW Λ _BW (10)

ここで、Ψ_BWは固有ベクトルψ_BWi（ｉ＝０、…、ｄ）を列ベクトルとする行列であり、Λ_BWは固有値λ_BWi（λ_BW1≧λ_BW2≧…≧λ_BWd）を対角要素に持つ行列である。こうして得られた固有ベクトルψ_BWiを固有値の大きい順にｄ’個並べた行列Φ_BW＝｛Φ_BW1，Φ_BW2，…，Φ_BWd']がｄ列×ｄ’行の判別行列３００となる。これにより、クラス間分散Ｂが大きく、同一のクラス内分散が小さくなるような判別行列３００を求めることができる。 Here, Ψ _BW is a matrix having eigenvectors ψ _BWi (i = 0,..., D) as column vectors, and Λ _BW has eigenvalues λ _BWi (λ _BW1 ≧ λ _BW2 ≧... ≧ λ _BWd ) as diagonal elements. It is a matrix with. A matrix Φ _BW = {Φ _BW1 , Φ _BW2 ,..., Φ _{BWd ′} ] obtained by arranging d ′ eigenvectors _φBWi obtained in this order in descending order of the eigenvalues becomes a discriminant matrix 300 of d columns × d ′ rows. As a result, a discriminant matrix 300 having a large inter-class variance B and a small intra-class variance can be obtained.

最後に、判別行列格納部８０９では、上記算出された判別行列３００を判別行列情報１４０に格納する。 Finally, the discriminant matrix storage unit 809 stores the calculated discriminant matrix 300 in the discriminant matrix information 140.

以上により、複数の画像を入力した場合、異なるクラス（顔領域）間では分散Ｂが大きくなる変換行列（第１変換行列）を得るのに加え、同一のクラス（顔領域）内では分散Ｗが小さくなる変換行列（第２変換行列）を得ることができる。これにより、本実施例２では前記実施例１の効果に加えて、同一人物の検出精度を向上させることが可能となる。 As described above, when a plurality of images are input, in addition to obtaining a transformation matrix (first transformation matrix) in which the variance B increases between different classes (face regions), the variance W is obtained in the same class (face region). A smaller transformation matrix (second transformation matrix) can be obtained. Thereby, in the second embodiment, in addition to the effects of the first embodiment, it is possible to improve the detection accuracy of the same person.

前記実施例１のクラス間分散計算部４０５では、１枚の画像に映っている顔画像（顔領域）ら抽出した特徴量ベクトルを用いて、クラス間分散Ｂを計算する例を示した。実施例３では、クラス間分散Ｂの計算を行う際に、１枚の画像に映っている顔画像のみを用いて計算するのではなく、前記実施例２で示したように各顔画像（顔領域）を追跡した結果、同一人物だと見なされた複数の顔画像を用いて計算しても良い。 In the first embodiment, the interclass variance calculation unit 405 calculates an interclass variance B using a feature vector extracted from a face image (face area) shown in one image. In the third embodiment, when calculating the inter-class variance B, the calculation is not performed using only the face image shown in one image, but as shown in the second embodiment, each face image (face As a result of tracking (region), calculation may be performed using a plurality of face images regarded as the same person.

本実施例３では、次の（１１）式に従って、顔領域から抽出した特徴量ベクトルを用いて、クラス間分散Ｂを計算する。 In the third embodiment, the inter-class variance B is calculated using the feature vector extracted from the face area according to the following equation (11).

ここで、学習データの全フレーム数をｎ_fとすると、クラス数ｃ_j≧２はｊ番目のフレーム画像から検出された顔画像（顔領域）数である。また、ｙ_ijはｊ番目のフレーム画像のｉ番目の顔領域と、追跡の結果、同一人物と見なされた他の顔画像から抽出された特徴量ベクトルの平均値であり、_aveｙ_jは特徴量ベクトルｙ_ijの平均値である。 Here, if the total number of frames of learning data is n _f , the class number c _j ≧ 2 is the number of face images (face regions) detected from the j-th frame image. Further, y _ij is an average value of feature quantity vectors extracted from the i-th face area of the j-th frame image and other face images regarded as the same person as a result of tracking, and _ave y _j is a feature This is the average value of the quantity vector y _ij .

すなわち、学習データとして図７で示したように複数の画像７１０、７３０、７５０が入力された場合、サーバ計算機１１０は、例えば、画像７１０の顔領域７２０Ａは、画像７３０の顔領域７４０Ａと、画像７５０の顔領域７６０Ａと同一人物と判定する。そして、上述のようにサーバ計算機１１０は、３つの顔領域７２０Ａ、７４０Ａ、７６０Ａの特徴量ベクトルの平均値を用いてクラス間分散Ｂを算出する。 That is, when a plurality of images 710, 730, and 750 are input as learning data as shown in FIG. 7, the server computer 110, for example, the face area 720A of the image 710, the face area 740A of the image 730, and the image It is determined that the person is the same as the 750 face area 760A. Then, as described above, the server computer 110 calculates the inter-class variance B using the average value of the feature amount vectors of the three face regions 720A, 740A, and 760A.

以上のように、複数のフレーム（画像）で同一人物と見なされた顔領域の特徴量ベクトルの平均値からクラス間分散Ｂを演算することで、クラス間分散Ｂが大きくなるような判別行列３００の精度を向上させることが可能となる。なお、複数の画像は、連続した画像あるいは所定時間毎の画像であればよい。 As described above, the discriminant matrix 300 that increases the interclass variance B by calculating the interclass variance B from the average value of the feature amount vectors of the face regions regarded as the same person in a plurality of frames (images). It is possible to improve the accuracy. The plurality of images may be continuous images or images every predetermined time.

＜変形例＞
前記実施例１のクラス間分散計算部４０５では、１枚の画像に映っている顔画像（顔領域）ら抽出した特徴量ベクトルを用いてクラス間分散Ｂを計算する例を示したが、異なる画像の顔領域の特徴量ベクトルを用いてクラス間分散Ｂを算出しても良い。 <Modification>
In the interclass variance calculation unit 405 of the first embodiment, an example is shown in which the interclass variance B is calculated using the feature vector extracted from the face image (face area) shown in one image. The interclass variance B may be calculated using the feature vector of the face area of the image.

例えば、学習データとして図７で示したように画像７１０、７３０、７５０を入力された場合、前記実施例２より画像７１０の顔領域（人物）７２０Ａと、画像７３０の顔領域７４０Ｂと、画像７５０の顔領域７６０Ｃはそれぞれ別人としてサーバ計算機１１０は認識する。そして、サーバ計算機１１０は、３つの顔領域７２０Ａ、７４０Ｂ、７６０Ｃの特徴量ベクトルを用いてクラス間分散Ｂを算出する。 For example, when the images 710, 730, and 750 are input as learning data as shown in FIG. 7, the face area (person) 720A of the image 710, the face area 740B of the image 730, and the image 750 are input from the second embodiment. , The server computer 110 recognizes each face area 760C as a different person. Then, the server computer 110 calculates the interclass variance B using the feature amount vectors of the three face regions 720A, 740B, and 760C.

以上のように、複数のフレーム（画像）で別人と見なされた顔領域の特徴量ベクトルからクラス間分散Ｂを演算することで、クラス間分散Ｂが大きくなるような判別行列３００の精度を向上させることが可能となる。 As described above, the accuracy of the discriminant matrix 300 that increases the inter-class variance B is improved by calculating the inter-class variance B from the feature vector of the face area regarded as a different person in a plurality of frames (images). It becomes possible to make it.

なお、本発明において説明した計算機等の構成、処理部及び処理手段等は、それらの一部又は全部を、専用のハードウェアによって実現してもよい。 The configuration of the computer, the processing unit, the processing unit, and the like described in the present invention may be partially or entirely realized by dedicated hardware.

また、本実施例で例示した種々のソフトウェアは、電磁的、電子的及び光学式等の種々の記録媒体（例えば、非一時的な記憶媒体）に格納可能であり、インターネット等の通信網を通じて、コンピュータにダウンロード可能である。 In addition, the various software exemplified in the present embodiment can be stored in various recording media (for example, non-transitory storage media) such as electromagnetic, electronic, and optical, and through a communication network such as the Internet. It can be downloaded to a computer.

また、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明をわかりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。 The present invention is not limited to the above-described embodiments, and includes various modifications. For example, the above-described embodiments have been described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the configurations described.

１１０サーバ計算機
１１１外部インタフェース
１１２ＣＰＵ（中央処理演算装置）
１１３メモリ（主記憶装置）
１１４ＨＤ（大容量外部記憶装置）
１２０通信基盤
１３０クライアント計算機、
１４０判別行列情報
１５０検索データベース
１６０ａ〜１６０ｎカメラ
２００特徴量管理情報
４０１画像取得部
４０２顔検知処理部
４０３人物情報生成部
４０４特徴量抽出部
４０５クラス間分散計算部
４０６判別行列生成部
４０７判別行列格納部
５０１画像取得部
５０２顔検知処理部
５０３特徴量抽出部
５０４特徴量変換部
５０５特徴量格納部 110 server computer 111 external interface 112 CPU (central processing unit)
113 memory (main memory)
114 HD (large capacity external storage device)
120 communication infrastructure 130 client computer,
140 discriminant matrix information 150 search database 160a-160n camera 200 feature amount management information 401 image acquisition unit 402 face detection processing unit 403 person information generation unit 404 feature amount extraction unit 405 interclass variance calculation unit 406 discriminant matrix generation unit 407 discriminant matrix storage Unit 501 image acquisition unit 502 face detection processing unit 503 feature amount extraction unit 504 feature amount conversion unit 505 feature amount storage unit

Claims

An image search method for searching for an image with a computer having a processor and a memory,
A first step in which the computer detects a first object and a second object from an input image;
A second step in which the computer extracts a first image feature quantity of the first object and a second image feature quantity of the second object;
A third step in which the computer determines that the first object and the second object are different objects;
A fourth step in which the computer generates a transformation matrix such that a variance between the first image feature quantity and the second image feature quantity that are different objects increases;
A fifth step in which the computer stores the image feature quantity after the image feature quantity is transformed using the transformation matrix;
An image search method comprising:

The image search method according to claim 1,
The image search method further includes a sixth step in which the computer receives an image to be searched and searches for the received image using the image feature quantity converted by the conversion matrix. .

The image search method according to claim 1,
The first step includes
Detecting a first object and a second object from each of a plurality of input images;
The third step includes
Determining that the first object and the second object are different objects in the same image;
Identifying the same object among the first object or the second object between the plurality of images;
An image search method comprising:

The image search method according to claim 3,
The fourth step includes
Generating a first transformation matrix having a large variance between the first image feature quantity and the second image feature quantity that are different objects;
Generating a second transformation matrix in which a variance between images of the first image feature quantity or the second image feature quantity specified as the same object in the plurality of images is reduced;
An image search method comprising:

The image search method according to claim 3,
The fourth step includes
An image search method, comprising: generating the transformation matrix from an average value of the first image feature quantity or the second image feature quantity specified as the same object in the plurality of images.

The image search method according to claim 3,
The fourth step includes
When there are a first image and a second image including a first object and a second object among the plurality of images, an image feature amount of the first object of the first image, and a second image An image search method characterized by generating the transformation matrix from the image feature amount of the second object.

A server for retrieving images with a processor and memory,
The server
A detection processing unit for detecting the first object and the second object from the input image;
A feature amount extraction unit that extracts a first image feature amount of the first object and a second image feature amount of the second object;
An object information generating unit that determines that the first object and the second object are different objects;
A transformation matrix generation unit that generates a transformation matrix such that a variance between the first image feature quantity and the second image feature quantity that are different objects increases;
The server characterized by having.

The server according to claim 7,
A server further comprising: a search unit that receives an image to be searched and searches for the received image using the image feature amount converted by the conversion matrix.

The server according to claim 7,
The detection processing unit
Detecting a first object and a second object from each of a plurality of input images;
The object information generation unit
It is determined that the first object and the second object are different objects in the same image, and the same object among the plurality of images is specified among the first object and the second object. A server characterized by that.

The server according to claim 9, wherein
The transformation matrix generation unit
Generating a first transformation matrix having a large variance between the first image feature quantity and the second image feature quantity that are different objects, and specifying the same object in the plurality of images; A server that generates a second transformation matrix that reduces a variance between images of a first image feature amount or a second image feature amount.

The server according to claim 9, wherein
The transformation matrix generation unit
The server that generates the transformation matrix from an average value of the first image feature quantity or the second image feature quantity specified as the same object in the plurality of images.

The server according to claim 9, wherein
The transformation matrix generation unit
When there are a first image and a second image including a first object and a second object among the plurality of images, an image feature amount of the first object of the first image, and a second image A server that generates the transformation matrix from an image feature amount of the second object.

A server with a processor and memory;
An image search system having an imaging device connected to the server,
The server
A detection processing unit that detects a first object and a second object from an image input from the imaging device;
A feature amount extraction unit that extracts a first image feature amount of the first object and a second image feature amount of the second object;
An object information generating unit that determines that the first object and the second object are different objects;
A transformation matrix generation unit that generates a transformation matrix such that a variance between the first image feature quantity and the second image feature quantity that are different objects increases;
An image search system comprising:

The image search system according to claim 13,
A client computer connected to the server;
An image search system further comprising: a search unit that receives a search target image from the client computer and searches for the received image using the image feature quantity converted by the conversion matrix.

The image search system according to claim 13,
The detection processing unit
Detecting a first object and a second object from each of a plurality of input images;
The object information generation unit
It is determined that the first object and the second object are different objects in the same image, and the same object among the plurality of images is specified among the first object and the second object. An image search system characterized by that.