JP2014092955A

JP2014092955A - Similar content search processing device, similar content search processing method and program

Info

Publication number: JP2014092955A
Application number: JP2012243341A
Authority: JP
Inventors: Ryoichi Kawanishi; 亮一川西; Keiji Icho; 圭司銀杏; Kaneto Ogawa; 兼人小川
Original assignee: Panasonic Corp
Current assignee: Panasonic Corp
Priority date: 2012-11-05
Filing date: 2012-11-05
Publication date: 2014-05-19

Abstract

【課題】ユーザが保有するコンテンツの内容に則して、効果的にユーザの意図に合う類似判定を行うことを可能とする類似コンテンツ検索処理装置を提供する。
【解決手段】類似コンテンツ検索処理装置は、検索対象であるコンテンツ群を取得し、検索時の問合せ元となる少なくとも１つ以上のコンテンツデータを入力し、取得されるコンテンツまたは入力されるコンテンツの属性情報を抽出し、入力されるコンテンツの属性情報と、取得される少なくとも１つ以上のコンテンツ群の属性情報から属性情報との差分情報を算出し、算出された差分情報に基づいてコンテンツ間の類似度を算出する際の基準となる類似度基準を決定し、決定された類似度基準に基づいて、コンテンツ間の類似度を算出し、算出された類似度を基に所定の条件を満たすコンテンツを選択する。
【選択図】図１The present invention provides a similar content search processing apparatus that enables a similarity determination that effectively matches a user's intention according to the content of the content held by the user.
A similar content search processing apparatus acquires a content group as a search target, inputs at least one content data as an inquiry source at the time of search, and acquires acquired content or attributes of input content Information is extracted, and difference information between the attribute information of the input content and the attribute information of the acquired at least one content group is calculated, and similarity between contents is calculated based on the calculated difference information A similarity criterion as a reference for calculating the degree is determined, a similarity between the contents is calculated based on the determined similarity criterion, and content satisfying a predetermined condition is calculated based on the calculated similarity select.
[Selection] Figure 1

Description

本発明は、ユーザが保有するコンテンツ群の属性情報に基づいて、コンテンツの内容を加味した類似コンテンツ検索が可能なコンテンツ処理技術に関するものである。 The present invention relates to a content processing technique capable of searching for similar content in consideration of the content of content based on attribute information of a content group held by a user.

近年、例えば被写体像を撮像するＤＳＣ（ＤｉｇｉｔａｌＳｔｉｌｌＣａｍｅｒａ）やデジタル一眼レフカメラやデジタルムービーカメラ、さらにはスマートフォン等の携帯電話カメラやタブレット端末カメラなどが広く普及しており、気軽な写真撮影や動画撮影が可能になっている。また、コンテンツデータ保存用の記録媒体は大規模化が進みクラウドサービス等のサーバー容量も拡大の一途である。さらにソーシャルメディアの発展により、多様な人々が個人コンテンツを共有し合っている。 In recent years, for example, DSC (Digital Still Camera), digital single-lens reflex cameras, digital movie cameras, mobile phone cameras such as smartphones, tablet terminal cameras, and the like have been widely used to capture subject images. Shooting is possible. In addition, the recording medium for storing content data has been increased in scale, and the server capacity of cloud services and the like has been expanded. In addition, with the development of social media, diverse people share personal content.

そのため、個人ユーザは膨大な量のコンテンツを保有する事や共有する事が自然な行為になっているが、ユーザが保有している膨大な量のコンテンツ群からユーザが利用時に必要なコンテンツを探したり、必要なコンテンツだけをユーザと共有して楽しんだりするには、膨大な量のコンテンツを手動でタグ付けしたりする事で整理しておく必要があるため、多大な時間と労力を必要とするという課題がある。また、写真や動画といったコンテンツデータを用いて業務宣伝をする事や、料理や服装や家具等の購入時にメニューや実物リストとしてコンテンツを利用する機会が増加している。この際にも同様の課題が発生する。 For this reason, it is natural for an individual user to own or share an enormous amount of content, but the user searches for the content that the user needs when using it from the enormous amount of content that the user has. In order to enjoy and share only necessary content with users, it is necessary to organize a large amount of content by manually tagging it, which requires a lot of time and effort. There is a problem of doing. Also, there are increasing opportunities to advertise business using content data such as photos and videos, and to use content as menus and real lists when purchasing food, clothes, furniture, and the like. At this time, a similar problem occurs.

そこで、ユーザが所望するコンテンツを効率的に探して利用できる様に、コンテンツから取得される特徴量を用いてユーザが指定したコンテンツと見た目が似ているコンテンツ群のみを集めることで、類似コンテンツを検索する類似検索処理技術が注目されている。 Therefore, by collecting only content groups that look similar to the content specified by the user using the feature amount acquired from the content so that the user can efficiently find and use the desired content, similar content can be collected. Similar search processing techniques for searching are attracting attention.

また、多数のコンテンツ画像に対する類似画像検索において、より効果的にユーザの意図に合う類似判定を行うための技術が存在する（例えば、特許文献１、特許文献２参照）。 In addition, there is a technique for performing a similarity determination that more effectively matches a user's intention in a similar image search for a large number of content images (see, for example, Patent Document 1 and Patent Document 2).

特開２０１２−５８９４０号公報JP 2012-58940 A 特開２０１１−２０３７７６号公報JP 2011-203776 A

しかしながら、上述した従来の技術では、さらに効果的にユーザの意図に合う類似判定を行うことが求められていた。 However, in the above-described conventional technology, it is required to perform similarity determination that more effectively matches the user's intention.

本発明の主な目的は、ユーザが保有するコンテンツの内容に則して、効果的にユーザの意図に合う類似判定を行うことを可能とする類似コンテンツ検索処理装置を提供することである。 A main object of the present invention is to provide a similar content search processing apparatus that can perform similarity determination that fits the user's intention effectively in accordance with the content of the content held by the user.

前記従来の課題を解決するために、本発明に係る類似コンテンツ検索処理装置は、検索対象であるコンテンツ群を取得するコンテンツデータ取得手段と、検索時の問合せ元となる少なくとも１つ以上のコンテンツデータを入力するクエリ情報入力手段と、前記コンテンツデータ取得手段により取得されるコンテンツまたは前記クエリ情報入力手段により入力されるコンテンツの属性情報を抽出するコンテンツ情報抽出手段と、前記コンテンツ情報抽出手段により抽出される、前記クエリ情報入力手段により入力されるコンテンツの属性情報と、前記コンテンツデータ取得手段により取得される少なくとも１つ以上のコンテンツ群の属性情報から属性情報の差分情報を算出し、算出された差分情報に基づいて前記コンテンツ間の類似度を算出する際の基準となる類似度基準を決定する類似度基準決定手段と、前記類似度基準決定手段により決定された類似度基準に基づいて、前記コンテンツ間の類似度を算出する類似度算出手段と、前記類似度算出手段により算出された類似度を基に所定の条件を満たすコンテンツを選択する合致コンテンツ選択手段と、を備える。 In order to solve the above-described conventional problem, a similar content search processing device according to the present invention includes a content data acquisition unit that acquires a content group that is a search target, and at least one content data that is an inquiry source at the time of search. The query information input means for inputting the content information, the content information extraction means for extracting the attribute information of the content acquired by the content data acquisition means or the content input by the query information input means, and the content information extraction means The difference information of the attribute information is calculated from the attribute information of the content input by the query information input unit and the attribute information of at least one content group acquired by the content data acquisition unit, and the calculated difference Calculate similarity between the contents based on information A similarity criterion determining unit that determines a similarity criterion that is a criterion for the determination, a similarity calculating unit that calculates a similarity between the contents based on the similarity criterion determined by the similarity criterion determining unit, Matching content selection means for selecting content satisfying a predetermined condition based on the similarity calculated by the similarity calculation means.

本構成によって、多様なコンテンツデータに対しても、コンテンツデータが持つ属性情報の違いに応じた類似度算出が可能となるため、検索対象コンテンツ群に対して、入力された検索クエリであるコンテンツに適した類似度で類似コンテンツ検索を行うことが可能となる。 This configuration makes it possible to calculate the degree of similarity according to the difference in attribute information of content data, even for a variety of content data. It is possible to perform a similar content search with an appropriate degree of similarity.

本発明の類似コンテンツ検索処理装置および方法によれば、ユーザが保有する多様なコンテンツ群や特定用途向けのコンテンツ群に対して、検索クエリであるコンテンツの属性情報と検索対象となる各コンテンツ群の属性情報に基づいて算出すべき類似度を決定する処理を行うため、ユーザの検索目的または探索意図に適合する類似度でコンテンツの検索を行う類似コンテンツ検索処理を行うことができる。 According to the similar content search processing apparatus and method of the present invention, the content attribute information that is the search query and the content groups that are the search targets of the various content groups that the user holds and the content groups that are specific to the application. Since the process for determining the similarity to be calculated based on the attribute information is performed, a similar content search process for searching for content with a similarity suitable for the user's search purpose or search intention can be performed.

本発明の実施の形態１〜３における類似コンテンツ検索処理装置および方法のブロック図Block diagram of similar content search processing apparatus and method according to first to third embodiments of the present invention 本発明の実施の形態１の類似コンテンツ検索制御処理を行う手順を示したフローチャートThe flowchart which showed the procedure which performs the similar content search control process of Embodiment 1 of this invention コンテンツ群から抽出されるコンテンツ情報における機器メタデータ情報の一例を示す図The figure which shows an example of the apparatus metadata information in the content information extracted from a content group コンテンツ群から抽出されるコンテンツ情報における解析メタデータ情報の一例を示す図The figure which shows an example of the analysis metadata information in the content information extracted from a content group コンテンツ群から抽出されるコンテンツ情報における利用メタデータ情報の一例を示す図The figure which shows an example of the utilization metadata information in the content information extracted from a content group 本発明の実施の形態１の類似度基準決定手段４の機能構成を示すブロック図The block diagram which shows the function structure of the similarity standard determination means 4 of Embodiment 1 of this invention. 本発明の実施の形態１におけるコンテンツ間の類似度基準決定処理を行う手順を示したフローチャートThe flowchart which showed the procedure which performs the similarity reference | standard determination process between the contents in Embodiment 1 of this invention 本発明の実施の形態１で算出される差分情報の一例を示す図The figure which shows an example of the difference information calculated in Embodiment 1 of this invention 本発明の実施の形態２における統計的な特性項目に基づくコンテンツ間の類似度基準決定処理を行う手順を示したフローチャートThe flowchart which showed the procedure which performs the similarity reference | standard determination process between the contents based on the statistical characteristic item in Embodiment 2 of this invention 本発明の実施の形態２の属性情報比較手段４３の機能構成を示すブロック図The block diagram which shows the function structure of the attribute information comparison means 43 of Embodiment 2 of this invention. 本発明の実施の形態３における照合判定向けコンテンツ間の類似度基準決定処理を行う手順を示したフローチャートThe flowchart which showed the procedure which performs the similarity reference | standard determination process between the contents for collation determination in Embodiment 3 of this invention 本発明の実施の形態３の基準情報決定手段４４の機能構成を示すブロック図The block diagram which shows the function structure of the reference | standard information determination means 44 of Embodiment 3 of this invention. 本発明の実施の形態４における類似コンテンツ検索処理装置および方法のブロック図Block diagram of similar content search processing apparatus and method according to Embodiment 4 of the present invention 本発明の実施の形態４の類似コンテンツ検索制御処理を行う手順を示したフローチャートThe flowchart which showed the procedure which performs the similar content search control process of Embodiment 4 of this invention. 本発明の実施の形態４における属性情報の変化に伴い１次元的に類似度基準を切り替えて検索した結果を表示する方法の一例を示す図The figure which shows an example of the method of displaying the result of having switched and searched the one-dimensional similarity reference according to the change of the attribute information in Embodiment 4 of this invention. 本発明の実施の形態４における属性情報の変化に伴い２次元的に類似度基準を切り替えて検索した結果を表示する方法の一例を示す図The figure which shows an example of the method of displaying the result of having searched and switched two-dimensional similarity reference according to the change of the attribute information in Embodiment 4 of this invention. 本発明の実施の形態４におけるフォルダ構成で類似度基準を切り替えてコンテンツを検索する際の表示方法の一例を示す図The figure which shows an example of the display method at the time of searching a content by switching similarity criteria by the folder structure in Embodiment 4 of this invention

（本発明に係る一形態を得るに至った経緯）
発明者は、多数のコンテンツ画像に対する類似画像検索において、より効果的にユーザの意図に合う類似判定を行うための技術について詳細に検討した。 (Background to obtaining one embodiment of the present invention)
The inventor has studied in detail a technique for performing a similarity determination more effectively matching a user's intention in a similar image search for a large number of content images.

まず、多数のコンテンツ画像に対する類似画像検索技術について詳細に説明する。 First, a similar image search technique for a large number of content images will be described in detail.

通常、上述した従来の技術では、基本的にコンテンツ特徴量として画像全体の色合いや形状または付与されているキーワードやユーザにより指定された特徴量等に基づいて、検索元のコンテンツと検索対象となるコンテンツ群との特徴量の類似性の高いコンテンツ群を閲覧してユーザが所望のコンテンツを探すことができるようにしている。それにより、ユーザが一定基準に従って、例えば同じ色合いの画像を探すといった類似画像検索をすることが可能になる。しかしながら、ユーザが未知のコンテンツ群から類似検索により見つけたいコンテンツを探す際には、一定基準に従って探す方が効果的であるが、既に見たことがある場合や撮影して保有している場合または何かしら探す対象コンテンツ群が意味のある構造を持っている場合には、全コンテンツを一意に特定基準で類似度を算出することは必ずしもユーザにとって探し易いまたは思い出し易いコンテンツの類似検索方法が提供されているとは言えない。また、Ｗｅｂ上にある全画像から探索する用途とは違い、特定のメニューやリストから探索する用途では、一定基準の類似度判定では探索し難い場合が発生する。 Normally, in the above-described conventional technology, the content of the search source and the search target are basically based on the color and shape of the entire image as the content feature amount, the assigned keyword, the feature amount specified by the user, and the like. A user can search for a desired content by browsing a content group having a high feature amount similarity to the content group. Thereby, it becomes possible for the user to perform a similar image search such as searching for an image of the same color according to a certain standard. However, when searching for content that the user wants to find from a group of unknown content by a similar search, it is more effective to search according to a certain standard, but if it has already been seen or taken and held or When the target content group to be searched for has a meaningful structure, it is possible to provide a similarity search method for content that is easy for a user to easily search or remember to calculate similarity based on a specific standard for all content. I can't say. In addition, unlike the use of searching from all images on the Web, in the use of searching from a specific menu or list, there are cases where it is difficult to search by a certain standard similarity determination.

次に、このような従来の類似画像検索技術において、より効果的にユーザの意図に合う類似判定を行うことを目的とした従来技術について詳細に説明する。 Next, in the conventional similar image retrieval technique, the conventional technique for the purpose of performing the similarity determination more appropriately according to the user's intention will be described in detail.

まず、特許文献１に係る技術は、入力画像とＤＢ内の個々の画像との類似度を、複数の特徴値のそれぞれの差に重みを付けて組み合わせて得られる複数の判定基準を用いて、所定の条件を満たす画像群を各基準から抽出することで、ユーザが所望するコンテンツに効果的な特徴値で類似度算出することを可能とするものである。具体的には、ユーザ毎に適した類似画像検索を実現するために、特徴量の組合せによる複数の類似度基準を用意しており、その類似度基準を保持した状態でユーザが選択できる類似画像検索ＩＦを提供し、ユーザが選択したコンテンツに基づいてユーザに適する類似度基準を判定することで、ユーザが所望する類似コンテンツに効率的に絞り込んで探索結果を提示することが可能とするものである。しかし、特許文献１に係る技術では、基本的にユーザが指定したコンテンツに基づいて類似画像検索を行うため、ユーザが指定したコンテンツに近い画像しか探索できず一定基準での類似度検索処理しかできていないという課題を有している。 First, the technology according to Patent Document 1 uses a plurality of determination criteria obtained by combining the similarity between an input image and each image in the DB by weighting a difference between a plurality of feature values. By extracting an image group satisfying a predetermined condition from each reference, it is possible to calculate the similarity with a feature value effective for the content desired by the user. Specifically, in order to realize a similar image search suitable for each user, a plurality of similarity criteria based on a combination of feature amounts are prepared, and a similar image that can be selected by the user while maintaining the similarity criteria By providing a search IF and determining a similarity criterion suitable for the user based on the content selected by the user, it is possible to efficiently narrow down to similar content desired by the user and present a search result. is there. However, since the technique according to Patent Document 1 basically searches for similar images based on content specified by the user, only images close to the content specified by the user can be searched, and only similarity search processing based on a certain standard can be performed. Have the problem of not.

また、特許文献２に係る技術は、コンテンツ特徴量以外のメタ情報を利用する手法も存在し、各コンテンツ画像に対して画像特徴量と画像に関連するテキストに基づいたテキスト特徴量を保持し、ユーザによる画像の指定に伴って、指定画像の類似度に加えてテキスト特徴量の類似度を統合することで、多数の画像群から目的とする類似コンテンツ画像群を見つけることを可能とするものである。しかし、特許文献２に係る技術では、類似判定に画像特徴量だけではなく、より信頼性の高いテキスト情報を特徴量化して併用することで見た目だけではなく言語的に類似する画像も検索することができるが、テキスト情報が付与されていることが前提にされていることに加えて、この場合でもテキスト情報として一定基準で似ているコンテンツしか探すことができないという課題を有している。 In addition, the technique according to Patent Document 2 also has a method of using meta-information other than the content feature amount, holds the image feature amount and the text feature amount based on the text related to the image for each content image, Along with the designation of the image by the user, the similarity of the text feature amount is integrated in addition to the similarity of the designated image, so that a target similar content image group can be found from a large number of image groups. is there. However, in the technique according to Patent Document 2, not only image feature amounts but also linguistically similar images are searched by using not only image feature amounts but also more reliable text information as feature amounts and using them together for similarity determination. However, in addition to the premise that text information is given, there is a problem that even in this case, only content that is similar as text information can be searched.

発明者は、以上のような、より効果的にユーザの意図に合う類似判定を行うための従来技術が有する課題について検討した。 The inventor examined the problems of the conventional technique for performing the similarity determination more effectively matching the user's intention as described above.

上述した、より効果的にユーザの意図に合う類似判定を行うための従来技術では、類似するコンテンツに対する検索処理を行う際に、未知コンテンツを探すことを前提に一定基準または一定基準に収束する方法で特徴量を算出し類似度を判定するため、全コンテンツに対して平等に類似度の高いコンテンツを検索する処理になっている。これは、ユーザにより予め構造化されているコンテンツ群単位や一定の共通点のあるコンテンツ群に対して、その情報を考慮せず類似度算出処理を行っていることに起因している。そのため、このような処理では、ユーザが保有する様なコンテンツの内容に則して適切な類似判定処理ができているとは言えない。その結果、ユーザの保有するコンテンツ群や特定の用途向けに保有されているコンテンツ群に対して、各種メタデータの違いを考慮しておらず、ユーザに適する類似コンテンツを探索し提示する類似検索処理手法には不十分であるという問題が生じる。 In the above-described conventional technology for more effectively performing similarity determination that matches the user's intention, a method of converging to a certain standard or a certain standard on the assumption that unknown content is searched when performing a retrieval process for similar content In order to calculate the feature amount and determine the similarity, the processing is to search for content having a high similarity evenly with respect to all content. This is due to the fact that the similarity calculation processing is performed without considering the information on the content group unit structured in advance by the user or the content group having a certain common point. Therefore, in such a process, it cannot be said that an appropriate similarity determination process can be performed in accordance with the content content that the user has. As a result, a similar search process that searches for and presents similar content suitable for the user without considering the difference of various metadata for the content group owned by the user or the content group held for a specific use. The problem is that the method is insufficient.

以上の検討を通じて、発明者は、上記の問題に対して、少なくともコンテンツ解析から算出される特徴量やコンテンツに付与されているテキスト情報に基づいた一定基準での類似度算出によるコンテンツ群の類似検索処理方法を行わず、ユーザが保有するコンテンツ群または特定用途向けのコンテンツ群の各属性情報やユーザ操作情報を用いることにより、多様なコンテンツの違いに則したユーザが満足することのできる、多様な観点における類似コンテンツ探索を行うことが可能であることを見出し、本発明に至った。 Through the above examination, the inventor can search the content group by searching for similarities based on a certain standard based on at least the feature amount calculated from content analysis and text information attached to the content. By using each attribute information and user operation information of the content group owned by the user or the content group for specific use and user operation information without performing the processing method, various users can satisfy various types of content The present inventors have found that similar content search can be performed from the viewpoint, and have reached the present invention.

以下本発明の実施の形態について、図面を参照しながら説明する。 Embodiments of the present invention will be described below with reference to the drawings.

（実施の形態１）
以下、図面を参照してこの発明の実施形態について説明する。本実施の形態１は、ユーザが保有するまたは特定用途向けの多様なコンテンツデータである画像や動画や文書や音楽等のデータ群検索対象群として、検索クエリとして入力されるコンテンツデータに対して一定の類似度を持つコンテンツデータを抽出して提示する検索制御処理を行う類似コンテンツ検索処理装置において、各コンテンツデータの属性情報を利用することによって、検索対象であるコンテンツ群毎に適する類似度基準を決定し、その類似度基準に基づいて類似度を算出した上で各コンテンツ群に類似するコンテンツデータを抽出する検索制御処理の仕組みに関するものである。図１は本発明の類似コンテンツ検索処理装置の原理的な構成を示すブロック図である。図１において、類似コンテンツ検索処理装置は、コンテンツデータ蓄積部１と、クエリ情報入力手段２と、コンテンツ情報抽出手段３と、類似度基準決定手段４と、類似度算出手段５と、合致コンテンツ選択手段６から構成されている。 (Embodiment 1)
Embodiments of the present invention will be described below with reference to the drawings. In the first embodiment, content data input as a search query is constant as a data group search target group of images, videos, documents, music, etc., which are various content data possessed by a user or for specific applications. In a similar content search processing apparatus that performs search control processing for extracting and presenting content data having similarities, a similarity criterion suitable for each content group to be searched is obtained by using attribute information of each content data. The present invention relates to a search control processing mechanism for determining and calculating similarity based on the similarity criterion and extracting content data similar to each content group. FIG. 1 is a block diagram showing the basic configuration of a similar content search processing apparatus of the present invention. In FIG. 1, the similar content search processing apparatus includes a content data storage unit 1, a query information input unit 2, a content information extraction unit 3, a similarity criterion determination unit 4, a similarity calculation unit 5, and a matching content selection. It is comprised from the means 6.

コンテンツデータ蓄積部１は、ユーザが保有しているまたは特定用途向けに保有されている多様なコンテンツファイルデータが蓄積されている記憶媒体である。例えば、イベントで撮影したまたは事業者が取得した写真画像や動画像データや文書や音楽データ等が記憶されている。記憶媒体は、ＨＤＤやＤＶＤ等の大容量メディアディスクや半導体メモリ等のストレージデバイス等である。 The content data storage unit 1 is a storage medium in which various content file data held by a user or for a specific use are stored. For example, photographic images, moving image data, documents, music data, etc. taken at an event or acquired by a business are stored. The storage medium is a large-capacity media disk such as an HDD or a DVD or a storage device such as a semiconductor memory.

クエリ情報入力手段２は、ユーザが選択して入力したまたはシステムにより選択されたコンテンツデータを検索元データであるクエリ情報として入力する。例えば、コンテンツデータ蓄積部１に蓄積されているデータと同種のデータ等が選択される。 The query information input means 2 inputs content data selected and input by the user or selected by the system as query information which is search source data. For example, data of the same type as the data stored in the content data storage unit 1 is selected.

コンテンツ情報抽出手段３は、携帯電話・スマートフォンや一眼レフカメラ・ＤＳＣ・ＤＶＣ等に代表されるコンテンツ取得機器により自動的に付与される機器メタデータ情報や、コンテンツ解析手法により付与される解析メタデータ情報やユーザによる入力情報や使用履歴等から付与される利用メタデータ情報等のコンテンツデータの属性情報が取得される。 The content information extraction means 3 includes device metadata information automatically given by content acquisition devices represented by mobile phones, smartphones, single-lens reflex cameras, DSCs, DVCs, etc., and analysis metadata given by content analysis methods Content data attribute information such as usage metadata information given from information, user input information, usage history, and the like is acquired.

機器メタデータ情報としては、ＥＸＩＦ（ＥｘｃｈａｎｇｅａｂｌｅＩｍａｇｅＦｉｌｅＦｏｒｍａｔ）情報や動画メタデータや音楽メタデータ等があり、取得可能な代表的な情報としては、撮影時の日時情報や場所情報であるＧＰＳ（ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ）情報や撮影方法として撮影時の撮影モード情報や各種撮影時のコンテンツ取得機器のパラメータ等の情報および各種センサー情報や音楽特徴情報等が存在する。 The device metadata information includes EXIF (Exchangeable Image File Format) information, moving image metadata, music metadata, and the like. Typical information that can be acquired is GPS (Global), which is date information and location information at the time of shooting. Positioning System) information and shooting methods include shooting mode information at the time of shooting, information such as parameters of a content acquisition device at the time of shooting, various sensor information, music feature information, and the like.

コンテンツ解析手法から得られる解析メタデータ情報としては、画像解析により抽出される特徴として、エッジや色やテクスチャ等の画像の基本的な低次特徴量から物体に特異な局所特徴量と呼ばれるオブジェクトの形状を表す高次特徴量が算出され得る。画像内の顔や人体や物体等の被写体オブジェクトおよび海辺や森林や屋内といった特定風景シーンや撮影構図等の認識可能なカテゴリの認識結果等も考えられる。また、文書では画像の配置場所や構成情報等があり、動画の場合には、時系列的な動きやシーンの解析情報および、音声情報や音楽のメロディー等の解析情報などが算出され得る。 The analysis metadata information obtained from the content analysis method includes the features extracted by image analysis, such as edges, colors, textures, etc., from the basic low-order feature quantities of the object, which are called object-specific local feature quantities. A high-order feature amount representing the shape can be calculated. A recognition result of a recognizable category such as a subject object such as a face, a human body or an object in an image, a specific landscape scene such as a seaside, a forest, or an indoor, a shooting composition, or the like is also conceivable. In addition, the document includes an image arrangement location, configuration information, and the like. In the case of a moving image, time-series motion and scene analysis information, and analysis information such as voice information and music melody can be calculated.

利用メタデータ情報としては、人が直接入力したイベント名や操作情報および個人名や撮影者等の情報等が存在し、コンテンツに対する視聴頻度等の利用履歴情報や共有サービスや加工サービス等のサービス利用履歴情報等が抽出され得る。 As usage metadata information, there are event names and operation information directly entered by humans, personal names, photographer information, etc., usage history information such as viewing frequency for content, service usage such as shared service and processing service etc. History information and the like can be extracted.

類似度基準決定手段４は、クエリ情報入力手段２により入力される検索元であるコンテンツデータとコンテンツデータ蓄積部１から取得される少なくとも１つ以上の一定単位のコンテンツ群に対して、コンテンツ情報抽出手段３から抽出される属性情報を用いてその差分情報を算出し、算出された差分情報に基づいてコンテンツ間の類似度を算出する際の基準となる類似度基準を決定する。類似度基準決定方法は、コンテンツ間の属性情報の差分情報を用いて、共通項目と非共通項目およびその度合い情報を抽出し、比較することで意味のある類似項目を決定する事で利用する特徴量やその算出方法を決定する。例えば、比較するコンテンツ間で共通する項目として撮影時間帯であり、非共通な項目として風景シーンである場合には、風景シーンが異なるため背景領域にロバストな局所特徴量とその算出方法を類似度基準として決定できる。 The similarity criterion determination unit 4 extracts content information for content data that is a search source input by the query information input unit 2 and at least one content unit of a certain unit acquired from the content data storage unit 1. The difference information is calculated using the attribute information extracted from the means 3, and a similarity criterion is determined as a criterion for calculating the similarity between contents based on the calculated difference information. The similarity criterion determination method uses the difference information of the attribute information between contents to extract common items and non-common items and their degree information, and uses them to determine meaningful similar items by comparing them Determine the quantity and how to calculate it. For example, if the shooting time zone is an item common to the contents to be compared and a landscape scene is a non-common item, the landscape scene is different, so the local feature that is robust to the background area and its calculation method are used for similarity. It can be determined as a standard.

類似度算出手段５は、類似度基準決定手段４で決定された類似度基準で、コンテンツの特徴量を算出した上でコンテンツ間の類似度を順次算出する。例えば、対象のコンテンツ間で画像全体の類似度を算出する際には、大局特徴量として画像全体の色特徴量やエッジ特徴量やテクスチャ特徴量を算出しピラミッドマッチング手法等により画像全体の類似度を算出する。画像の局所的な類似度を算出する際には、局所特徴量として画像の部分的に不変な部分特徴量を算出し幾何マッチング手法等により画像の局所的な類似度を算出する。 The similarity calculation unit 5 calculates the feature amount of the content based on the similarity criterion determined by the similarity criterion determination unit 4, and sequentially calculates the similarity between the contents. For example, when calculating the similarity of the entire image between target contents, the color feature amount, edge feature amount, and texture feature amount of the entire image are calculated as the global feature amount, and the similarity of the entire image is calculated by a pyramid matching method or the like. Is calculated. When calculating the local similarity of an image, a partial feature quantity that is partially invariant of the image is calculated as the local feature quantity, and the local similarity of the image is calculated by a geometric matching method or the like.

合致コンテンツ選択手段６は、類似度算出手段５で算出されたコンテンツ間の類似度に基づいて、検索クエリ情報であるコンテンツと所定の条件を満たす類似度のコンテンツを選択抽出する。例えば、類似度が閾値Ｔｈ以上のコンテンツが条件に合致するコンテンツとして選択される。 Based on the similarity between the contents calculated by the similarity calculation means 5, the matched content selection means 6 selects and extracts the content that is the search query information and the similarity that satisfies a predetermined condition. For example, content having a similarity equal to or greater than the threshold Th is selected as content that matches the condition.

以下では、ユーザが保有するまたは特定用途向けの写真や動画や文書や音楽等のコンテンツデータに対してデータ間の類似度情報に基づいて類似検索する際に、検索元であるクエリ情報としてのコンテンツデータと検索対象となるコンテンツデータ群の属性情報の違いを差分情報として抽出し利用することによって、コンテンツ間の属性の違いに則した類似度基準を適応的に決定し類似度を算出する。この処理によって、多様なコンテンツデータ間の類似度算出に関して、コンテンツ間の属性の違いを加味した類似度算出が可能となるため、よりコンテンツの特性に適した類似コンテンツ検索処理が可能な方法について詳細に説明する。 In the following, when a similar search is performed based on similarity information between data with respect to content data such as photos, videos, documents, music, etc. that are held by a user or for a specific use, the content as query information that is the search source By extracting and using the difference between the attribute information of the data and the content data group to be searched as difference information, a similarity criterion according to the attribute difference between contents is adaptively determined and the similarity is calculated. With this process, the similarity between various content data can be calculated by taking into account the difference in attributes between the contents. Explained.

ユーザやシステムがコンテンツの類似検索を行う際に、検索対象コンテンツデータ群に対して検索元であるクエリ情報のコンテンツデータを指定し入力した際に、検索対象コンテンツデータにおける属性情報との違いを反映したコンテンツの類似度基準決定および類似コンテンツの検索制御処理が行われる。類似コンテンツの検索制御処理が開始されると、入力されたクエリ情報であるコンテンツデータの属性情報を抽出した上で、検索対象であるコンテンツデータ群から比較可能な一定単位のコンテンツ群の属性情報を抽出しコンテンツ間の差分情報を抽出する。抽出された差分情報に基づいてコンテンツ間の属性の違いに適する類似度基準が決定される。そして決定された類似度基準に基づいてコンテンツ間の類似度を算出し、所定の条件を満たす類似コンテンツを選択決定する検索制御処理を行う。図２は、クエリ情報と検索対象のコンテンツデータの属性情報に基づいてコンテンツの類似度基準決定処理を行った上で類似コンテンツの検索制御処理を行う手順を示したフローチャートである。 When a user or system performs content similarity search, when the content data of the query information that is the search source is specified and entered for the search target content data group, the difference from the attribute information in the search target content data is reflected. Content similarity criterion determination and similar content search control processing are performed. When the similar content search control process is started, the attribute information of the content data that is the input query information is extracted, and the attribute information of the content group in a certain unit that can be compared from the content data group that is the search target is extracted. Extract and extract difference information between contents. Based on the extracted difference information, a similarity criterion suitable for an attribute difference between contents is determined. Then, the similarity between the contents is calculated based on the determined similarity criterion, and search control processing for selecting and determining similar contents satisfying a predetermined condition is performed. FIG. 2 is a flowchart showing a procedure for performing similar content search control processing after performing content similarity criterion determination processing based on query information and attribute information of content data to be searched.

ユーザまたはシステムによって検索元になる検索クエリ情報としてのコンテンツデータの入力がクエリ情報入力手段より行われて類似コンテンツデータの検索制御処理が開始される（ステップＳ１）。ユーザによる検索クエリの入力は、操作機器であるマウスやキーボードなどを用いてディスプレイに表示されているコンテンツデータを間接的に選択する方法やタッチパネルやタッチパッドを用いてディスプレイに表示されているコンテンツデータを直接的に選択する方法がある。さらに、カメラを用いた視線検出による操作やフローティングタッチパネルを用いた感圧式の選択によって検索クエリを選択することが可能である。なお、これらの操作などを組合せることにより操作も可能である。また、システムによってランダムにまたはユーザが興味を持ちそうなコンテンツデータを最初の検索クエリとして入力する事もできる。 Input of content data as search query information to be a search source by the user or system is performed by the query information input means, and search control processing of similar content data is started (step S1). The user can input a search query by indirectly selecting content data displayed on the display using a mouse or keyboard that is an operating device, or content data displayed on the display using a touch panel or touch pad. There is a method to select directly. Furthermore, it is possible to select a search query by an operation based on gaze detection using a camera or a pressure-sensitive selection using a floating touch panel. An operation can be performed by combining these operations. In addition, content data that the user may be interested in may be input as an initial search query by the system.

クエリ情報であるコンテンツデータが入力されると、コンテンツデータ蓄積部１から検索対象となるコンテンツデータが取得され、コンテンツ情報抽出手段３においてクエリ情報のコンテンツと検索対象のコンテンツに対する属性情報の抽出処理が行われる（ステップＳ２）。抽出されるコンテンツの属性情報としては、機器メタデータ情報や解析メタデータ情報や利用メタデータ情報等である。 When content data as query information is input, content data to be searched is acquired from the content data storage unit 1, and content information extraction means 3 performs processing for extracting the content of the query information and attribute information for the content to be searched. Performed (step S2). The attribute information of the extracted content includes device metadata information, analysis metadata information, usage metadata information, and the like.

機器メタデータ情報の一例を図３に示す。機器メタデータ情報は、コンテンツ取得機器により付与されるメタデータに含まれる属性情報である。コンテンツ取得機器により付与されるメタデータは、例えば、ＥＸＩＦ（ＥｘｃｈａｎｇｅａｂｌｅＩｍａｇｅＦｉｌｅＦｏｒｍａｔ）情報、動画用拡張メタデータ、ＣＤＤＢ情報、ＩＤ３タグなどの音楽メタデータである。このメタデータに含まれるコンテンツの属性情報は、例えばコンテンツデータそれぞれにＩＤ番号が割り振られており、各コンテンツにファイル名とデータの種類および取得時間が存在する。また、画像や動画データの場合には撮影された時間を表す撮影時間情報と撮影時の地理的位置情報としてＧＰＳ情報から得られる経度と緯度情報と撮影時の明るさの調整を行うＩＳＯ（ＩｎｔｅｒｎａｔｉｏｎａｌＯｒｇａｎｉｚａｔｉｏｎｆｏｒＳｔａｎｄａｒｄｉｚａｔｉｏｎ）感度情報や明るさを適正に視聴できるように調整する露出情報や撮影機器の種類や撮影機器の撮影モード情報等の機器パラメータ情報等が存在する。音楽データである場合には、音質やジャンル等の情報が存在する。 An example of device metadata information is shown in FIG. The device metadata information is attribute information included in metadata provided by the content acquisition device. The metadata provided by the content acquisition device is, for example, music metadata such as EXIF (Exchangeable Image File Format) information, extended metadata for moving images, CDDB information, and ID3 tags. In the content attribute information included in the metadata, for example, an ID number is assigned to each content data, and a file name, a data type, and an acquisition time exist for each content. In the case of image and video data, ISO (International) that adjusts the shooting time information indicating the shooting time and the longitude and latitude information obtained from the GPS information as the geographical position information at the time of shooting and the brightness at the time of shooting. (Organization for Standardization) Sensitivity information, exposure information for adjusting the brightness so that it can be properly viewed, device parameter information such as the type of photographing device, photographing mode information of the photographing device, and the like exist. In the case of music data, information such as sound quality and genre exists.

次に、解析メタデータ情報の一例を図４に示す。コンテンツ解析手法から得られるメタデータである。画像解析により得られるメタデータとしては、例えば、画像の基本的な特徴量情報である色情報やエッジ情報やテクスチャ情報の様な低次元特徴量から被写体オブジェクトの特徴を表現可能な高次元特徴量の算出が考えられる。画像の色情報としては、ＲＧＢカラー値を画像内統計値として算出したり、ＨＳＶやＹＵＶ色空間へ変換した色相情報として算出したり、カラーヒストグラムやカラーモーメント等の統計量情報等として算出したりすることができる。エッジ情報としては、画像内の線分検出されたエッジ特徴を一定角度毎に画像内統計値として算出したりすることができる。高次元特徴としては、特徴的な点を中心にした局所領域の特徴や物体の形状を表す特徴が考えられ、具体的にはＳＩＦＴやＳＵＲＦやＨＯＧ等の特徴量が存在する。これらの特徴量を用いてコンテンツ間の類似度を判定することや、同一画像を検出する事が可能である。また、顔検出技術等から得られる顔情報として顔の有無および数情報が算出されており、他にも顔の大きさ服装の色や形状および人検出情報等から人に関連する画像認識情報等を利用することも考えられる。なお、車検出および犬や猫などのペット検出や一般的なオブジェクトの認識を行う一般物体認識に代表されるような画像認識技術による結果を利用することも可能である。さらに、画像の領域情報抽出として、ＳａｌｉｅｎｃｙＭａｐやＤｅｐｔｈＭａｐを用いた領域情報算出手法等を用いて特定被写体や背景シーンの領域情報を利用することも可能である。 Next, an example of analysis metadata information is shown in FIG. Metadata obtained from content analysis techniques. As metadata obtained by image analysis, for example, high-dimensional feature quantities that can represent the characteristics of a subject object from low-dimensional feature quantities such as color information, edge information, and texture information that are basic feature quantity information of images Can be calculated. As the color information of the image, RGB color values are calculated as in-image statistical values, calculated as hue information converted into HSV or YUV color space, calculated as statistical information such as color histograms and color moments, etc. can do. As the edge information, the edge feature detected in the line segment in the image can be calculated as an in-image statistical value for each fixed angle. As the high-dimensional feature, a feature of a local region centered around a characteristic point or a feature representing the shape of an object is conceivable, and specifically, there are feature quantities such as SIFT, SURF, and HOG. It is possible to determine the degree of similarity between contents using these feature amounts and to detect the same image. In addition, the presence / absence and number of faces is calculated as face information obtained from face detection technology, etc. In addition, image recognition information related to people based on face size, clothing color and shape, person detection information, etc. It is also possible to use. In addition, it is also possible to use the result by image recognition technology represented by the general object recognition which performs vehicle detection, pet detection, such as a dog and a cat, and general object recognition. Furthermore, the region information of the specific subject or the background scene can be used as the region information extraction of the image by using a region information calculation method using a Salicity Map or a Depth Map.

また、利用メタデータ情報の一例を図５に示す。利用メタデータ情報は、ユーザが付与した情報、使用履歴やサービス利用履歴などコンテンツの利用により付与されるメタデータである。例えば、コンテンツの撮影イベント名を付与する操作内容や付与された各コンテンツ内の登場人物やタグ情報等が考えられる。また、表示機器で再生したまたは検索した回数や利用頻度、更に共有した相手先や利用サービスまたは写真現像やＤＶＤへのパッケージ化やデジタルアルバムやスライドショー化等のコンテンツを利用したサービス内容などが考えられる。さらに、上記属性情報は取得コンテンツデータが複数存在し撮影されたデータセット単位を撮影イベントとすると、撮影イベント単位でＩＤが割り振られ、その撮影イベント単位で各情報の組合せから算出可能な統計データ情報にすることや、直接入力可能な利用メタデータ情報として残して利用することが考えられる。 An example of usage metadata information is shown in FIG. The usage metadata information is metadata provided by use of content such as information provided by the user, usage history, service usage history, and the like. For example, an operation content for assigning a content shooting event name, a character in each attached content, tag information, or the like can be considered. In addition, the number of times played or searched on the display device, the frequency of use, the shared destination, the service used, or the contents of services using contents such as photo development, DVD packaging, digital albums, slideshows, etc. . Further, the attribute information is statistical data information that can be calculated from a combination of information in each shooting event unit, where an ID is allocated in each shooting event unit, where a plurality of acquired content data exists and a set of captured data is a shooting event. It is conceivable to leave it as used metadata information that can be directly input.

コンテンツの属性情報が抽出されると、類似度基準決定手段４において、抽出されたクエリ情報と検索対象コンテンツ群の属性情報を用いて、コンテンツ間の類似度をどの様に算出するかを表す情報である類似度基準を決定する（ステップＳ３）。ここで、類似度基準決定手段４の機能構成の一例について図６を参照しつつ具体的な処理内容について説明する。 When the content attribute information is extracted, the similarity criterion determination unit 4 uses the extracted query information and the attribute information of the search target content group to indicate how the similarity between the contents is calculated. A similarity criterion is determined (step S3). Here, an example of a functional configuration of the similarity criterion determination unit 4 will be described with reference to FIG.

図６は類似度基準決定手段４の機能ブロック図である。類似度基準決定手段４は、クエリ属性抽出手段４１と、対象群属性抽出手段４２と、属性情報比較手段４３と、基準情報決定手段４４とから構成されている。クエリ属性抽出手段４１は、入力されたクエリ情報であるコンテンツに関して、抽出された属性情報から利用する属性情報のみを抽出する。対象群属性抽出手段４２は、クエリ属性抽出手段４１で抽出された属性情報と比較可能な属性情報のみを抽出する。属性情報比較手段４３は、抽出された属性情報毎に差分値として算出可能な形式で比較することにより計算処理可能な差分情報を抽出する。そして、基準情報決定手段４４は、属性情報比較手段４３で算出された差分情報を基に、コンテンツ間の類似度として算出すべき特徴量や算出方法を類似度基準として算出し決定する。 FIG. 6 is a functional block diagram of the similarity criterion determination means 4. The similarity criterion determination unit 4 includes a query attribute extraction unit 41, a target group attribute extraction unit 42, an attribute information comparison unit 43, and a reference information determination unit 44. The query attribute extraction unit 41 extracts only attribute information to be used from the extracted attribute information regarding the content that is the input query information. The target group attribute extracting unit 42 extracts only attribute information that can be compared with the attribute information extracted by the query attribute extracting unit 41. The attribute information comparison unit 43 extracts difference information that can be calculated by comparing the extracted attribute information in a format that can be calculated as a difference value. Based on the difference information calculated by the attribute information comparison unit 43, the reference information determination unit 44 calculates and determines the feature amount and the calculation method to be calculated as the similarity between contents as the similarity criterion.

クエリ情報であるコンテンツと検索対象のコンテンツ群で類似度を算出するための類似度基準を決定する基本的な処理内容に関して、図７を用いて具体的に説明する。図７は、クエリ情報であるコンテンツの属性情報と検索対象であるコンテンツ群の属性情報に基づいてコンテンツ間の類似度を算出するための類似度基準決定処理を行う手順を示したフローチャートである。 The basic processing contents for determining the similarity criterion for calculating the similarity between the content that is the query information and the content group to be searched will be specifically described with reference to FIG. FIG. 7 is a flowchart illustrating a procedure for performing similarity criterion determination processing for calculating similarity between contents based on attribute information of content that is query information and attribute information of a content group that is a search target.

検索元であるクエリ情報のコンテンツとその属性情報が入力されると、クエリ属性抽出手段４１において差分情報として算出する際に利用する比較内容としての属性情報を抽出する(ステップＳ１１)。クエリ情報であるコンテンツの属性情報は、コンテンツ情報抽出手段３で抽出される属性情報のうち、少なくとも１つ以上の比較可能なメタデータ情報である。コンテンツ自体の属性情報から抽出される場合と、そのコンテンツが所属するコンテンツ群から抽出される場合がある。所属するコンテンツ群とは、属性情報に基づいて複数のコンテンツを１つの集合と見なす事ができるコンテンツ群のことである。例えば、ユーザにより作成されたイベントフォルダに仕分けされたコンテンツ群であったり、１日分のコンテンツ群であったり一定範囲内で同じ場所で撮影されたコンテンツ群であったり同じ撮影シーンや撮影構図と判定可能なコンテンツ群であったりする。 When the content of the query information that is the search source and the attribute information thereof are input, the attribute information as the comparison content used when the query attribute extraction unit 41 calculates the difference information is extracted (step S11). The attribute information of the content that is the query information is at least one or more comparable metadata information among the attribute information extracted by the content information extraction unit 3. There are cases where the content is extracted from attribute information of the content itself and cases where the content is extracted from a content group to which the content belongs. The affiliated content group is a content group in which a plurality of contents can be regarded as one set based on attribute information. For example, content groups sorted into event folders created by the user, content groups for one day, content groups shot at the same place within a certain range, or the same shooting scene and shooting composition It may be a content group that can be determined.

そして、検索対象となるコンテンツ群からその属性情報が入力されると、対象群属性抽出手段４２において差分情報として算出する際に利用する比較内容としての属性情報を抽出する(ステップＳ１２)。クエリ属性抽出手段４１で算出される属性情報の内容と同一の内容または差分情報として算出可能な情報である。検索対象群は、所属するコンテンツ群と同様で一定単位の同じコンテンツ群であると判定可能な複数のコンテンツ集合である。 Then, when the attribute information is input from the content group to be searched, the target group attribute extracting means 42 extracts the attribute information as the comparison contents used when calculating as the difference information (step S12). This is information that can be calculated as the same content as the content of the attribute information calculated by the query attribute extraction means 41 or as difference information. The search target group is a plurality of content sets that can be determined to be the same content group in a certain unit, similar to the content group to which the search target group belongs.

次に、ステップＳ１１とステップＳ１２で抽出された属性情報を用いて、属性情報比較手段４３において比較可能な属性情報の項目同士から差分情報を算出する(ステップＳ１３)。差分情報は、例えば解析メタデータ情報からシーン情報の一致度を算出する。一致度が低い方から高い方へ０〜１で正規化するとした場合、シーンが共に屋内であれば一致度を１、屋内と屋外であれば０、屋外と水辺であれば０．５等といったシーン情報に基づいてその差分を一致度として算出することができる。複数コンテンツでは複数のシーン情報が存在するが、各コンテンツのシーン情報を均等に利用し各シーンの数を各コンテンツ群が持つシーン情報として扱う事ができる。例えば、コンテンツ数が１０あり屋内と水辺と屋外の３種類のシーン種別がある場合に、各種別数が７と１と２であった際には正規化してシーン情報は各シーンが０．７と０．１と０．２含まれているとしてその割合で一致度が算出される。そして、検索対象群毎に差分情報が算出される。 Next, using the attribute information extracted in step S11 and step S12, difference information is calculated from the attribute information items that can be compared by the attribute information comparison unit 43 (step S13). As the difference information, for example, the degree of coincidence of scene information is calculated from analysis metadata information. When normalizing from 0 to 1 from low to high, the matching level is 1 if the scene is indoors, 0 if indoor and outdoor, 0.5 if outdoor and waterside, etc. The difference can be calculated as the degree of coincidence based on the scene information. There are a plurality of pieces of scene information in a plurality of contents, but the scene information of each piece of content can be used equally and the number of each scene can be handled as the scene information of each group of contents. For example, when the number of contents is 10 and there are three types of scenes, indoor, waterside, and outdoor, when the number of various types is 7, 1 and 2, the scene information is normalized to 0.7 for each scene. And the degree of coincidence is calculated at the ratio. Then, difference information is calculated for each search target group.

続いて、差分情報として算出対象となる全ての比較可能な属性情報に対して、各属性情報であるメタデータ情報間の一致度を算出することにより、全ての差分情報が算出完了するまで一致度の算出を繰り返す（ステップＳ１４）。算出するメタデータ情報の数や種類は、コンテンツ情報抽出手段３で抽出可能なメタデータ情報の組合せにより予め決定しておく事も可能であるが、コンテンツ群が保有しているメタデータ情報でクエリ情報と検索対象で共通するメタデータ情報のみに限定することや、一定単位のコンテンツ群であるコンテンツの集合から抽出されるメタデータ情報において抽出され得る内容から比較優先度の高い項目に限定して決定すること等が考えられる。 Subsequently, the degree of coincidence is calculated until all the difference information is calculated by calculating the degree of coincidence between the metadata information as each attribute information with respect to all comparable attribute information to be calculated as difference information. Is repeatedly calculated (step S14). The number and type of metadata information to be calculated can be determined in advance by a combination of metadata information that can be extracted by the content information extraction means 3, but a query can be made using metadata information held by the content group. Limit to only metadata information that is common to information and search targets, or to items with high comparison priority from content that can be extracted from metadata information extracted from a set of content that is a group of content in a fixed unit It is possible to decide.

最後に、算出された差分情報から類似度判定方法を決定する(ステップＳ１５)。 Finally, a similarity determination method is determined from the calculated difference information (step S15).

算出された差分情報の一例を図８に示す。検索対象のコンテンツ群がｃＩＤ０００００１の場合には、撮影時間帯やシーンの一致度が高く、例えば算出された一致度全体の平均値が高く分散値が低い際には全体的に一致するコンテンツが含まれる可能性が高い。そのため、抽出する特徴量は全体特徴量として色やエッジやテクスチャといった低次元特徴量を算出し、類似度判定は各画素単位またはピラミッド階層型の一定領域単位で差分情報を算出した上でユークリッド距離やＮ２ノーム距離といった距離算出手法により距離の近さで類似度を判定する手法を選択する事でコンテンツ全体として類似するコンテンツを抽出する手法に決定する。 An example of the calculated difference information is shown in FIG. When the content group to be searched is cID000001, the degree of coincidence of the shooting time zone and the scene is high. For example, when the average value of the total degree of coincidence calculated is high and the variance value is low, the content that matches generally is included. There is a high possibility of being. Therefore, the extracted feature value is a low-dimensional feature value such as color, edge, or texture as the overall feature value, and the similarity determination is based on the Euclidean distance after calculating the difference information in units of each pixel or pyramid hierarchy type. By selecting a method for determining the similarity based on the distance by a distance calculation method, such as N2 or N2 nom distance, a method for extracting similar content as the entire content is selected.

また、ｃＩＤ０００００２の場合には、撮影場所や構図などの一致度が中程度であり、例えば算出された一致度全体の平均値も分散値も中程度の際には、部分的に一致するコンテンツが含まれる可能性が高い。そのため、抽出する特徴量は局所特徴量として部分的な一致を判定可能な高次元特徴量を算出し、類似度判定はＢｏＶＷ（ＢａｇｏｆＶｉｓｕａｌＷｏｒｄｓ）を用いた局所的な一致度を発見する手法やマッチング手法による同一被写体が含まれているかを判定する手法等の部分的な類似度を判定する手法に決定する。 In the case of cID000002, the degree of coincidence of the shooting location and composition is medium. For example, when the average value and the variance value of the calculated whole degree of coincidence are medium, content that partially matches is obtained. It is likely to be included. Therefore, the extracted feature value is a local feature value, and a high-dimensional feature value that can be determined as a partial match is calculated, and the similarity determination is a technique for finding a local match using BoVW (Bag of Visual Words). Or a method for determining a partial similarity such as a method for determining whether or not the same subject is included by a matching method.

そして、ｃＩＤ０００００３の場合には、人物情報の一致度が高いがイベント名の一致度が低く、例えば算出された一致度全体の平均値も分散値も低い場合には、コンテンツ全体として同じ様なコンテンツが含まれている可能性は低いが人物のみ一致する可能性が高い。そのため、抽出する特徴量は顔や人体検出に有効な特徴量を算出し、類似度判定は顔特徴量に基づく一致度判定や服装情報等の一致度判定による特殊な被写体の判定手法に決定する。この様に、差分情報の一致度が高い程、コンテンツ全体が類似する内容で類似判定する様にし、差分情報の一致度が低い程、コンテンツの局所部分が類似する内容で類似判定する様に手法を切り替えるように決定する事が可能である。 In the case of cID000003, the matching degree of the person information is high, but the matching degree of the event name is low. For example, when the average value and the variance value of the calculated total matching degree are both low, Is unlikely to be included, but there is a high possibility that only people match. Therefore, the feature quantity to be extracted is calculated as a feature quantity effective for face and human body detection, and the similarity determination is determined as a special subject determination method based on the matching degree determination based on the face feature quantity and the matching degree determination of clothes information and the like. . In this way, the higher the degree of coincidence of the difference information, the more similar the content is determined to be similar, and the lower the degree of coincidence of the difference information is, the more similar the local part of the content is to be determined to be similar It is possible to decide to switch.

類似度基準決定手段４において類似度を算出するための類似度基準が決定されると、類似度算出手段５は決定された類似度基準を用いて対象となるコンテンツデータの類似度を測るための特徴量抽出を行い、指定された類似度算出方法でコンテンツデータ間の類似度を算出する（ステップＳ４）。類似度は基準情報決定手段４４で決定された手法に基づいてコンテンツから特徴量を算出し、コンテンツ間の類似度を算出する。 When the similarity criterion for calculating the similarity is determined by the similarity criterion determination means 4, the similarity calculation means 5 uses the determined similarity criterion to measure the similarity of the target content data. The feature amount is extracted, and the similarity between the content data is calculated by the designated similarity calculation method (step S4). The similarity is calculated from the content based on the method determined by the reference information determination unit 44, and the similarity between the contents is calculated.

そして、類似度算出手段５から算出された類似度が出力されると、合致コンテンツ選択手段６は所定の条件を満たす類似度を持つコンテンツを検索対象のコンテンツ群から選択し抽出する(ステップＳ５)。例えば、一致度が一定閾値Ｔｈ以上のコンテンツを全て抽出することや一致度が高い順にＮ個抽出することにより対象コンテンツ群を選択肢抽出する。また、類似度基準が異なる一定単位のコンテンツ群毎に条件を変更して適切なコンテンツ群を抽出することができる。 When the similarity calculated by the similarity calculation means 5 is output, the matched content selection means 6 selects and extracts content having a similarity satisfying a predetermined condition from the content group to be searched (step S5). . For example, options for a target content group are extracted by extracting all contents having a matching degree equal to or greater than a certain threshold Th or by extracting N contents in descending order of matching degree. In addition, it is possible to extract an appropriate content group by changing the condition for each content group of a certain unit having different similarity criteria.

以上のように、コンテンツから取得可能な特定の特徴量に基づいて決定される一定基準の類似度で類似するコンテンツを検索制御処理するのではなく、コンテンツ間の属性情報である各種メタデータの内容の違いを考慮した上で算出すべき特徴量や類似項目を決定することで、コンテンツ間で算出すべき適する類似度基準を動的に変更決定処理するため、ユーザが保有するまたは特定用途向けの多様なコンテンツデータに対して適した類似コンテンツの検索制御処理が可能となり、ユーザはより満足度の高い類似項目で保有データを効果的に類似検索したり、簡単に探索視聴することができる。 As described above, the content of various metadata that is attribute information between contents is not subjected to search control processing for similar contents with a certain standard similarity determined based on a specific feature amount that can be acquired from the contents. By determining feature quantities and similar items that should be calculated taking into account differences in content, it is possible to dynamically change and determine suitable similarity criteria to be calculated between contents. Search control processing of similar content suitable for various content data becomes possible, and the user can effectively perform similar searches on similar items with higher satisfaction and can easily search and view.

（実施の形態２）
以下、図面を参照してこの発明の実施形態について説明する。本実施の形態２は、ユーザが保有するまたは特定用途向けの多様なコンテンツデータである画像や動画や文書や音楽等のデータ群検索対象群として、検索クエリとして入力されるコンテンツデータに対して一定の類似度を持つコンテンツデータを抽出して提示する検索制御処理を行う類似コンテンツ検索処理装置において、コンテンツ群毎に属性情報で統計的に優位な情報を抽出し、その統計情報に基づいてコンテンツ間で算出すべき類似度基準を決定し、各コンテンツ群に類似するコンテンツデータを抽出する検索制御処理の仕組みに関するものである。なお、本実施の形態において、実施の形態１と同じ機能を有する構成には同じ符号を付し、その説明が適用できるため説明を省略する。また、本実施の形態において、特に記載のある場合を除いて、実施の形態１と同じ機能を有する構成には同じ符号を付し、その説明が適用できるため説明を省略する。 (Embodiment 2)
Embodiments of the present invention will be described below with reference to the drawings. In the second embodiment, the content data input as a search query is fixed as a data group search target group of images, videos, documents, music, etc., which are various content data possessed by a user or for specific applications. In a similar content search processing apparatus that performs a search control process for extracting and presenting content data having a similarity degree of the above, statistically superior information is extracted by attribute information for each content group, and between content based on the statistical information This is related to a search control processing mechanism for determining a similarity criterion to be calculated in (1) and extracting content data similar to each content group. Note that in this embodiment, components having the same functions as those in Embodiment 1 are assigned the same reference numerals, and descriptions thereof are omitted because they can be applied. In this embodiment, unless otherwise specified, the same reference numerals are given to configurations having the same functions as those in the first embodiment, and description thereof is omitted because the description can be applied.

本実施の形態では、ユーザが保有する写真や動画や文書や音楽等のコンテンツデータをユーザが意図する様に簡単に類似検索する、または探索視聴することができるように、検索クエリであるコンテンツと検索対象となるコンテンツ群の属性情報を統計的に解析し利用することによって、コンテンツ間の差分情報としてより間違いの少ない妥当な属性内容でコンテンツ間の違いを算出する事ができる。そして、コンテンツ間の撮影状況の違いに適した内容で類似するコンテンツの検索制御処理を適応的に行う。この処理によって、ユーザが保有するまたは特定用途向けの多様なコンテンツ群に対して、コンテンツ群の全般的な違いにより連動する類似度基準で類似するコンテンツの検索制御処理が可能な方法について詳細に説明する。 In the present embodiment, content that is a search query is used so that the user can easily perform similar searches or search and view content data such as photos, videos, documents, music, and the like that the user intends. By statistically analyzing and using the attribute information of the content group to be searched, the difference between the contents can be calculated with reasonable attribute contents with fewer errors as the difference information between the contents. Then, similar content search control processing is adaptively performed with the content suitable for the difference in shooting situation between the content. A detailed description will be given of a method capable of performing search control processing of similar content on a variety of content groups possessed by a user or for a specific purpose by using this process, based on a similarity criterion linked by a general difference of content groups. To do.

ユーザまたはシステムによって検索元になる検索クエリ情報としてのコンテンツデータの入力がクエリ情報入力手段より行われて類似コンテンツデータの検索制御処理が開始される。類似コンテンツの検索処理が開始されると、実施の形態１の処理に加えてクエリ情報または検索対象のコンテンツから抽出される属性情報の統計的な特性項目を抽出した上で、その項目に従ってコンテンツ間の差分情報を抽出しコンテンツ間の違いに適する類似度基準が決定される。そして決定された類似度基準に基づいてコンテンツ間の類似度を算出し、所定の条件を満たす類似コンテンツを選択決定する検索制御処理を行う。ここで、抽出されるコンテンツの属性情報である各種メタデータ情報は実施の形態１のそれと同じである。特に統計的に算出する項目単位としては、撮影日や撮影時間帯、撮影場所や撮影範囲、登場人物や家族、撮影シーンや構図、被写体に関する物体タグ、画面内の動き情報等が存在する。 Input of content data as search query information to be a search source by the user or the system is performed by the query information input means, and search control processing for similar content data is started. When the similar content search process is started, in addition to the process of the first embodiment, the statistical characteristic item of the attribute information extracted from the query information or the search target content is extracted, and the content between the contents is determined according to the item. The similarity information suitable for the difference between the contents is determined by extracting the difference information. Then, the similarity between the contents is calculated based on the determined similarity criterion, and search control processing for selecting and determining similar contents satisfying a predetermined condition is performed. Here, the various pieces of metadata information that is the attribute information of the content to be extracted is the same as that of the first embodiment. In particular, statistically calculated item units include shooting date and time, shooting location and shooting range, characters and family members, shooting scenes and compositions, object tags related to the subject, movement information on the screen, and the like.

コンテンツの属性情報が抽出されると、類似度基準決定手段４において、抽出されたクエリ情報と検索対象コンテンツ群の属性情報が抽出され、その属性情報群から統計的に特性のある項目が算出され、特性項目に基づいた差分情報からコンテンツ間の類似度をどの様に算出するかを表す情報である類似度基準を決定する。クエリ情報であるコンテンツと検索対象のコンテンツ群から属性情報の統計的な特性項目に基づいて類似度を算出することで類似度基準を決定する処理内容に関して、図９を用いて具体的に説明する。図９は、クエリ情報であるコンテンツの属性情報と検索対象であるコンテンツ群の属性情報から統計的な特性項目を算出し、その特性項目の差分情報から類似度基準決定処理を行う手順を示したフローチャートである。 When the content attribute information is extracted, the similarity criterion determination unit 4 extracts the extracted query information and the attribute information of the search target content group, and calculates statistically characteristic items from the attribute information group. Then, a similarity criterion, which is information indicating how to calculate the similarity between contents from the difference information based on the characteristic item, is determined. The processing contents for determining the similarity criterion by calculating the similarity based on the statistical characteristic item of the attribute information from the content that is the query information and the content group to be searched will be specifically described with reference to FIG. . FIG. 9 shows a procedure for calculating a statistical characteristic item from the attribute information of the content that is the query information and the attribute information of the content group that is the search target, and performing the similarity criterion determination process from the difference information of the characteristic item. It is a flowchart.

検索元であるクエリ情報のコンテンツとその属性情報および検索対象となるコンテンツ群とその属性情報が入力されると、コンテンツ情報抽出手段３によりコンテンツの属性情報が抽出される（ステップＳ２１）。 When the content of the query information that is the search source, its attribute information, and the content group to be searched and its attribute information are input, the content information extracting means 3 extracts the attribute information of the content (step S21).

次に、ステップＳ２１で抽出された属性情報を用いて、属性情報群の統計的な解析に基づいて特性のある項目を抽出する(ステップＳ２２)。 Next, using the attribute information extracted in step S21, an item having characteristics is extracted based on a statistical analysis of the attribute information group (step S22).

ここで、属性情報比較手段４３の機能構成の一例について図１０を参照しつつ具体的な処理内容について説明する。図１０は属性情報比較手段４３の機能ブロック図である。属性情報比較手段４３は、統計情報算出手段４３１と、特性項目決定手段４３２と、特性項目比較手段４３３とから構成されている。統計情報算出手段４３１は、入力されたクエリ情報であるコンテンツが所属するまたは検索対象である一定単位の複数のコンテンツ集合から抽出される複数の属性情報から統計解析が可能な単位で統計情報を算出する。統計情報は、単純な属性情報の項目毎のデータ量や分布量さらには平均値や分散値等の統計学的に算出可能な値である。特性項目決定手段４３２は、統計情報算出手段４３１で算出された統計情報を基に属性情報から特性項目を決定する。 Here, specific processing contents of an example of a functional configuration of the attribute information comparison unit 43 will be described with reference to FIG. FIG. 10 is a functional block diagram of the attribute information comparison unit 43. The attribute information comparison unit 43 includes a statistical information calculation unit 431, a characteristic item determination unit 432, and a characteristic item comparison unit 433. The statistical information calculation means 431 calculates statistical information in units that allow statistical analysis from a plurality of attribute information extracted from a plurality of content sets of a certain unit to which the content that is the input query information belongs or is a search target To do. The statistical information is a value that can be calculated statistically, such as a data amount and a distribution amount for each item of simple attribute information, and an average value and a variance value. The characteristic item determination unit 432 determines the characteristic item from the attribute information based on the statistical information calculated by the statistical information calculation unit 431.

例えば、属性情報である各種メタデータ情報が全コンテンツに対してどの程度含まれているかの含有率が算出されるいと、含有率が１０％を超える場合に特性項目として利用し越えない場合は利用しないといった判断を行う事ができる。全コンテンツが１００であり、顔検出されたコンテンツが２０の場合には、顔検出情報が特性項目として決定される。また、主成分分析等の統計解析手法を用いることにより主要な項目を抽出すること等も可能である。 For example, if the content rate of how much various metadata information that is attribute information is included in all content is calculated, it is used when it is not used as a characteristic item when the content rate exceeds 10% You can make a decision to not. When the total content is 100 and the face detected content is 20, the face detection information is determined as the characteristic item. It is also possible to extract main items by using a statistical analysis method such as principal component analysis.

特性項目比較手段４３３は、特性項目決定手段４３２で決定された特性項目毎に差分値として算出可能な形式で比較することにより計算処理可能な差分情報を抽出する(ステップＳ２３)。差分情報の種類は、実施の形態１のそれと同じであるが特性項目で限定された内容の組合せにより決定される。なお、特性項目はクエリ情報から抽出される属性情報のみまたは検索対象群から抽出される属性情報のみを使う事やそれらを組合せることにより共通する項目で限定する構成とすること等も可能である。 The characteristic item comparison unit 433 extracts difference information that can be calculated by comparing each characteristic item determined by the characteristic item determination unit 432 in a format that can be calculated as a difference value (step S23). The type of difference information is the same as that of the first embodiment, but is determined by a combination of contents limited by characteristic items. The characteristic items can be configured to be limited to common items by using only attribute information extracted from the query information or only attribute information extracted from the search target group, or combining them. .

次に、差分情報として算出対象となる全ての比較可能な特性項目である属性情報に対して、各属性情報であるメタデータ情報間の一致度を算出することにより、全ての差分情報が算出完了するまで一致度の算出を繰り返す（ステップＳ２４）。 Next, with respect to attribute information that is all comparable characteristic items to be calculated as difference information, calculation of all the difference information is completed by calculating the degree of coincidence between the metadata information that is each attribute information. The degree of coincidence is repeatedly calculated until it is (step S24).

算出するメタデータ情報の数や種類は、実施の形態１のそれと同じである。最後に、算出された差分情報から類似度判定方法を決定する(ステップＳ２５)。算出する特徴量や類似度判定に特徴量間の差分値に基づく距離計算手法は実施の形態１のそれと同じであっても良いが各属性情報における特性項目の一致度合いを利用して類似度を算出することも可能である。 The number and types of metadata information to be calculated are the same as those in the first embodiment. Finally, a similarity determination method is determined from the calculated difference information (step S25). The distance calculation method based on the difference value between the feature amounts for determining the feature amount or the similarity degree to be calculated may be the same as that of the first embodiment, but the similarity is obtained using the degree of matching of the characteristic items in each attribute information. It is also possible to calculate.

特性項目：Ｐ_ｋ(ｋは特性項目の番号)として、各特性項目における一致度合いをＭ_ｋとする。この際に、各特性項目から決定される特徴量として全体一致度と局所一致度と顔やペットといった特殊一致度を判定する特徴量が選定され、全体一致度を算出する類似度値をＶとして、選定された特徴量により算出される全体一致度をＧ、局所一致度をＬ、特殊一致度をＳ、各項目の重みをαとβとγとすると、Ｖ＝α×Ｇ＋β×Ｌ＋γ×Ｓとして算出することができる。ここで、αは全体一致度に属する特性項目：Ｐ_ｋの一致度合いの総和：ΣＭ_ｋから算出される。βは局所一致度に属する特性項目からγは局所一致度に属する特性項目から同様に算出される。なお、特性項目は複数の一致度算出対象に属する構成とすることや一致度が高い場合にのみ利用する構成とすることが可能である。また、一致度の種別は特性項目から決定することが可能な手法であればその種別は問わず、中央一致度や背景一致度や複数段階の領域別部分一致度などが考えられる。 The characteristic item: P _k (k is the number of the characteristic item), and the matching degree in each characteristic item is M _k . At this time, as the feature amounts determined from the respective characteristic items, feature amounts for determining the overall match degree, the local match degree, and the special match degree such as a face and a pet are selected, and the similarity value for calculating the overall match degree is set as V. If the overall matching degree calculated by the selected feature amount is G, the local matching degree is L, the special matching degree is S, and the weight of each item is α, β, and γ, V = α × G + β × L + γ × S Can be calculated as Here, alpha properties belonging to the entire degree of coincidence items: the sum of the degree of coincidence P _k: is calculated from the? M _k. β is calculated in the same manner from the characteristic item belonging to the local coincidence, and γ is similarly calculated from the characteristic item belonging to the local coincidence. It should be noted that the characteristic items can be configured to belong to a plurality of coincidence degree calculation targets or can be used only when the coincidence degree is high. In addition, the type of coincidence can be determined from a characteristic item, regardless of the type, and the central coincidence degree, the background coincidence degree, the partial coincidence degree for each region in a plurality of stages, and the like can be considered.

類似度基準決定手段４において特性項目に基づいて類似度を算出するための類似度基準が決定されると、類似度算出手段５は決定された類似度基準を用いて対象となるコンテンツデータの類似度を測るための特徴量抽出を行い、指定された類似度算出方法でコンテンツデータ間の類似度を算出する。そして、類似度算出手段５から算出された類似度が出力されると、合致コンテンツ選択手段６は、所定の条件を満たす、特に、特性項目に関して類似度を持つコンテンツを検索対象のコンテンツ群から選択し抽出される。例えば、同じ「海水浴」イベントで撮影されたコンテンツに関して「シーン」情報のみが特性項目として抽出されると、シーンはコンテンツ全般を示すためコンテンツ全般が類似するコンテンツが優先的に抽出される。また、同じ「食事会」イベントで撮影されたコンテンツに関して「物体タグ」と「撮影時間帯」情報のみが特性項目として抽出されると、撮影時間帯と被写体である物体はコンテンツの局所的な部分類似を示すためコンテンツの部分類似または意味的な類似性が高いコンテンツが優先的に抽出される。同じ「旅行」イベントで撮影されたコンテンツに関して「登場人物」と「撮影構図」情報のみが特性項目として抽出されると、登場人物と撮影構図はコンテンツの特殊な人物一致と全般的なパーツの類似性を示すためコンテンツの顔や人物または全般的な部分領域の類似性が高いコンテンツが優先的に抽出される。 When the similarity criterion determination unit 4 determines the similarity criterion for calculating the similarity based on the characteristic item, the similarity calculation unit 5 uses the determined similarity criterion to determine the similarity of the target content data. Feature amount extraction for measuring the degree is performed, and the degree of similarity between the content data is calculated by the designated degree of similarity calculation method. Then, when the similarity calculated by the similarity calculation unit 5 is output, the matched content selection unit 6 selects a content satisfying a predetermined condition, in particular, a content having a similarity with respect to the characteristic item, from the content group to be searched. And extracted. For example, if only “scene” information is extracted as a characteristic item for content shot at the same “sea bathing” event, the scene indicates the content in general, and content with similar content is preferentially extracted. In addition, if only “object tag” and “shooting time zone” information is extracted as characteristic items for content shot at the same “dining party” event, the shooting time zone and the object that is the subject are local parts of the content. In order to show the similarity, content with partial similarity or high semantic similarity is preferentially extracted. If only “Character” and “Shooting Composition” information is extracted as characteristic items for content shot at the same “Travel” event, the characters and shooting composition will be similar to the special person match of the content and general parts similar In order to show the nature, content with high similarity between the face and person of the content or the general partial area is preferentially extracted.

以上のように、コンテンツから取得可能な特定の特徴量に基づいて決定される一定基準の類似度で類似するコンテンツを検索制御処理するのではなく、コンテンツ間の属性情報である各種メタデータから統計的に抽出される内容の違いを考慮した上で算出すべき特徴量や類似項目を決定することで、コンテンツ間で算出すべきより妥当な類似度基準に変更決定できるため、ユーザが保有するまたは特定用途向けの多様なコンテンツデータに対してより全般的に適した類似コンテンツの検索制御処理が可能となり、ユーザはより満足度の高い類似項目で保有データを効果的に類似検索することや、簡単に探索視聴することができる。 As described above, instead of performing search control processing for similar content with a certain standard of similarity determined based on specific feature amounts that can be acquired from the content, statistics are obtained from various metadata that is attribute information between the content. By determining the feature quantity and similar items that should be calculated after taking into account the differences in the content extracted, the user can change the decision to a more appropriate similarity criterion that should be calculated between the contents. It is possible to search for similar content that is more generally suitable for a variety of content data for specific applications, enabling users to effectively search similar data with similar items with higher satisfaction, and easily You can search and watch.

（実施の形態３）
以下、図面を参照してこの発明の実施形態について説明する。本実施の形態３は、ユーザが保有するまたは特定用途向けの多様なコンテンツデータである画像や動画や文書や音楽等のデータ群検索対象群として、検索クエリとして入力されるコンテンツデータに対して一定の類似度を持つコンテンツデータを抽出して提示する検索制御処理を行う類似コンテンツ検索処理装置において、コンテンツ群毎にコンテンツ間の属性情報の違いを基に、コンテンツ間で算出すべき類似度基準に加えて同一性判定範囲を決定し、各コンテンツ群で検索クエリと同一のコンテンツデータを抽出する検索制御処理の仕組みに関するものである。なお、本実施の形態において、実施の形態１と同じ機能を有する構成には同じ符号を付し、その説明が適用できるため説明を省略する。また、本実施の形態において、特に記載のある場合を除いて、実施の形態１と同じ機能を有する構成には同じ符号を付し、その説明が適用できるため説明を省略する。 (Embodiment 3)
Embodiments of the present invention will be described below with reference to the drawings. In the third embodiment, the content data input as a search query is constant as a data group search target group of images, videos, documents, music, etc., which are various content data possessed by a user or for specific applications. In a similar content search processing apparatus that performs search control processing for extracting and presenting content data having similarities, the similarity criteria to be calculated between the contents based on the difference in attribute information between the contents for each content group In addition, the present invention relates to a search control processing mechanism that determines an identity determination range and extracts content data identical to a search query in each content group. Note that in this embodiment, components having the same functions as those in Embodiment 1 are assigned the same reference numerals, and descriptions thereof are omitted because they can be applied. In this embodiment, unless otherwise specified, the same reference numerals are given to configurations having the same functions as those in the first embodiment, and description thereof is omitted because the description can be applied.

本実施の形態では、ユーザが保有する写真や動画や文書や音楽等のコンテンツデータをユーザが意図する様に簡単に類似検索する、または探索視聴することができるように、検索クエリであるコンテンツと検索対象となるコンテンツ群の属性情報を利用することによって、コンテンツ間の差分情報に加えて同一性判定範囲を決定する事によりコンテンツが同一の内容であるかどうかを判定する事ができる。そして、コンテンツ間の撮影状況や利用状況の違いにより発生した同一コンテンツを照合するための検索制御処理を行う。この処理によって、ユーザが保有するまたは特定用途向けの多様なコンテンツ群に対して、コンテンツ群に適した同一判定基準で存在する同一コンテンツの検索制御処理が可能な方法について詳細に説明する。 In the present embodiment, content that is a search query is used so that the user can easily perform similar searches or search and view content data such as photos, videos, documents, music, and the like that the user intends. By using the attribute information of the content group to be searched, it is possible to determine whether the content is the same content by determining the identity determination range in addition to the difference information between the content. Then, a search control process for collating the same content generated due to a difference in shooting situation or usage situation between contents is performed. A method for performing search control processing for the same content existing on the same determination criterion suitable for the content group for various content groups owned by the user or for a specific application by this processing will be described in detail.

ユーザまたはシステムによって検索元になる検索クエリ情報としてのコンテンツデータの入力がクエリ情報入力手段より行われて類似コンテンツデータによる照合向け検索制御処理が開始される。コンテンツの照合向け検索処理が開始されると、実施の形態１の処理に加えてクエリ情報または検索対象のコンテンツから抽出される属性情報を抽出した上で、その属性情報を用いてコンテンツ間の差分情報を抽出し、コンテンツの照合向け類似度基準と同一性判定範囲が決定される。そして決定された類似度基準に基づいてコンテンツ間の類似度を算出し、所定の条件を満たす、ここでは、同一性判定範囲の条件を満たす同一コンテンツを選択決定する照合向け検索制御処理を行う。ここで、抽出されるコンテンツの属性情報である各種メタデータ情報は実施の形態１または２のそれと同じである。図１１は、クエリ情報と検索対象のコンテンツデータの属性情報に基づいてコンテンツの照合向け類似度基準決定処理を行った上で照合向け検索制御処理を行う手順を示したフローチャートである。 Input of content data as search query information to be a search source by the user or the system is performed by the query information input means, and search control processing for matching with similar content data is started. When the content matching search process is started, the attribute information extracted from the query information or the content to be searched is extracted in addition to the process of the first embodiment, and the difference between the contents using the attribute information is extracted. Information is extracted, and a similarity criterion for content matching and an identity determination range are determined. Then, based on the determined similarity criterion, the similarity between the contents is calculated, and a matching search control process for selecting and determining the same content that satisfies the predetermined condition, here, satisfying the condition of the identity determination range is performed. Here, various types of metadata information that is attribute information of the content to be extracted is the same as that in the first or second embodiment. FIG. 11 is a flowchart showing a procedure for performing a matching search control process after performing a content matching similarity criterion determination process based on query information and attribute information of content data to be searched.

ユーザまたはシステムによって照合元となるコンテンツの入力がクエリ情報入力手段より行われると照合向けの検索制御処理が開始される（ステップＳ３１）。照合元のコンテンツ入力は、ユーザにより入力機器を用いて選択入力することも可能であるが、システムにより複数デバイスや複数サービスからコンテンツを取得し自動的に入力することも可能である。 When the user or system inputs content to be verified from the query information input means, search control processing for verification is started (step S31). The content input of the collation source can be selected and input by the user using an input device. However, the content can be automatically input by acquiring content from a plurality of devices and services by the system.

照合元のコンテンツが入力されると、コンテンツデータ蓄積部１から照合対象となるコンテンツが取得され、コンテンツ情報抽出手段３において照合元のコンテンツと照合対象のコンテンツに対する属性情報の抽出処理が行われる（ステップＳ３２）。抽出されるコンテンツの属性情報は、実施の形態１のそれと同じである。 When the collation source content is input, the content to be collated is acquired from the content data storage unit 1, and the content information extraction unit 3 performs attribute information extraction processing on the collation source content and the collation target content ( Step S32). The attribute information of the extracted content is the same as that of the first embodiment.

コンテンツの属性情報が抽出されると、類似度基準決定手段４において、抽出された照合元と照合対象コンテンツ群の属性情報を用いて、コンテンツ間の類似度をどの様に算出し同一性を判定するかを表す情報である類似度基準と同一性判定範囲を決定する（ステップＳ３３）。照合元のコンテンツとその属性情報および照合対象となるコンテンツ群とその属性情報が入力されると、コンテンツ情報抽出手段３によりコンテンツの属性情報が抽出され、抽出された属性情報を用いて、照合向けの同一性判定内容を算出する。 When the content attribute information is extracted, the similarity criterion determination unit 4 uses the extracted attribute information of the collation source and collation target content group to calculate the similarity between the contents and determine the identity. A similarity criterion and identity determination range, which is information indicating whether or not, is determined (step S33). When the collation source content and its attribute information, and the content group to be collated and its attribute information are input, the content information extraction means 3 extracts the attribute information of the content, and uses the extracted attribute information for collation. The identity determination content of is calculated.

ここで、類似度基準決定手段４の機能構成は実施例１のそれと同じであるとした場合の基準情報決定手段４４の機能構成の一例について図１２を参照しつつ具体的な処理内容について説明する。図１２は基準情報決定手段４４の機能ブロック図である。基準情報決定手段４４は、加工情報抽出手段４４１と、同一性判定項目算出手段４４２と、同一性判定情報決定手段４４３とから構成されている。 Here, an example of the functional configuration of the reference information determining unit 44 when the functional configuration of the similarity criterion determining unit 4 is the same as that of the first embodiment will be described with reference to FIG. . FIG. 12 is a functional block diagram of the reference information determination unit 44. The reference information determination unit 44 includes processing information extraction unit 441, identity determination item calculation unit 442, and identity determination information determination unit 443.

加工情報抽出手段４４１は、入力された照合元のコンテンツに特有な属性情報を用いることによりオリジナルのコンテンツデータから変更された可能性のある加工内容情報を算出する。例えば、加工内容情報は、利用された編集情報やサービス情報から単純なエンコード方法の変更やリサイズによるデータが変更された事が直接的にまたは間接的に判別可能な情報である。また、解析メタデータを用いることによってコンテンツ全体または部分的な編集可能性を算出することで加工度合いを算出すること等も可能である。なお、照合対象である一定単位の複数のコンテンツに関する加工内容情報を算出し利用する構成とすることも考えられる。 The processing information extraction unit 441 calculates processing content information that may have been changed from the original content data by using attribute information that is specific to the input collation source content. For example, the processing content information is information that can directly or indirectly determine from the editing information and service information that has been used that the data has been changed due to a simple encoding method change or resizing. It is also possible to calculate the degree of processing by calculating the entire content or partial editability by using analysis metadata. Note that it is also possible to adopt a configuration in which processing content information regarding a plurality of contents in a certain unit to be collated is calculated and used.

同一性判定項目算出手段４４２は、加工情報抽出手段４４１で抽出された加工内容情報を基に差分情報から同一性判定項目を決定する。例えば、各種メタデータ情報の差分情報の大きい内容から優先的に加工内容情報に基づいてコンテンツデータが加工されている可能性を示す情報を算出する。例えば、利用メタデータ情報から利用サービスが異なるためにデータの保存方法が一定の画像サイズにリサイズされているまたは加工フィルターの利用やトリミングによる領域変更などの情報が算出される。また、解析メタデータ情報から加工度合いがコンテンツ領域により異なる際には、一定の加工度合い別に分割されたコンテンツの領域情報が算出される。 The identity determination item calculation unit 442 determines the identity determination item from the difference information based on the processing content information extracted by the processing information extraction unit 441. For example, information indicating the possibility that the content data has been processed based on the processed content information is calculated from the large content of the difference information of the various metadata information. For example, because the usage service is different from the usage metadata information, the data storage method is resized to a fixed image size, or information such as the use of a processing filter or an area change by trimming is calculated. In addition, when the degree of processing varies depending on the content area from the analysis metadata information, content area information divided by a certain degree of processing is calculated.

同一性判定情報決定手段４４３は、同一性判定項目算出手段４４２で算出された同一性判定項目に従って同一性を判定すべき類似度基準として利用する特徴量とその類似度判定方法および同一性判定範囲を決定する。例えば、保存時の画像サイズがリサイズされて異なる場合には、画像の全般的な一致度を算出する特徴量を利用し、コンテンツサイズの差分から同一性判定を行う際の同一だと判定可能な範囲を指定する。サイズの差分が大きいほど同一判定範囲が広くなり、小さいほど狭くなる設定を行う事ができる。また、フィルターやトリミングによる領域情報変更の場合には、画像の局所的な一致度を算出する特徴量を利用し、その領域サイズの差分から同一だと判定可能な範囲を指定することやコンテンツ間の一致度合いに連動する様なパラメータとして範囲設定をすることが可能である。 The identity determination information determination unit 443 includes a feature amount used as a similarity criterion for determining identity according to the identity determination item calculated by the identity determination item calculation unit 442, a similarity determination method thereof, and an identity determination range. To decide. For example, if the image size at the time of storage is resized and is different, it can be determined that the identity is the same when the identity determination is performed from the difference in content size using the feature amount for calculating the overall matching degree of the images Specify a range. The larger the size difference, the wider the same determination range, and the smaller the size difference, the narrower the setting can be made. In addition, in the case of area information change by filtering or trimming, a feature amount that calculates the local matching degree of images is used to specify a range that can be determined to be the same from the difference in area size, or between contents It is possible to set a range as a parameter that is linked to the degree of coincidence.

類似度基準決定手段４において照合を行うための類似度基準と同一性判定範囲が決定されると、類似度算出手段５は決定された類似度基準を用いて対象となるコンテンツデータの類似度を測るための特徴量抽出を行い、指定された類似度算出方法でコンテンツデータ間の類似度を算出する（ステップＳ３４）。類似度は基準情報決定手段４４で決定された手法に基づいてコンテンツから特徴量を算出し、コンテンツ間の類似度を算出する。 When the similarity criterion determining unit 4 determines the similarity criterion and the identity determination range for collation, the similarity calculating unit 5 uses the determined similarity criterion to determine the similarity of the target content data. A feature amount for measurement is extracted, and a similarity between content data is calculated by a designated similarity calculation method (step S34). The similarity is calculated from the content based on the method determined by the reference information determination unit 44, and the similarity between the contents is calculated.

そして、類似度算出手段５から算出された類似度が出力されると、合致コンテンツ選択手段６は類似度基準決定手段４で決定された同一性判定範囲を用いて同一性判定条件を満たす類似度を持つコンテンツを照合対象のコンテンツ群から選択し抽出する(ステップＳ３５)。例えば、一致度が同一性判定範囲の内のコンテンツが存在する際には、そのコンテンツが同一コンテンツであるとして管理する事ができる。存在しない場合には同一コンテンツが存在しないとする事や、複数存在する際には最も一致度が高いコンテンツに絞る事やユーザに提示することで１つに決定する様な構成とすること等も可能である。また、同一性判定をより厳密に行う場合には、類似度が一定閾値Ｔｈ以上の高い複数の候補コンテンツを抽出した後で、より詳細な一致度を算出するために局所特徴量としてＳＩＦＴやＳＵＲＦ等の特徴量等を用いてコンテンツ間の幾何マッチング手法により一致度を厳密に算出することで同一性を判定する複数段階の処理構成とすることもできる。そして、よりロバストに同一性判定を行うために複数の類似度基準を算出しておくことで同一性を判定する事も考えられる。 When the similarity calculated from the similarity calculation unit 5 is output, the matched content selection unit 6 uses the similarity determination range determined by the similarity criterion determination unit 4 to determine the similarity satisfying the identity determination condition. Is selected from the content group to be collated and extracted (step S35). For example, when there is content whose coincidence is within the identity determination range, it can be managed that the content is the same content. If there is no such content, the same content may not exist. If there are multiple content, it may be limited to the content with the highest degree of matching, or it may be determined as one by presenting it to the user. Is possible. In addition, when the identity determination is performed more strictly, SIFT or SURF is used as a local feature amount in order to calculate a more detailed degree of coincidence after extracting a plurality of candidate contents whose similarity is higher than a certain threshold Th. It is also possible to adopt a multi-stage processing configuration in which identity is determined by strictly calculating the degree of coincidence using a geometric matching method between contents using feature quantities and the like. In order to perform identity determination more robustly, it may be possible to determine identity by calculating a plurality of similarity criteria.

例えば、全体一致から局所一致までを段階的に判定できる複数の特徴量と類似度判定方法を保持し、複数の類似度基準で同一性判定を行い、全ての類似度基準で一致していれば加工処理がほぼされていないオリジナルと同じコンテンツであると判定し、少なくとも１つ以上の類似度基準で一致していれば、その数や一致度合いによって加工処理がどの程度されているかを考慮した同一性判定結果を算出することができる。さらに、利用メタデータ情報から特定のサービス処理や編集処理における加工内容の傾向を統計的に抽出することによって、統計的に優位な情報が存在する際には、その情報に従って特徴量の算出方法および類似度の算出方法を更新することで同一性判定を行う更新型の処理構成とすることが可能である。 For example, if there are multiple feature quantities and similarity determination methods that can be determined step by step from global matching to local matching, identity determination is performed using multiple similarity criteria, and if all similarity criteria match It is determined that the content is the same as the original content that has not been processed, and if the content matches based on at least one similarity criterion, it is the same considering the level of processing by the number and degree of matching The sex determination result can be calculated. Furthermore, when statistically superior information exists by statistically extracting the processing content trends in specific service processing and editing processing from the usage metadata information, a feature amount calculation method and It is possible to adopt an update-type processing configuration in which identity determination is performed by updating the similarity calculation method.

以上のように、コンテンツから取得可能な特定の特徴量に基づいて決定される一定基準の類似度で同一のコンテンツを照合するための検索制御処理するのではなく、コンテンツ間の属性情報である各種メタデータから抽出される内容の違いを考慮した上で算出すべき特徴量や同一判定項目や範囲を決定することで、コンテンツ間で照合すべきより妥当な同一性判定基準に変更決定できるため、ユーザが保有するまたは特定用途向けの多様なコンテンツデータに対してより適した同一コンテンツの照合向け検索制御処理が可能となり、ユーザはより満足度の高い同一性判定結果で保有データを効果的に類似検索することや、簡単に探索視聴することができる。 As described above, instead of performing search control processing for matching the same content with a certain standard of similarity determined based on a specific feature amount that can be acquired from the content, various types of attribute information between the content By deciding the feature quantity and the same judgment item and range that should be calculated in consideration of the difference in the content extracted from the metadata, it can be changed to a more appropriate identity judgment standard to be collated between contents, Search control processing for matching the same content that is more suitable for a variety of content data held by the user or for specific applications is possible, and the user effectively resembles the stored data with a more satisfactory identity determination result You can search and watch and watch easily.

（実施の形態４）
以下、図面を参照してこの発明の実施形態について説明する。本実施の形態４は、ユーザが保有するまたは特定用途向けの多様なコンテンツデータである画像や動画や文書や音楽等のデータ群検索対象群として、検索クエリとして入力されるコンテンツデータに対して一定の類似度を持つコンテンツデータを抽出して提示する検索制御処理を行う類似コンテンツ検索処理装置において、コンテンツ群毎にコンテンツ間の属性情報の違いを基に、属性情報の違いを反映した類似度基準を変更更新し、その変更が分かる様に抽出されるコンテンツデータを配置表示する検索制御処理の仕組みに関するものである。なお、本実施の形態において、実施の形態１と同じ機能を有する構成には同じ符号を付し、その説明が適用できるため説明を省略する。また、本実施の形態において、特に記載のある場合を除いて、実施の形態１と同じ機能を有する構成には同じ符号を付し、その説明が適用できるため説明を省略する。 (Embodiment 4)
Embodiments of the present invention will be described below with reference to the drawings. In the fourth embodiment, the content data input as a search query is fixed as a data group search target group of images, videos, documents, music, etc., which are various content data possessed by the user or for specific applications. Similarity criteria reflecting differences in attribute information based on differences in attribute information between contents for each content group in a similar content search processing apparatus that performs search control processing for extracting and presenting content data having similarities This is related to a search control processing mechanism for arranging and displaying extracted content data so that the change can be understood. Note that in this embodiment, components having the same functions as those in Embodiment 1 are assigned the same reference numerals, and descriptions thereof are omitted because they can be applied. In this embodiment, unless otherwise specified, the same reference numerals are given to configurations having the same functions as those in the first embodiment, and description thereof is omitted because the description can be applied.

本実施の形態では、ユーザが保有する写真や動画や文書や音楽等のコンテンツデータをユーザが意図する様に簡単に類似検索する、または探索視聴することができるように、検索クエリであるコンテンツと検索対象となるコンテンツ群の属性情報を利用することによって、コンテンツ間の差分情報に基づいて属性内容でコンテンツ間の違いを段階的に算出する事ができる。そして、コンテンツ間の撮影状況や利用状況の違いに即して段階的に類似度基準を変更する類似コンテンツの検索制御処理および配置表示処理を行う。この処理によって、ユーザが保有するまたは特定用途向けの多様なコンテンツ群に対して、コンテンツ間の違いに段階的に連動する類似度基準で類似するコンテンツの検索制御処理および配置表示処理が可能な方法について詳細に説明する。 In the present embodiment, content that is a search query is used so that the user can easily perform similar searches or search and view content data such as photos, videos, documents, music, and the like that the user intends. By using the attribute information of the content group to be searched, the difference between the contents can be calculated step by step based on the difference information between the contents. Then, similar content search control processing and arrangement display processing are performed in which the similarity criterion is changed step by step in accordance with differences in the shooting status and usage status between the content. This process enables search control processing and arrangement display processing of similar content on a variety of content groups owned by a user or for a specific application, based on similarity criteria that are linked in stages to differences between content Will be described in detail.

図１３は本発明の類似コンテンツ検索処理装置の原理的な構成を示すブロック図である。図１３において、類似コンテンツ検索処理装置は、コンテンツデータ蓄積部１と、クエリ情報入力手段２と、コンテンツ情報抽出手段３と、類似度基準決定手段４と、類似度算出手段５と、合致コンテンツ選択手段６と表示手段７とユーザ操作入力手段８とから構成されている。 FIG. 13 is a block diagram showing the basic configuration of the similar content search processing apparatus of the present invention. In FIG. 13, the similar content search processing apparatus includes a content data storage unit 1, a query information input unit 2, a content information extraction unit 3, a similarity criterion determination unit 4, a similarity calculation unit 5, and a matching content selection. It comprises means 6, display means 7, and user operation input means 8.

コンテンツデータ蓄積部１と、クエリ情報入力手段２と、コンテンツ情報抽出手段３と、類似度基準決定手段４と、類似度算出手段５と、合致コンテンツ選択手段６については、基本的な動作は実施の形態１に記載の内容と同じである。 For the content data storage unit 1, the query information input unit 2, the content information extraction unit 3, the similarity criterion determination unit 4, the similarity calculation unit 5, and the matching content selection unit 6, basic operations are performed. It is the same as the content described in Form 1.

表示手段７は、合致コンテンツ選択手段６で選択されたコンテンツ群を一定の予め定められた表示形式で表示機器上に表示制御する。例えば、スマートフォンやタブレットなどのタッチパネルディスプレイや通常の表示ディスプレイなどの表示構成に合わせて表示を行う。 The display unit 7 controls the display of the content group selected by the matching content selection unit 6 on a display device in a certain predetermined display format. For example, display is performed in accordance with a display configuration such as a touch panel display such as a smartphone or a tablet or a normal display.

ユーザ操作入力手段８は、ユーザ操作から類似コンテンツ検索処理装置に対する入力情報を判定する。入力情報は、類似コンテンツ検索処理装置におけるディスプレイにあるタッチパネルの押下やドラッグや解放などのユーザ操作により形成され、操作開始／終了指示、ディスプレイ上の押下位置座標などの物理的パラメータを含むが、この場合のみに限定されない。 The user operation input means 8 determines input information for the similar content search processing device from the user operation. The input information is formed by a user operation such as pressing, dragging or releasing the touch panel on the display in the similar content search processing device, and includes physical parameters such as an operation start / end instruction and a pressed position coordinate on the display. It is not limited only to the case.

ユーザやシステムがコンテンツの類似検索を行う際に、検索対象コンテンツデータ群に対して検索元であるクエリ情報のコンテンツデータを指定し入力した際に、検索対象コンテンツデータにおける属性情報との違いを反映したコンテンツの類似度基準決定および類似コンテンツの検索制御処理が行われた後、検索結果がユーザに表示されクエリ情報等を再入力することにより繰り返し類似検索制御処理が行われる。ユーザフィードバック型の類似コンテンツの繰り返し検索制御処理が開始されると、入力されたクエリ情報であるコンテンツと検索対象であるコンテンツ群から属性情報に基づいてコンテンツ間の差分情報が抽出される。抽出された差分情報に基づいてコンテンツ間の属性の違いに適する類似度基準が決定され、その類似度基準に基づいてコンテンツ間の類似度を算出し、所定の条件を満たす類似コンテンツを選択決定する検索制御処理が行われる。更に、検索結果をディスプレイ上に表示することによってユーザに提示する。ユーザは提示されたコンテンツから繰り返し類似検索したいコンテンツを選択する事で繰り返し類似検索制御処理が行われる。図１４は、クエリ情報と検索対象のコンテンツデータの属性情報に基づいて類似度基準決定処理を行う類似コンテンツの検索制御処理をユーザフィードバックに基づいて繰り返し行う手順を示したフローチャートである。 When a user or system performs content similarity search, when the content data of the query information that is the search source is specified and entered for the search target content data group, the difference from the attribute information in the search target content data is reflected. After the content similarity criterion determination and the similar content search control process are performed, the search result is displayed to the user, and the query information and the like are re-input to repeatedly perform the similar search control process. When the user feedback type similar content repetitive search control process is started, difference information between the content is extracted from the content that is the input query information and the content group that is the search target based on the attribute information. Based on the extracted difference information, a similarity criterion suitable for the attribute difference between the contents is determined, the similarity between the contents is calculated based on the similarity criterion, and similar contents satisfying a predetermined condition are selected and determined. Search control processing is performed. Further, the search result is presented to the user by displaying it on the display. The user selects the content that the user wants to repeatedly perform similar searches from the presented content, so that repeated similar search control processing is performed. FIG. 14 is a flowchart showing a procedure for repeatedly performing similar content search control processing based on user feedback for performing similarity criterion determination processing based on query information and attribute information of content data to be searched.

ユーザまたはシステムによって検索元になる検索クエリ情報としてのコンテンツデータの入力がクエリ情報入力手段より行われて類似コンテンツの検索制御処理が開始される。類似コンテンツの検索処理が開始されると、実施の形態１の処理に加えて検索結果である抽出コンテンツを表示すると共にユーザが提示結果に対するフィードバックを入力する事でその入力内容に基づいて繰り返し類似コンテンツの検索制御処理が行われる。 Input of content data as search query information to be a search source by the user or system is performed by the query information input means, and a search control process for similar content is started. When the similar content search process is started, the extracted content that is the search result is displayed in addition to the process of the first embodiment, and the user inputs feedback on the presentation result, so that the similar content is repeated based on the input content. The search control process is performed.

類似コンテンツの検索制御処理が開始されると、実施の形態１と同様にクエリ情報と検索対象であるコンテンツ群の属性情報が抽出される(ステップＳ４１)。 When the similar content search control process is started, the query information and the attribute information of the content group to be searched are extracted as in the first embodiment (step S41).

次に、類似度基準決定手段４において、抽出された属性情報群から差分情報が算出されコンテンツ間の類似度をどのように算出するかを規定した類似度基準が決定される(ステップＳ４２)。そして、類似度算出手段５は決定された類似度基準を用いて対象となるコンテンツデータの類似度を測るための特徴量抽出を行い、指定された類似度算出方法でコンテンツデータ間の類似度を算出する。 Next, the similarity criterion determination means 4 calculates difference information from the extracted attribute information group and determines a similarity criterion that defines how the similarity between contents is calculated (step S42). Then, the similarity calculation means 5 performs feature quantity extraction for measuring the similarity of the target content data using the determined similarity criterion, and calculates the similarity between the content data by the designated similarity calculation method. calculate.

そして、類似度算出手段５から算出された類似度が出力されると、合致コンテンツ選択手段６は所定の条件を満たす類似度を持つコンテンツを検索対象のコンテンツ群から選択し抽出される(ステップＳ４３)。 Then, when the similarity calculated by the similarity calculation unit 5 is output, the matched content selection unit 6 selects and extracts content having a similarity satisfying a predetermined condition from the search target content group (step S43). ).

抽出されたコンテンツ群は、表示手段７により、予め決められている所定の形式でユーザに提示される(ステップＳ４４)。例えば、抽出されたコンテンツ群を類似度の高い順に表示することや選択された類似度基準の情報もユーザが把握可能な形式で併せて表示する事もできる。また、属性情報の変化に伴い１次元的に類似度基準を切り替えて検索した際の効果的な表示方法の一例を図１５に示す。 The extracted content group is presented to the user in a predetermined format by the display means 7 (step S44). For example, the extracted content group can be displayed in descending order of similarity, and information on the selected similarity criterion can also be displayed in a format that can be grasped by the user. FIG. 15 shows an example of an effective display method when searching by switching the similarity criterion one-dimensionally with changes in attribute information.

類似度基準を変更する際に特定の属性情報に基づいて変動させる事によって、類似度基準の変動に伴って選択される類似コンテンツを順に配置することでよりユーザが把握し易い表示方法が考えられる。例えば、時間帯の変動や撮影場所の変動に伴ってその差分が大きくなるほど同じコンテンツにはなり難いため、類似判定する内容を全体一致から局所的な一致そして中心部分だけの一致による類似判定へと変化させることで、検索結果をその類似判定する基準が順に変化した際の検索結果として順に配置して表示することができる。 By changing the similarity criterion based on specific attribute information, it is possible to arrange a similar content that is selected in accordance with the variation of the similarity criterion in order so that the user can easily understand the display method. . For example, since the same content is unlikely to become the same as the difference increases with time zone fluctuation or shooting location fluctuation, the similarity determination content is changed from overall matching to local matching and similarity determination by matching only the central part By changing the search results, the search results can be sequentially arranged and displayed as search results when the criteria for determining similarity are changed in order.

図１５の（Ａ）は、類似度基準が１〜４へと順に変化して際の抽出結果を同じ類似度基準の結果を横方向に配置し、類似度基準が変化すると縦方向に配置する際の表示例である。また、図１５の（Ｂ）は、類似度基準が１〜５へと順に変化した際の抽出結果をクエリ情報のコンテンツを中心に配置し、類似度基準が変化するとその中心から時計方向に順に配置し、同じ類似度基準の結果を中心からその類似度基準が持つ方向で遠方に配置する際の表示例である。 In FIG. 15A, the similarity criterion is changed from 1 to 4 in order, and the extracted result is arranged in the horizontal direction when the similarity criterion is changed, and is arranged in the vertical direction when the similarity criterion is changed. It is a display example at the time. In FIG. 15B, the extraction result when the similarity criterion is sequentially changed from 1 to 5 is arranged centering on the content of the query information, and when the similarity criterion is changed, the extraction result is sequentially clockwise from the center. It is a display example when arranging and arranging the result of the same similarity criterion far from the center in the direction of the similarity criterion.

また、属性情報の変動は、利用した撮影機器の種類情報や日付などの時間情報や風景シーンや店舗ジャンル等の場所情報や加工種別毎の利用サービス情報等の各種メタデータ情報から抽出可能な情報であり、その情報の変動に伴って類似度基準が変動する情報であれば、その種別は問わない。更に、属性情報の変化に伴い２次元的に類似度基準を切り替えて検索した際の効果的な表示方法の一例を図１６に示す。 In addition, fluctuations in attribute information can be extracted from various types of metadata information such as type information of shooting equipment used, time information such as dates, location information such as landscape scenes and store genres, and service information used for each processing type. Any type of information may be used as long as the similarity criterion varies according to the variation of the information. Furthermore, FIG. 16 shows an example of an effective display method when searching by switching the similarity criterion two-dimensionally with changes in attribute information.

基本的な類似度基準の変動に伴うコンテンツの配置方法は１次元の場合と同じであるが、時間情報×場所情報や場所情報×サービス情報やサービス情報×時間情報の様に、属性情報が複数変化する際には類似度基準も複数次元で変動するため、多次元的にコンテンツを表示することも可能である。図１６では、水平軸で類似度基準が１〜３と変動しており、垂直軸で類似度基準がＡ〜Ｃへと変動している際に、それぞれの変動に合わせてコンテンツを全体一致から部分一致から中心一致へと検索結果のコンテンツを配置している際の表示例である。類似度基準が複数交わる際には両方の類似度基準で類似度が高いコンテンツが配置される事になる。 The arrangement method of the content accompanying the change of the basic similarity criterion is the same as the one-dimensional case, but there are a plurality of attribute information such as time information × location information or location information × service information or service information × time information. When changing, the similarity criterion also fluctuates in a plurality of dimensions, so that the content can be displayed in a multidimensional manner. In FIG. 16, when the similarity criterion fluctuates from 1 to 3 on the horizontal axis and the similarity criterion fluctuates from A to C on the vertical axis, the content is changed from the entire match according to each variation. It is a display example when content of a search result is arranged from partial match to center match. When a plurality of similarity criteria are crossed, content having a high similarity according to both similarity criteria is arranged.

表示されたコンテンツ群に対して、ユーザ入力手段８により再選択等の操作が入力されると、ステップＳ４１に戻り、操作内容に連動して類似度基準の再決定処理が行われ、検索結果のコンテンツが所定の形式で配置されユーザに提示されるが、ユーザによる操作処理が無い場合には類似検索処理が終了する(ステップＳ４５)。ユーザ操作として例えばクエリ情報であるコンテンツの再選択が行われると、そのコンテンツに最適化された類似度基準が選択され、その類似度基準に基づく類似検索結果が表示手段によりユーザに提示される。これを繰り返すことによりユーザは簡単に選択したコンテンツに最適な類似コンテンツ探索を繰り返し行う事が可能となる。 When an operation such as reselection is input to the displayed content group by the user input means 8, the process returns to step S41, the similarity criterion redetermination process is performed in conjunction with the operation content, and the search result The content is arranged in a predetermined format and presented to the user, but when there is no operation process by the user, the similarity search process ends (step S45). For example, when content that is query information is reselected as a user operation, a similarity criterion optimized for the content is selected, and a similarity search result based on the similarity criterion is presented to the user by the display means. By repeating this, the user can repeatedly search for similar content optimum for the selected content.

ユーザ操作を前提としてコンテンツを効果的に検索するために、類似コンテンツ検索を構造化することも考えられる。フォルダ構成を前提とした際の類似度基準を切り替えてコンテンツを検索する際の表示方法の一例を図１７に示す。図１７では、日付別のフォルダでコンテンツが構造化されて管理されている際の類似検索結果の表示例である。選択されたクエリ情報であるコンテンツと類似するコンテンツが多く含まれる日付のフォルダが垂直軸に配置されている。選択されたクエリ情報のコンテンツが海水浴シーンのコンテンツである場合に、海水浴シーンのフォルダコンテンツが上位に配置されており、次に類似する屋外シーンが配置され、最後に類似性の低い屋内シーンが配置されている。また同じフォルダ内であってもクエリコンテンツと類似するコンテンツは大きく表示し類似性が低い程、表示する大きさを小さくする様に表示する事ができる。なお、それぞれの日付フォルダはシーン情報が異なるため、それぞれのシーンの差分に基づいて決定された類似度基準で類似度が判定されているため、各フォルダに適した類似度判定結果によりコンテンツの表示サイズが決定されている。 In order to effectively search for content on the premise of user operation, it is possible to structure similar content search. FIG. 17 shows an example of a display method when searching for content by switching the similarity criterion when the folder structure is assumed. FIG. 17 shows a display example of a similar search result when contents are structured and managed in folders by date. Date folders containing a lot of content similar to the content that is the selected query information are arranged on the vertical axis. When the content of the selected query information is the content of the beach scene, the folder content of the beach scene is placed at the top, the next similar outdoor scene is placed, and the indoor scene with low similarity is placed at the end. Has been. Even in the same folder, content similar to the query content is displayed larger, and the lower the similarity is, the smaller the display size can be displayed. Since each date folder has different scene information, the similarity is determined on the basis of the similarity determined based on the difference between the scenes, so that the content is displayed based on the similarity determination result suitable for each folder. The size has been determined.

この様にコンテンツが管理されているフォルダ等の構造で検索処理を行い、目的コンテンツを絞り込んだ上でコンテンツ単体の類似検索処理に切り替える段階的な処理構成とすることができる。そして、各種メタデータ情報から作成可能な階層的な構造でコンテンツを管理している際には、さらに階層的な構造に対応した形式で段階的にユーザの操作に合わせて類似検索する様に処理単位を切り替えて、目的とするコンテンツの類似検索を行う処理構成にすることも可能である。 In this way, it is possible to adopt a step-by-step processing configuration in which search processing is performed with a structure such as a folder in which content is managed, and the target content is narrowed down and then switched to similar content search processing for a single content. When content is managed in a hierarchical structure that can be created from various types of metadata information, processing is performed so that similar searches are performed step by step in accordance with the user's operation in a format corresponding to the hierarchical structure. It is also possible to adopt a processing configuration in which the unit is switched to perform a similar search for the target content.

以上のように、コンテンツから取得可能な特定の特徴量に基づいて決定される一定基準の類似度で類似するコンテンツを検索制御処理し配置表示制御処理するのではなく、コンテンツ間の属性情報である各種メタデータから抽出される内容の違いを考慮した上で段階的に算出すべき特徴量や類似項目を決定することで、コンテンツ間で段階的に算出すべきより妥当な類似度基準に変更決定できるため、ユーザが保有するまたは特定用途向けの多様なコンテンツデータに対してより属性情報の違いの変化に適した類似コンテンツの検索制御処理および配置制御処理が可能となり、ユーザはより満足度の高い類似項目および配置表示で保有データを効果的に類似検索することや、簡単に探索視聴することができる。 As described above, it is attribute information between contents rather than search control processing and arrangement display control processing of similar content with a certain standard similarity determined based on a specific feature amount that can be acquired from the content. By determining the feature quantities and similar items that should be calculated in stages, taking into account the differences in content extracted from various metadata, the decision was made to change to a more appropriate similarity criterion that should be calculated in stages between contents. Therefore, it is possible to perform search control processing and arrangement control processing of similar content that is more suitable for changes in the difference in attribute information for various content data held by the user or for specific applications, and the user is more satisfied It is possible to effectively perform similar searches of stored data with similar items and arrangement display, and to easily search and view.

以上、本発明の一態様に係る類似コンテンツ検索処理装置について、実施の形態に基づいて説明したが、本発明は、これらの実施の形態に限定されるものではない。本発明の趣旨を逸脱しない限り、当業者が思いつく各種変形を本実施の形態に施したもの、あるいは異なる実施の形態における構成要素を組み合わせて構築される形態も、本発明の範囲内に含まれる。 As described above, the similar content search processing device according to one aspect of the present invention has been described based on the embodiments. However, the present invention is not limited to these embodiments. Unless it deviates from the meaning of this invention, the form which carried out various deformation | transformation which those skilled in the art can think to this embodiment, or the structure constructed | assembled combining the component in different embodiment is also contained in the scope of the present invention. .

例えば、実施の形態１における類似コンテンツ検索処理装置が備える構成要素の一部または全部は、１個のシステムＬＳＩ（ＬａｒｇｅＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ：大規模集積回路）から構成されているとしてもよい。 For example, some or all of the constituent elements included in the similar content search processing apparatus according to the first embodiment may be configured by one system LSI (Large Scale Integration).

システムＬＳＩは、複数の構成部を１個のチップ上に集積して製造された超多機能ＬＳＩであり、具体的には、マイクロプロセッサ、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｍｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）などを含んで構成されるコンピュータシステムである。前記ＲＯＭには、コンピュータプログラムが記憶されている。前記マイクロプロセッサが、前記コンピュータプログラムに従って動作することにより、システムＬＳＩは、その機能を達成する。 The system LSI is an ultra-multifunctional LSI manufactured by integrating a plurality of components on one chip. Specifically, a microprocessor, a ROM (Read Only Memory), a RAM (Random Access Memory), etc. It is a computer system comprised including. A computer program is stored in the ROM. The system LSI achieves its functions by the microprocessor operating according to the computer program.

なお、ここでは、システムＬＳＩとしたが、集積度の違いにより、ＩＣ、ＬＳＩ、スーパーＬＳＩ、ウルトラＬＳＩと呼称されることもある。また、集積回路化の手法はＬＳＩに限るものではなく、専用回路または汎用プロセッサで実現してもよい。ＬＳＩ製造後に、プログラムすることが可能なＦＰＧＡ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）、あるいはＬＳＩ内部の回路セルの接続や設定を再構成可能なリコンフィギュラブル・プロセッサを利用してもよい。 Although the system LSI is used here, it may be called IC, LSI, super LSI, or ultra LSI depending on the degree of integration. Further, the method of circuit integration is not limited to LSI's, and implementation using dedicated circuitry or general purpose processors is also possible. An FPGA (Field Programmable Gate Array) that can be programmed after manufacturing the LSI, or a reconfigurable processor that can reconfigure the connection and setting of circuit cells inside the LSI may be used.

さらには、半導体技術の進歩または派生する別技術によりＬＳＩに置き換わる集積回路化の技術が登場すれば、当然、その技術を用いて機能ブロックの集積化を行ってもよい。バイオ技術の適用等が可能性としてありえる。 Furthermore, if integrated circuit technology comes out to replace LSI's as a result of the advancement of semiconductor technology or a derivative other technology, it is naturally also possible to carry out function block integration using this technology. Biotechnology can be applied.

また、本発明は、このような特徴的な処理部を備える類似コンテンツ検索処理装置として実現することができるだけでなく、類似コンテンツ検索処理装置に含まれる特徴的な処理部をステップとする類似コンテンツ検索処理装置などとして実現することもできる。また、そのような方法に含まれる特徴的な各ステップをコンピュータに実行させるコンピュータプログラムとして実現することもできる。そして、そのようなコンピュータプログラムを、ＣＤ−ＲＯＭ等のコンピュータ読取可能な非一時的な記録媒体あるいはインターネット等の通信ネットワークを介して流通させることができるのは、言うまでもない。 In addition, the present invention can be realized not only as a similar content search processing device including such a characteristic processing unit, but also as a similar content search using the characteristic processing unit included in the similar content search processing device as a step. It can also be realized as a processing device. It can also be realized as a computer program that causes a computer to execute the characteristic steps included in such a method. Needless to say, such a computer program can be distributed via a computer-readable non-transitory recording medium such as a CD-ROM or a communication network such as the Internet.

（補足）
以下、本発明の実施形態に係る動画処理装置の構成およびその変形例と各効果について説明する。 (Supplement)
Hereinafter, the configuration of the moving image processing apparatus according to the embodiment of the present invention, its modified examples, and each effect will be described.

（１）本発明に係る類似コンテンツ検索処理装置は、検索対象であるコンテンツ群を取得するコンテンツデータ取得手段と、検索時の問合せ元となる少なくとも１つ以上のコンテンツデータを入力するクエリ情報入力手段と、前記コンテンツデータ取得手段により取得されるコンテンツまたは前記クエリ情報入力手段により入力されるコンテンツの属性情報を抽出するコンテンツ情報抽出手段と、前記コンテンツ情報抽出手段により抽出される、前記クエリ情報入力手段により入力されるコンテンツの属性情報と、前記コンテンツデータ取得手段により取得される少なくとも１つ以上のコンテンツ群の属性情報から属性情報の差分情報を算出し、算出された差分情報に基づいて前記コンテンツ間の類似度を算出する際の基準となる類似度基準を決定する類似度基準決定手段と、前記類似度基準決定手段により決定された類似度基準に基づいて、前記コンテンツ間の類似度を算出する類似度算出手段と、前記類似度算出手段により算出された類似度を基に所定の条件を満たすコンテンツを選択する合致コンテンツ選択手段と、を備える。 (1) A similar content search processing apparatus according to the present invention includes a content data acquisition unit that acquires a content group that is a search target, and a query information input unit that inputs at least one content data serving as a query source at the time of search. Content information extraction means for extracting attribute information of content acquired by the content data acquisition means or content input by the query information input means, and the query information input means extracted by the content information extraction means The difference information of the attribute information is calculated from the attribute information of the content input by the content information and the attribute information of the at least one content group acquired by the content data acquisition unit, and the content information is calculated based on the calculated difference information. Similarity criteria used as the basis for calculating similarity Similarity criterion determining means to be determined, similarity calculating means for calculating similarity between the contents based on the similarity criterion determined by the similarity criterion determining means, and calculated by the similarity calculating means Matching content selection means for selecting content satisfying a predetermined condition based on the similarity.

（２）上記（１）において、前記類似度基準決定手段は、前記クエリ情報入力手段に入力されるコンテンツの属性情報として、入力されるコンテンツ自体から抽出される属性情報、または入力されるコンテンツが属する一定単位のコンテンツ群から抽出される属性情報、またはその両方の属性情報を抽出するクエリ属性抽出手段を更に備え、前記類似度基準決定手段は、前記クエリ属性抽出手段により抽出される属性情報に基づいて、算出する差分情報の算出方法を変更することとしてもよい。 (2) In the above (1), the similarity criterion determination unit may include attribute information extracted from the input content itself or input content as the attribute information of the content input to the query information input unit. It further comprises query attribute extraction means for extracting attribute information extracted from the content group of a certain unit to which it belongs, or both attribute information, and the similarity criterion determination means includes attribute information extracted by the query attribute extraction means. Based on this, the calculation method of the difference information to be calculated may be changed.

（３）上記（２）において、前記類似度基準決定手段は、前記コンテンツデータ取得手段により取得される検索対象となるコンテンツ群から属性情報を抽出する対象群属性抽出手段を更に備え、前記類似度基準決定手段は、前記属性情報を抽出する一定単位のコンテンツ群は、複数のコンテンツデータで構成されており、構成条件は少なくともコンテンツ群の撮影時間、撮影場所、風景シーン、撮影構図、イベント内容またはユーザがまとめている単位のいずれかの条件を満たしている構成としてもよい。 (3) In the above (2), the similarity criterion determining means further comprises target group attribute extracting means for extracting attribute information from a content group to be searched acquired by the content data acquiring means, and the similarity The reference determining means is configured such that a certain unit of content group from which the attribute information is extracted is composed of a plurality of content data, and the configuration conditions include at least the content group shooting time, shooting location, landscape scene, shooting composition, event content or It is good also as a structure which satisfy | fills any conditions of the unit which the user put together.

（４）上記（２）において、前記類似度基準決定手段は、前記クエリ属性抽出手段により抽出される属性情報と前記対象群属性抽出手段から抽出される属性情報を比較し、類似度判定に必要な差分情報を算出する属性情報比較手段を更に備え、前記類似度基準決定手段において、前記差分情報を算出する際の比較単位は、少なくともコンテンツ群の撮影時間、撮影場所、風景シーン、撮影構図、イベント内容またはユーザがまとめている単位のいずれかの条件を満たしている構成単位であり、同一粒度の条件で構成されている少なくとも１つ以上のコンテンツ群から差分情報を算出する構成としてもよい。 (4) In the above (2), the similarity criterion determination unit compares the attribute information extracted by the query attribute extraction unit with the attribute information extracted from the target group attribute extraction unit, and is necessary for similarity determination Attribute information comparison means for calculating the difference information, and in the similarity criterion determination means, the comparison unit for calculating the difference information is at least the shooting time, shooting location, landscape scene, shooting composition of the content group, It is a constitutional unit that satisfies any of the conditions of the event content or the unit organized by the user, and may be configured to calculate the difference information from at least one content group composed of the same granularity conditions.

上記（２）から（４）の構成によって、さらに、比較するコンテンツ間の差分情報を算出する上で、抽出するコンテンツの属性情報の粒度を揃えることができるため、検索対象コンテンツ群と検索クエリであるコンテンツにより適した類似度で類似コンテンツ検索を行うことが可能となる。 With the configurations (2) to (4) above, since the granularity of the attribute information of the content to be extracted can be made uniform in calculating the difference information between the contents to be compared, the search target content group and the search query It is possible to perform a similar content search with a degree of similarity more suitable for a certain content.

（５）上記（１）の構成において、前記コンテンツ情報抽出手段から抽出される属性情報は、前記コンテンツデータが生成された機器から抽出可能なメタデータ情報、前記コンテンツデータを解析することにより抽出可能なメタデータ情報、または前記コンテンツデータを利用することにより抽出可能なメタデータ情報の内、少なくとも１つ以上のメタデータ情報を含み、前記類似度基準決定手段は、前記コンテンツ情報抽出手段により抽出されるコンテンツ群に対する属性情報に対して、統計的解析手法により抽出可能な統計情報を用いて差分情報を算出することを特徴とすることとしてもよい。 (5) In the configuration of (1), the attribute information extracted from the content information extraction unit can be extracted by analyzing the metadata information and the content data that can be extracted from the device in which the content data is generated. Metadata information or metadata information that can be extracted by using the content data, at least one piece of metadata information, and the similarity criterion determination means is extracted by the content information extraction means. The difference information may be calculated using the statistical information that can be extracted by a statistical analysis method for the attribute information for the content group.

（６）上記（５）の構成において、前記類似度基準決定手段は、前記差分情報に基づいて類似度判定に用いる特徴量および類似度算出手法等の基準情報を決定する基準情報決定手段を更に備え、前記基準情報決定手段において、コンテンツ群の撮影日や撮影時間帯等の時間に関する情報、撮影場所や撮影範囲等の場所に関する情報、登場人物や頻出人物や特定ポーズ等の人に関する情報、撮影シーンや構図や被写体の動き情報等の撮影内容に関する情報、撮影物体等の撮影被写体に関する情報の内、少なくとも１つ以上の情報を用いてコンテンツ群の統計情報を抽出し、前記類似度基準決定手段は、抽出された統計情報に従って前記基準情報を決定する構成としてもよい。 (6) In the configuration of (5), the similarity criterion determination unit further includes a criterion information determination unit that determines criterion information such as a feature amount and a similarity calculation method used for similarity determination based on the difference information. In the reference information determining means, information relating to the time of the content group, such as the shooting date and shooting time zone, information relating to the shooting location, shooting range, etc., information relating to the characters, people who appear frequently, specific poses, etc. The statistical information of the content group is extracted using at least one or more information out of the information about the photographing contents such as the scene, the composition and the movement information of the subject and the information about the photographing subject such as the photographing object, and the similarity criterion determining means May be configured to determine the reference information according to the extracted statistical information.

（７）上記（５）の構成において、前記類似度基準決定手段は、前記差分情報に基づいて類似度判定に用いる特徴量および類似度算出手法等の基準情報を決定する基準情報決定手段を更に備え、前記類似度基準決定手段は、抽出された統計情報の内、差分情報として一定値以上の差分の少ない属性情報を用いて、類似度判定に用いる前記基準情報を決定する構成としてもよい。 (7) In the configuration of (5), the similarity criterion determination unit further includes a criterion information determination unit that determines criterion information such as a feature amount and a similarity calculation method used for similarity determination based on the difference information. The similarity criterion determination means may be configured to determine the reference information used for similarity determination using attribute information with a small difference of a certain value or more as difference information among the extracted statistical information.

（８）上記（７）の構成において、前記類似度基準決定手段は、前記統計情報において、各属性情報の差分情報により得られる一致度合いを算出することにより、類似度判定に用いる基準情報の重み付けを変更することで前記基準情報を決定する構成としてもよい。 (8) In the configuration of (7), the similarity criterion determination means calculates the degree of coincidence obtained from the difference information of each attribute information in the statistical information, thereby weighting reference information used for similarity determination It is good also as a structure which determines the said reference | standard information by changing.

（９）上記（７）の構成において、前記類似度基準決定手段は、前記基準情報である利用特徴量として、コンテンツ全体に関する特徴量、コンテンツの局所部分に関する特徴量、コンテンツの前景または背景領域に関する特徴量、コンテンツの階層的な構成に関する特徴量、コンテンツの被写体に関する特徴量の内、少なくとも１つ以上を選択決定し、選択決定された特徴量に基づいてその類似度を算出可能な手法を選択決定する構成としてもよい。 (9) In the configuration of (7), the similarity criterion determination means relates to a feature amount relating to the entire content, a feature amount relating to a local portion of the content, a foreground or background region of the content as the use feature amount that is the reference information. Select and determine at least one of the feature quantities, the feature quantities related to the hierarchical structure of the content, and the feature quantities related to the subject of the content, and select a method that can calculate the similarity based on the selected and determined feature quantities. It is good also as a structure to determine.

上記（５）から（９）の構成によって、さらに、コンテンツ属性情報の統計情報を基にした差分情報の算出をする事でよりコンテンツの属性情報の違いを正しく反映した類似度算出が可能となるため、より俯瞰的なコンテンツ間の違いを考慮した類似度で類似コンテンツ検索を行うことが可能となる。 With the configurations (5) to (9) above, by calculating difference information based on the statistical information of the content attribute information, it is possible to calculate the similarity that correctly reflects the difference in the content attribute information. For this reason, it is possible to perform a similar content search with a degree of similarity that takes into account the difference between the contents that are more overhead.

（１０）上記（１）の構成において、前記類似度基準決定手段は、問合せ元のコンテンツデータと検索対象であるコンテンツデータ群との類似度から同一コンテンツであるかどうかを判定する際に、前記差分情報を基に類似度基準と同一コンテンツである場合の類似度範囲を表す同一性判定範囲を決定することとしてもよい。 (10) In the configuration of (1), the similarity criterion determination unit determines whether the content is the same from the similarity between the content data of the inquiry source and the content data group to be searched. Based on the difference information, an identity determination range representing a similarity range in the case where the content is the same as the similarity criterion may be determined.

（１１）上記（１０）の構成において、前記コンテンツ情報抽出手段から抽出される属性情報は、前記コンテンツデータが生成された機器から抽出可能なメタデータ情報、前記コンテンツデータを解析することにより抽出可能なメタデータ情報、または前記コンテンツデータを利用することにより抽出可能なメタデータ情報の内、少なくとも１つ以上のメタデータ情報を含み、前記類似度基準決定手段は、前記属性情報であるメタデータ情報の種類およびその差分情報に基づいて、同一コンテンツである事を検出すべき対象であるかを判定し、検出すべき対象に対してそのコンテンツ群に共通する傾向情報を抽出する事によって、前記同一性判定範囲を決定する構成としてもよい。 (11) In the configuration of (10), the attribute information extracted from the content information extraction unit can be extracted by analyzing the metadata information and the content data that can be extracted from the device in which the content data is generated. Metadata information, or metadata information that can be extracted by using the content data, at least one piece of metadata information, and the similarity criterion determination means is metadata information that is the attribute information By determining whether the content is the same content based on the type of the content and the difference information, and extracting the trend information common to the content group for the target to be detected, the same content The sex determination range may be determined.

（１２）上記（１０）の構成において、前記類似度基準決定手段は、前記類似度基準および同一性判定範囲と前記合致コンテンツ選択手段で選択されたコンテンツ群に基づいて、同一性を判定すべき内容を絞り込み、前記判定すべき内容を判定可能な類似度基準と同一性判定範囲を再決定し、前記合致コンテンツ選択手段は、前記類似度基準決定手段で決定された類似度基準と同一性判定範囲に合致するコンテンツ群から同一コンテンツであると決定できない場合に、合致コンテンツ群を前記類似度決定手段に入力する事により、再帰的に同一コンテンツを探索し選択する構成としてもよい。 (12) In the configuration of (10), the similarity criterion determination means should determine identity based on the similarity criterion, the identity determination range, and the content group selected by the matching content selection means. The content is narrowed down, the similarity criterion and the identity determination range that can determine the content to be determined are re-determined, and the matching content selection unit determines the similarity with the similarity criterion determined by the similarity criterion determination unit In the case where it is not possible to determine from the content group that matches the range that the content is the same, the same content may be recursively searched and selected by inputting the matched content group to the similarity determination means.

（１３）上記（１０）の構成において、前記類似度基準決定手段は、各コンテンツ群に対して複数の前記類似度基準および同一性判定範囲を決定し、前記合致コンテンツ選択手段は、前記類似度基準決定手段で決定された複数の類似度基準と同一性判定範囲から各コンテンツの類似度変化度合いまたは同一性変化度合いを用いることによって、同一コンテンツを選択する構成としてもよい。 (13) In the configuration of (10), the similarity criterion determining means determines a plurality of similarity criteria and identity determination ranges for each content group, and the matched content selecting means is the similarity degree The same content may be selected by using the similarity change degree or the identity change degree of each content from a plurality of similarity criteria determined by the reference determining means and the identity determination range.

上記（１０）から（１３）の構成によって、さらに、多様なコンテンツデータが持つ属性情報の違いに応じた類似度算出および同一性判定が可能となるため、様々な検索対象コンテンツ群に対しても検索クエリとの同一判定を行う事ができる類似コンテンツ検索を行うことが可能となる。 With the configurations (10) to (13), similarity calculation and identity determination according to differences in attribute information of various content data can be performed. It is possible to perform a similar content search that can make the same determination as the search query.

（１４）上記（１）の構成において、前記合致コンテンツ選択手段で選択されたコンテンツを前記類似度基準に適する配置で表示する表示手段を更に備え、前記類似度基準決定手段は、属性情報の差分情報に基づいて決定される類似度基準が比較コンテンツ群に対して少なくとも一度以上は変更され、前記合致コンテンツ選択手段は、前記類似度基準決定手段で決定された複数の類似度基準を用いて前記類似度算出手段で算出された類似度に対して一定条件で合致するコンテンツ群を選択することとしてもよい。 (14) In the configuration of the above (1), the image processing apparatus further includes display means for displaying the content selected by the matching content selection means in an arrangement suitable for the similarity criterion, and the similarity criterion determination means includes a difference between attribute information The similarity criterion determined based on the information is changed at least once for the comparison content group, and the matching content selection unit uses the plurality of similarity criteria determined by the similarity criterion determination unit. A content group that matches the similarity calculated by the similarity calculation means under a certain condition may be selected.

（１５）上記（１４）の構成において、前記類似度基準決定手段は、特定の１つの属性情報または複合的な１つの属性情報の差分情報に基づいて類似度基準を変更し、前記表示手段は、前記合致コンテンツ選択手段で選択されたコンテンツを変更された類似度基準に従って、前記１つの属性情報の差分情報が増加するまたは減少する順に１次元的にコンテンツを配置して表示する構成としてもよい。 (15) In the configuration of (14), the similarity criterion determination unit changes the similarity criterion based on difference information of one specific attribute information or one complex attribute information, and the display unit The content selected by the matching content selection unit may be arranged and displayed in a one-dimensional manner in the order in which the difference information of the one attribute information increases or decreases according to the changed similarity criterion. .

（１６）上記（１４）の構成において、前記類似度基準決定手段は、特定の２つの属性情報または複合的な２つの属性情報の差分情報に基づいて類似度基準を変更し、前記表示手段は、前記合致コンテンツ選択手段で選択されたコンテンツを変更された類似度基準に従って、前記２つの属性情報の差分情報それぞれが増加するまたは減少する順に２次元的にコンテンツを配置して表示する構成としてもよい。 (16) In the configuration of (14), the similarity criterion determining means changes the similarity criterion based on difference information between two specific attribute information or two complex attribute information, and the display means The content selected by the matching content selection means may be arranged and displayed two-dimensionally in the order in which the difference information of the two attribute information increases or decreases according to the changed similarity criterion. Good.

（１７）上記（１４）の構成において、前記表示手段に表示されているコンテンツまたは一定単位のコンテンツ群において、ユーザ所望のコンテンツまたは一定単位のコンテンツ群が選択された際に、その選択情報等を入力するユーザ操作入力手段を更に備え、前記類似度基準決定手段は、前記ユーザ操作入力手段で入力された選択情報に基づいて、差分情報を算出するコンテンツ間に最適な属性情報を選択する事によって、類似度基準を再決定する構成としてもよい。 (17) In the configuration of (14) above, when a user-desired content or a predetermined unit of content group is selected from the content displayed on the display means or the predetermined unit of content group, the selection information, etc. The apparatus further comprises a user operation input means for inputting, and the similarity criterion determination means selects optimum attribute information between the contents for calculating difference information based on the selection information input by the user operation input means. The similarity criterion may be re-determined.

（１８）上記（１７）の構成において、コンテンツ情報抽出手段により抽出される履歴情報は、メタデータ情報またはユーザにより編集されているフォルダ構成等のコンテンツの階層的な構造情報を含み、前記類似度基準決定手段は、構造情報における特定の階層または特定の構造におけるコンテンツ群単位で類似度基準を決定し、前記表示手段は、類似度基準が決定された構造情報に従ってコンテンツを配置する構成としてもよい。 (18) In the configuration of (17), the history information extracted by the content information extraction unit includes metadata structure or hierarchical structure information of content such as a folder configuration edited by the user, and the similarity The reference determining unit may determine the similarity criterion for each content group in a specific hierarchy or a specific structure in the structure information, and the display unit may arrange the content according to the structure information for which the similarity criterion is determined. .

上記（１４）から（１８）の構成によって、さらに、多様なコンテンツデータに対して属性情報の違いに応じた様々な類似度算出が可能となるため、入力された検索クエリであるコンテンツを多様な観点の類似度でコンテンツを配置表示することができる類似コンテンツ検索を行うことが可能となる。 With the configurations of (14) to (18), it is possible to calculate various similarities according to differences in attribute information for various content data. It is possible to perform a similar content search in which content can be arranged and displayed with the similarity of the viewpoint.

本発明にかかる類似コンテンツ検索処理装置および方法は、少なくともコンテンツの画像特性を表特定の特徴量に基づいた一定基準での類似性判定に基づいた類似コンテンツ群の検索制御処理を行わず、ユーザが保有するまたは特定用途向けの多様なコンテンツに対して、検索クエリであるコンテンツと検索対象となるコンテンツ群との属性情報の違いに基づいてコンテンツ間で抽出すべき特徴量や類似内容を動的に決定し抽出する処理を行うため、コンテンツの撮影内容に適合した算出すべき類似度基準でコンテンツ間の類似度を算出し、効果的な類似コンテンツの検索制御処理を行う事ができる。 The similar content search processing device and method according to the present invention does not perform search control processing of a similar content group based on similarity determination based on a fixed criterion based on at least image characteristics of content based on a table-specific feature amount. For a variety of contents that are owned or for specific uses, feature quantities and similar contents that should be extracted between contents based on the difference in attribute information between the content that is the search query and the content group that is the search target Since the process of determining and extracting is performed, it is possible to calculate the similarity between the contents based on the similarity criterion to be calculated that is suitable for the shooting content of the content, and to perform an effective similar content search control process.

例えば、コンテンツデータの類似検索、また同一コンテンツの照合を行う際に、コンテンツ間の属性情報の違いを考慮した比較すべき類似内容の決定が行えるため、コンテンツ間の撮影状況や撮影内容の実体に則した適切な類似度算出を行った上で類似コンテンツの検索制御処理を行うことが可能である。よって、効果的に検索クエリであるコンテンツに適する類似コンテンツの検索制御処理を行えるため、様々な類似コンテンツ検索装置に有用である。また、ＤＶＤ／ＢＤレコーダーやＴＶやパソコンソフトやデータサーバーやデータサービス等の用途にも応用できる。 For example, when similar searches of content data and collation of the same content are performed, similar content to be compared can be determined in consideration of the difference in attribute information between content. It is possible to perform similar content search control processing after performing appropriate and appropriate similarity calculation. Therefore, it is possible to effectively perform search control processing of similar content suitable for the content that is a search query, which is useful for various similar content search devices. It can also be applied to uses such as DVD / BD recorders, TVs, personal computer software, data servers and data services.

１コンテンツデータ蓄積部
２クエリ情報入力手段
３コンテンツ情報抽出手段
４類似度基準決定手段
５類似度算出手段
６合致コンテンツ選択手段
７表示手段
８ユーザ操作入力手段
４１クエリ属性抽出手段
４２対象群属性手段
４３属性情報比較手段
４４基準情報決定手段
４３１統計情報算出手段
４３２統計項目決定手段
４３３特性項目比較手段
４４１加工情報抽出手段
４４２同一性判定項目算出手段
４４３同一性判定情報決定手段 DESCRIPTION OF SYMBOLS 1 Content data storage part 2 Query information input means 3 Content information extraction means 4 Similarity criteria determination means 5 Similarity calculation means 6 Matched content selection means 7 Display means 8 User operation input means 41 Query attribute extraction means 42 Target group attribute means 43 Attribute information comparison means 44 Reference information determination means 431 Statistical information calculation means 432 Statistical item determination means 433 Characteristic item comparison means 441 Processing information extraction means 442 Identity determination item calculation means 443 Identity determination information determination means

Claims

Content data acquisition means for acquiring a content group to be searched;
Query information input means for inputting at least one or more content data to be a query source at the time of search;
Content information extraction means for extracting attribute information of content acquired by the content data acquisition means or content input by the query information input means;
Difference information of attribute information from the attribute information of the content input by the query information input means extracted by the content information extraction means and the attribute information of at least one content group acquired by the content data acquisition means A similarity criterion determining means for determining a similarity criterion that is a criterion for calculating the similarity between the contents based on the calculated difference information;
Similarity calculating means for calculating the similarity between the contents based on the similarity criterion determined by the similarity criterion determining means;
Matching content selection means for selecting content satisfying a predetermined condition based on the similarity calculated by the similarity calculation means;
A similar content search processing apparatus comprising:

The similarity criterion determination means is extracted as attribute information of the content input to the query information input means from attribute information extracted from the input content itself or from a certain unit of content group to which the input content belongs. Query attribute extraction means for extracting the attribute information or both of the attribute information,
2. The similar content search processing apparatus according to claim 1, wherein the similarity criterion determination unit changes a calculation method of difference information to be calculated based on attribute information extracted by the query attribute extraction unit.

The similarity criterion determining means further comprises target group attribute extracting means for extracting attribute information from the content group to be searched acquired by the content data acquiring means,
The similarity criterion determining unit is configured such that a certain unit of content group from which the attribute information is extracted includes a plurality of content data, and the configuration conditions include at least a shooting time, a shooting location, a landscape scene, a shooting composition of the content group, The similar content search processing apparatus according to claim 2, wherein either a condition of an event content or a unit collected by a user is satisfied.

The similarity criterion determination unit compares the attribute information extracted by the query attribute extraction unit with the attribute information extracted from the target group attribute extraction unit, and calculates difference information necessary for similarity determination Further comprising means,
In the similarity criterion determination means, the comparison unit for calculating the difference information is at least one of the conditions of the content group shooting time, shooting location, landscape scene, shooting composition, event content, or unit compiled by the user The similar content search processing apparatus according to claim 2, wherein difference information is calculated from at least one content group that is a structural unit satisfying the above and is configured with the same granularity condition.

The attribute information extracted from the content information extraction unit uses metadata information that can be extracted from the device in which the content data is generated, metadata information that can be extracted by analyzing the content data, or the content data. Including at least one or more pieces of metadata information that can be extracted by
The similarity criterion determination unit calculates difference information for the attribute information for the content group extracted by the content information extraction unit, using statistical information that can be extracted by a statistical analysis method. Item 6. The similar content search processing device according to Item 1.

The similarity criterion determination means further includes reference information determination means for determining reference information such as a feature amount and a similarity calculation method used for similarity determination based on the difference information,
In the reference information determining means, information related to time such as the shooting date and shooting time zone of the content group, information related to the location such as the shooting location and shooting range, information related to the characters, people who appear frequently, specific poses, shooting scenes, The statistical information of the content group is extracted by using at least one piece of information among shooting information such as composition and subject movement information, and shooting subject such as a shooting object, and the similarity criterion determination means includes: 6. The similar content search processing apparatus according to claim 5, wherein the reference information is determined according to the extracted statistical information.

The similarity criterion determination means further includes reference information determination means for determining reference information such as a feature amount and a similarity calculation method used for similarity determination based on the difference information,
The similarity criterion determining means determines the criterion information used for similarity determination using attribute information with a small difference of a certain value or more as difference information among the extracted statistical information. 5. The similar content search processing device according to 5.

The similarity criterion determination means determines the reference information by changing the weighting of the reference information used for similarity determination by calculating the degree of coincidence obtained from the difference information of each attribute information in the statistical information. 8. The similar content search processing apparatus according to claim 7, wherein:

The similarity criterion determining means includes, as the used feature amount as the reference information, a feature amount relating to the entire content, a feature amount relating to the local portion of the content, a feature amount relating to the foreground or background area of the content, and a feature relating to the hierarchical structure of the content 8. The method according to claim 7, further comprising: selecting and determining at least one of the amount and the feature amount related to the subject of the content, and selecting and determining a method capable of calculating the similarity based on the selected and determined feature amount. Similar content search processing device.

The similarity criterion determining means determines whether the content is the same content based on the similarity between the content data of the query source and the content data group to be searched, and the same content as the similarity criterion based on the difference information The similar content search processing apparatus according to claim 1, wherein an identity determination range representing a similarity range is determined.

The attribute information extracted from the content information extraction unit uses metadata information that can be extracted from the device in which the content data is generated, metadata information that can be extracted by analyzing the content data, or the content data. Including at least one or more pieces of metadata information that can be extracted by
The similarity criterion determination means determines whether the content is the same content based on the type of metadata information that is the attribute information and the difference information, and for the target to be detected The similar content search processing apparatus according to claim 10, wherein the identity determination range is determined by extracting tendency information common to the content group.

The similarity criterion determination means narrows down the content to be determined based on the similarity criterion, the identity determination range, and the content group selected by the matching content selection unit, and determines the content to be determined Re-determine possible similarity criteria and identity range,
The matching content selection unit determines the matching content group when the matching content group cannot be determined to be the same content from the content group that matches the similarity criterion determined by the similarity criterion determination unit. 11. The similar content search processing apparatus according to claim 10, wherein the same content is recursively searched and selected by inputting to the means.

The similarity criterion determining means determines a plurality of similarity criteria and identity determination ranges for each content group,
The matching content selection unit selects the same content by using the similarity change degree or the identity change degree of each content from a plurality of similarity criteria determined by the similarity criterion determination unit and the identity determination range. The similar content search processing device according to claim 10.

Display means for displaying the content selected by the matching content selection means in an arrangement suitable for the similarity criterion;
The similarity criterion determination means is configured such that the similarity criterion determined based on the difference information of the attribute information is changed at least once for the comparison content group,
The matching content selection unit selects a content group that matches the similarity calculated by the similarity calculation unit using a plurality of similarity criteria determined by the similarity criterion determination unit under a certain condition. The similar content search processing apparatus according to claim 1.

The similarity criterion determination means changes the similarity criterion based on difference information of one specific attribute information or one complex attribute information,
The display means arranges and displays the contents selected one-dimensionally in the order in which the difference information of the one attribute information increases or decreases according to the changed similarity criterion. The similar content search processing apparatus according to claim 14, wherein:

The similarity criterion determining means changes the similarity criterion based on difference information between two specific attribute information or two complex attribute information,
The display means displays the content selected in the matching content selection means in a two-dimensional manner in accordance with the changed similarity criterion in the order of increasing or decreasing difference information between the two attribute information. The similar content search processing apparatus according to claim 14, wherein:

A user operation input means for inputting selection information or the like when a user-desired content or a predetermined unit of content group is selected in the content displayed on the display unit or the predetermined unit of content group;
The similarity criterion determination means re-determines the similarity criterion by selecting optimum attribute information between contents for which difference information is calculated based on the selection information input by the user operation input means. The similar content search processing device according to claim 14, wherein:

The history information extracted by the content information extracting means includes metadata structure or hierarchical structure information of the content such as a folder structure edited by the user,
The similarity criterion determination means determines a similarity criterion for each content group in a specific hierarchy or a specific structure in the structure information,
18. The similar content search processing apparatus according to claim 17, wherein the display unit arranges the content according to the structure information for which the similarity criterion is determined.

A content data acquisition step of acquiring a content group to be searched;
A query information input step for inputting at least one content data to be a query source at the time of search;
A content information extraction step of extracting attribute information of the content acquired by the content data acquisition step or the content input by the query information input means;
Difference information of attribute information from the attribute information of the content input by the query information input step extracted by the content information extraction step and the attribute information of at least one content group acquired by the content data acquisition step A similarity criterion determination step for determining a similarity criterion that is a criterion for calculating the similarity between the contents based on the calculated difference information;
A similarity calculation step of calculating a similarity between the contents based on the similarity criterion determined by the similarity criterion determination step;
A matching content selection step for selecting content satisfying a predetermined condition based on the similarity calculated in the similarity calculation step;
A similar content search processing method comprising:

On the computer,
A content data acquisition step of acquiring a content group to be searched;
A query information input step for inputting at least one content data to be a query source at the time of search;
A content information extraction step of extracting attribute information of the content acquired by the content data acquisition step or the content input by the query information input means;
Difference information of attribute information from the attribute information of the content input by the query information input step extracted by the content information extraction step and the attribute information of at least one content group acquired by the content data acquisition step A similarity criterion determination step for determining a similarity criterion that is a criterion for calculating the similarity between the contents based on the calculated difference information;
A similarity calculation step of calculating a similarity between the contents based on the similarity criterion determined by the similarity criterion determination step;
A matching content selection step for selecting content satisfying a predetermined condition based on the similarity calculated in the similarity calculation step;
A program characterized by having executed.