JP2004341928A

JP2004341928A - Search decision tree generation method, search decision tree generation device, search decision tree generation program, and program recording medium

Info

Publication number: JP2004341928A
Application number: JP2003139101A
Authority: JP
Inventors: Teruo Hamano; 輝夫浜野
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: NTT Inc
Priority date: 2003-05-16
Filing date: 2003-05-16
Publication date: 2004-12-02

Abstract

【課題】質問の難易度または容易度を考慮して質問を配置した検索決定木の生成を可能にする検索決定木生成方法、検索決定木生成装置、検索決定木生成プログラム、およびプログラム記録媒体を提供する。
【解決手段】各限定質問に対する回答の容易さを示す容易度情報に基づき、回答が最も容易な限定質問を抽出し、検索決定木の根に最も近い節のうち空いている節に割り当て、複数の限定質問から構成される限定質問の集合を２つの部分集合に分割し、各部分集合から空集合でない部分集合を抽出し、分割した各部分集合から空集合を抽出し、この抽出した空集合の各々に対応する情報を割り当てて葉とすることによって前記根から延出した枝を終止し、抽出した各部分集合を質問集合として、部分集合毎に上述した処理を前記検索決定木の全ての枝を終止するまで再起的に繰り返す。
【選択図】図１A search decision tree generation method, a search decision tree generation device, a search decision tree generation program, and a program recording medium that enable generation of a search decision tree in which questions are arranged in consideration of the difficulty or ease of a question. provide.
A limited question with the easiest answer is extracted based on ease information indicating the ease of answer to each limited question, and assigned to a vacant clause among the clauses closest to the root of the search decision tree. The set of limited questions composed of questions is divided into two subsets, a non-empty subset is extracted from each subset, an empty set is extracted from each of the divided subsets, and each of the extracted empty sets is extracted. By terminating the branches extending from the root by assigning information corresponding to the leaves, the extracted subsets are used as a query set, and the above-described processing for each subset is performed on all the branches of the search decision tree. Recursively until it stops.
[Selection diagram] Fig. 1

Description

【０００１】
【発明の属する技術分野】
本発明は、複数の質問に回答して検索を行う際の、質問の配置を表す検索決定木を作成する検索決定木生成方法、検索決定木生成装置、検索決定木生成プログラム、およびプログラム記録媒体に関する。
【０００２】
【従来の技術】
ＦＡＱ（ＦｒｅｑｕｅｎｔｌｙＡｓｋｅｄＱｕｅｓｔｉｏｎｓ）とは、「よくある質問」という意味であって、例えば、パソコン、電子レンジ等の家電製品などを購入した消費者が、その製品の使い方が良くわからなかったり、製品が故障したりした場合に、しばしばなされる質問のことである。ＦＡＱがあった場合に、それに対する回答として、どうしたら良いかという解決方法等をまとめたデータベースがＦＡＱデータベースである。
【０００３】
一般に、ＦＡＱ等の回答を検索するのに検索決定木が用いられ、この検索決定木の節に配置された質問に回答することにより、上記のＦＡＱデータベースから所望の解決方法が検索されるようになっている。
【０００４】
このようなＦＡＱデータベースから所望の解決方法を検索しようとするユーザは、まず根に配置された質問に回答し、この回答に該当する枝の節に配置された質問に答える。この手順を繰り返すことで、ユーザは、最終的に所望の解決方法が配置された葉に到達することができる。
【０００５】
この検索決定木は、あくまでもＦＡＱデータベースから所望の解決方法を検索するための論理的な構造を表す一種の設計図であり、実際にはＷＷＷ（ＷｏｒｌｄＷｉｄｅＷｅｂ）で用いられるＨＴＭＬ（ＨｙｐｅｒＴｅｘｔＭａｒｋｕｐＬａｎｇｕａｇｅ）文書等として、コンピュータ上に実現されたり、あるいは紙の冊子上で実現されたりしていた。そして、検索決定木の節に配置される質問は、出現頻度の高い解決方法が早く検索されるように配置が決められていた（例えば、特許文献１を参照）。
【０００６】
【特許文献１】
特開平６−４２９２号公報
【０００７】
【発明が解決しようとする課題】
しかしながら、このような従来の検索決定木生成技術では、出現頻度の高い解決方法ほど早く検索されるように検索決定木の節に質問が配置され、質問の難易度は全く考慮されていないため、回答の難易度の高い質問が検索決定木の根元に近い冒頭付近に配列される場合があり、かかる場合には、その質問以降、検索決定木を進むことができず、回答を絞り込むことができなかった。
【０００８】
また、そもそも解決方法を検索する上で、難易度の高い質問には回答する必要すらない可能性があるにも関わらず、従来の技術ではどのような難易度の質問に対してもユーザが回答する必要があり、その結果、ユーザが所望の回答を得ることができない場合もあった。
【０００９】
本発明は、このような問題を解決するためになされたもので、その目的は、質問の難易度または容易度を考慮して質問を配置した検索決定木の生成を可能にする検索決定木生成方法、検索決定木生成装置、検索決定木生成プログラム、およびプログラム記録媒体を提供することにある。
【００１０】
【課題を解決するための手段】
上記目的を達成するために、請求項１記載の発明は、検索決定木の根から順次延出して枝分かれを生じる複数の節の各々に配置される限定質問を構成要素とする質問集合のうち、該当する質問集合につき、前記根および前記複数の節の各々に配置される限定質問に対して順次ユーザが入力する回答に基づいて辿り着く枝の終端である葉に、前記ユーザが希望する情報としてのユーザ希望情報を割り当てて成る検索決定木を生成する検索決定木生成方法であって、検索決定木を生成するために必要な情報を記憶する記憶部を備えたコンピュータが、（イ）各限定質問に対する回答の容易さを示す容易度情報を前記記憶部から読み出し、この読み出した容易度情報に基づいて、回答が最も容易な限定質問を抽出する抽出ステップと、（ロ）この抽出ステップで抽出した限定質問を検索決定木の根に最も近い節のうち空いている節のいずれかに割り当て、複数の限定質問から構成される質問集合を二つの部分集合に分割する分割ステップと、（ハ）この分割ステップで分割した二つの部分集合から空集合でない部分集合を抽出する部分集合抽出ステップと、（ニ）前記分割ステップで分割した二つの部分集合から空集合を抽出し、この抽出した空集合の各々に対応するユーザ希望情報を割り当てて葉とすることにより、前記根から延出した枝を終止する終止ステップとを実行することを特徴とする。
【００１１】
請求項２記載の発明は、前記（ハ）の部分集合抽出ステップで抽出した部分集合のうち空集合でない部分集合を改めて質問集合として、この質問集合の各々に対して、前記（イ）の抽出ステップから前記（ニ）の終止ステップに至るまでの処理を再帰的に繰り返し、この繰り返しの過程において、前記（ロ）の分割ステップで分割した部分集合が全て空集合であるとき、繰り返しを終了することを特徴とする。
【００１２】
請求項３記載の発明は、検索決定木の根から順次延出して枝分かれを生じる複数の節の各々に配置される限定質問を構成要素とする質問集合のうち、該当する質問集合につき、前記根および前記複数の節の各々に配置される限定質問に対して順次ユーザが入力する回答に基づいて辿り着く枝の終端である葉に、前記ユーザが希望する情報としてのユーザ希望情報を割り当てて成る検索決定木を生成する検索決定木生成方法であって、検索決定木を生成するために必要な情報を記憶する記憶部を備えたコンピュータが、（イ）各限定質問に対する回答を組み合せて成る複数の訓練事例から構成される訓練事例情報と、当該訓練事例情報に係る質問集合とに基づいて、前記訓練事例全体の限定質問前と限定質問後の情報量の差を示す情報利得を算出する情報利得算出ステップと、（ロ）この情報利得算出ステップで算出した情報利得と各限定質問に対する回答の容易さを示す容易度情報を前記記憶部から読み出し、この読み出した情報利得と容易度情報に基づいて、各限定質問に対する評価値を算出する評価値算出ステップと、（ハ）この評価値算出ステップで算出した評価値が最も高い限定質問を前記記憶部から読み出して抽出する抽出ステップと、（ニ）この抽出ステップで抽出した限定質問を検索決定木の根に最も近接する節のうち空いている節に割り当て、複数の限定質問から構成される限定質問の集合を２つの部分集合に分割する分割ステップと、（ホ）この分割ステップで分割した各部分集合から空集合でない部分集合を抽出する部分集合抽出ステップと、（へ）前記分割ステップで分割した各部分集合から空集合を抽出し、この抽出した空集合の各々に対応するユーザ希望情報を割り当てて葉とすることにより、前記根から延出した枝を終止する終止ステップとを実行することを特徴とする。
【００１３】
請求項４記載の発明は、前記（ホ）の部分集合抽出ステップで抽出した部分集合のうち空集合でない部分集合を改めて質問集合として、この質問集合の各々に対して、前記（イ）の抽出ステップから前記（ヘ）の終止ステップに至るまでの処理を再帰的に繰り返し、この繰り返しの過程において、前記（ニ）の分割ステップで分割した部分集合が全て空集合であるとき、繰り返しを終了することを特徴とする。
【００１４】
請求項５記載の発明は、前記評価値算出ステップで算出する評価値は、前記情報利得および前記容易度情報を所定の比率で加算した値であることを特徴とする。
【００１５】
請求項６記載の発明は、検索決定木の根から順次延出して枝分かれを生じる複数の節の各々に配置される限定質問を構成要素とする質問集合のうち、該当する質問集合につき、前記根および前記複数の節の各々に配置される限定質問に対して順次ユーザが入力する回答に基づいて辿り着く枝の終端である葉に、前記ユーザが希望する情報としてのユーザ希望情報を割り当てて成る検索決定木を生成する検索決定木生成装置であって、検索決定木を生成するために必要な情報を記憶する記憶手段と、各限定質問に対する回答の容易さを示す容易度情報を前記記憶手段から読み出し、この読み出した容易度情報に基づいて、回答が最も容易な限定質問を抽出する抽出手段と、この抽出手段で抽出した限定質問を検索決定木の根に最も近い節のうち空いている節に割り当て、複数の限定質問から構成される限定質問の集合を２つの部分集合に分割する分割手段と、この分割手段で分割した各部分集合から空集合でない部分集合を抽出する部分集合抽出手段と、前記分割手段で分割した各部分集合から空集合を抽出し、この抽出した空集合の各々に対応するユーザ希望情報を割り当てて葉とすることにより、前記根から延出した枝を終止する終止手段とを備えたことを特徴とする。
【００１６】
請求項７記載の発明は、検索決定木の根から順次延出して枝分かれを生じる複数の節の各々に配置される限定質問を構成要素とする質問集合のうち、該当する質問集合につき、前記根および前記複数の節の各々に配置される限定質問に対して順次ユーザが入力する回答に基づいて辿り着く枝の終端である葉に、前記ユーザが希望する情報としてのユーザ希望情報を割り当てて成る検索決定木を生成する検索決定木生成装置であって、検索決定木を生成するために必要な情報を記憶する記憶手段と、各限定質問に対する回答を組み合せて成る複数の訓練事例から構成される訓練事例情報と、当該訓練事例情報に係る質問集合とに基づいて、前記訓練事例全体の限定質問前と限定質問後の情報量の差を示す情報利得を算出する情報利得算出手段と、この情報利得算出手段で算出した情報利得と各限定質問に対する回答の容易さを示す容易度情報に基づいて、各限定質問に対する評価値を算出する評価値算出手段と、この評価値算出手段で算出した評価値が最も高い限定質問を抽出する抽出手段と、この抽出手段で抽出した限定質問を検索決定木の根に最も近接する節のうち空いている節に割り当て、複数の限定質問から構成される限定質問の集合を２つの部分集合に分割する分割手段と、この分割手段で分割した各部分集合から空集合でない部分集合を抽出する部分集合抽出手段と、前記分割手段で分割した各部分集合から空集合を抽出し、この抽出した空集合の各々に対応するユーザ希望情報を割り当てて葉とすることにより、前記根から延出した枝を終止する終止手段とを備えたことを特徴とする。
【００１７】
請求項８記載の発明は、前記評価値算出手段で算出する評価値は、前記情報利得および前記容易度情報を所定の比率で加算した値であることを特徴とする。
【００１８】
請求項９記載の発明は、検索決定木の根から順次延出して枝分かれを生じる複数の節の各々に配置される限定質問を構成要素とする質問集合のうち、該当する質問集合につき、前記根および前記複数の節の各々に配置される限定質問に対して順次ユーザが入力する回答に基づいて辿り着く枝の終端である葉に、前記ユーザが希望する情報としてのユーザ希望情報を割り当てて成る検索決定木を生成する検索決定木生成プログラムであって、検索決定木を生成するために必要な情報を記憶する記憶部を備えたコンピュータに、（イ）各限定質問に対する回答の容易さを示す容易度情報を前記記憶部から読み出し、この読み出した容易度情報に基づいて、回答が最も容易な限定質問を抽出する抽出ステップと、（ロ）この抽出ステップで抽出した限定質問を検索決定木の根に最も近い節のうち空いている節のいずれかに割り当て、複数の限定質問から構成される質問集合を二つの部分集合に分割する分割ステップと、（ハ）この分割ステップで分割した二つの部分集合から空集合でない部分集合を抽出する部分集合抽出ステップと、（ニ）前記分割ステップで分割した二つの部分集合から空集合を抽出し、この抽出した空集合の各々に対応するユーザ希望情報を割り当てて葉とすることにより、前記根から延出した枝を終止する終止ステップとを実行させることを特徴とする。
【００１９】
請求項１０記載の発明は、前記（ハ）の部分集合抽出ステップで抽出した部分集合のうち空集合でない部分集合を改めて質問集合として、この質問集合の各々に対して、前記（イ）の抽出ステップから前記（ニ）の終止ステップに至るまでの処理を再帰的に繰り返し、この繰り返しの過程において、前記（ロ）の分割ステップで分割した部分集合が全て空集合であるとき、繰り返しを終了することを特徴とする。
【００２０】
請求項１１記載の発明は、検索決定木の根から順次延出して枝分かれを生じる複数の節の各々に配置される限定質問を構成要素とする質問集合のうち、該当する質問集合につき、前記根および前記複数の節の各々に配置される限定質問に対して順次ユーザが入力する回答に基づいて辿り着く枝の終端である葉に、前記ユーザが希望する情報としてのユーザ希望情報を割り当てて成る検索決定木を生成する検索決定木生成プログラムであって、検索決定木を生成するために必要な情報を記憶する記憶部を備えたコンピュータに、（イ）各限定質問に対する回答を組み合せて成る複数の訓練事例から構成される訓練事例情報と、当該訓練事例情報に係る質問集合とに基づいて、前記訓練事例全体の限定質問前と限定質問後の情報量の差を示す情報利得を算出する情報利得算出ステップと、（ロ）この情報利得算出ステップで算出した情報利得と各限定質問に対する回答の容易さを示す容易度情報を前記記憶部から読み出し、この読み出した情報利得と容易度情報に基づいて、各限定質問に対する評価値を算出する評価値算出ステップと、（ハ）この評価値算出ステップで算出した評価値が最も高い限定質問を前記記憶部から読み出して抽出する抽出ステップと、（ニ）この抽出ステップで抽出した限定質問を検索決定木の根に最も近接する節のうち空いている節に割り当て、複数の限定質問から構成される限定質問の集合を２つの部分集合に分割する分割ステップと、（ホ）この分割ステップで分割した各部分集合から空集合でない部分集合を抽出する部分集合抽出ステップと、（へ）前記分割ステップで分割した各部分集合から空集合を抽出し、この抽出した空集合の各々に対応するユーザ希望情報を割り当てて葉とすることにより、前記根から延出した枝を終止する終止ステップとを実行させることを特徴とする。
【００２１】
請求項１２記載の発明は、前記（ハ）の部分集合抽出ステップで抽出した部分集合のうち空集合でない部分集合を改めて質問集合として、この質問集合の各々に対して、前記（イ）の抽出ステップから前記（ヘ）の終止ステップに至るまでの処理を再帰的に繰り返し、この繰り返しの過程において、前記（ニ）の分割ステップで分割した部分集合が全て空集合であるとき、繰り返しを終了することを特徴とする。
【００２２】
請求項１３記載の発明は、前記評価値算出ステップで算出する評価値は、前記情報利得および前記容易度情報を所定の比率で加算した値であることを特徴とする。
【００２３】
請求項１４記載の発明は、請求項９乃至１３のいずれか１項に記載の検索決定木生成プログラムを記録したことを特徴とする。
【００２４】
この発明におけるプログラム記録媒体とは、ハードディスク、フレキシブルディスク、ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｃＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）、光磁気ディスク、ＰＣカード等のコンピュータ読み取り可能な記録媒体を意味する。
【００２５】
【発明の実施の形態】
以下、添付図面を参照して、本発明の実施の形態を説明する。
【００２６】
（第１の実施形態）
図１は、本発明の第１の実施形態に係る検索決定木生成装置の概略構成を示すブロック図である。同図に示す検索決定木生成装置１は、検索決定木の各節に配置される質問（以下、「限定質問」という）に対する回答の容易度の情報（以下、単に「容易度情報」という）を含む所定の情報の入力や操作を行うための操作入力部１１、操作入力部からの入力や操作に応じて検索決定木の生成を行うための制御および判断を行う制御判断部１２、所定の情報を記憶しておくための記憶部１３（記憶手段をなす）、制御判断部１２によって生成された検索決定木を外部に出力するための出力部１４を含むように構成される。
【００２７】
ここで、操作入力部１１は、例えば、キーボード、マウス、ＣＤ（ＣｏｍｐｕｔｅｒＤｉｓｃ）ドライブ等によって構成され、記憶部１３に記憶される情報をユーザに入力させ、制御判断部１２に行わせる処理の指示を行わせるようになっている。
【００２８】
記憶部１３は、検索決定木の生成に必要な情報を記憶しておくようになっており、限定質問、検索決定木の葉に割り当てられる情報（以下、「ユーザ希望情報」という）、および容易度情報を含む情報を記憶するようになっている。なお、以下では、複数の限定質問から構成される限定質問の集合を、単に「質問集合」という。
【００２９】
このような記憶部１３は、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等から構成される主記憶部と、ハードディスクドライブ、フレキシブルディスクドライブ、ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｃＲｅａｄＯｎｌｙＭｅｍｏｒｙ）ドライブ、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）ドライブ、光磁気ディスクドライブ、ＰＣ（ＰｅｒｓｄｏｎａｌＣｏｍｐｕｔｅｒ）カードドライブ等から構成される補助記憶部とを備えており、後述する計算結果を随時格納するために必要なメモリ領域が確保されている。
【００３０】
出力部１４は、制御判断部１２によって生成された検索決定木を外部に出力するものであり、液晶ディスプレイ等の出力装置から構成されている。また、記憶部１３に記憶された所定の情報を制御判断部１２経由で外部の装置に出力できるようになっているのでもよい。出力の指示は、操作入力部１１を介して入力され、制御判断部１２の制御の下に行われる。
【００３１】
制御判断部１２は、操作入力部１１からの操作に応じて記憶部１３に記憶された情報を読み出し、この読み出した情報に基づいて検索決定木を生成し、出力部１４を制御して生成した検索決定木を外部に出力するようになっている。また、制御判断部１２は、記憶部１３および出力部１４を制御するその他の制御動作および判断動作を行うのでもよい。ここで記載した動作からも明らかなように、制御判断部１２は、中央処理装置（ＣＰＵ：ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）から構成されている。
【００３２】
図２は、本発明の第１の実施形態に係る検索決定木生成処理の流れを示すフローチャート図である。同図を参照して、本実施形態に係る検索決定木生成方法を具体的に説明する。
【００３３】
以下では、検索決定木の一例として、図６に示す検索決定木１００を適宜参照する。この検索決定木１００の根Ｒ_１１と節Ｎ_ｍｎには、それぞれ限定質問Ｑ_１１とＱ_ｍｎが配置されており、根Ｒ_１１から延出する枝の終端である葉Ｌ_ｎには、解決方法Ａ_ｎが回答として与えられている。節Ｎ_ｍｎの添字のうち、添字ｍはレベルの値と一致し、添字ｎは各レベルごとに順序付けられた整数である。また、図６に示す場合、葉Ｌ_ｎの数はｐ個である（ｐは正の整数）。なお、同図において、ｉ、ｊおよびｋは正の整数である。
【００３４】
まず、検索決定木の根から出発して節から葉に至る方向に進んだ尺度であるレベルの値を１に設定する（ステップＳ２１１）。図６に示す検索決定木１００の場合には、根Ｒ_１１のレベルを「１」とし、各枝を葉に向かって１つ進むごとにレベルの値が１つ増加する。なお、図６では、根Ｒ_１１における限定質問をＱ_１１としている。ユーザが、根Ｒ_１１において限定質問Ｑ_１１に回答すると、レベルは「２」に進み、回答を行う毎にレベルの値が１ずつ増えるようになっている。レベルの値が小さいほど根Ｒ_１１に近いことはいうまでもない。
【００３５】
次に、容易度情報に基づいて、記憶部１３に記憶された質問集合のうち、最も回答が容易な限定質問を抽出する（ステップＳ２１２）。図６の場合には、レベル１において、限定質問Ｑ_１１が抽出される。
【００３６】
ステップＳ２２１からＳ２２６までの処理はループをなす。まず、ステップＳ２１２で抽出した限定質問に基づいて質問集合を２つの部分集合に分割する（ステップＳ２２１）。
【００３７】
次のステップＳ２２２では、全質問集合についてステップＳ２２１での処理が完了したか否かを判断する。１回目のループ（レベル１）での処理の場合、質問集合は１つであり、既に分割が完了しているので（Ｙｅｓ）、ステップＳ２２３に進む。他のレベルのループで質問集合の分割が完了していない場合（Ｎｏ）には、ステップＳ２２１に戻って同じ処理を繰り返す。
【００３８】
次に、上記ステップＳ２２１で分割した部分集合のうち、空集合でない部分集合（以下、「要素保有部分集合」という）を抽出する（ステップＳ２２３）。
【００３９】
ステップＳ２２４では、要素保有部分集合を抽出したか否かを判断し、要素保有部分集合を抽出したと判断した場合、ステップＳ２２５に移り、抽出しないと判断した場合、ステップＳ２３１に進む（ステップＳ２２４）。ここで、要素保有部分集合を抽出したか否かの判断は、空集合ではない部分集合が少なくとも一つ存在するときは「抽出した」と判断し、全ての部分集合が空集合のときは「抽出していない」と判断する。
【００４０】
例えば図６に示す場合、レベル２にある節Ｎ_２１（限定質問Ｑ_２１が配置されている）の場合、次のレベルに来るのはともに節なので、抽出した部分集合は空集合でないものを含む場合に相当する。したがって、この場合には要素保有部分集合を「抽出した」と判断して、ステップＳ２２５に進む。これに対して、レベル４にある節Ｎ_４２（限定質問Ｑ_４２が配置される）は、抽出した部分集合が全て空集合となる場合に相当しており、この場合、要素保有部分集合を「抽出していない」と判断してステップＳ２３１に進む。
【００４１】
ステップＳ２２４で要素保有部分集合を抽出したと判断した場合には、レベルの値を１増加させる（ステップＳ２２５）。１回目のループでは、この処理によってレベルの値が「２」となる。
【００４２】
レベルを増加させた後、ステップＳ２２３で抽出した各要素保有部分集合に対し、回答が最も容易な限定質問をそれぞれ抽出し（ステップＳ２２６）、ステップＳ２２１の処理に戻る。図６に示す例では、レベル２のステップＳ２２６において、限定質問Ｑ_２１とＱ_２２が抽出される。
【００４３】
この後、抽出した要素保有部分集合を改めて質問集合とみなし、上記のステップＳ２２１からＳ２２６までの処理を再帰的に繰り返す。
【００４４】
上記のステップＳ２２１からＳ２２６までのループ処理を繰り返して、全ての部分集合が空集合となった場合、ステップＳ２２４で「抽出していない」と判断することになるので、ステップＳ２３１に進む。この場合、各空集合に対応する情報を割り当てる（ステップＳ２３１）。このステップで枝の終端としての葉Ｌ_ｎに割り当てられる情報は、ユーザが順次回答して得ることになる情報、すなわちユーザが希望する情報なので、上述したように「ユーザ希望情報」と呼ぶ。
【００４５】
図６において、このユーザ希望情報に該当するのは、レベル４の葉に割り当てられたＡ_１、レベル５の葉Ｌ_２，Ｌ_３にそれぞれ割り当てられたＡ_２，Ａ_３，レベルｉ−１の葉Ｌ_ｐ−２に割り当てられたＡ_ｐ−２，レベルｉの葉Ｌ_ｐ−１，Ｌ_ｐにそれぞれ割り当てられたＡ_ｐ−１，Ａ_ｐ等である（ここでｉは正の整数）。
【００４６】
以上の処理が制御判断部１２によって実行されることはいうまでもない。この意味で、制御判断部１２は、各限定質問に対する回答の容易さを示す容易度情報を前記記憶手段から読み出し、この読み出した容易度情報に基づいて、回答が最も容易な限定質問を抽出する抽出手段、この抽出手段で抽出した限定質問を検索決定木の根に最も近い節のうち空いている節に割り当て、複数の限定質問から構成される限定質問の集合を２つの部分集合に分割する分割手段、この分割手段で分割した各部分集合から空集合でない部分集合を抽出する部分集合抽出手段、前記分割手段で分割した各部分集合から空集合を抽出し、この抽出した空集合の各々に対応するユーザ希望情報を割り当てて葉とすることにより、前記根から延出した枝を終止する終止手段としての機能を兼備している。
【００４７】
以上説明した本発明の第１の実施形態によれば、限定質問の容易度情報に基づいて回答の容易な順番に検索決定木の根から葉の方向に限定質問を配置することにより、質問の容易度（または難易度）を考慮して限定質問を配置した検索決定木を生成することができる。
【００４８】
なお、本実施形態では、検索決定木生成装置を用いて上記のステップＳ２１１からステップＳ２３１に至る各ステップでの処理を行う検索決定木生成方法について説明したが、これらの各ステップを含む検索決定木生成処理を実行させるための検索決定木生成プログラムがインストールされた所定のコンピュータを用いて実施しても同様の効果を得ることができる。
【００４９】
また、本実施形態の各種処理は、一つの電子的な装置が実行する場合だけでなく、各ステップの実行を適宜分割して二つ以上の電子的な装置から構築されたシステムが全体で実行する場合も含む。この意味で、本実施形態に係る「検索決定木生成装置」は、一つまたは複数のコンピュータ（システム）によって構成される。この点については、本発明の全ての実施の形態に共通する事項である。
【００５０】
ところで、以上説明した検索決定木生成プログラムを記録したコンピュータ読み取り可能なプログラム記録媒体をコンピュータに装着し、そのプログラム記録媒体に格納されているプログラムを読み出すことによって、コンピュータが上述した処理を実行するようにしてもよい。ここで、「コンピュータ読み取り可能な」プログラム記録媒体としては、ハードディスク、フレキシブルディスク、ＣＤ−ＲＯＭ、ＤＶＤ、光磁気ディスク、ＰＣカード等を用いることができる。このようなプログラム記録媒体を提供することによって、本実施形態の検索決定木生成プログラムを広く流通させることができるようになる。この点についても、本発明の全ての実施の形態に共通する事項である。
【００５１】
（第２の実施形態）
図３は、本発明の第２の実施形態に係る検索決定木生成装置の概略構成を示すブロック図である。同図に示す検索決定木生成装置３は、本発明の第１の実施形態に係る検索決定木生成装置１に評価値算出部３５を付加した構成を有する。ここで、本実施形態に係る検索決定木生成装置３の構成部分のうち、検索決定木生成装置１と同じ構成を有する部分には同一の番号を付し、その説明を省略する。
【００５２】
図３において、操作入力部１１からは、第１の実施形態における容易度情報を含む所定の情報以外に、訓練事例情報も入力されるものとする。ここで、「訓練事例」とは、図４の表５０に示すように、質問集合中の個々の限定質問（１，２，３，・・・，Ｒ）に対する回答を組み合わせたものである。そしてこの表５０のように、複数の訓練事例（１，２，・・・，Ｐ）から構成される集合を「訓練事例情報」という。
【００５３】
操作入力部１１を介して入力された訓練事例情報は、制御判断部３２を介して記憶部１３に記憶される。
【００５４】
評価値算出部３５は、質問集合と訓練事例情報とに基づいて情報利得
Ｇ（Ｑ_ｋ，Ｔ）＝Ｉ（Ｔ）−Ｉ（Ｑ_ｋ，Ｔ）（１）
を算出する。ただし、Ｉ（Ｔ）は、訓練事例情報Ｔ全体の情報量であり、Ｉ（Ｑ_ｋ，Ｔ）は、Ｔが限定質問Ｑ_ｋによって部分集合Ｓ_ｉに分割された状態における情報量を示し、
【数１】

と表される。式（２）において、Ｃ_ｊ（ｊ＝１、・・・・、Ｍ；Ｍは正の整数）はユーザ希望情報であり、｜Ｃ_ｊ｜は、各ユーザ希望情報に含まれる事例の個数である。また、式（３）において、｜Ｔ｜は、訓練事例情報の個数であり、Ｓ_ｉ（ｊ＝１、・・・・、Ｎ；Ｎは正の整数）は、訓練事例情報Ｔが限定質問Ｑ_ｋによって分割された場合の各部分集合を示す。
【００５５】
この後、評価値算出部３５は、算出した情報利得Ｇ（Ｑ_ｋ，Ｔ）と、容易度情報に基づいて与えられる限定質問Ｑ_ｋに対する回答の容易度Ｆ（Ｑ_ｋ）とに基づいて、限定質問Ｑ_ｋに対する評価値Ｅ（Ｑ_ｋ）を算出する。したがって、評価値算出部３５は、情報利得算出手段と評価値算出手段の二つの機能を兼備するものである。
【００５６】
評価値Ｅ（Ｑ_ｋ）としては、例えば、以下の式に示すように、情報利得Ｇ（Ｑ_ｋ，Ｔ）と容易度Ｆ（Ｑ_ｋ）を所定の比率で加えたものを用いることができる：
Ｅ（Ｑ_ｋ）＝Ｇ（Ｑ_ｋ，Ｔ）＋λＦ（Ｑ_ｋ）（４）
ここで、λは、容易度Ｆ（Ｑ_ｋ）の寄与を調整するためのパラメータであり、正の値をとる。なお、容易度Ｆ（Ｑ_ｋ）は、回答が容易なほど高い値を示す指標であればどのようなものでもよく、例えば、予め限定質問について回答率の調査を行っておき、その結果得られた回答率の逆数等の所定の関数を用いて算出したものでもよい。
【００５７】
ところで、情報利得Ｇ（Ｑ_ｋ，Ｔ）は、式（１）の定義からもわかるように、訓練事例情報Ｔに対する限定質問Ｑ_ｋによる質問前と質問後の情報量の差を表している。そこで、この情報利得Ｇ（Ｑ_ｋ，Ｔ）の代わりに、貢献度として利得比ＧＲ（Ｑ_ｋ，Ｔ）を採用することもできる。利得比ＧＲ（Ｑ_ｋ，Ｔ）は、次のように定義される量である。
【数２】

【００５８】
実際、情報利得Ｇ（Ｑ_ｋ，Ｔ）のままでは、より多数の値（ユーザ希望情報）をとる限定質問Ｑ_ｋほど優先的に採用されてしまう恐れもあるため、利得比ＧＲ（Ｑ_ｋ，Ｔ）を用いることによって、出現頻度が相対的に高いユーザ希望情報に、より早く到達できる可能性が出てくる。なお、ユーザ希望情報の定義は、第１の実施形態と同じである。
【００５９】
また、利得比ＧＲ（Ｑ_ｋ，Ｔ）は、情報利得Ｇ（Ｑ_ｋ，Ｔ）をＳＩ（Ｑ_ｋ，Ｔ）によって正規化したものであり（式（５）を参照）、このような値を用いることにより、検索決定木生成のアルゴリズムが、分割を細かくし過ぎないように抑制することが可能となる。
【００６０】
以上をふまえ、以後の説明では、情報利得Ｇ（Ｑ_ｋ，Ｔ）の代わりに利得比ＧＲ（Ｑ_ｋ，Ｔ）を用いてもよいものとする。
【００６１】
図５は、本発明の第２の実施形態に係る検索決定木生成処理の流れを示すフローチャート図である。同図を参照して、本実施形態に係る検索決定木生成方法を具体的に説明する。以下では、情報利得Ｇ（Ｑ_ｋ，Ｔ）と容易度Ｆ（Ｑ_ｋ）とを引数とする評価値Ｅ（Ｑ_ｋ）が、上記の式（４）によって表される場合を例にとって説明する。
【００６２】
まず、式（４）のパラメータλの値を決定し（ステップＳ５１１）、検索決定木の根から出発して葉の方向に進んだ尺度であるレベルの値を１に設定する（ステップＳ５１２）。ここでも、レベル１の具体例としては、図６に示す検索決定木１００の根Ｒ_１１のレベルを挙げることができる。
【００６３】
次に、記憶部１３に記憶された質問集合のうちの１つの限定質問を抽出する（ステップＳ５１３）。
【００６４】
ステップＳ５２１からＳ５３４までの処理はループをなす。まず、限定質問に対する情報利得Ｇ（Ｑ_ｋ，Ｔ）を算出し（ステップＳ５２１）、上記の式（４）に従って評価値Ｅ（Ｑ_ｋ）を算出する（ステップＳ５２２）。
【００６５】
ステップＳ５２１とＳ５２２での処理を全限定質問について完了するまで繰り返す（ステップＳ５２３）。すなわち、このステップで、全ての限定質問に対してステップＳ５２１、Ｓ５２２の処理が完了していない場合（Ｎｏ）にはステップＳ５２１に戻り、完了した場合（Ｙｅｓ）にはステップＳ５２４に進む。
【００６６】
ステップＳ５２４では、ステップＳ５２３で算出した評価値のうち、最大のものに対応する限定質問を抽出する。
【００６７】
このステップで抽出した限定質問に基づいて質問集合を２つの部分集合に分割する（ステップＳ５２５）。
【００６８】
次のステップＳ５２６での処理は、全質問集合についてステップＳ５２５での処理が完了したか否かを判断するものである。全質問集合について完了した場合（Ｙｅｓ）には、続くステップＳ５３１に進む。他方、完了していない場合（Ｎｏ）には、ステップＳ５２１に戻って、そのステップＳ５２１からの処理を繰り返す。１回目のループでの処理の場合には、質問集合は１つなので、処理はステップＳ５３１に進むことになる。
【００６９】
ステップＳ５３１では、上記で分割した部分集合のうち、要素保有部分集合を抽出する（ステップＳ５３１）。要素保有部分集合の定義は、第１の実施形態と同じである。
【００７０】
ステップＳ５３１で要素保有部分集合を抽出したか否かを判断し、要素保有部分集合を抽出したと判断した場合（Ｙｅｓ）、処理はステップＳ５３３に移り、抽出しないと判断した場合（Ｎｏ）、処理はステップＳ５４１に進む（ステップＳ５３２）。ここで、要素保有部分集合を抽出したか否かの判断の仕方は、本発明の第１の実施形態と同じである。
【００７１】
ステップＳ５３４で要素保有部分集合を抽出したと判断した場合、レベルを１増加させる（ステップＳ５３３）。１回目のループの場合、レベルは「２」になる。レベルの値を増加させた後、ステップＳ５３１で抽出した各要素保有部分集合に対して、各要素保有部分集合における回答が最も容易な限定質問を抽出し（ステップＳ５３４）、ステップＳ５２１の処理に戻る。これに際して、要素保有部分集合を新たに質問集合とみなし、上記のステップＳ５２１からＳ５３４までの処理を繰り返す。
【００７２】
図６においては、限定質問Ｑ_２１とＱ_２２が、ステップＳ５３４で抽出される。上記のステップＳ５２１からＳ５３４までのループ処理を繰り返して、全ての部分集合が空集合となった場合、ステップＳ５３２で処理はステップＳ５４１に進む。
【００７３】
全ての部分集合が空集合となった場合、すなわちステップＳ５３２で部分集合が「抽出していない」と判断した場合、各空集合に対してユーザ希望情報を割り当てる（ステップＳ５４１）。
【００７４】
以上の処理が制御判断部３２および評価値算出部３５によって実行されることはいうまでもない。このうち、制御判断部３２は、評価値算出部３５で算出した評価値が最も高い限定質問を抽出する抽出手段、この抽出手段で抽出した限定質問を検索決定木の根に最も近接する節のうち空いている節に割り当て、複数の限定質問から構成される限定質問の集合を２つの部分集合に分割する分割手段、この分割手段で分割した各部分集合から空集合でない部分集合を抽出する部分集合抽出手段、前記分割手段で分割した各部分集合から空集合を抽出し、この抽出した空集合の各々に対応するユーザ希望情報を割り当てて葉とすることにより、前記根から延出した枝を終止する終止手段としての機能を兼備している。
【００７５】
以上説明した本発明の第２の実施形態によれば、質問の容易度情報および情報利得に応じた評価値を算出し、その評価値の高い順番に検索決定木の根から葉の方向に質問を配置することによって、質問の容易度（または難易度）を考慮して質問を配置した検索決定木を生成することができる。
【００７６】
なお、本実施形態では、検索決定木生成装置を用いて上記のステップＳ５１１からステップＳ５４１に至る各ステップでの処理を行う検索決定木生成方法について説明したが、これらの各ステップを含む検索決定木生成処理を実行させるための検索決定木生成プログラムがインストールされた所定のコンピュータを用いて実施しても同様の効果を得ることができる。
【００７７】
【発明の効果】
以上の説明からも明らかなように、本発明によれば、質問の難易度または容易度を考慮して質問を配置した検索決定木の生成を可能にする検索決定木生成方法、検索決定木生成装置、検索決定木生成プログラム、およびプログラム記録媒体を提供することができる。
【図面の簡単な説明】
【図１】本発明の第１の実施形態に係る検索決定木生成装置の概略構成を示すブロック図である。
【図２】本発明の第１の実施形態に係る検索決定木生成における処理の流れを示すフローチャート図である。
【図３】本発明の第２の実施形態に係る検索決定木生成装置の概略構成を示すブロック図である。
【図４】訓練事例の一例を概念的に示す説明図である。
【図５】本発明の第２の実施形態に係る検索決定木生成における処理の流れを示すフローチャート図である。
【図６】検索決定木の一例を示す図である。
【符号の説明】
１、３検索決定木生成装置
１１操作入力部
１２、３２制御判断部
１３記憶部
１４出力部
３５評価値算出部
１００検索決定木[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a search decision tree generation method, a search decision tree generation device, a search decision tree generation program, a search decision tree generation program, and a program recording medium when a search is performed by answering a plurality of questions and a search decision tree representing an arrangement of questions is created. About.
[0002]
[Prior art]
FAQ (Frequently Asked Questions) means "Frequently Asked Questions". For example, a consumer who has purchased a home appliance such as a personal computer or a microwave oven cannot understand how to use the product, This is a question often asked in the event of a breakdown. The FAQ database is a database that summarizes a solution to what to do when there is an FAQ.
[0003]
Generally, a search decision tree is used to search for an answer such as an FAQ, and by answering a question arranged in a node of the search decision tree, a desired solution is searched from the FAQ database. Has become.
[0004]
A user who attempts to search for a desired solution from such an FAQ database first answers a question placed at the root, and then answers a question placed at a branch node corresponding to the answer. By repeating this procedure, the user can finally reach the leaf where the desired solution is located.
[0005]
This search decision tree is a kind of a design diagram representing a logical structure for searching a desired solution from an FAQ database, and is actually an HTML (Hyper Text Markup Language) used in the WWW (World Wide Web). ) As a document or the like, it has been realized on a computer or on a paper booklet. Then, the arrangement of the questions arranged in the nodes of the search decision tree is determined so that a solution having a high appearance frequency is searched earlier (for example, see Patent Document 1).
[0006]
[Patent Document 1]
JP-A-6-4292
[0007]
[Problems to be solved by the invention]
However, in such a conventional search decision tree generation technology, a question is arranged in a node of the search decision tree so that a solution having a higher appearance frequency is searched earlier, and the difficulty level of the question is not considered at all. In some cases, questions with high difficulty in answer are arranged near the beginning near the root of the search decision tree. In such a case, the search decision tree cannot be advanced after that question, and the answers cannot be narrowed down. Was.
[0008]
In addition, despite the fact that it may not be necessary to answer difficult questions when searching for solutions in the first place, users can answer questions of any difficulty with conventional technology. As a result, the user may not be able to obtain a desired answer in some cases.
[0009]
The present invention has been made to solve such a problem, and an object of the present invention is to generate a search decision tree that enables the generation of a search decision tree in which questions are arranged in consideration of the difficulty or ease of a question. A method, a search decision tree generation device, a search decision tree generation program, and a program recording medium are provided.
[0010]
[Means for Solving the Problems]
In order to achieve the above object, the invention according to claim 1 corresponds to a question set including a limited question arranged as a constituent element in each of a plurality of nodes that sequentially extend from a root of a search decision tree and cause branching. For a set of questions, a user as information desired by the user is placed on a leaf that is the end of a branch that is reached based on an answer sequentially input by the user with respect to a limited question arranged at each of the root and the plurality of clauses. A search decision tree generating method for generating a search decision tree by assigning desired information, comprising: a computer having a storage unit for storing information necessary for generating a search decision tree; An extraction step of reading ease information indicating the ease of answer from the storage unit and extracting a limited question with the easiest answer based on the read ease information; A dividing step of assigning the limited question extracted in the step to one of the vacant clauses among the clauses closest to the root of the search decision tree, and dividing a question set composed of a plurality of limited questions into two subsets; ) A subset extraction step of extracting a non-empty subset from the two subsets divided in this division step, and (d) extracting an empty set from the two subsets divided in the division step. And terminating the branches extending from the root by assigning user desired information corresponding to each of the sets to leaves.
[0011]
The invention according to claim 2 is that the subset which is not an empty set among the subsets extracted in the subset extraction step of (c) is newly set as a question set, and the extraction of (a) is performed for each of the question sets. The process from the step to the end step of (d) is recursively repeated. In the course of this repetition, when all the subsets divided in the division step of (b) are empty sets, the repetition is terminated. It is characterized by the following.
[0012]
The invention according to claim 3 is that, out of the question set having limited questions arranged at each of a plurality of nodes that extend from the root of the search decision tree and form a branch, the question set corresponding to the root and the question A search decision made by allocating user desired information as information desired by the user to a leaf which is the end of a branch arriving based on an answer sequentially input by the user with respect to the limited questions arranged in each of the plurality of clauses A search decision tree generation method for generating a tree, comprising: a computer having a storage unit for storing information necessary for generating a search decision tree; Based on the training case information composed of the cases and the question set related to the training case information, an information gain indicating a difference between the information amounts before and after the limited question of the entire training case is calculated. (B) reading from the storage unit the information gain calculated in the information gain calculation step and the ease information indicating the ease of answering each of the limited questions, and the read information gain and ease information An evaluation value calculating step of calculating an evaluation value for each limited question based on the (c) an extraction step of reading out and extracting the limited question having the highest evaluation value calculated in the evaluation value calculating step from the storage unit; (D) Assigning the limited question extracted in this extraction step to an empty node among the clauses closest to the root of the search decision tree, and dividing the set of limited questions composed of a plurality of limited questions into two subsets (E) a subset extracting step of extracting a non-empty subset from each of the subsets divided in this division step; Ending a branch extending from the root by extracting an empty set from each of the subsets divided by the step, assigning user desired information corresponding to each of the extracted empty sets to a leaf, and ending the branch extending from the root. It is characterized by executing.
[0013]
The invention according to claim 4 is that the subset which is not an empty set among the subsets extracted in the subset extraction step of (e) is newly set as a question set, and the extraction of (a) is performed for each of the question sets. The process from the step to the end step (f) is recursively repeated. In the course of this repetition, when all the subsets divided in the division step (d) are empty sets, the repetition ends. It is characterized by the following.
[0014]
The invention according to claim 5 is characterized in that the evaluation value calculated in the evaluation value calculating step is a value obtained by adding the information gain and the ease information at a predetermined ratio.
[0015]
The invention according to claim 6 is that, among the question sets each including a limited question arranged at each of a plurality of nodes that extend from the root of the search decision tree and form a branch, the root and the question A search decision made by allocating user desired information as information desired by the user to a leaf which is the end of a branch arriving based on an answer sequentially input by the user with respect to the limited questions arranged in each of the plurality of clauses A search decision tree generating apparatus for generating a tree, wherein storage means for storing information necessary for generating a search decision tree, and readability information indicating ease of answer to each limited question are read from the storage means. Based on the read ease information, extracting means for extracting a limited question with the easiest answer, and extracting the limited question extracted by the extracting means from the nodes closest to the root of the search decision tree. Dividing means for dividing a set of limited questions composed of a plurality of limited questions into two subsets, and extracting a non-empty subset from each of the subsets divided by the dividing means Extracting means, extracting an empty set from each of the subsets divided by the dividing means, assigning user desired information corresponding to each of the extracted empty sets to a leaf, thereby branching the branch extending from the root. And terminating means for terminating.
[0016]
The invention according to claim 7, wherein, among the question sets each including a limited question arranged at each of a plurality of nodes that extend sequentially from the root of the search decision tree and form a branch, the root and the question A search decision made by allocating user desired information as information desired by the user to a leaf which is the end of a branch arriving based on an answer sequentially input by the user with respect to the limited questions arranged in each of the plurality of clauses A search decision tree generating apparatus for generating a tree, comprising: storage means for storing information necessary for generating a search decision tree; and a training case comprising a plurality of training cases in which answers to each of the limited questions are combined. Information gain calculating means for calculating an information gain indicating a difference between the information amounts before and after the limited question of the entire training case based on the information and the question set related to the training case information. Evaluation value calculating means for calculating an evaluation value for each limited question based on the information gain calculated by the information gain calculating means and ease information indicating the ease of answering each limited question; Extracting means for extracting a limited question having the highest evaluated value, and assigning the limited question extracted by the extracting means to a vacant node among the nodes closest to the root of the search decision tree, and forming a limited question composed of a plurality of limited questions. A dividing means for dividing the set of questions into two subsets, a subset extracting means for extracting a non-empty subset from each of the subsets divided by the dividing means, and an empty set from each of the subsets divided by the dividing means Terminating means for terminating a branch extending from the root by extracting a set and assigning user desired information corresponding to each of the extracted empty sets to a leaf. The features.
[0017]
The invention according to claim 8 is characterized in that the evaluation value calculated by the evaluation value calculation means is a value obtained by adding the information gain and the ease information at a predetermined ratio.
[0018]
The invention according to claim 9 is a query set having limited questions arranged at each of a plurality of nodes that extend sequentially from the root of the search decision tree and form a branch, and for the corresponding question set, A search decision made by allocating user desired information as information desired by the user to a leaf which is the end of a branch arriving based on an answer sequentially input by the user with respect to the limited questions arranged in each of the plurality of clauses A search decision tree generating program for generating a tree, the computer having a storage unit for storing information necessary for generating the search decision tree; An extraction step of extracting information from the storage unit and extracting a limited question with the easiest answer based on the read ease information; and (b) extracting in the extraction step A dividing step of assigning the fixed question to one of the vacant clauses among the clauses closest to the root of the search decision tree, and dividing the question set composed of a plurality of limited questions into two subsets; (D) extracting an empty set from the two subsets divided in the dividing step, and extracting a non-empty subset from the two subsets And ending a branch extending from the root by allocating corresponding user desired information to make a leaf.
[0019]
The invention according to claim 10 is that, among the subsets extracted in the subset extraction step of (c), a non-empty subset is newly set as a question set, and the extraction of (a) is performed for each of the question sets. The process from the step to the end step of (d) is recursively repeated. In the course of this repetition, when all the subsets divided in the division step of (b) are empty sets, the repetition is terminated. It is characterized by the following.
[0020]
The invention according to claim 11, wherein, among the question sets each including a limited question arranged at each of a plurality of nodes that extend from the root of the search decision tree and cause branching, a question set corresponding to the root and the question A search decision made by allocating user desired information as information desired by the user to a leaf which is the end of a branch arriving based on an answer sequentially input by the user with respect to the limited questions arranged in each of the plurality of clauses A search decision tree generation program for generating a tree, comprising: a computer having a storage unit for storing information necessary for generating a search decision tree; Based on training case information composed of cases and a set of questions related to the training case information, information indicating a difference between the information amounts before and after the limited question of the entire training case. An information gain calculation step of calculating the gain; and (b) reading the information gain calculated in the information gain calculation step and ease information indicating the ease of answering each of the limited questions from the storage unit. An evaluation value calculating step of calculating an evaluation value for each limited question based on the degree of ease information; and (c) extracting the limited question having the highest evaluation value calculated in the evaluation value calculating step from the storage unit and extracting it. And (d) assigning the limited question extracted in this extraction step to a vacant clause among the clauses closest to the root of the search decision tree, and converting a set of limited questions composed of a plurality of limited questions into two subsets A dividing step of dividing, and (e) a subset extracting step of extracting a non-empty subset from each of the subsets divided in the dividing step; A termination step of terminating a branch extending from the root by extracting an empty set from each of the subsets divided in the dividing step and assigning user desired information corresponding to each of the extracted empty sets to a leaf; Are executed.
[0021]
According to the twelfth aspect of the present invention, the non-empty subset among the subsets extracted in the subset extraction step of (c) is newly set as a question set, and the extraction of (a) is performed for each of the question sets. The process from the step to the end step (f) is recursively repeated. In the course of this repetition, when all the subsets divided in the division step (d) are empty sets, the repetition ends. It is characterized by the following.
[0022]
The invention according to claim 13 is characterized in that the evaluation value calculated in the evaluation value calculation step is a value obtained by adding the information gain and the ease information at a predetermined ratio.
[0023]
According to a fourteenth aspect of the present invention, the search decision tree generating program according to any one of the ninth to thirteenth aspects is recorded.
[0024]
The program recording medium in the present invention means a computer-readable recording medium such as a hard disk, a flexible disk, a CD-ROM (Compact Disc Only Only Memory), a DVD (Digital Versatile Disk), a magneto-optical disk, and a PC card.
[0025]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings.
[0026]
(1st Embodiment)
FIG. 1 is a block diagram showing a schematic configuration of the search decision tree generation device according to the first embodiment of the present invention. The search decision tree generation device 1 illustrated in FIG. 1 includes information on the ease of answer to a question (hereinafter, referred to as a “limited question”) arranged in each section of the search decision tree (hereinafter, simply referred to as “easiness information”). An operation input unit 11 for inputting and operating predetermined information including, a control determining unit 12 for performing control and determination for generating a search decision tree in accordance with an input or operation from the operation input unit, It is configured to include a storage unit 13 (serving as a storage unit) for storing information and an output unit 14 for outputting the search decision tree generated by the control determination unit 12 to the outside.
[0027]
Here, the operation input unit 11 includes, for example, a keyboard, a mouse, a CD (Computer Disc) drive, etc., and allows the user to input information stored in the storage unit 13 and instructs the control determination unit 12 to perform processing. Is performed.
[0028]
The storage unit 13 stores information necessary for generating a search decision tree, and includes a limited question, information assigned to a leaf of the search decision tree (hereinafter, referred to as “user desired information”), and ease level information. Is stored. In the following, a set of limited questions composed of a plurality of limited questions is simply referred to as a “question set”.
[0029]
The storage unit 13 includes a main storage unit including a random access memory (RAM), a hard disk drive, a flexible disk drive, a compact disc read only memory (CD-ROM) drive, and a digital versatile disk (DVD) drive. , An auxiliary storage unit including a magneto-optical disk drive, a PC (Personal Computer) card drive, and the like, and a memory area necessary for storing a calculation result described later as needed is secured.
[0030]
The output unit 14 outputs the search decision tree generated by the control determination unit 12 to the outside, and includes an output device such as a liquid crystal display. Further, the predetermined information stored in the storage unit 13 may be output to an external device via the control determination unit 12. The output instruction is input through the operation input unit 11 and is performed under the control of the control determination unit 12.
[0031]
The control determination unit 12 reads information stored in the storage unit 13 in response to an operation from the operation input unit 11, generates a search decision tree based on the read information, and controls the output unit 14 to generate the search decision tree. The search decision tree is output to the outside. Further, the control determination unit 12 may perform other control operations and determination operations for controlling the storage unit 13 and the output unit 14. As is clear from the operation described here, the control determination unit 12 is configured by a central processing unit (CPU: Central Processing Unit).
[0032]
FIG. 2 is a flowchart illustrating the flow of the search decision tree generation process according to the first embodiment of the present invention. The search decision tree generation method according to the present embodiment will be specifically described with reference to FIG.
[0033]
Hereinafter, as an example of the search decision tree, the search decision tree 100 shown in FIG. 6 will be referred to as appropriate. The root R of this search decision tree 100 ₁₁ And Section N _mn Has a limited question Q ₁₁ And Q _mn Are arranged, and the root R ₁₁ L which is the end of the branch extending from _n Contains solution A _n Is given as an answer. Section N _mn , The subscript m matches the value of the level, and the subscript n is an integer ordered for each level. In the case shown in FIG. _n Is p (p is a positive integer). Note that, in the figure, i, j and k are positive integers.
[0034]
First, a value of a level, which is a scale starting from a root of a search decision tree and proceeding from a node to a leaf, is set to 1 (step S211). In the case of the search decision tree 100 shown in FIG. ₁₁ Is set to “1”, and the value of the level increases by one as each branch advances one toward the leaf. In FIG. 6, the root R ₁₁ Q for limited questions in ₁₁ And If the user has root R ₁₁ Question Q ₁₁ , The level advances to "2", and the level value is increased by 1 each time an answer is made. The smaller the level value, the more the root R ₁₁ It goes without saying that it is close to
[0035]
Next, based on the degree-of-easiness information, a limited question with the easiest answer is extracted from the question set stored in the storage unit 13 (step S212). In the case of FIG. ₁₁ Is extracted.
[0036]
The processing from steps S221 to S226 forms a loop. First, the question set is divided into two subsets based on the limited question extracted in step S212 (step S221).
[0037]
In the next step S222, it is determined whether or not the processing in step S221 has been completed for all question sets. In the case of the processing in the first loop (level 1), the number of question sets is one, and since the division has already been completed (Yes), the process proceeds to step S223. If the division of the question set has not been completed in another level loop (No), the process returns to step S221 and the same processing is repeated.
[0038]
Next, a subset that is not an empty set (hereinafter, referred to as an “element holding subset”) is extracted from the subsets divided in step S221 (step S223).
[0039]
In step S224, it is determined whether or not the element holding subset has been extracted. If it is determined that the element holding subset has been extracted, the process proceeds to step S225. If it is determined that the element holding subset has not been extracted, the process proceeds to step S231 (step S224). . Here, it is determined whether or not an element holding subset is extracted. If at least one subset that is not an empty set exists, it is determined that “extracted”. If all subsets are empty sets, Not extracted ".
[0040]
For example, in the case shown in FIG. ₂₁ (Limited Question Q ₂₁ Are arranged), the next level is a node, so the extracted subset corresponds to a case including a non-empty set. Therefore, in this case, it is determined that the element holding subset is “extracted”, and the process proceeds to step S225. On the other hand, section N at level 4 ₄₂ (Limited Question Q ₄₂ Is equivalent to the case where all the extracted subsets are empty sets. In this case, it is determined that the element holding subset is “not extracted”, and the process proceeds to step S231.
[0041]
If it is determined in step S224 that the element holding subset has been extracted, the level value is increased by 1 (step S225). In the first loop, the level value becomes “2” by this processing.
[0042]
After increasing the level, for each of the element holding subsets extracted in step S223, a limited question with the easiest answer is extracted (step S226), and the process returns to step S221. In the example shown in FIG. 6, in step S226 of level 2, the restricted question Q ₂₁ And Q ₂₂ Is extracted.
[0043]
Thereafter, the extracted element holding subset is regarded as a question set again, and the above-described steps S221 to S226 are recursively repeated.
[0044]
When the loop processing from the above steps S221 to S226 is repeated and all the subsets become empty sets, it is determined in step S224 that "no extraction has been made", and the process proceeds to step S231. In this case, information corresponding to each empty set is assigned (step S231). In this step, the leaf L as the end of the branch _n Is information that the user obtains in response to each other, that is, information desired by the user, and is referred to as “user desired information” as described above.
[0045]
In FIG. 6, the user request information corresponds to A assigned to the leaf of level 4. ₁ , Level 5 leaves L ₂ , L ₃ A assigned to each ₂ , A ₃ , Level i-1 leaf L _p-2 A assigned to _p-2 , Level i leaves L _p-1 , L _p A assigned to each _p-1 , A _p (Where i is a positive integer).
[0046]
Needless to say, the above processing is executed by the control determination unit 12. In this sense, the control determination unit 12 reads out the ease information indicating the ease of answering each limited question from the storage unit, and extracts the limited question with the easiest answer based on the read ease information. Extracting means for assigning the limited question extracted by the extracting means to an empty node among the clauses closest to the root of the search decision tree, and dividing the set of limited questions composed of a plurality of limited questions into two subsets A subset extracting unit for extracting a non-empty subset from each of the subsets divided by the dividing unit; extracting an empty set from each of the subsets divided by the dividing unit; and corresponding to each of the extracted empty sets. By allocating the user desired information to the leaves, it also has a function as a terminating means for terminating the branch extending from the root.
[0047]
According to the first embodiment of the present invention described above, the limited questions are arranged in the order from the root of the search decision tree to the leaves in the order in which the answers are easy based on the limited question ease information. (Or difficulty), a search decision tree in which limited questions are arranged can be generated.
[0048]
In the present embodiment, the search decision tree generation method for performing the processing in each of the steps from step S211 to step S231 using the search decision tree generation device has been described. However, the search decision tree including these steps is described. Similar effects can be obtained by using a predetermined computer in which a search decision tree generation program for executing the generation process is installed.
[0049]
In addition, the various processes according to the present embodiment are executed not only by one electronic device but also by a system constructed from two or more electronic devices by dividing the execution of each step as appropriate. Includes cases where In this sense, the “search decision tree generation device” according to the present embodiment is configured by one or a plurality of computers (systems). This point is common to all embodiments of the present invention.
[0050]
By the way, a computer-readable program recording medium recording the above-described search decision tree generation program is mounted on a computer, and the computer executes the above-described processing by reading the program stored in the program recording medium. It may be. Here, as a “computer-readable” program recording medium, a hard disk, a flexible disk, a CD-ROM, a DVD, a magneto-optical disk, a PC card, or the like can be used. By providing such a program recording medium, the search decision tree generation program of the present embodiment can be widely distributed. This point is also common to all embodiments of the present invention.
[0051]
(Second embodiment)
FIG. 3 is a block diagram showing a schematic configuration of the search decision tree generation device according to the second embodiment of the present invention. The search decision tree generation device 3 shown in the figure has a configuration in which an evaluation value calculation unit 35 is added to the search decision tree generation device 1 according to the first embodiment of the present invention. Here, among the components of the search decision tree generation device 3 according to the present embodiment, portions having the same configuration as the search decision tree generation device 1 are assigned the same numbers, and descriptions thereof are omitted.
[0052]
In FIG. 3, it is assumed that, in addition to the predetermined information including the ease information in the first embodiment, training case information is also input from the operation input unit 11. Here, the “training case” is a combination of the answers to the individual limited questions (1, 2, 3,..., R) in the question set, as shown in Table 50 of FIG. As shown in Table 50, a set composed of a plurality of training cases (1, 2,..., P) is referred to as “training case information”.
[0053]
The training case information input via the operation input unit 11 is stored in the storage unit 13 via the control determination unit 32.
[0054]
The evaluation value calculation unit 35 calculates the information gain based on the question set and the training case information.
G (Q _k , T) = I (T) -I (Q _k , T) (1)
Is calculated. Here, I (T) is the information amount of the entire training case information T, and I (Q) _k , T) means that T is a limited question Q _k By the subset S _i Indicates the amount of information in a divided state,
(Equation 1)

It is expressed as In equation (2), C _j (J = 1,..., M; M is a positive integer) is user desired information, and | C _j | Is the number of cases included in each user's desired information. In Expression (3), | T | is the number of pieces of training case information, and S _i (J = 1,..., N; N is a positive integer) indicates that the training case information T _k Each subset in the case of being divided by is shown.
[0055]
Thereafter, the evaluation value calculation unit 35 calculates the information gain G (Q _k , T) and a limited question Q given based on the ease information _k Of answer to the question F (Q _k ), And the limited question Q _k Evaluation value E (Q _k ) Is calculated. Therefore, the evaluation value calculation unit 35 has both functions of an information gain calculation unit and an evaluation value calculation unit.
[0056]
Evaluation value E (Q _k ) Is, for example, as shown in the following equation, information gain G (Q _k , T) and ease F (Q _k ) Can be used in a given ratio:
E (Q _k ) = G (Q _k , T) + λF (Q _k ) (4)
Here, λ is the ease F (Q _k ) Is a parameter for adjusting the contribution, and takes a positive value. Note that the ease F (Q _k ) May be any index that indicates a value that is higher as the answer is easier. For example, a survey of the response rate is performed on limited questions in advance, and a predetermined value such as the reciprocal of the response rate obtained as a result is obtained. May be calculated using the following function.
[0057]
By the way, the information gain G (Q _k , T) is a limited question Q for the training case information T, as can be seen from the definition of equation (1). _k Represents the difference in the amount of information before and after the question. Therefore, the information gain G (Q _k , T) instead of the gain ratio GR (Q _k , T) can also be employed. Gain ratio GR (Q _k , T) is a quantity defined as follows:
(Equation 2)

[0058]
In fact, the information gain G (Q _k , T), a limited question Q that takes a larger number of values (user desired information) _k The gain ratio GR (Q _k , T), there is a possibility that the user desired information having a relatively high appearance frequency can be reached earlier. Note that the definition of the user desired information is the same as in the first embodiment.
[0059]
Also, the gain ratio GR (Q _k , T) is the information gain G (Q _k , T) to SI (Q _k , T) (see equation (5)), and by using such a value, it is possible to suppress the algorithm for generating the search decision tree so that the division is not made too fine. Become.
[0060]
Based on the above, in the following description, the information gain G (Q _k , T) instead of the gain ratio GR (Q _k , T) may be used.
[0061]
FIG. 5 is a flowchart illustrating the flow of a search decision tree generation process according to the second embodiment of the present invention. The search decision tree generation method according to the present embodiment will be specifically described with reference to FIG. In the following, the information gain G (Q _k , T) and ease F (Q _k ) And the evaluation value E (Q _k ) Is represented by the above-described expression (4) as an example.
[0062]
First, the value of the parameter λ in the equation (4) is determined (step S511), and the level value, which is a scale starting from the root of the search decision tree and progressing toward the leaves, is set to 1 (step S512). Again, as a specific example of level 1, the root R of the search decision tree 100 shown in FIG. ₁₁ Level.
[0063]
Next, one limited question is extracted from the question set stored in the storage unit 13 (step S513).
[0064]
The processing from steps S521 to S534 forms a loop. First, the information gain G (Q _k , T) is calculated (step S521), and the evaluation value E (Q _k ) Is calculated (step S522).
[0065]
The processes in steps S521 and S522 are repeated until all the limited questions are completed (step S523). That is, in this step, if the processes of steps S521 and S522 have not been completed for all the limited questions (No), the process returns to step S521, and if completed (Yes), the process proceeds to step S524.
[0066]
In step S524, a limited question corresponding to the largest evaluation value is extracted from the evaluation values calculated in step S523.
[0067]
The question set is divided into two subsets based on the limited questions extracted in this step (step S525).
[0068]
The process in the next step S526 is for determining whether or not the process in step S525 has been completed for all question sets. If all question sets have been completed (Yes), the process proceeds to the subsequent step S531. On the other hand, if it has not been completed (No), the process returns to step S521, and the processing from step S521 is repeated. In the case of the processing in the first loop, since there is one question set, the processing proceeds to step S531.
[0069]
In step S531, an element holding subset is extracted from among the subsets divided above (step S531). The definition of the element holding subset is the same as in the first embodiment.
[0070]
It is determined in step S531 whether or not the element holding subset has been extracted. If it is determined that the element holding subset has been extracted (Yes), the process proceeds to step S533. Goes to step S541 (step S532). Here, the method of determining whether or not the element holding subset has been extracted is the same as in the first embodiment of the present invention.
[0071]
If it is determined in step S534 that the element holding subset has been extracted, the level is increased by 1 (step S533). In the case of the first loop, the level is “2”. After increasing the value of the level, for each of the element holding subsets extracted in step S531, a limited question that is the easiest to answer in each element holding subset is extracted (step S534), and the process returns to step S521. . At this time, the element holding subset is regarded as a new question set, and the above-described processing from steps S521 to S534 is repeated.
[0072]
In FIG. 6, the limited question Q ₂₁ And Q ₂₂ Is extracted in step S534. When the loop processing from the above steps S521 to S534 is repeated and all subsets become empty sets, the process proceeds to step S541 in step S532.
[0073]
If all the subsets are empty sets, that is, if it is determined in step S532 that the subsets have not been extracted, user preference information is assigned to each empty set (step S541).
[0074]
Needless to say, the above processing is executed by the control determining unit 32 and the evaluation value calculating unit 35. Among them, the control judging unit 32 extracts the limited question having the highest evaluation value calculated by the evaluation value calculating unit 35, and extracts the limited question extracted by the extracting unit out of the nodes closest to the root of the search decision tree. Dividing means for dividing a set of limited questions composed of a plurality of limited questions into two subsets, and extracting a non-empty subset from each of the subsets divided by the dividing means Means for extracting an empty set from each of the subsets divided by the dividing means, assigning user desired information corresponding to each of the extracted empty sets to a leaf, and terminating the branch extending from the root. It also has a function as a termination means.
[0075]
According to the second embodiment of the present invention described above, an evaluation value is calculated in accordance with the ease information and information gain of a question, and the questions are arranged in the descending order of the evaluation values in the direction from the root of the search decision tree to the leaves. By doing so, it is possible to generate a search decision tree in which questions are arranged in consideration of the ease (or difficulty) of the questions.
[0076]
In the present embodiment, the search decision tree generation method for performing the processing in each of the steps from step S511 to step S541 using the search decision tree generation device has been described, but the search decision tree including these steps is described. Similar effects can be obtained by using a predetermined computer in which a search decision tree generation program for executing the generation process is installed.
[0077]
【The invention's effect】
As is apparent from the above description, according to the present invention, a search decision tree generation method and a search decision tree generation method capable of generating a search decision tree in which questions are arranged in consideration of the difficulty or ease of a question An apparatus, a search decision tree generation program, and a program recording medium can be provided.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a schematic configuration of a search decision tree generation device according to a first embodiment of the present invention.
FIG. 2 is a flowchart illustrating a flow of processing in generating a search decision tree according to the first embodiment of the present invention.
FIG. 3 is a block diagram illustrating a schematic configuration of a search decision tree generation device according to a second embodiment of the present invention.
FIG. 4 is an explanatory diagram conceptually showing an example of a training case.
FIG. 5 is a flowchart illustrating a processing flow in search decision tree generation according to a second embodiment of the present invention.
FIG. 6 is a diagram illustrating an example of a search decision tree.
[Explanation of symbols]
1, 3 search decision tree generator
11 Operation input section
12, 32 control judgment unit
13 Memory
14 Output section
35 Evaluation value calculation unit
100 search decision tree

Claims

Of the question sets having limited questions arranged as components in each of a plurality of clauses sequentially extending from the root of the search decision tree and forming a branch, a corresponding question set is assigned to each of the root and the plurality of clauses. Tree that generates user-determined information by assigning user-desired information as information desired by the user to leaves at the ends of branches that are reached based on answers sequentially input by the user in response to limited questions The method,
A computer including a storage unit that stores information necessary for generating a search decision tree,
(A) an extraction step of reading ease information indicating the ease of answer to each limited question from the storage unit and extracting a limited question with the easiest answer based on the read ease information;
(B) Assigning the limited question extracted in this extraction step to one of the free nodes among the clauses closest to the root of the search decision tree, and dividing the question set composed of a plurality of limited questions into two subsets Steps and
(C) a subset extraction step of extracting a non-empty subset from the two subsets divided in this division step;
(D) An empty set is extracted from the two subsets divided in the dividing step, and user-desired information corresponding to each of the extracted empty sets is assigned as a leaf, so that a branch extending from the root is obtained. Performing a terminating step of terminating the search decision tree.

Of the subsets extracted in the subset extraction step of (c), a subset that is not an empty set is again set as a question set. For each of the question sets, the extraction step of (a) and the termination of (d) are performed. Recursively repeat the process up to the step,
2. The search decision tree generation method according to claim 1, wherein in the repetition process, when all of the subsets divided in the division step (b) are empty sets, the repetition is terminated.

Of the question sets having limited questions arranged as components in each of a plurality of clauses sequentially extending from the root of the search decision tree and forming a branch, a corresponding question set is assigned to each of the root and the plurality of clauses. Tree that generates user-determined information by assigning user-desired information as information desired by the user to leaves at the ends of branches that are reached based on answers sequentially input by the user in response to limited questions The method,
A computer including a storage unit that stores information necessary for generating a search decision tree,
(A) Based on training case information composed of a plurality of training cases in which answers to each limited question are combined and a set of questions related to the training case information, before and after the limited question of the entire training case An information gain calculation step of calculating an information gain indicating a difference in the information amount of,
(B) The information gain calculated in the information gain calculation step and the ease information indicating the ease of answering each of the limited questions are read from the storage unit, and based on the read information gain and the ease information, each limited question is read. An evaluation value calculating step of calculating an evaluation value for
(C) an extraction step of reading out and extracting the limited question having the highest evaluation value calculated in this evaluation value calculation step from the storage unit;
(D) Assigning the limited question extracted in this extraction step to an empty node among the clauses closest to the root of the search decision tree, and dividing the set of limited questions composed of a plurality of limited questions into two subsets Steps and
(E) a subset extraction step of extracting a non-empty subset from each subset divided in this division step;
(F) The branches extending from the root are terminated by extracting an empty set from each of the subsets divided in the division step and assigning user desired information corresponding to each of the extracted empty sets to a leaf. And generating a search decision tree.

Among the subsets extracted in the subset extraction step of (e), a subset that is not an empty set is again set as a question set. For each of the question sets, the extraction step of (a) to the end of (f) are performed. Recursively repeat the process up to the step,
4. The search decision tree generation method according to claim 3, wherein in the repetition process, the repetition is terminated when all the subsets divided in the division step (d) are empty sets.

The method according to claim 3, wherein the evaluation value calculated in the evaluation value calculation step is a value obtained by adding the information gain and the ease information at a predetermined ratio.

Of the question sets having limited questions arranged as components in each of a plurality of clauses sequentially extending from the root of the search decision tree and forming a branch, a corresponding question set is assigned to each of the root and the plurality of clauses. Tree that generates user-determined information by assigning user-desired information as information desired by the user to leaves at the ends of branches that are reached based on answers sequentially input by the user in response to limited questions A device,
Storage means for storing information necessary for generating a search decision tree;
Extracting means for reading ease information indicating the ease of answer to each limited question from the storage means, and extracting a limited question with the easiest answer based on the read ease information;
Dividing means for assigning the limited question extracted by the extracting means to an empty node among the clauses closest to the root of the search decision tree, and dividing a set of limited questions composed of a plurality of limited questions into two subsets;
Subset extracting means for extracting a non-empty subset from each subset divided by the dividing means;
A terminating means for terminating a branch extending from the root by extracting an empty set from each of the subsets divided by the dividing means and assigning user desired information corresponding to each of the extracted empty sets to a leaf; And a search decision tree generation device.

Of the question sets having limited questions arranged as components in each of a plurality of clauses sequentially extending from the root of the search decision tree and forming a branch, a corresponding question set is assigned to each of the root and the plurality of clauses. Tree that generates user-determined information by assigning user-desired information as information desired by the user to leaves at the ends of branches that are reached based on answers sequentially input by the user in response to limited questions A device,
Storage means for storing information necessary for generating a search decision tree;
Based on training case information composed of a plurality of training cases in which answers to each limited question are combined, and a set of questions related to the training case information, the information amount before the limited question and after the limited question of the entire training case Information gain calculating means for calculating an information gain indicating the difference between
Evaluation value calculating means for calculating an evaluation value for each limited question, based on the information gain calculated by the information gain calculating means and the ease information indicating the ease of answering each limited question,
Extracting means for extracting a limited question having the highest evaluation value calculated by the evaluation value calculating means;
Dividing means for allocating the limited question extracted by the extracting means to an empty one of the clauses closest to the root of the search decision tree, and dividing a set of limited questions composed of a plurality of limited questions into two subsets;
Subset extracting means for extracting a non-empty subset from each subset divided by the dividing means;
A terminating means for terminating a branch extending from the root by extracting an empty set from each of the subsets divided by the dividing means and assigning user desired information corresponding to each of the extracted empty sets to a leaf; And a search decision tree generation device.

8. The search decision tree generation device according to claim 7, wherein the evaluation value calculated by the evaluation value calculation means is a value obtained by adding the information gain and the ease information at a predetermined ratio.

Of the question sets having limited questions arranged as components in each of a plurality of clauses sequentially extending from the root of the search decision tree and forming a branch, a corresponding question set is assigned to each of the root and the plurality of clauses. Tree that generates user-determined information by assigning user-desired information as information desired by the user to leaves at the ends of branches that are reached based on answers sequentially input by the user in response to limited questions A program,
A computer equipped with a storage unit for storing information necessary for generating a search decision tree,
(A) an extraction step of reading ease information indicating the ease of answer to each limited question from the storage unit and extracting a limited question with the easiest answer based on the read ease information;
(B) Assigning the limited question extracted in this extraction step to one of the free nodes among the clauses closest to the root of the search decision tree, and dividing the question set composed of a plurality of limited questions into two subsets Steps and
(C) a subset extraction step of extracting a non-empty subset from the two subsets divided in this division step;
(D) An empty set is extracted from the two subsets divided in the dividing step, and user-desired information corresponding to each of the extracted empty sets is assigned as a leaf, so that a branch extending from the root is obtained. And a terminating step for terminating the program.

Of the subsets extracted in the subset extraction step of (c), a subset that is not an empty set is again set as a question set. For each of the question sets, the extraction step of (a) and the termination of (d) are performed. Recursively repeat the process up to the step,
10. The search decision tree generation program according to claim 9, wherein in the process of the repetition, when all of the subsets divided in the division step (b) are empty sets, the repetition is terminated.

Of the question sets having limited questions arranged as components in each of a plurality of clauses sequentially extending from the root of the search decision tree and forming a branch, a corresponding question set is assigned to each of the root and the plurality of clauses. Tree that generates user-determined information by assigning user-desired information as information desired by the user to leaves at the ends of branches that are reached based on answers sequentially input by the user in response to limited questions A program,
A computer equipped with a storage unit for storing information necessary for generating a search decision tree,
(A) Based on training case information composed of a plurality of training cases in which answers to each limited question are combined and a set of questions related to the training case information, before and after the limited question of the entire training case An information gain calculation step of calculating an information gain indicating a difference in the information amount of,
(B) The information gain calculated in the information gain calculation step and the ease information indicating the ease of answering each of the limited questions are read from the storage unit, and based on the read information gain and the ease information, each limited question is read. An evaluation value calculating step of calculating an evaluation value for
(C) an extraction step of reading out and extracting the limited question having the highest evaluation value calculated in this evaluation value calculation step from the storage unit;
(D) Assigning the limited question extracted in this extraction step to an empty node among the clauses closest to the root of the search decision tree, and dividing the set of limited questions composed of a plurality of limited questions into two subsets Steps and
(E) a subset extraction step of extracting a non-empty subset from each subset divided in this division step;
(F) The branches extending from the root are terminated by extracting an empty set from each of the subsets divided in the division step and assigning user desired information corresponding to each of the extracted empty sets to a leaf. And a terminating step of performing a search decision tree generation program.

Of the subsets extracted in the subset extraction step of (c), a non-empty subset is newly set as a question set. For each of the question sets, the extraction step of (a) and the termination of (f) are performed. Recursively repeat the process up to the step,
12. The search decision tree generation program according to claim 11, wherein in the process of the repetition, when all of the subsets divided in the dividing step (d) are empty sets, the repetition is terminated.

13. The search decision tree generation program according to claim 11, wherein the evaluation value calculated in the evaluation value calculation step is a value obtained by adding the information gain and the ease information at a predetermined ratio.

A program recording medium on which the search decision tree generation program according to any one of claims 9 to 13 is recorded.