JP2000010981A

JP2000010981A - Link identifying method and electronic document system

Info

Publication number: JP2000010981A
Application number: JP11164589A
Authority: JP
Inventors: William N Schilit; エヌ．シリットウィリアム; Morgan N Price; エヌ．プライスモーガン; Gene Golovchinsky; ゴロブチンスキージーン; Mark D Weiser; ディー．ワイザーマーク
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1998-06-19
Filing date: 1999-06-11
Publication date: 2000-01-14

Abstract

PROBLEM TO BE SOLVED: To obtain a method and a system for passively showing a document relating to a reader without interfering with a readout process by estimating user's interest from notes that the user attaches to a source document when reading the source document. SOLUTION: The user's interest is estimated from the notes that the user attaches to the source document when reading the source document. By this identifying method, two kinds of target parts relating to the passage of the source document 16, i.e., a target part 22 relating to the passage with specific notes of the source document and a target part 22 relating to the whole source document are discriminated. Once the relation between the passage and target part 22 of the source document 16 is established, the target part 22 is displayed by clicking a selectable link on the display of the source document 16.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、一般に電子ドキュ
メント読み取りシステムに関する。特に、本発明は、第
１のドキュメントを表示する際に他の関連するドキュメ
ントを提案する電子ドキュメント読み取りシステムに関
する。FIELD OF THE INVENTION The present invention relates generally to electronic document reading systems. In particular, the present invention relates to an electronic document reading system that suggests other related documents when displaying a first document.

【０００２】[0002]

【従来の技術】ユーザによって関連しているとみなされ
たドキュメントと同類のドキュメントの検索は、関連フ
ィードバックとして公知である。関連フィードバック
は、Ｇ．サルトン(G. Salton)らによる"Introduction t
o Modern Information Retrieval"(McGraw Hill, 1983)
に記載されており、これは全体的に参照として本明細書
中に援用される。関連フィードバックをサポートするイ
ンタフェースは、従来通り、読み手側の明示的動作が必
要であり、関連ドキュメントの提案を自発的には提案し
ない。ウィンドウベースの演算環境のためにデザインさ
れた情報探索インタフェースは、典型的には、別のウィ
ンドウのリストを介して、又は見えているドキュメント
を検索結果に置きかえることによって他の関連ドキュメ
ントの検索結果を表す。これらのシステムは非常に割り
込み的であり、読み取りプロセスを邪魔する。2. Description of the Related Art Searching for documents similar to documents deemed relevant by a user is known as relevant feedback. Related feedback can be found in "Introduction t" by G. Salton et al.
o Modern Information Retrieval "(McGraw Hill, 1983)
Which is incorporated herein by reference in its entirety. Interfaces that support related feedback still require explicit action on the part of the reader and do not voluntarily suggest related documents. Information search interfaces designed for window-based computing environments typically filter the search results of other related documents through a list of separate windows or by replacing visible documents with search results. Represent. These systems are very interruptive and hinder the reading process.

【０００３】ハイパーテキストインタフェースは、リン
クを含むマージンを提供することによって、又は"Hyper
ties"によって開拓された方法でリンクをソースドキュ
メントのテキストに埋め込むことのいずれかによって、
ソースドキュメントに関連するドキュメントのリンクを
表示する。このシステムは、シュナイダーマン(Shneide
rman）による"User Interface Design for the Hyperti
es Electronic Encyclopedia"(Proceedings of Hyperte
xt '87, November 1987, Chapel Hill, NC)に記載され
ており、これは全体的に参照によって本明細書中に援用
される。しかし、これらのリンクは静的であり、ハイパ
ーテキストの著者によってソースドキュメントと共に形
成される。幾つかのシステム、例えばトレリス(Trelli
s)はリンクを動的に表示するが、先に定義したリンクの
固定セットからしか表示できない。トレリスは、Ｒ．フ
ルタ(R. Furuta)らによる"Programmable Browsing Sema
ntics and Trellis"(Proceedings of Hypertext '89, N
ovember 1989, Pittsburgh, PA, ACM Press)に記載され
ており、これは全体的に参照として本明細書中に援用さ
れる。[0003] Hypertext interfaces can be provided by providing a margin containing links, or by "Hyper
by embedding the link in the text of the source document in a way pioneered by "ties"
Displays a document link related to the source document. This system is based on Schneideman
rman) "User Interface Design for the Hyperti
es Electronic Encyclopedia "(Proceedings of Hyperte
xt '87, November 1987, Chapel Hill, NC), which is incorporated herein by reference in its entirety. However, these links are static and are formed by the hypertext author along with the source document. Some systems, such as Trellis
s) displays links dynamically, but only from a fixed set of links defined earlier. The trellis is described by R. "Programmable Browsing Sema" by R. Furuta et al.
ntics and Trellis "(Proceedings of Hypertext '89, N
ovember 1989, Pittsburgh, PA, ACM Press), which is incorporated herein by reference in its entirety.

【０００４】ハイネットシステム(HieNet System)は、
ハイパーテキストの著者によって既に形成されているリ
ンクに基づいたハイパーテキストリンクを形成するため
に、インターノード類似度測度を使用する。このシステ
ムは、Ｄ．Ｔ．チャン(D.T.Chang)による"Hienet: A Us
er-Centered Approach for Automatic Link Generatio
n"(Proceedings of Hypertext '93, November 1993, Se
attle, WA, ACM Press)に記載されており、これは全体
的に参照として本明細書中に援用される。著者がドキュ
メントＡからドキュメントＢへのリンクを形成する場
合、このシステムはドキュメントＡに類似する全てのド
キュメントからのリンクをドキュメントＢに類似する全
てのドキュメントに自動的に付加する。これらの自動的
に生成されたリンクのアンカーは、種々のドキュメント
のマージンのアイコンによって表示される。アイコンを
クリックすることによって、照会に関してランク付けさ
れた可能性のある目的のドキュメントのリストを含むポ
ップアップメニューが表示される。このシステムも著者
によって既に形成されているリンクに依存する。[0004] The HieNet System is
The internode similarity measure is used to form a hypertext link based on a link already formed by the hypertext author. This system is described in T. "Hienet: A Us by DTChang
er-Centered Approach for Automatic Link Generatio
n "(Proceedings of Hypertext '93, November 1993, Se
attle, WA, ACM Press), which is incorporated herein by reference in its entirety. When the author creates a link from document A to document B, the system automatically adds links from all documents similar to document A to all documents similar to document B. The anchors of these automatically generated links are indicated by icons in the margins of the various documents. Clicking on the icon will display a pop-up menu containing a list of potential documents ranked for the query. This system also relies on links already created by the author.

【０００５】他の従来のシステムは、検索結果を表示す
るハイパーテキスト類似方法に関する。ハイネットは、
マージンに自動的リンクを表示するが、マージンのアン
カーは、アンカーに隣接するパッセージの内容に関連し
ない。ハイネットは、ドキュメント同士間リンクとパッ
セージとドキュメント間リンクを区別しない。さらに、
ハイネットは、マージンリンクにわたって検索可能なド
キュメントの数及び性質を示さない。[0005] Another conventional system relates to a hypertext-like method of displaying search results. Hinet is
Automatically display the link in the margin, but the margin anchor is not related to the content of the passage adjacent to the anchor. Hinet does not distinguish between links between documents and between passages and links between documents. further,
Hi-Net does not indicate the number and nature of documents that can be searched over margin links.

【０００６】情報検索システムの視覚化（Visualizatio
n of Information Retrieval System:以後ＶＯＩＲとす
る）は、Ｇ．ゴロビンスキー(G.Golovinsky)による"Que
ries? Links? Is There a Difference?"(Proceedings o
f CHI '97"(March 1997, Atlanta, GA, ACM Press)、"W
hat the Query Told the Link: The Integration ofHyp
ertext and Information Retrieval"(Proceedings of H
ypertext '97, April1997, Southhampton, UK, ACM Pre
ss)に記載されており、これらは各々全体的に参照とし
て本明細書中に援用される。ＶＯＩＲは、選択されたア
ンカーを囲むテキストから演算された照会を用いてハイ
パーテキストリンクを動的に形成し、これを解決するメ
カニズムである。ＶＯＩＲは照会を使用して選択された
アンカーを含むパッセージに関連するドキュメントのセ
ットを検索する。ＶＯＩＲは予め確立した関係を有する
リンクをユーザに示さない。逆に、照会を提出し関係を
確立するために、ユーザは待機してアンカーを選択しな
ければならない。ＶＯＩＲは、読み取りプロセスを助長
するためというよりも、相互作用情報探索をサポートす
るために特にデザインされた。したがって、ＶＯＩＲの
焦点は、ドキュメント同士間のナビゲーションをサポー
トすることである。したがってユーザはブラウジングに
より認識的な努力をすることが期待される。さらに、Ｖ
ＯＩＲはユーザがドキュメントに注釈をつけたりタグを
つけることを許可しない。また、ＶＯＩＲは、特定のデ
ィスプレイを生成するためにどのリンクが選択されたか
も示さない。[0006] Visualization of an information retrieval system (Visualizatio)
n of Information Retrieval System: hereinafter referred to as VOIR). "Que by G. Golovinsky
ries? Links? Is There a Difference? "(Proceedings o
f CHI '97 "(March 1997, Atlanta, GA, ACM Press)," W
hat the Query Told the Link: The Integration of Hyp
ertext and Information Retrieval "(Proceedings of H
ypertext '97, April1997, Southhampton, UK, ACM Pre
ss), each of which is incorporated herein by reference in its entirety. VOIR is a mechanism for dynamically creating and resolving hypertext links using queries computed from text surrounding selected anchors. The VOIR uses a query to retrieve the set of documents associated with the passage containing the selected anchor. VOIR does not show the user links that have a pre-established relationship. Conversely, to submit a query and establish a relationship, the user must wait and select an anchor. VOIR was specifically designed to support interaction information exploration, rather than to facilitate the reading process. Therefore, the focus of VOIR is to support navigation between documents. Therefore, users are expected to make cognitive efforts by browsing. Furthermore, V
OIR does not allow users to annotate or tag documents. The VOIR also does not indicate which link was selected to create a particular display.

【０００７】リメンブランスエージェント（Remembranc
e Agent:以下ＲＡとする）と称される背景情報検索プロ
セスは、Ｂ．Ｊ．ロード(B.J.Rhodes)らによる"A Conti
nuously Running Automated Information Retrieval Sy
stem"(Proceedings of The First International Confe
rence on the Practical Application of Intelligent
Agents in Multi-Agent Technology, PAAM '96, April,
1997, London, UK)に記載されており、これは全体的に
参照として本明細書中に援用される。ＲＡはＥＭＡＣＳ
ウィンドウで作動し、ユーザによってタイプされたテキ
ストの最後の数行に関するドキュメントを提案する。Ｒ
Ａは、タイプされているテキストに関連するドキュメン
トを提案するために、ユーザのプライベートデータを検
索するようにデザインされる。しかし、これらの提案は
一時的であり、現在書きこまれているテキストにしか関
連しない。ＲＡは、ユーザがドキュメントを編集すると
絶えず提案を置換するため、読み取りタスクをサポート
しない。[0007] Remembranc agent (Remembranc
e Agent: hereinafter referred to as RA). J. "A Conti" by Lord (BJRhodes) et al.
nuously Running Automated Information Retrieval Sy
stem "(Proceedings of The First International Confe
rence on the Practical Application of Intelligent
Agents in Multi-Agent Technology, PAAM '96, April,
1997, London, UK), which is incorporated herein by reference in its entirety. RA is EMACS
Runs in a window and suggests documentation for the last few lines of text typed by the user. R
A is designed to search the user's private data to suggest documents related to the text being typed. However, these proposals are temporary and relate only to the text currently being written. RA does not support the read task because it constantly replaces suggestions when a user edits a document.

【０００８】ＱＲＬは、テキスト上のインク状のマーク
を使用してブール照会を識別する照会ベースの情報探索
インタフェースである。このシステムは、Ｇ．ゴロブチ
ンスキー(G.Golovchinsky)らによる"Queries-R-Links:
Graphical Markup for TextNavigation" (Proceedings
of INTERCHI '93, April 1993, Amsterdam, The Nether
lands, ACM Press)に記載されており、これは全体的に
参照として本明細書中に援用される。照会の用語は矩形
で選択される。直線は矩形をつないでブールＡＮＤオペ
レータを表す。[0008] QRL is a query-based information search interface that uses ink-like marks on text to identify Boolean queries. This system is described in G. "Queries-R-Links by G. Golovchinsky et al .:
Graphical Markup for TextNavigation "(Proceedings
of INTERCHI '93, April 1993, Amsterdam, The Nether
lands, ACM Press), which is incorporated herein by reference in its entirety. Query terms are selected in rectangles. Straight lines connect the rectangles to represent the Boolean AND operator.

【０００９】[0009]

【発明が解決しようとする課題】これらのシステムは全
て関連するドキュメントへのリンクを生成するのにユー
ザの膨大な相互作用を必要とするか又は書きこみしかサ
ポートしない。関連ドキュメントのリンクを受動的且つ
邪魔にならないように生成して読み取りをサポートする
電子ドキュメント読み取りシステムが必要とされる。All of these systems require extensive user interaction or support only writing to generate links to related documents. There is a need for an electronic document reading system that supports reading by generating passive and unobtrusive links to related documents.

【００１０】[0010]

【課題を解決するための手段】本発明は、読み取りプロ
セスに干渉せずに読み手に関連するドキュメントを受動
的に示す方法及びシステムを提供する。SUMMARY OF THE INVENTION The present invention provides a method and system for passively indicating a document associated with a reader without interfering with the reading process.

【００１１】本発明は、読み手が興味を示す可能性があ
る他のドキュメント部分又は同じドキュメントの他の部
分を、読まれているソースドキュメントの部分との読み
手の対話に基づいて自動的に検出することによって読み
取りを直観的にサポートする。人々がテキストを読む場
合、彼ら／彼女らは、興味があったり論議をかもすパッ
セージ及び用語を強調するためにしばしば注釈をつけ
る。このようなマーク及び走り書きの存在又はこれらの
相対密度は、特定のパッセージにおいて読み手が示す相
対的興味のインジケータとして使用され得る。読まれて
いるドキュメントに関連する多数のドキュメントが利用
される場合、読み手は他のドキュメントの関連部分又は
同じドキュメントの他の部分を見出すことを読み取りプ
ロセスの一部として興味を示すかもしれない。The present invention automatically detects other document parts that the reader may be interested in or other parts of the same document based on the reader's interaction with the part of the source document being read. Intuitively supports reading by When people read the text, they / they often annotate to highlight passages and terms that are of interest or controversy. The presence of such marks and scribbles, or their relative density, can be used as an indicator of relative interest shown by the reader in a particular passage. If multiple documents related to the document being read are utilized, the reader may be interested in finding relevant parts of other documents or other parts of the same document as part of the reading process.

【００１２】ユーザが興味を示す特定のパッセージに関
連する他のドキュメント部分又は同じドキュメントの他
の部分の参照は、ソースドキュメントのマージンに配置
され、ソースドキュメントに全体的に類似する他のドキ
ュメントの参照はエンドノートとして挿入される。本発
明のシステム及び方法は、非線形読み取り及び拾い読み
を助長するためにリンクが一旦識別されるとこれを保持
する。References to other document parts related to a particular passage of interest to the user or to other parts of the same document are placed in the margin of the source document and refer to other documents that are generally similar to the source document. Is inserted as an end note. The systems and methods of the present invention maintain links once identified to facilitate non-linear reading and browsing.

【００１３】本発明のシステム及び方法は、ユーザがソ
ースドキュメントを読んでいる際にユーザによってソー
スドキュメントにつけられる注釈からユーザの興味を推
定する。したがって、本発明のシステム及び方法は少な
くとも二つの方法で認識上のオーバーヘッドを最小化す
る。その方法とは、１）ユーザが注釈をつけたソースド
キュメント部分に関連する他のドキュメント部分又は同
じドキュメントの他の部分を識別するのに明示的な照会
が必要ない。２）他の関連するドキュメント部分又は同
じドキュメントの他の部分の選択可能なリンクは、マー
ジン及びソースドキュメントの終わりに邪魔にならない
ように提供される。この例は図２及び３に示される。The system and method of the present invention infers a user's interest from annotations made on the source document by the user as the user is reading the source document. Thus, the systems and methods of the present invention minimize cognitive overhead in at least two ways. The method is: 1) No explicit query is required to identify other document parts related to the source document part annotated by the user or other parts of the same document. 2) Selectable links to other related document parts or other parts of the same document are provided out of the way at the margins and at the end of the source document. This example is shown in FIGS.

【００１４】また、本発明のシステム及び方法は、ユー
ザを形式的な対話で悩ませるのではなく、他の対話に適
応する方法で読み手への提案を示す。本発明の方法及び
システムによって提案された他のドキュメントの部分又
は同じドキュメントの他の部分は、以下の選択可能なリ
ンクによってアクセス可能である。しかし、本発明のシ
ステム及び方法は、提案がなされたときにこれに従うこ
とをユーザに強制しない。そうではなく、ユーザが提案
に従うことを理解したときに（又は理解したならば）こ
の提案に従うことができる。本発明のシステム及び方法
は、提案され参照された部分のタイプをアイコンで表示
し、テキストのレベルをそのアイコンに提供してユーザ
にリンクのターゲットのよりよい理解を与える。Also, the system and method of the present invention present suggestions to the reader in a manner that adapts to other interactions, rather than bothering the user with formal interactions. Portions of other documents proposed by the methods and systems of the present invention or other portions of the same document are accessible by the following selectable links. However, the systems and methods of the present invention do not force the user to follow when a proposal is made. Rather, the suggestion can be followed when (or if so) the user understands that the suggestion will be followed. The systems and methods of the present invention display the type of proposed and referenced portion with an icon and provide a level of text to the icon to give the user a better understanding of the target of the link.

【００１５】本発明の具体的な態様を以下に示す。本発
明の第１の態様は、ソースドキュメントに対して、ター
ゲット部分への少なくとも一つのリンクを識別する方法
であって、各ターゲット部分はソースドキュメントに関
連し、ソースドキュメントの少なくとも一つのユーザに
よる注釈付きパッセージを識別し、ソースドキュメント
の少なくとも一つの注釈付きパッセージに関連する少な
くとも一つのターゲット部分を識別する、リンク識別方
法である。本発明の第２の態様は、第１の態様におい
て、識別されたターゲット部分の各々の選択可能リンク
をソースドキュメントのディスプレイに表示することを
さらに含む。本発明の第３の態様は、第２の態様におい
て、選択可能リンクは、ソースドキュメントのエンドノ
ートとして表示されることを含む。本発明の第４の態様
は、第２の態様において、選択可能リンクは識別された
少なくとも一つの注釈付きパッセージ付近に表示される
ことを含む。本発明の第５の態様は、第４の態様におい
て、識別ステップは、ソースドキュメントの少なくとも
一つのパッセージの注釈に応答することを含む。本発明
の第６の態様は、第５の態様において、少なくとも一つ
のターゲット部分を識別するステップは、ソースドキュ
メントの少なくとも一つの注釈付きパッセージの注釈に
応答することを含む。本発明の第７の態様は、第５の態
様において、選択可能リンクは、少なくとも一つの注釈
付きパッセージに隣接する、ソースドキュメントのディ
スプレイのマージンに表示されることを含む。本発明の
第８の態様は、第２の態様において、識別された少なく
とも一つのターゲット部分のタイプを表すアイコンが表
示されることを含む。本発明の第９の態様は、第２の態
様において、識別された少なくとも一つのターゲット部
分が生じたドキュメントのタイトルが表示されることを
含む。本発明の第１０の態様は、第２の態様において、
識別された少なくとも一つのターゲット部分が生じたド
キュメントの概要が表示されることを含む。本発明の第
１１の態様は、第１の態様において、少なくとも一つの
ユーザによる注釈付きパッセージを識別するステップ
は、ソースドキュメントをパッセージにセグメント化
し、パッセージの少なくとも一つを注釈を有するものと
して識別することを含む。本発明の第１２の態様は、第
１の態様において、少なくとも一つのターゲット部分を
識別するステップは、ユーザが識別した用語及び関連フ
ィードバック技術を使用して識別した用語に基づいて関
連度を決定することを含む。本発明の第１３の態様は、
第１２の態様において、識別ステップは、重み付け和の
照会を使用することを含む。本発明の第１４の態様は、
第１の態様において、選択可能リンクが選択されたか否
かを決定するステップと、選択可能リンクの選択に応じ
て識別された少なくとも一つのターゲット部分を表示す
るステップと、をさらに含む。本発明の第１５の態様
は、第１の態様において、識別ステップは、複数のター
ゲット部分をソースドキュメントに関連するものとして
識別するステップと、識別された複数のターゲット部分
をクラスタ化するステップと、各クラスタ内の識別され
た複数のターゲットドキュメントの全てを代表する、各
クラスタの識別された複数のターゲット部分の少なくと
も一つを選択し、選択可能リンクは、識別された複数の
ターゲット部分の選択された少なくとも一つに参照を付
けるステップと、をさらに含む。本発明の第１６の態様
は、第１の態様において、関連度は、少なくとも一つの
ターゲット部分とソースドキュメントとの類似度に基づ
いて決定されることを含む。本発明の第１７の態様は、
第１３の態様において、識別ステップは、所定の類似度
閾値を越える少なくとも一つのターゲット部分を識別す
ることをさらに含む。Specific embodiments of the present invention will be described below. A first aspect of the present invention is a method for identifying, for a source document, at least one link to a target portion, wherein each target portion is associated with the source document and is annotated by at least one user of the source document. A link identification method for identifying an annotated passage and identifying at least one target portion associated with at least one annotated passage of a source document. A second aspect of the present invention, in the first aspect, further comprises displaying a selectable link of each of the identified target portions on a display of the source document. A third aspect of the invention, in the second aspect, comprises that the selectable link is displayed as an end note of the source document. A fourth aspect of the present invention includes, in the second aspect, the selectable link is displayed near at least one identified annotated passage. According to a fifth aspect of the present invention, in the fourth aspect, the identifying step comprises responding to an annotation of at least one passage of the source document. In a sixth aspect of the present invention, in the fifth aspect, the step of identifying at least one target portion comprises responding to an annotation of at least one annotated passage of the source document. A seventh aspect of the present invention comprises, in the fifth aspect, the selectable link is displayed in a margin of a display of the source document adjacent to at least one annotated passage. An eighth aspect of the present invention, in the second aspect, includes displaying an icon representing the type of the at least one identified target portion. A ninth aspect of the present invention includes, in the second aspect, displaying a title of a document in which the identified at least one target portion has occurred. According to a tenth aspect of the present invention, in the second aspect,
Including displaying a summary of the document in which the at least one identified target portion occurred. An eleventh aspect of the invention is a method according to the first aspect, wherein the step of identifying an annotated passage by at least one user segments the source document into passages and identifies at least one of the passages as having an annotation. Including. According to a twelfth aspect of the present invention, in the first aspect, the step of identifying at least one target portion determines a degree of relevancy based on the user-identified term and the term identified using a related feedback technique. Including. According to a thirteenth aspect of the present invention,
In a twelfth aspect, the step of identifying includes using a weighted sum query. According to a fourteenth aspect of the present invention,
In a first aspect, the method further includes determining whether a selectable link has been selected and displaying at least one target portion identified in response to selection of the selectable link. A fifteenth aspect of the present invention is the method according to the first aspect, wherein the identifying step comprises: identifying the plurality of target portions as being associated with the source document; clustering the identified plurality of target portions; Selecting at least one of the identified plurality of target portions of each cluster representing all of the identified plurality of target documents in each cluster, wherein the selectable link is selected of the identified plurality of target portions. Referencing at least one of the two. According to a sixteenth aspect of the present invention, in the first aspect, the relevance is determined based on a similarity between the at least one target portion and the source document. According to a seventeenth aspect of the present invention,
In a thirteenth aspect, the identifying step further comprises identifying at least one target portion that exceeds a predetermined similarity threshold.

【００１６】本発明の第１８の態様は、ソースドキュメ
ントに関連する少なくとも一つのターゲット部分をソー
スドキュメントのディスプレイに提案する電子ドキュメ
ントシステムであって、ソースドキュメントの少なくと
も一つのユーザによる注釈付きパッセージを識別し、少
なくとも一つのターゲット部分をソースドキュメントの
注釈付きパッセージに関連するものとして識別するプロ
セッサを含む、電子ドキュメントシステムである。本発
明の第１９の態様は、第１８の態様において、ソースド
キュメントのディスプレイの識別された少なくとも一つ
のターゲット部分に参照を付ける選択可能リンクを表示
するディスプレイをさらに含む。本発明の第２０の態様
は、第１９の態様において、プロセッサは、少なくとも
一つのターゲット部分とソースドキュメントとの関連度
に基づいて少なくとも一つのターゲット部分を識別し、
選択可能リンクは識別された少なくとも一つのターゲッ
ト部分に参照を付けることを含む。本発明の第２１の態
様は、第２０の態様において、プロセッサは、ソースド
キュメントの少なくとも一つの注釈付きパッセージを識
別し、少なくとも一つのターゲット部分を、識別された
少なくとも一つの注釈付きパッセージに関連するものと
して識別し、選択可能リンクは識別された少なくとも一
つの注釈付きパッセージ付近に表示されることを含む。
本発明の第２２の態様は、第１９の態様において、選択
可能リンクは、ソースドキュメントのエンドノートとし
て表示されることを含む。本発明の第２３の態様は、第
１９の態様において、選択可能リンクは、少なくとも一
つの注釈付きパッセージに隣接するマージンに表示され
ることを含む。本発明の第２４の態様は、第１９の態様
において、ユーザインタフェースをさらに含み、前記ユ
ーザインタフェースにおいて、ディスプレイはユーザに
よる選択可能リンクの選択に応答して識別された少なく
とも一つのターゲット部分を表示することを含む。本発
明の第２５の態様は、第１９の態様において、プロセッ
サは、ソースドキュメントとの関連度に基づいて複数の
ターゲット部分を識別し、識別された複数のターゲット
部分をクラスタ化し、各クラスタ内の識別されたターゲ
ット部分の全てを代表する、クラスタの複数の識別され
たターゲット部分の少なくとも一つを選択し、選択可能
リンクは複数の第２の部分の選択された少なくとも一つ
に参照を付けることを含む。本発明の第２６の態様は、
第１９の態様において、ディスプレイは、識別された少
なくとも一つのターゲット部分のタイプを表すアイコン
を表示することを含む。本発明の第２７の態様は、第１
９の態様において、ディスプレイは、識別された少なく
とも一つのターゲット部分が生じたドキュメントのタイ
トルを表示することを含む。本発明の第２８の態様は、
第１９の態様において、ディスプレイは、識別された少
なくとも一つのターゲット部分が生じたドキュメントの
概要を表示することを含む。本発明の第２９の態様は、
第１８の態様において、ユーザ入力インタフェースをさ
らに含み、前記ユーザ入力インタフェースにおいて、プ
ロセッサはユーザによるソースドキュメントのパッセー
ジの注釈に応じて少なくとも一つのターゲット部分を識
別することを含む。本発明の第３０の態様は、第１８の
態様において、プロセッサはユーザが識別した用語及び
関連フィードバック技術に基づいて識別した用語に基づ
いて少なくとも一つのターゲット部分を識別することを
含む。本発明の第３１の態様は、第３０の態様におい
て、プロセッサは、重み付け和の照会に基づいて少なく
とも一つのターゲット部分を識別することを含む。本発
明の第３２の態様は、第１８の態様において、プロセッ
サは、ターゲット部分の文脈とソースドキュメントとの
類似度に基づいてソースドキュメントと少なくとも一つ
のターゲット部分との関連度を決定することを含む。本
発明の第３３の態様は、第３２の態様において、プロセ
ッサは、複数のターゲット部分の少なくとも一つを所定
の類似度閾値を越えていると識別することを含む。An eighteenth aspect of the present invention is an electronic document system for proposing at least one target portion associated with a source document to a display of the source document, wherein the system identifies an annotated passage of at least one user of the source document. An electronic document system including a processor for identifying at least one target portion as being associated with an annotated passage of a source document. The nineteenth aspect of the present invention further comprises, in the eighteenth aspect, a display displaying a selectable link that references the identified at least one target portion of the display of the source document. According to a twentieth aspect of the present invention, in the nineteenth aspect, the processor identifies at least one target portion based on a degree of association between the at least one target portion and the source document;
The selectable link includes referencing at least one identified target portion. According to a twentieth aspect of the present invention, in the twentieth aspect, the processor identifies at least one annotated passage of the source document and associates at least one target portion with the identified at least one annotated passage. Identifying that the selectable link is displayed near at least one identified annotated passage.
A twenty-second aspect of the present invention includes, in the nineteenth aspect, wherein the selectable link is displayed as an end note of the source document. A twenty-third aspect of the present invention includes, in the nineteenth aspect, wherein the selectable link is displayed in a margin adjacent to at least one annotated passage. A twenty-fourth aspect of the present invention is the nineteenth aspect, further comprising a user interface, wherein the display displays at least one target portion identified in response to a user selecting a selectable link. Including. A twenty-fifth aspect of the present invention is the method according to the nineteenth aspect, wherein the processor identifies a plurality of target portions based on the degree of relevance to the source document, clusters the identified plurality of target portions, Selecting at least one of the plurality of identified target portions of the cluster, representing all of the identified target portions, wherein the selectable link references the selected at least one of the second plurality of portions; including. According to a twenty-sixth aspect of the present invention,
In a nineteenth aspect, a display includes displaying an icon representing the type of at least one identified target portion. A twenty-seventh aspect of the present invention is directed to the first aspect.
In a ninth aspect, a display includes displaying a title of a document in which the at least one identified target portion occurred. A twenty-eighth aspect of the present invention provides:
In a nineteenth aspect, a display includes displaying a summary of a document in which the identified at least one target portion has occurred. A twenty-ninth aspect of the present invention provides:
In an eighteenth aspect, the system further comprises a user input interface, wherein the processor comprises identifying at least one target portion in response to an annotation of the passage of the source document by the user. A thirtieth aspect of the present invention comprises, in the eighteenth aspect, the processor identifying at least one target portion based on the user-identified term and the term identified based on the associated feedback technique. A thirty-first aspect of the present invention comprises, in the thirtieth aspect, the processor identifying at least one target portion based on the weighted sum query. A thirty-second aspect of the present invention provides the method according to the eighteenth aspect, wherein the processor determines a degree of association between the source document and the at least one target part based on a similarity between the context of the target part and the source document. . A thirty-third aspect of the present invention comprises, in the thirty-second aspect, the processor identifying at least one of the plurality of target portions as exceeding a predetermined similarity threshold.

【００１７】本発明のこれらの及び他の特徴及び利点
は、以下の好適な実施の形態の詳細な説明に述べられ、
これから明らかである。[0017] These and other features and advantages of the present invention are set forth in the following detailed description of the preferred embodiments,
It is clear from this.

【００１８】[0018]

【発明の実施の形態】本発明の好適な実施の形態を図面
を参照して以下に詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Preferred embodiments of the present invention will be described below in detail with reference to the drawings.

【００１９】図１は、本発明のドキュメント読み取りシ
ステム１０の一つの実施の形態のブロック図を示す。ド
キュメント読み取りシステム１０は、ディスプレイ１８
上でユーザによって現在読み取られているソースドキュ
メント１６を格納する第１のメモリ１４と通信するプロ
セッサ１２を含む。また、プロセッサ１２は、関連して
いる可能性のあるターゲット部分２２を格納する第２の
メモリ２０とも通信する。ターゲット部分は、任意でソ
ースドキュメント１６の一部を含んでもよい。ターゲッ
ト部分２２はドキュメント全体を含んでもよい。FIG. 1 shows a block diagram of one embodiment of a document reading system 10 of the present invention. The document reading system 10 includes a display 18
It includes a processor 12 in communication with a first memory 14 for storing a source document 16 currently being read by a user above. The processor 12 also communicates with a second memory 20 that stores the possibly related target portion 22. The target portion may optionally include a portion of the source document 16. Target portion 22 may include the entire document.

【００２０】ユーザは、任意の数の従来の入力／出力デ
バイス２４、例えばマウス２６、キーボード２８、又は
ペンベースのインタフェース３０を介してドキュメント
読み取りシステム１０と相互（対話）作用し、これを制
御する。入力／出力デバイス２４は入力／出力インタフ
ェース（Ｉ／Ｏ）３１とも通信し、該インタフェース３
１はプロセッサ１２と通信する。The user interacts with and controls the document reading system 10 via any number of conventional input / output devices 24, such as a mouse 26, keyboard 28, or pen-based interface 30. . The input / output device 24 also communicates with an input / output interface (I / O) 31
1 communicates with the processor 12.

【００２１】図１に示されるように、システム１０はプ
ログラムされた汎用コンピュータで実施されるのが好ま
しい。しかし、システム１０は専用コンピュータ、プロ
グラムされたマイクロプロセッサ又はマイクロコントロ
ーラ及び任意の必須な周辺集積回路素子、ＡＳＩＣもし
くは他の集積回路、ハードワイヤード電子回路又はロジ
ック回路、例えば離散素子回路、プログラマブルロジッ
クデバイス、例えばＰＬＤ、ＰＬＡ、ＦＰＧＡ、又はＰ
ＡＬ等を使用しても実施されることができる。一般に、
図４及び５に示されるフローチャートを実施することが
できる有限状態マシーン上のあらゆるデバイスをシステ
ム１０を実施するために使用できる。As shown in FIG. 1, system 10 is preferably implemented on a programmed general purpose computer. However, the system 10 may comprise a dedicated computer, a programmed microprocessor or microcontroller and any necessary peripheral integrated circuit elements, ASICs or other integrated circuits, hardwired electronic or logic circuits, such as discrete element circuits, programmable logic devices, For example, PLD, PLA, FPGA, or P
It can also be implemented using AL or the like. In general,
Any device on a finite state machine that can implement the flowcharts shown in FIGS. 4 and 5 can be used to implement system 10.

【００２２】さらに、図１に示されるように、格納デバ
イス又はメモリ１４及び２０はスタティック又はダイナ
ミックＲＡＭを使用して実施されるのが好ましい。しか
し、デバイス１４及び２０は、フロッピーディスク及び
ディスクドライブ、書きこみ可能光ディスク、ディスク
ドライブ、ハードドライブ、フラッシュメモリ等を使用
して実施されることもできる。また、デバイス１４及び
２０は単一のメモリの別個の部分又は物理的に別個のメ
モリのいずれかであり得ることが理解される。Further, as shown in FIG. 1, the storage devices or memories 14 and 20 are preferably implemented using static or dynamic RAM. However, devices 14 and 20 can also be implemented using floppy disks and disk drives, writable optical disks, disk drives, hard drives, flash memory, and the like. It is also understood that devices 14 and 20 can be either separate parts of a single memory or physically separate memories.

【００２３】さらに、デバイス１４及び２０とプロセッ
サ１２をつなぐリンク１５及び１７は、ネットワーク
（図示せず）へのワイヤードリンク又はワイヤレスリン
クであることも理解される。ネットワークは、ローカル
エリアネットワーク、広域ネットワーク、イントラネッ
ト、インターネット又は任意の他の分散型処理及び格納
ネットワークであり得る。この場合、電子ドキュメント
１６は物理的に遠隔のメモリデバイス１４からリンク１
５を介して引き出されてプロセッサ１２で処理され、タ
ーゲット部分２２は以下に概要を述べる方法によってリ
ンク１７を介して遠隔にアクセスされる。この場合、電
子ドキュメント１６及びターゲット部分２２はシステム
１０の他のメモリデバイス（図示せず）の一部に局所的
に格納されることもできる。It is further understood that the links 15 and 17 connecting devices 14 and 20 and processor 12 are wired or wireless links to a network (not shown). The network can be a local area network, a wide area network, an intranet, the Internet or any other distributed processing and storage network. In this case, the electronic document 16 is linked from the physically remote memory device 14 to link 1
Withdrawn via 5 and processed by processor 12, target portion 22 is remotely accessed via link 17 in the manner outlined below. In this case, the electronic document 16 and the target portion 22 may be stored locally on some of the other memory devices (not shown) of the system 10.

【００２４】本発明の方法は、各ソースドキュメント１
６のパッセージに関連する二種類のターゲット部分を識
別する。二つのタイプとは、１）ソースドキュメントの
特定の注釈付きパッセージに関連するターゲット部分２
２及び２）ソースドキュメント全体に関連するターゲッ
ト部分２２である。一旦、ソースドキュメントのパッセ
ージとターゲット部分２２との関係が確立すると、ター
ゲット部分２２は、ソースドキュメント１６のディスプ
レイの選択可能リンクをクリックすることによって表示
される。本発明のシステム及び方法は、ソースドキュメ
ント１６の一つ以上のパッセージに関連しているものと
して識別された全てのターゲット部分２２を単一のディ
スプレイに示してもよい。The method of the present invention is applied to each source document 1
Identify the two types of target parts associated with the six passages. The two types are: 1) the target part 2 associated with a particular annotated passage in the source document.
2 and 2) Target part 22 relating to the entire source document. Once the relationship between the passage of the source document and the target portion 22 has been established, the target portion 22 is displayed by clicking on a selectable link on the display of the source document 16. The system and method of the present invention may show on a single display all target portions 22 identified as being associated with one or more passages of the source document 16.

【００２５】二つのタイプのターゲット部分２２の参照
の例が図２及び３に示される。ソースドキュメント１６
の特定のパッセージ３２に関連するターゲット部分２２
は、関連パッセージ３２付近のソースドキュメント１６
のマージンに配置されたマージン表示３４によって識別
される。図３に示されるように、ソースドキュメント１
６に全体的に関連するターゲット部分２２は注釈が付け
られ、ソースドキュメントのエンドノート３６として示
される。エンドノート３６は、ターゲット部分が生じた
ドキュメントのタイプ、タイトル及び概要情報を含み得
る。Examples of references to two types of target portions 22 are shown in FIGS. Source document 16
Target portion 22 associated with a particular passage 32 of
Is the source document 16 near the related passage 32
Is identified by the margin display 34 arranged in the margin of the image. As shown in FIG.
The target portion 22 generally associated with 6 is annotated and shown as an end note 36 in the source document. End note 36 may include the type, title, and summary information of the document in which the target portion occurred.

【００２６】本発明は、注釈のタイプ又は注釈の文脈を
決定する方法又はシステムによって限定されない。注釈
は構造化されていても、されていなくてもよい。The present invention is not limited by a method or system for determining the type of annotation or the context of the annotation. Annotations may be structured or unstructured.

【００２７】図４及び５は、本発明の方法の一つの実施
の形態の制御ルーチンの概要を示すフローチャートを示
す。制御ルーチンはステップＳ１００からスタートしス
テップＳ１１０に続く。ステップＳ１１０では、制御ル
ーチンは入力としてソースドキュメント１６を受け取
り、ステップＳ１２０に続く。ステップＳ１２０では、
制御ルーチンはソースドキュメントを一連のパッセージ
にセグメント化し、制御ルーチンはステップＳ１３０に
続く。ステップＳ１２０のセグメンテーションは、任意
の従来のセグメンテーションシステム又は方法によって
実行され得る。ステップＳ１３０では、制御ルーチンは
ソースドキュメントのセグメントを付加ターゲット部分
２２として格納し、ステップＳ１４０に続く。このよう
にして、ソースドキュメントのセグメント化されたパッ
セージは、参照されるべき関連している可能性のある部
分として利用可能となる。FIGS. 4 and 5 are flowcharts outlining the control routine of one embodiment of the method of the present invention. The control routine starts from step S100 and continues to step S110. In step S110, the control routine receives the source document 16 as input and continues to step S120. In step S120,
The control routine segments the source document into a series of passages, and the control routine continues to step S130. The segmentation of step S120 may be performed by any conventional segmentation system or method. In step S130, the control routine stores the segment of the source document as additional target portion 22, and continues to step S140. In this way, the segmented passages of the source document are made available as potentially relevant parts to be referenced.

【００２８】ステップＳ１４０では、制御ルーチンは注
釈がシステムに入力されたか否かを決定する。注釈は任
意の数の異なるシステムを使用して任意の数の異なる方
法で入力され得る。注釈を入力する一つの好ましい方法
は、ペンベースの入力デバイスのスタイラスを使用して
ソースドキュメントのディスプレイに直接マーキングす
る方法である。ステップＳ１４０では、注釈がシステム
に入力されていないと制御ルーチンが決定すると、制御
ルーチンはステップＳ１４０に戻る。ステップＳ１４０
で注釈が入力されたと制御ルーチンが決定すると、制御
ルーチンはステップＳ１５０に続く。In step S140, the control routine determines whether an annotation has been entered into the system. Annotations may be entered in any number of different ways using any number of different systems. One preferred method of entering annotations is to use a stylus on a pen-based input device to mark directly on the display of the source document. In step S140, if the control routine determines that no annotation has been entered into the system, the control routine returns to step S140. Step S140
When the control routine determines that a comment has been input in step S150, the control routine continues to step S150.

【００２９】ステップＳ１５０では、制御ルーチンは注
釈の文脈を決定する。文脈は、あらゆる従来のシステム
及び方法によってステップＳ１５０で決定され得る。注
釈の文脈は、バウンディング（境界）ボックスを使用し
て決定されるのが好ましい。このシステムは、注釈マー
クを囲むバウンディングボックスを生成し、ボックスの
境界を拡張して文脈を定める。水平方向の境界はテキス
トのエッジまで水平方向に延び、垂直方向の境界は垂直
方向に延びて完全な単語、センテンス又はパラグラフを
含む。バウンディングボックスによって囲われたテキス
トの量は、ユーザに合うように調節され得る所定の優先
を使用して最初に確立される。さらに、ユーザはバウン
ディングボックスを直接操作して注釈の文脈を決定して
もよい。制御ルーチンがステップＳ１５０で注釈の文脈
を決定した後、制御ルーチンはステップＳ１６０に続
く。In step S150, the control routine determines the context of the annotation. The context may be determined at step S150 by any conventional system and method. The context of the annotation is preferably determined using a bounding box. The system creates a bounding box surrounding the annotation mark and extends the bounds of the box to define the context. Horizontal boundaries extend horizontally to the edges of the text, and vertical boundaries extend vertically to include a complete word, sentence, or paragraph. The amount of text surrounded by the bounding box is initially established using a predetermined preference that can be adjusted to suit the user. Further, the user may directly manipulate the bounding box to determine the context of the annotation. After the control routine determines the context of the annotation in step S150, the control routine continues to step S160.

【００３０】ステップＳ１６０では、制御ルーチンは注
釈の文脈を分析し、ターゲット部分２２のデータベース
を検索するために使用される照会を生成する。本発明の
システム及び方法は、注釈付きパッセージのテキスト及
び注釈の性質から照会を導出する。そして制御ルーチン
はステップＳ１７０に続き、ここで関連するターゲット
部分２２を識別するために照会が使用される。関連する
ターゲット部分２２は、所定の閾値を越えるベストマッ
チングターゲット部分を決定することによって、ステッ
プＳ１６０で生成された照会を使用して識別される。任
意の数の従来の方法又はシステムがベストマッチングタ
ーゲット部分を決定するために使用されてよく、本発明
は特定の検索方法又はシステムに限定されないことが理
解されるであろう。さらに、所定の閾値はユーザの優先
によってユーザによって調節されてもよい。In step S160, the control routine analyzes the annotation context and generates a query that is used to search the target portion 22 database. The system and method of the present invention derive a query from the text of the annotated passage and the nature of the annotation. The control routine then continues to step S170, where a query is used to identify the relevant target portion 22. The relevant target portion 22 is identified using the query generated in step S160 by determining the best matching target portion that exceeds a predetermined threshold. It will be appreciated that any number of conventional methods or systems may be used to determine the best matching target portion, and the present invention is not limited to a particular search method or system. Further, the predetermined threshold may be adjusted by the user according to the user's preference.

【００３１】次に、制御ルーチンはステップＳ１８０に
続き、ここで識別されたターゲット部分の各々のリンク
が表示される。リンクが特定の注釈の文脈に関連するタ
ーゲット部分に対応する場合、そのリンクは図２に示さ
れるようにマージン表示３４として注釈に隣接するマー
ジンに表示される。或いは、制御ルーチンが全体的にソ
ースドキュメントに関連するターゲット部分をステップ
Ｓ１７０で識別する場合、制御ルーチンは図３に示され
るようにエンドノート３６としてターゲット部分のリン
クを表示する。次に制御ルーチンはステップＳ１９０に
続く。Next, the control routine continues to step S180, where the links of each of the identified target portions are displayed. If the link corresponds to a target portion associated with the context of a particular annotation, the link is displayed as a margin display 34 in the margin adjacent to the annotation, as shown in FIG. Alternatively, if the control routine identifies in step S170 a target portion that is entirely associated with the source document, the control routine displays the target portion link as an end note 36, as shown in FIG. Next, the control routine continues to step S190.

【００３２】ステップＳ１９０では、制御ルーチンは、
表示されたリンクをユーザが選択したか否かを決定す
る。ステップＳ１９０で制御ルーチンがリンクは選択さ
れたと決定すると、制御ルーチンはステップＳ２００に
続く。ステップＳ２００では、制御ルーチンは選択され
たリンクに対応するターゲット部分を表示する。次に制
御ルーチンはステップＳ２１０に続き、ここで制御ルー
チンは停止する。In step S190, the control routine includes:
Determine whether the user has selected the displayed link. If the control routine determines in step S190 that the link has been selected, the control routine continues to step S200. In step S200, the control routine displays a target portion corresponding to the selected link. Next, the control routine continues to step S210, where the control routine stops.

【００３３】別の実施の形態では、示されないが、制御
ルーチンは表示されたターゲット部分を入力ソースドキ
ュメントとして処理し、ステップＳ１１０に戻り、ここ
でターゲット部分が処理されてもよい。別の実施の形態
では、ステップＳ１９０でリンクが選択された場合、選
択されたリンクに関連する識別されたターゲット部分が
生じたドキュメント全体が表示され、ターゲット部分が
生じたドキュメント全体がステップＳ１１０でソースド
キュメントとして入力される。しかし、ステップＳ１９
０でリンクが選択されない場合、制御ルーチンはステッ
プＳ２２０に続く。In another embodiment, not shown, the control routine processes the displayed target portion as an input source document and returns to step S110, where the target portion may be processed. In another embodiment, if a link is selected in step S190, the entire document with the identified target portion associated with the selected link is displayed, and the entire document with the target portion is sourced in step S110. Entered as a document. However, step S19
If 0 and no link is selected, the control routine continues to step S220.

【００３４】ステップＳ２２０では、制御ルーチンはユ
ーザがエンドルーチンコマンドを入力したか否かを決定
する。ステップＳ２２０でユーザがエンドルーチンコマ
ンドを入力していなければ、制御ルーチンはステップＳ
１４０に戻る。或いは、ステップＳ２２０で、ユーザが
エンドルーチンコマンドを入力したと制御ルーチンが決
定すると、制御ルーチンはステップＳ２１０に続き、こ
こで制御ルーチンが停止する。In step S220, the control routine determines whether the user has entered an end routine command. If the user has not entered the end routine command in step S220, the control routine proceeds to step S220.
Return to 140. Alternatively, if the control routine determines that the user has entered an end routine command in step S220, the control routine continues to step S210, where the control routine stops.

【００３５】本発明の方法及びシステムの制御ルーチン
は、ドキュメント読み取りシステム１０がオンであるか
又はユーザ入力コマンド（例えば、注釈）の受信の際に
作動するように設定されてもよい。The control routines of the method and system of the present invention may be set to operate when the document reading system 10 is turned on or upon receiving a user input command (eg, an annotation).

【００３６】さらに、識別及び／又は表示されるターゲ
ット部分の数は所定の閾値を使用して調節され得る。こ
の閾値は、システムが有用であることの妨げとなり得る
多すぎるリンクの表示を防止することができる。Further, the number of target portions identified and / or displayed may be adjusted using a predetermined threshold. This threshold may prevent the display of too many links, which may prevent the system from being useful.

【００３７】ターゲット部分は、一つ以上の従来の方法
を使用して識別され得る。例えば、ターゲット部分の関
連度はテキストベースの統計的類似方法を使用して決定
されてもよいし、フルテキストブール又は確率的方法を
使用してもよい。他の例として、言語又は論理的分析方
法、話者識別又は認識及び画像類似アルゴリズムが挙げ
られる。本発明はターゲット部分とソースドキュメント
との関連度を決定する任意の公知の又は将来の方法又は
システムに限定されない。The target portion can be identified using one or more conventional methods. For example, the relevance of the target portion may be determined using a text-based statistical similarity method, or may use a full-text Boolean or stochastic method. Other examples include linguistic or logical analysis methods, speaker identification or recognition and image similarity algorithms. The invention is not limited to any known or future method or system for determining the relevance of a target portion to a source document.

【００３８】本発明の方法及びシステムは検索の単位と
してドキュメント全体ではなくドキュメントのパッセー
ジ又は一部分を使用する。ターゲット部分２２は、ドキ
ュメントを統計的に類似するテキストの範囲を含むタイ
ルにセグメント化することによってインデックス（索引
付け）される。多くのトピックについて述べている長い
ドキュメントは照会にあまりマッチしない傾向にあり、
一方これらのドキュメントの特定の部分が非常にマッチ
するため、ドキュメントよりもタイルが検索の単位とし
て使用される。The method and system of the present invention use a passage or portion of a document as a unit of search, rather than the entire document. The target portion 22 is indexed by segmenting the document into tiles that include statistically similar ranges of text. Long documents that talk about many topics tend not to match queries very well,
On the other hand, because certain parts of these documents match so closely, tiles are used as a unit of search rather than documents.

【００３９】本発明の方法及びシステムは幾つかの明確
なインクパターンを認識し、これから照会が演算され
る。インクパターン又はマークとして、下線が引かれた
単語、丸で囲まれた単語、丸で囲まれたパッセージ及び
マージン注釈が挙げられる。注釈の各タイプによって検
索エンジンの照会が異なることが好ましい。例えば、特
定の単語を選択するマークは、同じセンテンスにおいて
これらの単語を他の単語から強調する照会に変換し、長
いパッセージを選択するマークは類似のフレーズを検索
する照会を生成する。或いは、各ストロークに対して別
個の照会が演算されてもよい。The method and system of the present invention recognizes several distinct ink patterns from which a query is computed. Ink patterns or marks include underlined words, circled words, circled passages, and margin annotations. Preferably, each type of annotation results in a different search engine query. For example, marks that select certain words translate into queries that emphasize these words from other words in the same sentence, while marks that select long passages generate queries that search for similar phrases. Alternatively, a separate query may be computed for each stroke.

【００４０】インクストロークは、時間及び注釈のタイ
プによってグループ化されてもよい。例えば、注釈のタ
イプは色によって識別されてもよいし、同一の又は類似
の色を有する注釈がグループ化されて照会を生成しても
よい。The ink strokes may be grouped by time and type of annotation. For example, annotation types may be identified by color, or annotations having the same or similar colors may be grouped to generate a query.

【００４１】本明細書中で使用される注釈という用語は
テキスト、デジタルインク、オーディオ、ビデオ又はド
キュメントに関連する任意の他の入力を含むことを意図
していることが理解されるであろう。また、ドキュメン
トという用語は、テキスト、ビデオ、オーディオ及び任
意の他の媒体又は媒体の任意の組み合わせを含むことを
意図していることも理解されるであろう。さらに、テキ
ストという用語は、テキスト、ストローク又はビットマ
ップフォーマットのデジタルインク、オーディオ、画
像、ビデオ又はドキュメントの任意の他の構造又は内容
を含むことを意図することが理解されるであろう。ま
た、「注釈」という用語は、テキスト、デジタルイン
ク、オーディオ、画像、ビデオ又はドキュメントに関連
する任意の他の入力を含むことを意図することが理解さ
れるであろう。It will be understood that the term annotation as used herein is intended to include text, digital ink, audio, video or any other input associated with a document. It will also be understood that the term document is intended to include text, video, audio and any other media or any combination of media. Furthermore, it will be understood that the term text is intended to include text, strokes or digital ink in bitmap format, audio, images, video or any other structure or content of a document. It will also be appreciated that the term "annotation" is intended to include text, digital ink, audio, images, video or any other input associated with a document.

【００４２】「類似度」又は「関連度」という用語は、
ドキュメント又はドキュメントの関連部分又は別のドキ
ュメントの一部の関連部分又は同じドキュメントの別の
部分を含むドキュメント部分の任意の測度を含むことを
意図することが理解されるであろう。The term “similarity” or “relevance”
It will be understood that it is intended to include any measure of a document or a related portion of a document or portion of another document or a portion of a document that includes another portion of the same document.

【００４３】本明細書中は上述の特定の実施の形態によ
って説明されてきたが、多くの代替物、変更及びバリエ
ーションが当業者には明らかであろう。したがって、上
述の好ましい実施の形態は例示的であり制限するもので
はない。本明細書中の精神及び範囲から逸脱しない限
り、種々の変更がなされ得る。Although described herein with reference to the specific embodiments described above, many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, the above described preferred embodiments are illustrative and not restrictive. Various changes may be made without departing from the spirit and scope herein.

[Brief description of the drawings]

【図１】本発明の電子ドキュメント読み取りシステムの
一つの実施の形態のブロック図である。FIG. 1 is a block diagram of one embodiment of an electronic document reading system according to the present invention.

【図２】注釈付きパッセージに隣接するマージンにアイ
コンを有するソースドキュメントを示した図である。FIG. 2 illustrates a source document having an icon in a margin adjacent to an annotated passage.

【図３】エンドノートを有する別のソースドキュメント
を示した図である。FIG. 3 illustrates another source document having an end note.

【図４】本発明の一つの実施の形態の制御ルーチンの概
要を示したフローチャートである。FIG. 4 is a flowchart showing an outline of a control routine according to one embodiment of the present invention.

【図５】本発明の一つの実施の形態の制御ルーチンの概
要を示したフローチャートである。FIG. 5 is a flowchart showing an outline of a control routine according to one embodiment of the present invention.

[Explanation of symbols]

１０ドキュメント読み取りシステム１２プロセッサ１４第１のメモリ１６ソースドキュメント１８ディスプレイ２０第２のメモリ２２ターゲット部分２４入力／出力デバイス３２関連パッセージ３４マージン表示 10 Document Reading System 12 Processor 14 First Memory 16 Source Document 18 Display 20 Second Memory 22 Target Portion 24 Input / Output Device 32 Associated Passage 34 Margin Display

───────────────────────────────────────────────────── フロントページの続き (72)発明者ジーンゴロブチンスキーアメリカ合衆国 94306 カリフォルニア州パロアルトエルカミノリアル 4250 ナンバーシー327 (72)発明者マークディー．ワイザーアメリカ合衆国 94301 カリフォルニア州パロアルトグリーンウッドアベニュー 1144 ────────────────────────────────────────────────── ─── Continued on the front page (72) Inventor Gene Gorobchinsky United States 94306 California Palo Alto El Camino Real 4250 Number Sea 327 (72) Inventor Mark D. Weiser United States 94301 Palo Alto Greenwood Avenue California 1144

Claims

[Claims]

1. A method for identifying at least one link to a target portion for a source document, wherein each target portion is associated with the source document and identifies an annotated passage by at least one user of the source document. And identifying at least one target portion associated with at least one annotated passage of the source document.

2. The method of claim 1, further comprising displaying a selectable link for each of the identified target portions on a display of the source document.

3. The method of claim 2, wherein an icon representing the type of at least one identified target portion is displayed.

4. The method of claim 2, wherein the title of the document in which the identified at least one target portion occurred is displayed.

5. The method of claim 2, wherein a summary of the document in which the at least one identified target portion occurred is displayed.

6. The step of identifying an annotated passage by at least one user includes segmenting the source document into passages and identifying at least one of the passages as having an annotation.
The method of claim 1.

7. The method of claim 1, wherein identifying at least one target portion comprises determining a relevance based on the user-identified term and the term identified using related feedback techniques. .

8. The method of claim 1, further comprising: determining whether a selectable link has been selected; and displaying at least one target portion identified in response to the selection of the selectable link. the method of.

9. An identification step comprising: identifying a plurality of target portions as being associated with a source document; clustering the identified plurality of target portions; and identifying a plurality of identified targets within each cluster. Selecting at least one of the identified plurality of target portions of each cluster, representative of all of the documents, wherein the selectable link references the selected at least one of the identified plurality of target portions; The method of claim 1, further comprising:

10. The method of claim 1, wherein the relevance is determined based on a similarity between the at least one target portion and the source document.

11. An electronic document system for suggesting at least one target portion associated with a source document on a display of the source document, wherein the electronic document system identifies at least one user-annotated passage of the source document and includes at least one target portion. An electronic document system that includes a processor that identifies the source document as being associated with the annotated passage of the source document.

12. The system of claim 11, further comprising a display displaying a selectable link that references the at least one identified target portion of the display of the source document.

13. The claim, wherein the processor identifies at least one target portion based on a relevance of the at least one target portion to the source document, and wherein the selectable link references the identified at least one target portion. Item 13. The system according to Item 12.

14. A processor identifying at least one annotated passage of the source document, identifying at least one target portion as being associated with the identified at least one annotated passage,
14. The system of claim 13, wherein the selectable link is displayed near at least one identified annotated passage.

15. The system further comprising a user interface,
2. The user interface of claim 1, wherein the display displays at least one target portion identified in response to a selection of the selectable link by the user.
2. The system according to 2.

16. The processor identifies a plurality of target portions based on relevance to the source document, clusters the identified plurality of target portions, and represents all of the identified target portions within each cluster. ,
13. The method of claim 12, wherein at least one of the plurality of identified target portions of the cluster is selected, and the selectable link references the selected at least one of the plurality of second portions.
The described system.

17. The system of claim 11, further comprising a user input interface, wherein the processor identifies at least one target portion in response to an annotation of a passage of the source document by the user.

18. The system of claim 11, wherein the processor identifies at least one target portion based on the user-identified term and the term identified based on related feedback techniques.

19. The system of claim 11, wherein the processor determines a relevancy between the source document and the at least one target portion based on a similarity between the context of the target portion and the source document.