JPH04199262A

JPH04199262A - Picture displaying method for multimedia document processing system

Info

Publication number: JPH04199262A
Application number: JP2317787A
Authority: JP
Inventors: Katsuyuki Takahashi; 克幸高橋
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1990-11-26
Filing date: 1990-11-26
Publication date: 1992-07-20

Abstract

PURPOSE:To enable a user to easily understand the content of a multimedia document by automatically displaying the picture information related to the text information of the multimedia document on a display when the text information is read out in a synthesized voice. CONSTITUTION:The process regarding the link and display of text information and a related medium in a multimedia document is performed by means of a document edition processing section 103, displaying information analyzing section 104, displaying information selecting section 109, and link information table 102. Then, when the text information of the multimedia document is read out in a synthesized voice, figures, tables, and pictures are successively displayed on a screen in accordance with the content of the document. Therefore, a user can listen to the voice reading out the text information while the user watches the figures, tables, and pictures automatically displayed on the screen and can easily understand the content of the multimedia document.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、マルチメディア文書を扱う、マルチメディア
文書処理システムに係り、特にマルチメディア文書のテ
キスト情報を音声合成して読み上げるときに、そのテキ
ストと関連する画像情報を自動的にデイスプレィに表示
するマルチメディア文書処理システムの画像表示方法に
関する。[Detailed Description of the Invention] [Field of Industrial Application] The present invention relates to a multimedia document processing system that handles multimedia documents, and in particular, when text information of a multimedia document is synthesized and read out, the text The present invention relates to an image display method for a multimedia document processing system that automatically displays image information related to images on a display.

[Conventional technology]

テキスト音声合成システムとしては、米国ＭＩＴで開発
されたＭＩＴａｌｋやＮＴＴ基礎研究所の開発した日本
語テキスト音声合成システムなどがあり、ＮＴＥＣ（Ｎ
、ＴＴ技術移転株式会社）より出版されている「音声情
報工学」　（心地・箆・他共著）でその基本的な技術が
紹介されている。Examples of text-to-speech synthesis systems include MITalk developed at MIT in the United States and the Japanese text-to-speech synthesis system developed by NTT Basic Research Laboratories.
The basic technology is introduced in ``Speech Information Engineering'' (co-authored by Kochi, Kei, et al.) published by TT Technology Transfer Co., Ltd.

さらに市販されているシステムとしては、ＭＩＣＲＯＮ
ＩＣ３社のＰＣ−９８００用電子音声発声装置「音次部
」などがあり、日本語ワードプロセッサで作成したテキ
スト情報をテキスト音声合成処理により読み上げる機能
をもっている。Furthermore, as a commercially available system, MICRON
There is an electronic voice output device for the PC-9800, ``Onjibu'' by IC3, which has a function to read out text information created with a Japanese word processor using text-to-speech synthesis processing.

また、テキスト情報に様々なマルチメディアデータをリ
ンクしていく方式については、ハイパーテキスト技術が
知られている。例えば著名な製品としてはＭ　、　Ｐ　
、　Ｔ　ｅｃｈｎｏｌｏｇｙ社のｒＧｕｉｄｅＪがある
が、これはテキスト中のある言葉を他の文章や図、ビデ
オなどとリンクさせて、この言葉の表示しである部分を
マウスでクリックするとリンクされている文書に表示さ
れている内容が置きかわったり、図やビデオなどが同時
表示されたりする機能をもっている（参考文献としては
、例えば日経バイト１９９０年４月号、　Ｐ、２２０〜
Ｐ、２３０）。Further, hypertext technology is known as a method for linking various multimedia data to text information. For example, famous products include M, P
There is rGuideJ by Technology Inc., which allows you to link a word in a text with other sentences, pictures, videos, etc., and when you click on the part where this word is displayed with the mouse, you can see the linked document. It has a function that allows the displayed content to be replaced or to display figures, videos, etc. at the same time (references include Nikkei Baito April 1990 issue, P. 220~
P, 230).

[Problem to be solved by the invention]

ワードプロセッサで作成した文書が、とくに図や表２画
像といったマルチメディアデータを含むマルチメディア
文書である場合、テキスト情報の内容は文書中の図表や
画像と関連づけながら記述されているのが一般的である
。このような文書については、テキスト情報の内容だけ
で文書内容を理解することは困難である。マルチメディ
ア文書中のテキスト情報がテキスト音声合成システムに
よって読み上げられる場合、ユーザはこれを聞きながら
読み上げられる内容に応じて適宜関連する図表や画像情
報を参照することが必要である。When a document created using a word processor is a multimedia document that includes multimedia data such as figures, tables, and images, the content of the text information is generally described in relation to the figures, tables, and images in the document. . Regarding such documents, it is difficult to understand the content of the document based only on the content of text information. When text information in a multimedia document is read out by a text-to-speech synthesis system, the user needs to refer to related diagrams and image information as appropriate depending on the content being read out while listening.

文書中のある部分に関連する情報をリンクし、ユーザが
適宜関連情報を参照できるシステムとして、ハイパーテ
キストがある。しかし従来のハイパーテキストでは、ユ
ーザが目で文書を追いながら関連情報を参照したいとき
にはマウスでテキスト文章中の該当個所をクリックする
ことによって関連情報を表示する方法をとっている。こ
れに対し、読み上げるという処理は時間経過をともなう
もので、ユーザにとっては、“今”読み上げられている
内容と関連する情報を事態することが必要である。Hypertext is a system that links information related to a certain part of a document and allows users to refer to related information as appropriate. However, in conventional hypertext, when a user wants to refer to related information while following a document with his or her eyes, the user clicks on a relevant location in the text with a mouse to display the related information. On the other hand, the process of reading aloud involves the passage of time, and it is necessary for the user to read information related to the content that is being read aloud at the moment.

本発明の目的は、マルチメディア文書中のテキスト情報
がテキスト音声合成システムにより読み上げられる際、
読み上げられている内容に応じてユーザが参照すべき図
２表２画像などの関連情報をデイスプレィに自動的に表
示する画像表示方法を提供することにある。An object of the present invention is to provide a method for reading out text information in a multimedia document by a text-to-speech synthesis system.
It is an object of the present invention to provide an image display method for automatically displaying related information such as images in FIG. 2 and Table 2 that the user should refer to on a display according to the content being read out.

[Means to solve the problem]

上記目的を達成するために、本発明は、マルチメディア
文書中のテキスト情報のある範囲について関連する図２
表または画像情報を（今後「関連メディア」とよぶ）を
リンクさせ、さらにテキスト情報中で該当する範囲の前
後に関連メディアを表示するためのリンク情報として格
納し、テキスト情報の文字列が音声合成処理のため先頭
より処理される過程で関連メディアのリンク情報を検知
した時に関連メディアを表示することを可能とするもの
である。In order to achieve the above object, the present invention relates to a range of textual information in a multimedia document.
Tables or image information (hereinafter referred to as "related media") are linked, and the text information is stored as link information to display related media before and after the corresponding range, and the text information is synthesized into speech. This makes it possible to display related media when link information of related media is detected in the process of processing from the beginning.

[Effect]

テキスト情報のある範囲に対して関連する図。 A diagram associated with a range of textual information.

表２画像情報がリンクされると、指定された範囲のテキ
スト情報の前と後にはリンク情報が埋め込まれる。この
リン、り情報は、テキスト情報が先頭より順に音声合成
処理システムで処理される過程で検知され、図表９画像
情報の表示制御部に送られる。図２表９画像情報とリン
クされている範囲のテキスト情報が現時点で音声合成処
理システムで処理され読み上げられているかどうかをシ
ステムが判別し、その範囲がテキスト音声合成処理され
て読み上げられている間、リンクされた図２表。Table 2 When image information is linked, link information is embedded before and after the specified range of text information. This link information is detected in the process in which the text information is sequentially processed by the speech synthesis processing system from the beginning, and is sent to the image information display control section in Figure 9. Figure 2 Table 9 The system determines whether the text information in the range linked to the image information is currently being processed and read aloud by the speech synthesis processing system, and while that range is being processed and read out by the text-to-speech synthesis processing system. , linked Figure 2 Table.

画像情報がデイスプレィに表示される。こうして、読み
上げられる内容に合わせ関連する図９表２画像情報を表
示するタイミングをとることができ、ユーザは現在読み
上げられている音声をその内容に合わせて表示されてい
る図２表２画像を見ながら聞くことができる。Image information will be displayed on the display. In this way, the timing for displaying the related image information in Figure 9, Table 2, according to the content being read out can be determined, and the user can adjust the currently read out audio to the displayed image in Figure 2, Table 2. You can listen while listening.

〔Example〕

以下、本発明の一実施例について図面を用いて詳細に説
明する。Hereinafter, one embodiment of the present invention will be described in detail using the drawings.

第１図は、マルチメディア文書編集処理、テキスト内容
と関連メディアのリンク及び表示処理およびテキスト音
声合成処理の各機能をもつシステムの全体構成図を示す
。FIG. 1 shows an overall configuration diagram of a system having the functions of multimedia document editing processing, linking and display processing of text contents and related media, and text-to-speech synthesis processing.

マルチメディア文書の編集処理は、文書編集処理部１０
３で行われる。ユーザは、マウス１１０．キーボード１
１１およびデイスプレィ１１２を通して文書編集処理部
１０３に対して編集等の命令を入力し、マルチメディア
文書の作成２編集を行う。作成。The multimedia document editing process is carried out by the document editing processing section 10.
It is done in 3. The user uses the mouse 110. keyboard 1
11 and the display 112 to input editing commands to the document editing processing section 103 to perform creation and editing of the multimedia document. create.

編集されたマルチメディア文書はマルチメディア文書デ
ータベース１０１へ格納される。再び編集。The edited multimedia document is stored in the multimedia document database 101. Edit again.

加工するときは、マルチメディア文書データベース１０
１より編集対象とする文書をとり出して文書編集処理部
１０３で処理を行う。When processing, use the multimedia document database 10
1, a document to be edited is extracted and processed by the document editing processing unit 103.

テキスト内容と関連メディアのリンク及び表示に関する
処理は、文書編集処理部１０３９表示情報解析部１０４
９表示情報制御部１０９およびリンク情報テーブル１０
２により行われる。ユーザは文書処理編集部１０３に対
してテキスト情報と関連メディアを指定してリンク指示
を行う。リンク指示を受けた文書処理編集部１０３は、
音声合成処理時に関連メディアを表示するための制御情
報を、現在文書編集処理部１０３で編集されている文書
のテキスト情報内に記録する。また、どの関連メディア
とリンクしたかを表わすリンク情報をリンク情報テーブ
ル１０２に記録する。表示情報解析部１０４は音声合成
処理時に文書処理編集部１０３の処理でテキスト内に記
録された制御情報をチエツクしながらテキスト情報をテ
キスト解析部１０５へ送る。表示情報が検知されると、
テキスト情報をテキスト解析部１０５へ送るのを止めて
音声合成処理を一時中断させて、制御情報を表示情報制
御部１０９に送る。Processing related to linking and displaying text content and related media is performed by the document editing processing unit 1039 and the display information analysis unit 104.
9 display information control unit 109 and link information table 10
2. The user specifies text information and related media and issues a link instruction to the document processing editing unit 103. The document processing editing department 103 that received the link instruction,
Control information for displaying related media during speech synthesis processing is recorded in the text information of the document currently being edited by the document editing processing unit 103. Further, link information indicating which related media has been linked is recorded in the link information table 102. The display information analysis section 104 sends text information to the text analysis section 105 while checking the control information recorded in the text by the processing of the document processing and editing section 103 during speech synthesis processing. When display information is detected,
Sending the text information to the text analysis section 105 is stopped, the speech synthesis process is temporarily interrupted, and control information is sent to the display information control section 109.

表示情報制御部１０９は受は取った制御情報に基づきリ
ンク情報テーブル１０２を参照してテキストの該当部分
とリンクされた関連メディアを確定し、文書編集処理部
１０３にある文書中から該当する関連メディアを抜き出
し、デイスプレィ１１２に表示する。The display information control unit 109 refers to the link information table 102 based on the received control information, determines the related media linked to the corresponding part of the text, and selects the corresponding related media from the document in the document editing processing unit 103. is extracted and displayed on the display 112.

ここでは、リンク情報テーブルに詳細なリンク情報を格
納しているが、文章のテキスト中にメディアを確定する
情報まで記憶しておいてもよい。Although detailed link information is stored in the link information table here, information for determining the media may also be stored in the text of the sentence.

また、リンク情報テーブルにテキストの範囲を示すアド
レス情報も持たせ、音声出力中のテキストのアドレスを
チエツクし、テーブル中のアドレス情報と比較するよう
にしてもよい。Further, the link information table may also include address information indicating the range of text, and the address of the text being outputted as audio may be checked and compared with the address information in the table.

テキスト情報を音声出力に変換するテキスト音声合成処
理は、テキスト解析部１０５．音声制御情報生成部１０
６．音声合成制御パラメタファイル１．０７および音声
合成部１０８により以下の動作で行われる。テキスト解
析部１０５はテキスト情報の構文解析、意味解析を行い
、読みがな、単語／文節境界。Text-to-speech synthesis processing for converting text information into audio output is performed by the text analysis unit 105. Voice control information generation unit 10
6. The following operations are performed by the speech synthesis control parameter file 1.07 and the speech synthesis section 108. The text analysis unit 105 performs syntactic and semantic analysis of text information, and analyzes readings and word/clause boundaries.

文法、基本アクセントなどの情報を生成する。音声制御
情報生成部１０６はテキスト解析部１０５で生成された
情報を受け、音韻情報、＠律情報に関する音声合成制御
パラメタを生成し、音声合成部１０８へ送る。音声合成
部１０８では受信した音声合成制御パラメタを音声信号
に変換し、スピーカ１１３より出力する。発声速度都合
上、音声合成部１０８による処理待ちをすることになっ
たパラメタ情報については、音声合成制御パラメタファ
イル１０７に一度蓄積され、音声合成部１０８が次のパ
ラメタ情報を処理できるようになるのを待って、音声合
成部１０８へと送られる。また、音声合成制御パラメタ
ファイル１０７の処理待ちパラメタ蓄積状況については
、関連メディアの表示処理を伴う音声合成を行うために
、表示情報解析部１０４でモニタできるようにする。Generates information such as grammar and basic accents. The speech control information generation section 106 receives the information generated by the text analysis section 105, generates speech synthesis control parameters regarding phoneme information and @rhythm information, and sends them to the speech synthesis section 108. The speech synthesis unit 108 converts the received speech synthesis control parameters into an audio signal and outputs it from the speaker 113. Parameter information that has to wait for processing by the speech synthesis unit 108 due to speaking speed is once stored in the speech synthesis control parameter file 107, and the speech synthesis unit 108 is then able to process the next parameter information. After that, it is sent to the speech synthesis section 108. Furthermore, the status of accumulation of parameters waiting to be processed in the speech synthesis control parameter file 107 can be monitored by the display information analysis unit 104 in order to perform speech synthesis accompanied by display processing of related media.

次にテキスト内容と関連メディアのリンク及び表示に関
する処理について、文書編集処理部１０３゜表示情報解
析部１０４２表示情報制御ｌｉｔ　１０９およびリンク
情報テーブル１０２で行われる動作を中心に、テキスト
内容と関連メディアのリンク及びテキスト音声合成処理
に合わせて関連メディアを自動的に表示していく処理の
しくみについて詳細に説明する。Next, regarding the processing related to linking and displaying text content and related media, we will focus on the operations performed in the document editing processing unit 103, display information analysis unit 1042, display information control lit 109, and link information table 102. The mechanism of automatically displaying related media in accordance with link and text-to-speech synthesis processing will be explained in detail.

第２図は、デイスプレィ１１２にマルチメディア文書を
表示している画面例である。２０１はコマンドメニュー
であり、ウィンドウ２０２に、Ｊｌ　２０３゜テキスト
２０４１画像２０５の各データよりなるマルチメディア
文書が表示されている。各領域は、それぞれ−意の領域
番号と図／表／テキスト／画像というメディア種別につ
いての属性を持つ。図２０３の領域がもつ属性情報とし
ての例を第４図に示す。FIG. 2 is an example of a screen displaying a multimedia document on the display 112. Reference numeral 201 indicates a command menu, and a multimedia document consisting of data such as Jl 203°, text 2041, and image 205 is displayed in a window 202. Each area has an attribute regarding the area number and the media type of figure/table/text/image. FIG. 4 shows an example of attribute information held by the area in FIG. 203.

このとき、テキスト２０４のある範囲と図２０３をリン
クする際のユーザの操作例は次の通りである。At this time, an example of the user's operation when linking a certain range of the text 204 and the diagram 203 is as follows.

まずユーザは、テキスト２０４において図２０３に関連
する範囲がどこからどこまでかを指定する。First, the user specifies in the text 204 the range related to FIG. 203.

第２図では、斜線部２０６が指定された範囲とする。In FIG. 2, the shaded area 206 is the designated range.

次に、コマンドメニ、：１．−２０１にある１表示属性
」を選択すると、リンクする関連メディアを選択する旨
を示したメツセージが表示され、ユーザはこのメツセー
ジに従い、図２０３　’ｌマウスでビックする。すると
、第３図に示すような表示対象屈性設定ウィンドウ３０
１が表示される。図２０３が表示対象として指定しであ
るため、表示対象種別を指示する欄３０３には、１図」
が自動的に選択されている。。Next, the command menu: 1. When selecting "1 display attribute" in 201, a message indicating that the related media to be linked is selected is displayed, and the user follows this message and clicks with the mouse in Figure 203'l. Then, a display target tropism setting window 30 as shown in FIG.
1 is displayed. Since the figure 203 is specified as the display target, the field 303 for specifying the display target type contains "1 figure".
is automatically selected. .

ユーザは、欄３０２へ図２０：３へも−る名称を設定す
る。１＠　：３０２．　３０：３の設定が完Ｙしたら、
コマンドメニュー２０１の「完了」を選択する。以上で
、図２０３とテキスト２０４の斜線ｆａｔ！　２０６の
リンクが完了する。The user sets a name in the column 302 as shown in FIG. 20:3. 1@:302. After completing the 30:3 setting,
Select “Complete” from the command menu 201. Above is the diagonal line fat! of figure 203 and text 204! 206 linking is completed.

このようなリンク設定の操作時のシステムの動作は次の
通りである。The operation of the system during such a link setting operation is as follows.

第２図にある文書を表示しているウィンドウ２０２は、
第１図の文書編集処理部１０３が制御しでいる。テキス
ト２０４において対象とする範囲が指定され、コマンド
メニュー２０１により「表示属性」が選択されると、文
書編集処理部１０３は、リンクする対象な指定オる旨の
メツセージ分画面に表示する。さらにユーザが図２０３
をビックして指定すると、文書編集処理部１０３は第：
（図に示すようなウィンドウ３０１をデイスプレィに表
示する。また、この時に第４図に示すようなビックした
領域の属性情報を参照して領域のメディア種別を確定」
る１、またテキスト情報の範囲指定した前後に第１−〕
図に示すような制御情報５０１．　５０５およびリンク
情報５０２、５０６を生成する。第５図は、範囲指定さ
れたテキスＩ＝−情報に構成する文字情報の並びを示し
でおり、５０３の「図」という文字は指定されたテキス
［・情報の範囲の先頭文字であり、同じ＜５０４の「。The window 202 displaying the document in FIG.
The document editing processing section 103 shown in FIG. 1 is in control. When a target range is specified in the text 204 and "display attribute" is selected from the command menu 201, the document editing processing unit 103 displays on the screen a message indicating that the link target has been specified. Furthermore, the user
When you start and specify, the document editing processing unit 103 executes the following:
(A window 301 as shown in the figure is displayed on the display. At this time, the media type of the area is determined by referring to the attribute information of the startled area as shown in Figure 4.)
1, and 1-] before and after specifying the range of text information.
Control information 501 as shown in the figure. 505 and link information 502 and 506 are generated. Figure 5 shows the sequence of character information constituting the range specified text I = - information, and the character 503 "Figure" is the first character of the specified text [-information range, and is the same <504 ".

」は指定されたテキスト情報の範囲の最後の文字である
。１文書編集処理部１０３は１６進表示で、１’ＦＦＦ
ＩＪという文字コードは指定範囲のはじまりを示ずリン
ク情報どして、ｒＦＦＦ２Ｊは指定範囲の終わりを示す
リンク情報として認識し１、さらに「ト’ＦＦ１ｊおよ
びｒ　Ｆ　ｒ：　Ｆ　２　Ｊの次の２バイ）・分のリン
ク情報５０２．　５０６は、それぞれ関連メゾ、イアと
のリンクの関係を一意に表わすコー・ドとして認識する
。このコードは、リンク情報か生成される際システムに
より一意につけられる。文書編集処理部１０３はデイス
プレィ１１２に文書な：表示するときには、これら制御
情報およびリンク情報は表示せず、また文字の一つとし
ての編集対象ともしない。一方、参照した領域属性より
、メディア種別４０２を確定した結果は、文書編集処理
部１０３が表示対象属性設定ウィンドウ３帆を生成する
際、表示対象種別を示す桐３Ｏ３に表示する。ユーザが
表示対象名３０２に表示対象名を設定し、コマンドメニ
ュー２０１の「完了」を指定すると、文書ｍｓ処理８１
！１０３は表示対象属性設定ウィンドウ３０１を閑じ、
生成されたリンク情報、指定された領域番号、。” is the last character in the specified range of text information. 1 document editing processing unit 103 is displayed in hexadecimal, 1'FFF
The character code IJ is recognized as link information and does not indicate the beginning of the specified range, and rFFF2J is recognized as link information indicating the end of the specified range. The link information 502 and 506 of the link information 502 and 506 are respectively recognized as a code that uniquely represents the link relationship with the related meso and ia.This code is uniquely attached by the system when the link information is generated. When displaying a document on the display 112, the document editing processing unit 103 does not display these control information and link information, and does not edit them as one of the characters.On the other hand, based on the referenced area attribute, the media type 402 When the document editing processing unit 103 generates the display target attribute setting window 3, the determined result is displayed in the paulownia 3O3 indicating the display target type.The user sets the display target name in the display target name 302, and issues the command When “Complete” is specified in the menu 201, document ms processing 81
! 103 leaves the display target attribute setting window 301;
Generated link information, specified area number,.

メディア種別および表示対象名をリンク情報デープル１
０２に格納する。この格納例を第６図に示す。Link information table 1 for media type and display target name
Store in 02. An example of this storage is shown in FIG.

次に、テキスト音声合成処理の関連メディアの表示のし
くみを第７図の処理フローに沿って説明する。Next, a mechanism for displaying related media in text-to-speech synthesis processing will be explained along the processing flow shown in FIG. 7.

まず、表示情報解析部」０４は、文書編集処理＆［；１
０３より音声化することを指斤：されたテキスト・Ｆｆ
７報について、先頭の文字情報から順に句読点があるか
どうか検知していく。句読点が見つかるか音声化するこ
とを指定された最後の文字を検知Ｊ−ると、ぞこまでの
文字情報を１つの集まりとし、１つの文節として認識す
る（’７０１）。次にこの文節中に表示情報を表わｊ文
字情報、すなわち第５図の例で（７１）ｒＦＦＦＩＪあ
ルイはｒＩ’　Ｆ　Ｂ’　２ｊがないかをヂエツクする
（　７０２）。表示情報が当該文節中に無い場合には、
この文節のデータをテキスト・解析部１０５へ送り音声
出力処理を進める（　７１１）。表示情報が当該文節中
にある場合には、表示情報解析部］０４は直接このテキ
スト情報をテギスト解析８１Ｓ１０５へ送るのを待ち、
関連メディアの画面表示処理に先に行う。表示情報解析
部１０４は、表示情報な文節内に検知すると、音声合成
制御パラメタファイル１０７のパラメタがすべて音声合
成部１０８ハ、送られるのを待つ。パラメタがすべて音
声合成部１０８ノ＼送られたことをモニタすると、音声
情報解析部］、、０４は検知した表示情報の内容を表示
情報制御部１０９へ伝え６（７ｏａ）。表示情報料Ｊｌ
　ｆｌｌｉ　］、　０９−Ｑ　Ｌｉ。First, the display information analysis section 04 performs document editing processing &[;1
From 03 onwards, it is suggested to make the text into voice.
For the 7 reports, the presence of punctuation marks is detected in order from the first character information. When a punctuation mark is found or the last character specified to be vocalized is detected, the character information up to that point is grouped together and recognized as one phrase ('701). Next, it is checked whether there is any character information representing the display information in this clause, that is, (71) rFFFIJ in the example of FIG. 5 (702). If the display information is not in the relevant clause,
The data of this clause is sent to the text/analysis unit 105 to proceed with audio output processing (711). If the display information is in the relevant clause, the display information analysis unit] 04 waits to directly send this text information to the TEGIST analysis 81S105,
Perform screen display processing for related media first. When the display information analysis unit 104 detects display information in a clause, it waits until all parameters of the speech synthesis control parameter file 107 are sent to the speech synthesis unit 108c. After monitoring that all the parameters have been sent to the speech synthesis section 108, the speech information analysis section . Display information fee Jl
fli ], 09-Q Li.

受けた表示情報が表示はじまりの情報であるか、すなわ
ち第５図の例でいえばｒＦＦＦｌ」であるかを判断しく
　７０４）、はじまりの情報である場合には、リンク情
報テーブル１０２を参照し、受けた表示情報の表示情報
コードについて、すなわち第５図の例でいえばｒｏｏｏ
ｌ」について、同値のデータをリンク情報テーブル１．
０２の表示情報コード欄６０１から検索し、この表示情
報コードをもつデータにアクセスする（　７０５）。そ
してここで検索された領域番号６０３と＝一致する領域
属性をもつ領域な文書処理編集部１０３にある文書の中
よりアクセスする（　７０６）。表示情報制御部１０９
は、さらに関連メディア表示用のウィンドウを生成し、
ここでアクセスした文書中の領域の情報をこのウィンド
ウに表示する（　７０７）。表示情報は当該文節の文字
情報より削除され（７１０）、テキスト解析部１．０５
へと送られて音声合成処理が進められる（　７１１）。It is determined whether the received display information is the information on the beginning of the display, that is, in the example of FIG. Regarding the display information code of the received display information, that is, in the example of FIG. 5, it is rooo.
1", the equivalent data is stored in link information table 1.
02 from the display information code column 601 and access the data having this display information code (705). Then, an area having an area attribute matching the area number 603 searched here is accessed from among the documents in the document processing editing unit 103 (706). Display information control unit 109
also generates a window for displaying related media,
Information about the area in the document accessed here is displayed in this window (707). The display information is deleted from the character information of the relevant phrase (710), and the text analysis section 1.05
and the speech synthesis process proceeds (711).

一方、表示情報制御部１０９の受けた表示情報が表示路
わりの情報であるなら、すなわち第５図の例でいえばｒ
Ｆ　Ｆ　Ｆ　２Ｊであれば（７０８）、関連メディアの
表示処理を終了し、表示用のウィンドウを閉じる（７０
９）。そしてあとは表示はじまりの表示情報を受けたと
きと同じようにして音声合成処理が進められる（　７１
０．　７１１）。この処理が、テキスト情報色読み上げ
ることを指定した範囲の終わりになるまで続けられる（
　７１２）。On the other hand, if the display information received by the display information control unit 109 is information about the display path, that is, in the example of FIG.
If it is F F F 2J (708), the related media display process is finished and the display window is closed (70
9). Then, the speech synthesis process proceeds in the same way as when the display information at the beginning of the display is received (71
0. 711). This process continues until the end of the range specified for reading text information color (
712).

本実施例によれば、文書編集と同じ環境でテキスト情報
と関連メディアをリンクできるのでユーザにわりやすい
リンク設定のための環境が提供される。また、テキスト
音声合成処理は文節ごとに行われるため、関連メディア
表示処理のために一時的に音声合成処理が中断されたと
き、読み上げる内容がある程度区切りのよいところとな
る。これにより、ユーザに読み上げの停止について不自
然な感じを与えることもなくなる。According to this embodiment, since text information and related media can be linked in the same environment as document editing, a user-friendly environment for link setting is provided. Furthermore, since the text-to-speech synthesis process is performed for each clause, when the speech synthesis process is temporarily interrupted for related media display processing, the content to be read out will have some good separation. As a result, stopping the reading does not give the user an unnatural feeling.

［発明の効果］本発明によれば、マルチメディア文書のテキスト情報を
音声合成処理により読み上げる際に、その内容に合わせ
て図７表７画像を画面に次々と表示していくので、ユー
ザは自動的に表示されるテキストと関連した図１表７画
像を見ながら読み上げる音声を聞くことができるのでマ
ルチメディア文書の内容を容易に理解することができる
。さらに、関連メディアを表示するテキスト情報読み上
げ範囲をユーザが自由に設定できるので、読み上げるス
ピードに合わせて関連メディアを表示できる。[Effects of the Invention] According to the present invention, when the text information of a multimedia document is read aloud by speech synthesis processing, the images in FIG. 7 and Table 7 are displayed one after another on the screen according to the content, so the user can automatically The content of the multimedia document can be easily understood because the user can listen to the voice reading while looking at the images in FIG. 1 and Table 7 associated with the displayed text. Furthermore, since the user can freely set the text information reading range for displaying related media, the related media can be displayed in accordance with the reading speed.

[Brief explanation of the drawing]

第１図は本発明の一実施例のマルチメディア文書処理シ
ステムの全体構成図、第２図はマルチメディア文書の画
面表示例殻示す図、第３図は表示対象属性設定ウィンド
ウを示す図、第４図は文書中の各領域のもつ属性情報を
示す図、第５図は関連メディアの表示開始、終了を表わ
す表示情報が埋め込まれたテキスト情報を示す図、第６
図はリンク情報テーブルの構成図、第７図はテキスト音
声合成処理時に関連メディアを表示するための処理を示
すフローチャートである。１０１・・・マルチメディア文書データベース、１０２
・・・リンク情報テーブル、１０３・・・文書編集処理部、１．０４・・・表示情報解析部、１０５・・・テキスト解析部、１０６・・・音声制御情報生成部、１０８・・・音声合成部。FIG. 1 is an overall configuration diagram of a multimedia document processing system according to an embodiment of the present invention, FIG. 2 is a diagram showing an example of a screen display of a multimedia document, FIG. 3 is a diagram showing a display target attribute setting window, and FIG. Figure 4 shows the attribute information of each area in the document, Figure 5 shows text information embedded with display information indicating the start and end of display of related media, and Figure 6.
FIG. 7 is a configuration diagram of a link information table, and FIG. 7 is a flowchart showing a process for displaying related media during text-to-speech synthesis processing. 101... Multimedia document database, 102
...Link information table, 103...Document editing processing section, 1.04...Display information analysis section, 105...Text analysis section, 106...Audio control information generation section, 108...Audio Synthesis department.

Claims

[Claims] 1. An image of a multimedia document processing system that stores multimedia documents including text information and image information, and has an editing function, a display function, and an audio output function of the text information. A display method that associates a certain range of text information with related image information via link information, and displays image information linked to the text based on the link information when outputting audio of the certain range of text. An image display method for a multimedia document processing system, characterized in that: 2. The image display method for a multimedia document processing system according to claim 1, wherein the link information is recorded before and after a certain range of the text.