JP2003241799A

JP2003241799A - Acoustic encoding method, decoding method, encoding device, decoding device, encoding program, decoding program

Info

Publication number: JP2003241799A
Application number: JP2002039203A
Authority: JP
Inventors: Akio Jin; 明夫神; Takehiro Moriya; 健弘守谷; Kazunaga Ikeda; 和永池田; Takeshi Mori; 岳至森
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: NTT Inc
Priority date: 2002-02-15
Filing date: 2002-02-15
Publication date: 2003-08-29

Abstract

(57)【要約】【課題】スケーラブル符号化を適用するとともに、パケ
ット損失が発生するような状況においても音質劣化が抑
制される音響符号化方法を提供する。【解決手段】入力音響信号から所定の時間間隔で波形を
切り出し（ステップ１０１）、スケーラブル符号化を実
行し（ステップ１０２）。第１の帯域（主要階層Ａ）か
らの符号ビット列Ａ、第１の帯域における符号化誤差を
含む第２の帯域（補助階層Ｂ）からの符号ビット列Ｂ、
第２の帯域までの符号化誤差を含む第３の帯域（補助階
層Ｃ）からの符号ビット列Ｃを得る。これらの符号ビッ
トを、各階層ごとに独立したパケットとなるように、パ
ケット化する（ステップ１０３）。必要に応じ、第１の
帯域側の優先度が高くなるように誤り保護を行った上で
（ステップ１０４）、パケットを伝送する（ステップ１
０５）。 (57) [Summary] [Problem] To provide an acoustic encoding method that applies scalable encoding and suppresses sound quality degradation even in a situation where packet loss occurs. A waveform is cut out at predetermined time intervals from an input audio signal (step 101), and scalable encoding is executed (step 102). A code bit string A from a first band (main layer A), a code bit string B from a second band (auxiliary layer B) including a coding error in the first band,
A code bit sequence C from a third band (auxiliary layer C) including a coding error up to the second band is obtained. These code bits are packetized so as to become an independent packet for each layer (step 103). If necessary, error protection is performed so that the priority on the first band side becomes higher (step 104), and the packet is transmitted (step 1).
05).

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、音響符号化方法、
復号化方法、符号化装置及び復号化装置に関し、特に、
音響信号を入力とするスケーラブルによる音響符号化方
法及び音響符号化装置と、それに対応する音響復号化方
法及び音響復号化装置に関する。TECHNICAL FIELD The present invention relates to an acoustic coding method,
The present invention relates to a decoding method, an encoding device and a decoding device, and in particular,
The present invention relates to a scalable acoustic coding method and acoustic coding apparatus that receive an acoustic signal as an input, and a corresponding acoustic decoding method and acoustic decoding apparatus.

【０００２】[0002]

【従来の技術】楽音信号または音声信号の符号化におい
ては、従来、スケーラブル符号化技術が存在しなかった
ため、音声または楽音の符号ビット列をネットワーク経
由で配信する場合には、非スケーラブル符号化によって
配信していた。非スケーラブル符号化による配信では、
配信する符号ビット列は、パケット配信による場合、１
符号化フレームを１パケットとしたり複数符号化フレー
ムを１パケットとしたり、あるいは、１符号化フレーム
を複数パケットにしたりして、パケット化が行なわれて
いた。また、パケットをインタリーブしたりパケット損
失した場合には再送するなどの方法によってパケット損
失に対する耐性を高める方法が考案されてきた。しかし
ながら、これらの従来の方法では、非スケーラブル符号
化方式であるため、各パケットに含まれる情報は、入力
信号を所定の時間領域で区切った信号系列の全帯域にわ
たる情報であり、したがって、あるパケットが損失すれ
ば、そのパケットに含まれる全帯域の音が損失し、大き
な雑音または歪となって表れる。すなわち、非スケーラ
ブル符号化による音響信号の伝送は、聴感上は必ずしも
好ましいものではなかった。2. Description of the Related Art In the coding of a musical tone signal or a voice signal, there has been no scalable coding technique in the past. Therefore, when a voice or musical tone code bit string is distributed via a network, it is distributed by non-scalable coding. Was. For non-scalable encoded delivery,
The code bit string to be delivered is 1 when the packet delivery is used.
Packetization has been performed by using one encoded frame as one packet, a plurality of encoded frames as one packet, or one encoded frame as a plurality of packets. In addition, methods have been devised to increase resistance to packet loss by interleaving packets or retransmitting packets when packet loss occurs. However, since these conventional methods are non-scalable coding schemes, the information contained in each packet is the information over the entire band of the signal sequence in which the input signal is divided in a predetermined time domain, and therefore a certain packet. Is lost, the sound of the entire band included in the packet is lost, and appears as large noise or distortion. That is, the transmission of the acoustic signal by non-scalable coding is not always preferable from the viewpoint of hearing.

【０００３】例えば、インターネット等において音声や
楽音の符号ビット列をパケット化して配信することを考
えると、ネットワーク上でトラヒックの集中による輻輳
が発生した場合や受信パケットのジッタ吸収に失敗した
場合にはパケット損失による音質劣化が生じることにな
る。この音質劣化は、音切れやバーストノイズとなって
現れ、非常に耳障りである。特に、インターネット上に
おけるＶｏＩＰ（Voice over IP）やインターネットラ
ジオ、インターネットテレビなどを使って音声や楽音な
どの音響信号を配信する際には、ライブ送信であること
が重要であってパケット再生処理などを採用し難いの
で、このようなパケット損失は大きな問題となり、この
パケット損失による音の劣化を抑制することができれ
ば、心地よく音響信号を伝達することができる。Considering, for example, that a code bit string of voice or musical sound is packetized and delivered on the Internet or the like, when congestion occurs due to the concentration of traffic on the network or when the jitter absorption of a received packet fails, the packet is Sound quality deterioration due to loss occurs. This sound quality deterioration appears as sound interruptions and burst noises, and is very annoying. In particular, when transmitting audio signals such as voices and musical sounds using VoIP (Voice over IP), Internet radio, Internet TV, etc. on the Internet, it is important that live transmission is performed and packet reproduction processing is performed. Since it is difficult to adopt, such packet loss becomes a big problem, and if the deterioration of the sound due to the packet loss can be suppressed, the acoustic signal can be comfortably transmitted.

【０００４】ところで、本発明者らは、特許第３１３９
６０２号明細書（あるいは特開平８−２６３０９６号公
報）において、復号品質や符号化圧縮率に選択性を持た
せるスケーラブル符号化による音響信号の符号化方法と
して、ある帯域（第１帯域）の音響を符号化し、第１帯
域よりも広い帯域の第２帯域の音響と第１帯域の符号化
残差を符号化し、という一連の処理を繰り返す方法を提
案している。この方法によれば、下位層（第１帯域側）
と上位層とに異なる圧縮技術による符号化を適用した場
合であっても、上位層までの復号信号において符号化品
質が低下せず、また、どの階層で復号しても聴感上の復
号品質が最適となる、という効果を奏する。By the way, the inventors of the present invention have disclosed the patent No. 3139.
No. 602 (or Japanese Unexamined Patent Publication No. H8-263096) describes an audio signal of a certain band (first band) as an audio signal coding method by scalable coding that gives selectivity to decoding quality and coding compression rate. Has been proposed, and a method of repeating a series of processing of encoding the sound of the second band that is wider than the first band and the coding residual of the first band, and repeating the processing. According to this method, the lower layer (first band side)
Even when the encoding by different compression techniques is applied to the upper layer and the upper layer, the encoding quality does not deteriorate in the decoded signal up to the upper layer, and the decoding quality in the auditory sense does not occur at any layer. It has the effect of being optimal.

【０００５】しかしながら、このようなスケーラブル符
号化であっても、従来の方法では、パケット損失等があ
った場合に、聴感上かなり大きな音質劣化が生じること
がある。However, even with such a scalable coding, the conventional method may cause a considerable sound quality deterioration in the sense of hearing when there is a packet loss or the like.

【０００６】[0006]

【発明が解決しようとする課題】上述したように、非ス
ケーラブル符号化方法により音響信号を符号化し、さら
に符号化された信号をパケット化した場合には、伝送時
のパケット損失などにより音切れやバーストノイズが現
れるという問題点がある。スケーラブル符号化方法を用
いた場合であっても、パケット損失により音質劣化が生
じ得る。そこで、パケット損失時にも音質があまり劣化
しない符号化方式やパケット伝送方式が要望されてい
る。As described above, when an acoustic signal is coded by the non-scalable coding method and the coded signal is packetized, sound loss or noise may occur due to packet loss during transmission. There is a problem that burst noise appears. Even if the scalable coding method is used, sound quality deterioration may occur due to packet loss. Therefore, there is a demand for an encoding method and a packet transmission method that do not significantly deteriorate the sound quality even when a packet is lost.

【０００７】本発明の目的は、スケーラブル符号化を適
用するとともに、パケット損失が発生するような状況に
おいても音質劣化が抑制されるか、または音質劣化が抑
制されるように送信パケットの誤り保護対象や誤り保護
の強さを制御できる、音響符号化方法及び音響符号化装
置と、それに対応する音響復号化方法及び音響復号化装
置を提供することにある。An object of the present invention is to apply scalable coding and to suppress sound quality deterioration even in a situation where packet loss occurs, or to subject a transmission packet to error protection so as to suppress sound quality deterioration. Another object of the present invention is to provide an acoustic encoding method and an acoustic encoding device capable of controlling the strength of error protection, and an acoustic decoding method and an acoustic decoding device corresponding thereto.

【０００８】[0008]

【課題を解決するための手段】本発明の第１の音響符号
化方法は、音響信号を符号化して伝送する音響符号化方
法であって、入力する音響符号に対してスケーラブル符
号化を行い、スケーラブル符号化における各階層ごとに
得られた符号をパケット化し、各階層のパケットを配信
する。ここで、重要度の高い階層のパケットほど優先度
が高くなるように誤り保護処理を行ってから各階層のパ
ケットを配信するようにすることが好ましい。A first acoustic coding method of the present invention is an acoustic coding method for coding and transmitting an acoustic signal, wherein scalable coding is performed on an input acoustic code, The code obtained for each layer in the scalable coding is packetized, and the packet of each layer is distributed. Here, it is preferable that the packet of each layer is delivered after performing the error protection processing so that the packet of the layer of higher importance has a higher priority.

【０００９】本発明の第２の音響符号化方法は、音響信
号をＮ（Ｎは２以上の整数）個の帯域ごとに符号化する
音響符号化方法であって、ｉを２以上Ｎ以下の各整数と
して、第ｉの帯域は第ｉ−１の帯域を含み、音響信号の
第１の帯域の成分信号を抽出する第１の抽出過程と、第
１の帯域の成分信号を符号化して第１の符号を求める第
１の符号化過程と、第１乃至第ｉ−１の符号に基づいて
第ｉ−１の帯域の復号信号を求める第ｉ−１の復号過程
と、音響信号の第ｉの帯域の成分信号を抽出する第ｉの
抽出過程と、第ｉの帯域の成分信号と第ｉ−１の帯域の
復号信号との残差信号を符号化して第ｉの符号を求める
第ｉの符号化過程と、第１乃至第Ｎの符号を前記各帯域
ごとに分けて伝送する過程と、を有する。ここで、第ｉ
−１の帯域での伝送誤りまたは欠落が第ｉの帯域におけ
る伝送誤りまたは欠落よりも少なくなるように誤り保護
処理を行ってから、伝送する過程を実施することが好ま
しい。A second acoustic encoding method of the present invention is an acoustic encoding method for encoding an acoustic signal for every N (N is an integer of 2 or more) bands, where i is 2 or more and N or less. As each integer, the i-th band includes the (i-1) -th band, and the first extraction step of extracting the component signal of the first band of the acoustic signal and the first extraction process by encoding the component signal of the first band. The first encoding process for obtaining the first code, the (i-1) th decoding process for obtaining the decoded signal in the (i-1) th band based on the first to (i-1) th codes, and the i-th acoustic signal The i-th extraction step of extracting the component signal of the i-th band and the i-th extraction step of encoding the residual signal of the component signal of the i-th band and the decoded signal of the i-1-th band to obtain the i-th code The encoding process and the process of transmitting the first to Nth codes separately for each band are included. Where the i-th
It is preferable to perform the error protection processing so that the transmission error or omission in the -1 band is smaller than the transmission error or omission in the i-th band, and then perform the process of transmitting.

【００１０】本発明の音響復号化方法は、音響符号化に
よる符号を格納したパケットを受信して前記符号を復号
する音響復号化方法であって、音響符号化は複数の階層
を有するスケーラブル符号化であって、パケットは階層
ごとに設定されており、パケット伝送におけるパケット
損失の有無を検査する過程と、パケット損失が検出され
なかった階層ごとに、当該階層のパケットから符号を取
り出して復号し、復号信号を得る過程と、パケット損失
が検出された階層に関しては無音として、各復号信号を
加算する過程と、加算の結果に基づいて音響を再生する
過程と、を有する。The acoustic decoding method of the present invention is an acoustic decoding method of receiving a packet storing a code by acoustic coding and decoding the code, wherein the acoustic coding is a scalable coding having a plurality of layers. The packet is set for each layer, the process of inspecting the presence or absence of packet loss in packet transmission, and for each layer in which no packet loss is detected, the code is extracted from the packet of the layer and decoded. The process has a process of obtaining a decoded signal, a process of adding each decoded signal as silence for a layer in which packet loss is detected, and a process of reproducing sound based on the result of the addition.

【００１１】本発明の第１の音響符号化装置は、音響信
号を符号化して伝送する音響符号化装置であって、入力
する音響符号に対してスケーラブル符号化を行う手段
と、スケーラブル符号化における各階層ごとに得られた
符号をパケット化する手段と、各階層のパケットを配信
する手段と、を有する。ここで、伝送前に、重要度の高
い階層のパケットほど優先度が高くなるように誤り保護
を行う手段をさらに設けることが好ましい。A first acoustic coding apparatus of the present invention is an acoustic coding apparatus that codes and transmits an acoustic signal, and is a means for performing scalable coding on an input acoustic code, and in the scalable coding. It has means for packetizing the code obtained for each layer and means for distributing the packet of each layer. Here, before transmission, it is preferable to further provide a means for performing error protection so that a packet of a layer of higher importance has a higher priority.

【００１２】本発明の第２の音響符号化装置は、音響信
号をＮ（Ｎは２以上の整数）個の帯域ごとに符号化する
音響符号化装置であって、ｉを２以上Ｎ以下の各整数と
して、第ｉの帯域は第ｉ−１の帯域を含み、音響信号の
第１の帯域の成分信号を抽出する第１の抽出器と、第１
の帯域の成分信号を符号化して第１の符号を求める第１
の符号化器と、第１乃至第ｉ−１の符号に基づいて第ｉ
−１の帯域の復号信号を求める第ｉ−１の復号器と、音
響信号の第ｉの帯域の成分信号を抽出する第ｉの抽出器
と、第ｉの帯域の成分信号と前記第ｉ−１の帯域の復号
信号との残差信号を符号化して第ｉの符号を求める第ｉ
の符号化器と、第１乃至第Ｎの符号を前記各帯域ごとに
分けて伝送する手段と、を有する。を有する音響符号化
装置。A second acoustic coding apparatus of the present invention is an acoustic coding apparatus which codes an acoustic signal for each of N (N is an integer of 2 or more) bands, and i is 2 or more and N or less. As each integer, the i-th band includes the (i-1) -th band, and a first extractor for extracting a component signal of the first band of the acoustic signal;
A first component to encode a component signal in the band
, And the i-th code based on the first to (i-1) th codes.
−1 th decoder for obtaining a decoded signal in the −1 band, an i th extractor for extracting a component signal in the i th band of the acoustic signal, a component signal in the i th band and the i th − The i-th code is obtained by encoding the residual signal with the decoded signal in the 1-band
Encoder and means for transmitting the first to Nth codes separately for each of the bands. An audio encoding device having.

【００１３】本発明の音響復号化装置は、音響符号化に
よる符号を格納したパケットを受信して符号を復号する
音響復号化装置であって、音響符号化は複数の階層を有
するスケーラブル符号化であって、パケットは前記階層
ごとに設定されており、パケット伝送におけるパケット
損失の有無を検査する手段と、パケット損失が検出され
なかった階層ごとに、当該階層のパケットから符号を取
り出して復号し、復号信号を得る手段と、パケット損失
が検出された階層に関しては無音として、各復号信号を
加算する手段と、加算の結果に基づいて音響を再生する
手段と、を有する。An acoustic decoding apparatus of the present invention is an acoustic decoding apparatus that receives a packet storing a code by acoustic coding and decodes the code, and the acoustic coding is a scalable coding having a plurality of layers. Then, the packet is set for each layer, means for inspecting the presence or absence of packet loss in packet transmission, and for each layer in which no packet loss is detected, the code is extracted from the packet of the layer and decoded, It has means for obtaining a decoded signal, means for adding each decoded signal as silence for the layer in which packet loss is detected, and means for reproducing sound based on the result of the addition.

【００１４】すなわち、本発明では、音声または楽音の
符号ビット列をパケット化して配信する場合に、スケー
ラブル符号化によって符号情報を階層化し、各階層ごと
に符号ビット列を独立してパケット化し配信する。この
方法によって、一部分の配信パケットが損失しても、音
が途切れたり、大きな歪や雑音が発生することははとん
どなくなる。さらに、このようにしてスケーラブル符号
ビット列を送る際に、人の聴覚に聞こえやすい主要な音
を含む階層のパケットを重点的に誤り保護したり、階層
に優先順位をつけて、重要度の高い階層のパケットに強
い誤り保護を実施することによって、さらにパケット損
失時の音質劣化を小さく抑えることができる。That is, according to the present invention, when packetizing a code bit string of a voice or a musical tone for distribution, the code information is layered by scalable coding, and the code bit string is independently packetized and distributed for each layer. By this method, even if a part of the delivery packet is lost, the sound is not interrupted, and the large distortion or noise hardly occurs. Furthermore, when sending a scalable code bit string in this way, error protection is focused on packets in a layer that contains major sounds that are easily heard by humans, or layers are prioritized to create a layer of high importance. By implementing strong error protection on the packet, it is possible to further suppress the sound quality deterioration due to the packet loss.

【００１５】本発明において、パケット損失には、所定
の遅延時間内に受信側にパケットが到達しないことも含
まれる。In the present invention, the packet loss includes that a packet does not reach the receiving side within a predetermined delay time.

【００１６】[0016]

【発明の実施の形態】次に、本発明の好ましい実施の形
態について、図面を参照して説明する。BEST MODE FOR CARRYING OUT THE INVENTION Next, preferred embodiments of the present invention will be described with reference to the drawings.

【００１７】本発明の音響符号化方法では、スケーラブ
ル符号化を採用するとともに、スケーラブル符号化での
階層ごとにパケット化を行う。その結果、パケット損失
があったとしても、同一の符号化フレームに属する複数
のパケットが同時に損失することは稀であると考えられ
るため、いずれかの帯域の波形が残ることとなって、著
しい音質劣化を防止することができる。さらに、最も重
要な音響成分を含む帯域に対応するパケットを、伝送誤
りやパケット消失に対して高い優先度で伝送するように
すれば、最も重要な音響成分に対応するパケットにおけ
る伝送誤りやパケット消失が防止され、それ以外のパケ
ットにおいて伝送誤りやパケット消失が発生したとして
も、聴感上、ほとんど音質が劣化していないように感じ
られるようになる。In the acoustic coding method of the present invention, scalable coding is adopted and packetization is performed for each layer in scalable coding. As a result, even if there is a packet loss, it is rare that multiple packets belonging to the same coded frame are lost at the same time, and the waveform in either band remains, resulting in significant sound quality. It is possible to prevent deterioration. Furthermore, if the packet corresponding to the band containing the most important acoustic component is transmitted with high priority for transmission error and packet loss, the transmission error and packet loss in the packet corresponding to the most important acoustic component will be transmitted. Is prevented, and even if a transmission error or packet loss occurs in the other packets, the sound quality is almost perceived as if it is not deteriorated.

【００１８】スケーラブル符号化における階層数として
は適宜の値を採用することができる。ここでスケーラブ
ル符号化における階層数をＮとおくと（ただしＮ≧
２）、本発明に基づく音響符号化方法では、具体的に
は、ｉを２≦ｉ≦Ｎの各整数とし、入力する音響信号を
第ｉの帯域が第ｉ−１の帯域を含むようにＮ個の帯域に
分割するものとして、音響信号の第１の帯域の成分信号
を抽出する第１の抽出過程と、第１の帯域の成分信号を
符号化して第１の符号を求める第１の符号化過程と、第
１の符号〜第ｉ−１の符号に基づいて第ｉ−１の帯域の
復号信号を求める第ｉ−１の復号過程と、音響信号の第
ｉの帯域の成分信号を抽出する第ｉの抽出過程と、第ｉ
の帯域の成分信号と第ｉ−１の帯域の復号信号との残差
信号を符号化して第ｉの符号を求める第ｉの符号化過程
とを実行するとともに、必要に応じて第ｉ−１の帯域で
の伝送誤りまたは欠落が第ｉの帯域の伝送誤りまたは欠
落よりも少なくするように、このように得られたＮ個の
符号（第１の符号〜第Ｎの符号）に対して伝送処理を行
う。なお、第ｉ−１の復号過程を実行する際、ｉ＞２で
あれば、この時点で第ｉ−２の帯域の復号信号が求めら
れていれば、スケーラブル符号化における符号の階層性
により、第ｉ−１の符号のみを復号して得た信号に対し
て第ｉ−２の帯域の復号信号を加算することにより、第
ｉ−１の帯域の復号信号を得ることができる。また、第
１の符号化過程で用いる圧縮技術、第２の符号化過程で
用いる圧縮技術、…、第Ｎの符号化過程で用いる圧縮技
術は、それぞれ、異なるものであってもよい。階層符号
化の考え方からすれば、各階層に応じて最適な符号化方
法、圧縮方法が採用されるべきである。An appropriate value can be adopted as the number of layers in the scalable coding. Here, if the number of layers in scalable coding is N (where N ≧
2) In the acoustic coding method according to the present invention, specifically, i is an integer of 2 ≦ i ≦ N, and the input acoustic signal includes the i-th band including the i−1-th band. A first extraction step of extracting the component signal of the first band of the acoustic signal as a division into N bands and a first extraction step of encoding the component signal of the first band to obtain the first code An encoding process, an (i-1) th decoding process for obtaining a decoded signal in the (i-1) th band based on the first code to the (i-1) th code, and a component signal in the i-th band of the acoustic signal are performed. The i-th extraction process of extracting, and the i-th
Performing the i-th encoding process of encoding the residual signal of the component signal of the band and the decoded signal of the i-1th band to obtain the i-th code, and if necessary, the i-1th For the N codes (the first code to the Nth code) obtained in this way so that the transmission error or omission in the band of is less than the transmission error or omission in the i-th band. Perform processing. When performing the i−1 th decoding process, if i> 2, if a decoded signal in the i−2 th band is required at this point, due to the hierarchical nature of the code in scalable encoding, A decoded signal in the i-1th band can be obtained by adding a decoded signal in the i-2th band to a signal obtained by decoding only the i-1th code. The compression technique used in the first encoding process, the compression technique used in the second encoding process, ..., And the compression technique used in the Nth encoding process may be different from each other. From the viewpoint of layered coding, an optimal coding method and compression method should be adopted according to each layer.

【００１９】図１は、本発明に基づく音響符号化方法を
説明する図であり、ここでは、スケーラブル符号化を例
として３階層で表している。音の主要な帯域を網羅する
階層を主要階層Ａとし、その補助的な階層をＢ、Ｃとす
る。音の主要な帯域とは、音声を主とする音源の場合に
は、例えば、０〜４ｋＨｚ程度の帯域であり、音楽（あ
るいは楽音）を主とする音源の場合には、例えば、０〜
８ｋＨｚ程度である。補助階層Ｂは主要階層Ａの帯域を
完全に含み、補助階層Ｃは補助階層Ｂの帯域を完全に含
んでいる（したがって補助階層Ｃは主要階層Ａの帯域も
完全に含んでいる）。もちろん、スケーラブル符号化に
おける階層数は２以上であればいくつであってもよい。FIG. 1 is a diagram for explaining an acoustic coding method according to the present invention, and here, scalable coding is shown as an example in three layers. A layer that covers a main band of a sound is a main layer A, and its auxiliary layers are B and C. The main band of sound is, for example, a band of about 0 to 4 kHz in the case of a sound source mainly composed of voice, and is, for example, 0 to 0 in the case of a sound source mainly composed of music (or musical sound).
It is about 8 kHz. The sub-tier B completely contains the band of the main tier A and the sub-tier C completely contains the band of the sub-tier B (thus the sub-tier C also completely contains the band of the main tier A). Of course, the number of layers in scalable encoding may be any number as long as it is 2 or more.

【００２０】入力信号（音響信号）から所定の時間間隔
（符号化フレーム単位）で波形を切り出し（ステップ１
０１）、スケーラブル符号化を実行する（ステップ１０
２）。スケーラブル符号化としては、上述した特許第３
１３９６０２号明細書に開示されたような処理を用いる
ことができる。ここでは入力信号のパワースペクトルが
階層Ａ〜Ｃに対応して図示２００に示すようになってい
るとする。なお、このパワースペクトルにおいて、階層
Ａの領域に細長く位置する階層Ｂの領域は、階層Ａを符
号化した後の符号化残差に対応している。A waveform is cut out from an input signal (acoustic signal) at a predetermined time interval (encoding frame unit) (step 1
01), perform scalable coding (step 10)
2). As scalable coding, the above-mentioned Patent No. 3
Treatments such as those disclosed in 139602 may be used. Here, it is assumed that the power spectrum of the input signal is as shown in FIG. 200 corresponding to the layers A to C. In this power spectrum, the area of the layer B, which is located in the elongated area of the layer A, corresponds to the coding residual after the coding of the layer A.

【００２１】スケーラブル符号化によって、原音から符
号化フレーム単位で切り出された波形が符号化され、主
要階層Ａから符号ビット列Ａが生じ、補助階層Ｂから符
号ビット列Ｂが生じ、補助階層Ｃから符号ビット列Ｃが
生じる。これらはスケーラブルなビット列となる。By the scalable coding, the waveform cut out from the original sound in coding frame units is coded, the code bit string A is generated from the main layer A, the code bit string B is generated from the auxiliary layer B, and the code bit string is generated from the auxiliary layer C. C occurs. These are scalable bit strings.

【００２２】このように得られた符号ビット列は、次
に、パケット化される（ステップ１０３）。このスケー
ラブルな符号ビット列をパケット化する際には、各階層
ごとに独立したパケットとする。その結果、符号ビット
列Ａ〜Ｃに対応してパケットＡ〜Ｃが得られる。このよ
うにして得られたパケットＡ〜Ｃは、後述するように必
要に応じて誤り保護を行った上で（ステップ１０４）、
伝送路１１０上に送出され伝送される（ステップ１０
５）。階層Ａ〜Ｃごとに独立したパケットとしているの
で、伝送路１１０上においても、これらのパケットＡ〜
Ｃが階層ごとに独立して伝送されることになる。このよ
うなパケットは、インターネットなどの回線を経由して
受信側に到達する。The code bit string thus obtained is then packetized (step 103). When packetizing this scalable code bit string, each layer is an independent packet. As a result, packets A to C are obtained corresponding to the code bit strings A to C. The packets A to C thus obtained are subjected to error protection as necessary as described later (step 104),
It is sent out and transmitted on the transmission line 110 (step 10).
5). Since each layer A to C has an independent packet, these packets A to C are also provided on the transmission line 110.
C will be transmitted independently for each layer. Such packets reach the receiving side via a line such as the Internet.

【００２３】なお、ここでは、主要階層Ａ、補助階層
Ｂ，Ｃの全てのパケットについて送信するようにしてい
るが、全ての階層のパケットを伝送するのではなく、主
要階層から予め定めた階層までのパケットを送信するこ
とも可能である。受信側での処理については後述する
が、受信側においても、受信した全てのパケットに基づ
いて復号化を行うのではなく、必要とする音質に応じ、
主要階層から予め定めた階層までのパケットを用いて復
号化を行い音を再生することが可能である。また、上述
の説明では、１符号化フレーム単位でパケット化するも
のとしているが、２〜３符号化フレームでパケット化を
行うようにしてもよい。また、パケットＡ〜Ｃをそれぞ
れ独立して伝送するものとしているが、主要階層Ａの符
号と補助階層Ｂの符号をあわせて１パケットとし、補助
階層Ｃの符号を１パケットとするように、複数の階層の
符号から１パケットを生成してもよい。複数の階層の符
号から１パケットを生成する場合、全ての階層の符号を
１パケットとしたのでは、パケット損失が起きたときに
重大な音質劣化が生じることとなるから、階層数より生
成するパケット数は少なくてよいものの、入力音響信号
における同一時間帯に対して複数個のパケットが生成す
るようにする。Although all the packets of the main layer A and the auxiliary layers B and C are transmitted here, the packets of all the layers are not transmitted, but from the main layer to a predetermined layer. It is also possible to transmit the packet of. The processing on the receiving side will be described later, but the receiving side does not perform decoding based on all the received packets, but according to the required sound quality,
It is possible to reproduce sound by decoding using packets from the main layer to a predetermined layer. Also, in the above description, packetization is performed in units of one encoded frame, but packetization may be performed in 2-3 encoded frames. Further, although the packets A to C are independently transmitted, a plurality of packets are used so that the code of the main layer A and the code of the auxiliary layer B are combined into one packet and the code of the auxiliary layer C is one packet. One packet may be generated from the code of the hierarchy. When one packet is generated from codes of a plurality of layers, if the codes of all layers are set to one packet, serious sound quality deterioration will occur when packet loss occurs. Although the number may be small, a plurality of packets are generated in the same time zone in the input acoustic signal.

【００２４】パケット配信の際には、ネットワーク上で
トラヒックの集中による輻輳が発生したり、受信パケッ
トのジッタ吸収に失敗するなど、パケットが損失する場
合がある。このようなパケット損失がある場合、普通の
非スケーラブル符号化方式では、１符号化フレームまた
は複数の符号化フレームを１パケットとしてパケット化
し、これらのパケットは全帯域の情報を含むので、パケ
ット損失のある符号化フレーム部分は、全帯域が消失し
てしまい、音質が著しく劣化する。しかしながらこの実
施の形態の場合、１符号化フレームを構成するパケット
Ａ、パケットＢ、パケットＣが同時に全て損失すること
は稀であると考えられるため、図２（ａ）〜（ｃ）に示
すように、どこかの帯域の波形が損失せずに残ることと
なる。したがって、音質が著しく劣化することはほとん
どない。図１にパワースペクトルを示した音響信号をス
ケーラブル符号化して伝送したとして、図２（ａ）は主
要階層Ａのパケットが損失したとしたときのパワースペ
クトルを示し、図２（ｂ）は補助階層Ｂのパケットが損
失したときのパワースペクトルを示し、図２（ｃ）は補
助階層Ｃのパケットが損失したときのパワースペクトル
を示している。During packet delivery, there are cases where packets are lost, such as congestion occurring on the network due to concentration of traffic, and failure in absorbing received packet jitter. When such a packet loss occurs, in a normal non-scalable coding method, one coded frame or a plurality of coded frames is packetized as one packet, and since these packets include information of the entire band, packet loss is reduced. In a certain coded frame portion, the entire band is lost and the sound quality is significantly deteriorated. However, in the case of this embodiment, it is considered rare that all of the packets A, B, and C constituting one encoded frame are lost at the same time, and therefore, as shown in FIGS. In addition, the waveform of some band remains without loss. Therefore, the sound quality hardly deteriorates significantly. Assuming that the acoustic signal whose power spectrum is shown in FIG. 1 is scalable encoded and transmitted, FIG. 2A shows the power spectrum when the packet of the main layer A is lost, and FIG. 2B shows the auxiliary layer. FIG. 2C shows a power spectrum when the packet of B is lost, and FIG. 2C shows a power spectrum when the packet of the auxiliary layer C is lost.

【００２５】ここで述べた例では、スケーラブル符号化
の特性により、主要階層Ａに対応するパケットＡが損失
せずに残っていれば、他の階層のパケット（パケット
Ｂ，Ｃ）が損失しても音質はほとんど劣化しないことに
なる。そこで、図１におけるステップ１０４の誤り保護
を実施するとともに、誤り保護においては、重要な階層
のパケットについては伝送路においてより損失しにくく
なるように、各階層ごとにパケットに優先順位を設定す
るようにするとよい。ここで述べた例で言えば、音響信
号の主要な要素を構成するパケットＡは優先度大とし、
次に主要なパケットＢは優先度中とし、最も補助的なパ
ケットＣは優先度小とする。そして、このような優先度
に基づいて、優先度の高いパケットＡに対しては高度の
誤り保護を行って、受信側におけるパケット損失の割合
が極めて小さい所定の値以下となるようにし、優先度が
中程度のパケットＢに対しては中程度の誤り保護を行っ
て、受信側におけるパケット損失の割合が誤り保護を行
わない場合よりも小さいがパケットＡの場合よりも大き
くなるようにする。優先度が小のパケットＣに対して
は、軽度の誤り保護を行うか誤り保護を行わないように
する。あるいは、ＱｏＳ(Quality of Service)技術など
の利用により、パケットＡについては早く目的地に到達
するように優先度を高くし、パケットＣはゆっくり到達
してもよいものと設定するような方法もある。In the example described here, due to the characteristics of scalable coding, if the packet A corresponding to the main layer A remains without loss, the packets (packets B and C) of other layers will be lost. However, the sound quality will hardly deteriorate. Therefore, the error protection of step 104 in FIG. 1 is performed, and in the error protection, the packets of the important layers are prioritized so that the packets are less likely to be lost in the transmission path. It should be set to. In the example described here, the packet A that constitutes the main element of the acoustic signal has a high priority,
Next, the main packet B has a medium priority, and the most auxiliary packet C has a low priority. Then, based on such a priority, a high degree of error protection is performed for the packet A having a high priority so that the rate of packet loss on the receiving side becomes an extremely small predetermined value or less, The medium error protection is performed for the medium packet B of which the packet loss ratio on the receiving side is smaller than that of the case where the error protection is not performed, but larger than that of the packet A. For the packet C having a low priority, the light error protection is performed or the error protection is not performed. Alternatively, by using QoS (Quality of Service) technology or the like, there is a method in which the priority of the packet A is set higher so that the destination is reached earlier, and the packet C is set such that the packet C may be reached slowly. .

【００２６】伝送路上などでエラーが発生した場合に受
信側におけるパケット損失を防ぐ誤り保護の手法として
は、冗長符号や再送などの各種のものが知られており、
本発明では、適宜の誤り保護の手法を用いることができ
る。例えば、「損失パケット」を「ある遅延時間（例え
ば５００ミリ秒）で受信側へ到達できなかったパケッ
ト」とするならば、最重要の階層である第１の階層（こ
こでの主要階層Ａ）についての再送要求信号を受信側か
ら送信側に送ってそのパケットを再送させる、あるい
は、二重、三重のパケット送信を行うなどの手法があ
る。また、第１の階層のみ、エラー再送や誤り訂正処理
などを組み込んだプロトコルであるＴＣＰ(transmissio
n control protocol)を用いて伝送し、他の階層（ここ
での補助階層Ｂ，Ｃ）についてはエラー再送や誤り訂正
処理などを含まないプロトコルであるＵＤＰ(user diag
ram protocol)を用いて伝送する手法もある。さらに
は、例えば、特許第３２１２１２３号明細書（あるいは
特開平５−２８１９９８号公報）に記載されているよう
に、訂正能力、検出能力が異なる複数の誤り訂正符号器
や誤り訂正復号器を用い、第１の階層のパケットに対し
ては、例えばリードソロモン符号など、訂正能力、検出
能力が高い誤り訂正符号化を行い、第２の階層（ここで
の補助階層Ｂ）については、第１の階層で用いたものよ
りも訂正能力、検出能力が劣る、例えばＢＣＨ(Bose-Ch
audhuri-Hocquennghem)符号などを用いて誤り訂正符号
化を行い、第３の階層（ここでの補助階層Ｃ）について
は誤り訂正符号化を行わないなどの方法もある。パケッ
トＡのみ高速で確実な専用線などを経由して伝送し、パ
ケットＢ，Ｃはある程度の遅延や誤りをゆするインター
ネットなどを経由して伝送する、という方法網明日。Various methods such as redundant code and retransmission are known as error protection methods for preventing packet loss on the receiving side when an error occurs on a transmission line.
In the present invention, an appropriate error protection method can be used. For example, if the "lost packet" is "a packet that could not reach the receiving side within a certain delay time (for example, 500 milliseconds)", the first layer, which is the most important layer (main layer A here) Is sent from the receiving side to the transmitting side to retransmit the packet, or double or triple packet transmission is performed. In addition, TCP (transmissio), which is a protocol incorporating error resending and error correction processing only in the first layer.
n control protocol), and the other layers (auxiliary layers B and C here) do not include error retransmission and error correction processing.
There is also a method of transmitting using a ram protocol). Furthermore, for example, as described in Japanese Patent No. 3212123 (or Japanese Patent Application Laid-Open No. 5-281998), a plurality of error correction encoders and error correction decoders having different correction capabilities and detection capabilities are used, For the packets of the first layer, error correction coding with high correction capability and detection capability, such as Reed-Solomon code, is performed, and for the second layer (auxiliary layer B here), the first layer is used. The correction ability and the detection ability are lower than those used in, for example, BCH (Bose-Ch
There is also a method in which error correction coding is performed using an audhuri-Hocquennghem) code or the like, and error correction coding is not performed for the third layer (auxiliary layer C here). A method that only packet A is transmitted via a high-speed and reliable leased line, and packets B and C are transmitted via the Internet, which causes some delay or error.

【００２７】図３（ａ）〜（ｃ）は、図１にパワースペ
クトルを示した音響信号をスケーラブル符号化して伝送
したとして、このような誤り保護を行った場合の受信側
で得られるパワースペクトルを示している。図３（ａ）
はパケットＢとパケットＣが損失した場合を示し、図３
（ｂ）はパケットＣが損失した場合を示し、図３（ｃ）
はパケットＢが損失した場合を示している。主要な帯域
に対するパケットＡはなるべく損失しないように保護さ
れるため、パケット損失が発生したとしても主要な帯域
は波形は残存し、したがって、音質はほとんど劣化しな
くなる。FIGS. 3A to 3C show the power spectrum obtained on the receiving side when such error protection is performed, assuming that the acoustic signal whose power spectrum is shown in FIG. 1 is scalable encoded and transmitted. Is shown. Figure 3 (a)
Shows a case where packet B and packet C are lost, and FIG.
3B shows the case where the packet C is lost, and FIG.
Indicates the case where packet B is lost. Since the packet A for the main band is protected so as not to be lost as much as possible, the waveform remains in the main band even if the packet loss occurs, and therefore the sound quality is hardly deteriorated.

【００２８】次に、このような誤り保護について説明す
る。ここでは、３階層スケーラブル符号化によって、ス
ケーラブルな階層構造を持つ符号ビット列を各階層ごと
にパケット化して配信する場合に、パケット損失が生じ
てもあまり音質が劣化しないようにするための構成を説
明する。図４（ａ）に示すように、ネットワーク１０に
送信側（符号化装置）１１と受信側（復号化装置）１２
が接続しているものとする。Next, such error protection will be described. Here, a configuration will be described in which, when the code bit string having a scalable hierarchical structure is packetized for each layer and distributed by the three-layer scalable coding, the sound quality is not deteriorated much even if a packet loss occurs. To do. As shown in FIG. 4A, a transmission side (encoding device) 11 and a reception side (decoding device) 12 are connected to the network 10.
Are connected.

【００２９】図４（ｂ）に示すように、まず、音響信号
符号パケットを配信する前に、何らかの方法で送信側ま
たは受信側または送受信間でネットワーク１０の混み具
合を調査、予想し（ステップ１３１）、パケット損失推
定値を得る（ステップ１３２）。ネットワークの混み具
合とは、ネットワーク中で、どの程度、通信が占有され
ているかを示すパラメータであり、例えば、ＴＣＰプロ
トコルを使用している場合であれば、送信側と受信側の
間でパケットを監視したり、あるいは、本番の送信の前
にテストパケットを配信し受信側に至るまでにどれくら
いの割合で欠落するかを観測する方法などによって、予
測することができる。ネットワークの混み具合を受信側
で調査する場合には、受信側から送信側に向けて調査結
果を通知する。そして、推定されたパケット損失率に応
じて、スケーラブル符号化の各階層に対し、受信側への
受信成功確率が指定値以上に保証されるように誤り保護
を設定し（ステップ１３３）、各階層のパケットを、そ
のような設定値に基づいて誤り保護情報を与えてから配
信する（ステップ１３４）。As shown in FIG. 4B, first, before distributing the acoustic signal code packet, the congestion degree of the network 10 is investigated and predicted by some method between the transmitting side or the receiving side or between the transmitting and receiving (step 131). ), And obtain a packet loss estimate (step 132). The congestion degree of the network is a parameter indicating how much communication is occupied in the network. For example, when the TCP protocol is used, packets are transmitted between the transmission side and the reception side. It can be predicted by monitoring or by observing how much a test packet is delivered before the actual transmission and is dropped at the receiving side. When the receiving side investigates the congestion level of the network, the receiving side notifies the transmitting side of the investigation result. Then, according to the estimated packet loss rate, error protection is set for each layer of the scalable coding so that the reception success probability to the receiving side is guaranteed to be a specified value or more (step 133), and each layer is set. The packet is delivered after the error protection information is given based on such a set value (step 134).

【００３０】具体的な数値で例を示すと、ここでは簡単
のため、（誤り保護処理以前の）全パケットのパケット
サイズが同一であるとして、平均パケット損失率推定値
が１５％以下であるときには、パケットの受信側への受
信成功確率において、主要階層Ａのパケットを９９％以
上、補助階層Ｂのパケットを９５％以上、補助階層Ｃの
パケットを９１％以上で保証されるようにしたり、ある
いは、平均パケット損失率推定値が１６％以上５５％以
下であるときには、パケットの受信側への受信成功確率
において、主要階層Ａのパケットを９５％以上、補助階
層Ｂのパケットを８０％以上、補助階層Ｃのパケットを
７０％以上で保証されるようにする、などの指定を行
う。To give an example with specific numerical values, here, for simplicity, assuming that the packet size of all packets (before error protection processing) is the same, when the average packet loss rate estimated value is 15% or less, In the probability of successful reception of the packet to the receiving side, the packet of the main layer A is guaranteed to be 99% or more, the packet of the auxiliary layer B is 95% or more, and the packet of the auxiliary layer C is 91% or more, or , If the average packet loss rate estimated value is 16% or more and 55% or less, in the reception success probability of the packet to the receiving side, the packet of the main layer A is 95% or more, the packet of the auxiliary layer B is 80% or more, and the auxiliary Designation is made such that the packet of the layer C is guaranteed at 70% or more.

【００３１】そして送信終了かどうかを判断し（ステッ
プ１３５）、送信終了であれば処理を終了し、送信終了
でなければ、次のパケットに誤り保護情報を与えてその
パケットを送出するために、ステップ１３４に戻る。Then, it is judged whether or not the transmission is completed (step 135). If the transmission is completed, the process is terminated. If the transmission is not completed, error protection information is given to the next packet and the packet is transmitted. Return to step 134.

【００３２】このような誤り保護処理を行うことによ
り、主要階層の音が優先的に保護される仕組みが達成で
きるため、受信側でデコードする際に、上述したパケッ
ト損失率の推定値程度のパケット損失が発生したとして
も、音質の劣化の度合いを小さくすることができる。By performing such an error protection process, the mechanism of preferentially protecting the sound of the main layer can be achieved. Therefore, when decoding at the receiving side, a packet having a packet loss rate of the above estimated value is used. Even if a loss occurs, the degree of sound quality deterioration can be reduced.

【００３３】パケットの受信成功率に優先順位を与える
場合に、誤り保護情報により、受信側においてある遅延
条件の範囲内でほぼ１００％に近似できるような受信成
功確率を保証できるような通信プロトコルを使用する場
合を想定する。その場合は、そのように誤り保護情報が
与えられたパケットは、実質的に必ず受信に成功できる
ものとみなすことができよう。その場合も、上述の誤り
保護処理の例と同様の過程を経て、パケット配信前に平
均パケット損失率推定値を求め、パケット損失が発生す
ることが予想される場合には、主要階層Ａのパケットに
ついては受信成功確率が１００％に近似できるような値
となるように誤り保護情報を与え、その他のパケットＢ
とＣにおいてパケット損失が発生するように、パケット
損失状態をコントロールする。例えば、平均パケット損
失率推定値が１５％以下であるときには、主要階層Ａの
パケットは実質的に損失しないように保証し、補助階層
Ｂが９５％以上、補助階層Ｃが９０％以上でのパケット
の受信成功確率となるように、配信パケットに誤り保護
情報を与えて配信する。あるいは、平均パケット損失率
推定値が１６％以上５５％以下であるときには、主要階
層Ａのパケットは損失しないように保証し、補助階層Ｂ
が８５％以上、補助階層Ｃが７０％以上でのパケットの
受信成功確率となるように配信パケットに誤り保護情報
を与えて配信する。When prioritizing the reception success rate of packets, a communication protocol that guarantees a reception success probability that can be approximated to 100% within a certain delay condition on the receiving side by the error protection information. It is assumed to be used. In that case, a packet given such error protection information can be regarded as one that can be successfully received. Even in that case, the average packet loss rate estimation value is obtained before packet delivery through the same process as the above-mentioned example of the error protection process, and if packet loss is expected to occur, the packet of the main layer A For error packet, the error protection information is given so that the reception success probability becomes a value that can be approximated to 100%.
The packet loss state is controlled so that the packet loss occurs in C and C. For example, when the estimated average packet loss rate is 15% or less, the packets of the main layer A are guaranteed not to be substantially lost, and the packets of the auxiliary layer B of 95% or more and the auxiliary layer C of 90% or more are guaranteed. The error protection information is given to the delivery packet so that the delivery success probability of is obtained. Alternatively, when the estimated average packet loss rate is 16% or more and 55% or less, the packet of the main layer A is guaranteed not to be lost and the auxiliary layer B is
Is 85% or more and the auxiliary layer C is 70% or more, the error protection information is given to the distribution packet and the packet is distributed.

【００３４】このような誤り保護を実行することによっ
て、主要階層Ａのパケットが確実に保護され、伝送路上
でたとえパケット損失が発生しても、前述の図３（ａ）
〜（ｃ）に示すように、主要部分（帯域）の音質は劣化
を免れるため、良好な品質が保たれる。By performing such error protection, the packet of the main layer A is surely protected, and even if a packet loss occurs on the transmission line, the packet shown in FIG.
As shown in (c) to (c), the sound quality of the main part (band) is free from deterioration, so that good quality is maintained.

【００３５】次に、上述したような音響符号化を行う音
響符号化装置について説明する。このような音響符号化
装置は、具体的には、例えば、スケーラブル符号化にお
ける階層数をＮとおくと（ただしＮ≧２）、ｉを２≦ｉ
≦Ｎの各整数とし、入力する音響信号を第ｉの帯域が第
ｉ−１の帯域を含むようにＮ個の帯域に分割するものと
して、音響信号の第１の帯域の成分信号を抽出する第１
の抽出器と、第１の帯域の成分信号を符号化して第１の
符号を求める第１の符号化器と、第１の符号〜第ｉ−１
の符号に基づいて第ｉ−１の帯域の復号信号を求める第
ｉ−１の復号器と、音響信号の第ｉの帯域の成分信号を
抽出する第ｉの抽出器と、第ｉの帯域の成分信号と第ｉ
−１の帯域の復号信号との残差信号を符号化して第ｉの
符号を求める第ｉの符号化器と、各階層の符号（第１乃
至第Ｎの符号）をそれぞれパケット化するパケット化器
と、これらのパケットを伝送路に送出する多重化部とを
備えている。第ｉ−１の復号器は合計Ｎ−１個設けら
れ、第ｉの抽出器も合計Ｎ−１個設けられ、第ｉの符号
化器も合計Ｎ−１個設けられ、パケット化器は合計Ｎ個
設けられる。さらに、上述したような誤り保護を行うの
であれば、各階層に対応してそれぞれのパケットに対し
て誤り保護処理を行う誤り保護部（合計Ｎ個）を設け、
第ｉ−１の帯域での伝送誤りまたは欠落が第ｉの帯域の
伝送誤りまたは欠落よりも少なくするように、各誤り保
護部で誤り保護の処理を行う。Next, an acoustic coding apparatus for performing the above acoustic coding will be described. Specifically, such an audio encoding device, for example, when the number of layers in scalable encoding is N (where N ≧ 2), i is 2 ≦ i
Each integer of ≦ N is set, and the input acoustic signal is divided into N bands so that the i-th band includes the i−1-th band, and the component signal of the first band of the acoustic signal is extracted. First
Of the first band, the first coder for coding the component signal of the first band to obtain the first code, and the first code to the (i-1) th code
Of the i-th band of the i-th band of the acoustic signal, the i-th decoder for obtaining the decoded signal of the i-th band based on the code of Component signal and i-th
-I encoder that encodes the residual signal with the decoded signal of the -1 band to obtain the i-th code, and packetization that packetizes the codes (first to N-th code) of each layer And a multiplexing unit for sending these packets to the transmission path. A total of N-1 number of i-1th decoders, a total of N-1 number of i-th extractors, a total of N-1 number of i-th encoders, and a total number of packetizers are provided. N pieces are provided. Further, if the above-mentioned error protection is performed, an error protection unit (total N) for performing error protection processing on each packet is provided corresponding to each layer,
Each error protection unit performs error protection processing so that transmission errors or omissions in the i-1th band are less than transmission errors or omissions in the i-th band.

【００３６】図５は、Ｎ＝３すなわち上述した例におけ
る３階層のスケーラブル符号化を行うと場合に用いられ
る音響符号化装置の具体的な構成を示している。ここで
は、音響の主要な帯域に対応して第１の帯域（主要階層
Ａの帯域）が設定され、第１の帯域を含むように第１の
帯域より広い第２の帯域（補助階層Ｂの帯域）が設定さ
れ、第２の帯域を含むように第２の帯域より広い第３の
帯域（補助階層Ｃの帯域）が設定されているものとす
る。FIG. 5 shows a concrete configuration of an acoustic coding apparatus used when N = 3, that is, when scalable coding of three layers in the above example is performed. Here, the first band (the band of the main layer A) is set corresponding to the main band of the sound, and the second band (of the auxiliary layer B) wider than the first band is included so as to include the first band. It is assumed that a third band (a band of the auxiliary layer C) wider than the second band is set so as to include the second band.

【００３７】この音響符号化装置は、入力信号（音響信
号）から波形切り出しを行って符号化フレームを得る切
り出し部２１と、第１の帯域（主要階層Ａの帯域）の成
分信号を抽出する第１の抽出器２２と、第１の帯域の成
分信号に対して符号化を行い第１の符号を得る第１の符
号化器２３と、第１の符号を復号して第１の帯域の復号
信号を得る第１の復号器２４と、第２の帯域（補助階層
Ｂの帯域）の成分信号を抽出する第２の抽出器２５と、
第２の帯域の成分信号から第１の帯域の復号信号を差し
引くことにより残差信号を生成する減算器２６と、減算
器２６で得られた残差信号に対して符号化を行い第２の
符号を得る第２の符号化器２７と、第２の符号化器２７
から出力される第２の符号を復号する第２の復号器２８
と、上述の第１の帯域の復号信号と第２の復号器２８の
出力信号を加算して第２の帯域の復号信号とする加算器
２９と、第３の帯域（補助階層Ｃの帯域）の成分信号を
抽出する第３の抽出器３０と、第３の帯域の成分信号か
ら第２の帯域の復号信号を差し引くことにより残差信号
を生成する減算器３１と、減算器３１で得られた残差信
号に対して符号化を行い第３の符号を得る第２の符号化
器３２と、第１の符号をパケット化する第１のパケット
化器３３と、第２の符号をパケット化する第２のパケッ
ト化器３４と、第３の符号をパケット化する第３のパケ
ット化器３５と、第１のパケット化器３３の出力パケッ
トに対して誤り保護処理を行う第１の誤り保護部３６
と、第２のパケット化器３４の出力パケットに対して誤
り保護処理を行う第２の誤り保護部３７と、第３のパケ
ット化器３５の出力パケットに対して誤り保護処理を行
う第３の誤り保護部３８と、各誤り保護部３６〜３８で
の誤り保護処理のレベルを設定する誤り保護制御部３９
と、誤り保護処理がなされた各階層のパケットを多重化
して伝送路上に送出する多重化部４０とを備えている。
各誤り保護部３６〜３８には、第１の帯域のパケットに
対して最も優先度の高い誤り保護がなされ、第２の帯域
のパケットに対して次の順位の優先度の誤り保護がなさ
れ、第３の帯域のパケットには最も優先度の低い誤り保
護がなされる（誤り保護を実行しない場合も含む）よう
に、誤り保護制御部３９によって誤り保護処理のレベル
が設定される。This acoustic coding apparatus includes a clipping unit 21 for clipping a waveform from an input signal (acoustic signal) to obtain a coded frame, and a first band (the band of the main layer A) for extracting a component signal. 1 extractor 22, a first encoder 23 that obtains a first code by encoding the component signal of the first band, and a first band that decodes the first code to decode the first band. A first decoder 24 for obtaining a signal, a second extractor 25 for extracting a component signal of a second band (band of auxiliary layer B),
The subtractor 26 that generates a residual signal by subtracting the decoded signal of the first band from the component signal of the second band, and the residual signal obtained by the subtractor 26 is encoded and A second encoder 27 for obtaining a code, and a second encoder 27
Second decoder 28 for decoding the second code output from
And an adder 29 for adding the above-mentioned decoded signal of the first band and the output signal of the second decoder 28 to obtain a decoded signal of the second band, and a third band (band of auxiliary layer C) The third extractor 30 for extracting the component signal of the second band, the subtracter 31 for generating the residual signal by subtracting the decoded signal of the second band from the component signal of the third band, and the subtracter 31 A second encoder 32 that encodes the residual signal to obtain a third code, a first packetizer 33 that packetizes the first code, and a packetize the second code. A second packetizer 34, a third packetizer 35 that packetizes a third code, and a first error protection that performs error protection processing on the output packet of the first packetizer 33. Part 36
A second error protection unit 37 that performs error protection processing on the output packet of the second packetizer 34, and a third error protection unit 37 that performs error protection processing on the output packet of the third packetizer 35. An error protection unit 38 and an error protection control unit 39 that sets the level of error protection processing in each of the error protection units 36 to 38.
And a multiplexing unit 40 that multiplexes the packets of each layer that have been subjected to error protection processing and sends them out onto the transmission path.
In each of the error protection units 36 to 38, the error protection with the highest priority is given to the packet of the first band, and the error protection of the next priority is given to the packet of the second band. The error protection control unit 39 sets the level of the error protection processing so that the packet of the third band is subjected to the error protection with the lowest priority (including the case where the error protection is not executed).

【００３８】次に、ここで述べたように送信側において
音響符号化がなされたとして、受信側での復号処理につ
いて説明する。Next, the decoding process on the receiving side will be described assuming that acoustic coding has been performed on the transmitting side as described above.

【００３９】受信側では、伝送路からパケットを受取る
と、まず、各階層ごとにパケットを仕分ける。そして、
階層ごとに、パケットを検査し、誤り保護情報が付加さ
れている場合には誤り検出や誤り訂正を行い、その階層
の符号を得る。そして、各階層の符号を復号し、各復号
信号を加算して音響信号に再生することにより、出力音
響信号を得る。このとき、パケット損失（制限時間まで
に受信側へ到着しない未着パケットや誤り訂正で訂正で
きない誤りがあるパケット）があったときは、パケット
損失のあった階層だけはパケットがなかったものとして
扱い、したがって、各復号信号を加算する場合にもその
階層は加算対象とされないようにする。すなわちその階
層については無音であったものとする。このように処理
しても、図２あるいは図３を用いて説明したように、残
りの階層の波形から音響信号が再生されることになるの
で、出力音響信号における著しい音質の劣化は避けられ
る。特に、主要階層のパケットについて十分な誤り保護
がなされている場合には、その主要階層のパケットは損
失となることが実質的になく、したがって、パケット損
失に伴なう音質の劣化が著しく軽減される。ここでは、
パケット損失があった階層は無音化すると説明したが、
パケット損失があった場合に、その階層の直前の符号化
フレームのパケットを再度用いるようにしてもよいし、
その階層の直前の符号化フレームのパケットと直後の符
号化フレームのパケットの相加平均を求めてその相加平
均の信号列を用いるようにしてもよい。When the receiving side receives a packet from the transmission line, it first sorts the packet into each layer. And
The packet is inspected for each layer, and if the error protection information is added, error detection and error correction are performed to obtain the code of the layer. Then, the output acoustic signal is obtained by decoding the code of each layer, adding each decoded signal and reproducing the acoustic signal. At this time, if there is a packet loss (a packet that has not arrived at the receiving side by the time limit or a packet that has an error that cannot be corrected by error correction), only the layer with packet loss is treated as having no packet. Therefore, even when each decoded signal is added, the hierarchy is not set as an addition target. That is, it is assumed that the layer is silent. Even with this processing, as described with reference to FIG. 2 or FIG. 3, since the acoustic signal is reproduced from the waveforms of the remaining layers, significant deterioration in sound quality of the output acoustic signal can be avoided. In particular, if sufficient error protection is provided for the packets of the main layer, the packets of the main layer are practically not lost, and therefore the deterioration of the sound quality due to the packet loss is significantly reduced. It here,
I explained that the layer with packet loss is silenced,
If there is a packet loss, the packet of the coded frame immediately before that layer may be used again,
The arithmetic mean of the packet of the immediately preceding encoded frame and the packet of the immediately following encoded frame in the layer may be obtained and the signal sequence of the arithmetic average may be used.

【００４０】このような音響復号化装置は、スケーラブ
ル符号化における階層数がＮであって上述したように音
響信号がＮ個の帯域に分割されているものとして（ただ
しＮ≧２）、ｊを１≦ｊ≦Ｎの各整数とすると、第ｊの
符号における符号の誤りまたは欠落を検出する第ｊの符
号検査部と、誤りも欠落も検出されなかった場合にその
第ｊの符号を復号して第ｊの帯域の復号信号を求める第
ｊの復号器と、誤りも欠落も検出されなかった第ｊの帯
域の復号信号を帯域にわたり加算する加算器と、加算器
によって求められた加算成分に基づいて音響信号を再生
する音響信号再生部とを備えている。In such an acoustic decoding device, assuming that the number of layers in scalable encoding is N and the acoustic signal is divided into N bands as described above (where N ≧ 2), j If each integer of 1 ≦ j ≦ N is satisfied, a j-th code check unit that detects a code error or a loss in the j-th code and a j-th code that is decoded when neither an error nor a loss is detected. To obtain the decoded signal of the j-th band by the following: an adder that adds the decoded signal of the j-th band in which no error or omission was detected over the band; and an addition component obtained by the adder. And an acoustic signal reproducing unit that reproduces an acoustic signal based on the sound signal.

【００４１】Ｎ＝３であるとすると、このような音響復
号化装置は、図６に示すように、受信信号中のパケット
を各階層に分離する符号分離器５１と、第１の階層のパ
ケットについて誤りや欠落の検査を行い第１の符号を取
り出す第１の符号検査部５２と、第１の符号検査部５２
において誤りも欠落も検出されなかった場合にその第１
の符号を復号して第１の帯域の復号信号を求める第１の
復号器５３と、第２の階層のパケットについて誤りや欠
落の検査を行い第２の符号を取り出す第２の符号検査部
５４と、第２の符号検査部５４において誤りも欠落も検
出されなかった場合にその第２の符号を復号して第２の
帯域の復号信号を求める第２の復号器５５と、第３の階
層のパケットについて誤りや欠落の検査を行い第３の符
号を取り出す第３の符号検査部５６と、第３の符号検査
部５６において誤りも欠落も検出されなかった場合にそ
の第３の符号を復号して第３の帯域の復号信号を求める
第１の復号器５７と、第１乃至第３の帯域の復号信号を
帯域にわたり加算する加算器５８と、加算器５８での加
算結果に基づき音響信号再生処理を行う音響信号再生部
５９と、を備えている。ただし、加算器５８において
は、利用者の必要に応じて、階層Ａの復号音のみを使用
したり、階層ＡとＢを加算した復号音を使用したり、階
層Ａ，Ｂ，Ｃの全てを加算した復号音を使用したりする
ことができる。Assuming that N = 3, such an audio decoding apparatus, as shown in FIG. 6, has a code separator 51 for separating a packet in a received signal into layers and a packet for the first layer. A first code checking unit 52 for checking for errors and omissions and extracting a first code, and a first code checking unit 52
First if no error or omission was detected in
A first decoder 53 that decodes the code of 1 to obtain a decoded signal of the first band, and a second code checking unit 54 that checks the packets of the second layer for errors and omissions and extracts the second code. A second decoder 55 that decodes the second code to obtain a decoded signal in the second band when no error or omission is detected in the second code checking unit 54; Third code inspecting unit 56 for inspecting the third packet for errors and omissions, and decoding the third code when no error or omission is detected in the third code inspecting unit 56. A first decoder 57 for obtaining a decoded signal of the third band by means of the above, an adder 58 for adding the decoded signals of the first to third bands over the band, and an acoustic signal based on the addition result of the adder 58. And an acoustic signal reproducing unit 59 that performs a reproducing process. That. However, in the adder 58, only the decoded sound of the layer A is used, the decoded sound obtained by adding the layers A and B is used, or all of the layers A, B, and C are used in the adder 58. The added decoded sound can be used.

【００４２】以上、本発明の好ましい実施の形態につい
て説明したが、上述した音響符号化装置および音響復号
化装置は、ハードウエア構成とすることもできるし、汎
用のコンピュータやマイクロプロセッサなどを用い、ソ
フトウェアによって構成することもできる。Although the preferred embodiment of the present invention has been described above, the above-described acoustic encoding device and acoustic decoding device may have a hardware configuration, or a general-purpose computer or microprocessor may be used. It can also be configured by software.

【００４３】すなわち、音響符号化装置、音響復号化装
置とも、上述した音響符号化方法、音響復号化方法を実
行するためのプログラムを、マイクロプロセッサなどを
含む例えば１ボードコンピュータなどのコンピュータに
読み込ませ、そのプログラムを実行させることによって
も実現できる。音響符号化方法、音響復号化方法を行う
ためのプログラムは、ＣＤ−ＲＯＭや不揮発性メモリな
どの記録媒体によって、あるいはネットワークを介して
コンピュータに読み込まれる。このコンピュータは、例
えば、マイクロプロセッサなどのＣＰＵと、プログラム
やデータを格納するためのメモリと、音響信号などが入
出力する入出力装置と、ネットワークとの接続を行うネ
ットワークインタフェースと、記録媒体を読み取る読み
取り装置とから構成されている。メモリ、入出力装置、
ネットワークインタフェース及び読み取り装置は、いず
れもＣＰＵに接続している。この計算機では、音響符号
化や音響復号化を行うためのプログラムを記録媒体やネ
ットワークから読み出してメモリ上に展開し、そのプロ
グラムをＣＰＵが実行することにより、上述したような
音響符号化あるいは音響復号化が実行される。That is, both the acoustic coding apparatus and the acoustic decoding apparatus load a program for executing the acoustic coding method and the acoustic decoding method described above into a computer such as a one-board computer including a microprocessor and the like. , Can also be realized by executing the program. A program for performing the acoustic encoding method and the acoustic decoding method is read into a computer by a recording medium such as a CD-ROM or a non-volatile memory, or via a network. This computer reads, for example, a CPU such as a microprocessor, a memory for storing programs and data, an input / output device for inputting and outputting acoustic signals, a network interface for connecting to a network, and a recording medium. And a reading device. Memory, input / output device,
Both the network interface and the reading device are connected to the CPU. In this computer, a program for performing acoustic coding or acoustic decoding is read from a recording medium or a network, loaded on a memory, and the CPU executes the program to perform the acoustic coding or acoustic decoding as described above. Is executed.

【００４４】[0044]

【発明の効果】以上説明したように本発明は、音響信号
をスケーラブル符号化で符号化するとともに、スケーラ
ブル符号化での各階層に対応するパケットを階層ごとに
分けられた形で配信するため、パケット損失が生じた場
合に著しい音質劣化を被ることを防止できるという効果
がある。また、重要階層のパケットに対しては十分な誤
り保護を行うことにより、パケット損失が発生したとし
ても、音質の低下を最小限にとどめることができるよう
になる。したがって、本発明によれば、インターネット
などの様々な回線でパケット化した楽音または音声デー
タを配信する際に、音切れや劣化の感じられない伝送を
行なうことが期待できる。特に、インターネットラジオ
放送、インターネットテレビ放送などにおけるライブ番
組や、インターネット電話、インターネット会議システ
ムなど、リアルタイム伝送が要求される通信システムで
は、パケット損失が生じた場合に再送ができないため、
本発明によって、損失パケットを再送しなくても品質の
劣化しない、またはほとんど劣化の感じられないライブ
通信の実現が期待できる。As described above, according to the present invention, since the acoustic signal is encoded by the scalable coding and the packet corresponding to each layer in the scalable coding is distributed in each layer. There is an effect that it is possible to prevent a significant deterioration in sound quality when a packet loss occurs. Further, by sufficiently protecting the packets of the important layer from error, even if a packet loss occurs, the deterioration of the sound quality can be minimized. Therefore, according to the present invention, when packetized musical sound or voice data is distributed through various lines such as the Internet, it can be expected that transmission without feeling sound breakage or deterioration is performed. In particular, in a communication system that requires real-time transmission, such as live programs in Internet radio broadcasting and Internet TV broadcasting, Internet telephones, and Internet conferencing systems, retransmission cannot be performed when packet loss occurs.
According to the present invention, it is possible to expect realization of live communication in which quality is not deteriorated or almost no deterioration is felt even if a lost packet is not retransmitted.

[Brief description of drawings]

【図１】本発明の実施の一形態の音響符号化方法を説明
する図である。FIG. 1 is a diagram illustrating an acoustic coding method according to an embodiment of the present invention.

【図２】（ａ）〜（ｃ）は、各階層のパケットが損失し
たときの再生音スペクトルを示す図である。FIGS. 2A to 2C are diagrams showing a reproduction sound spectrum when a packet in each layer is lost.

【図３】（ａ）〜（ｃ）は、主要階層Ａに対応するパケ
ットに対して高度の誤り保護を行った場合に他の階層の
パケットが損失したときの再生音スペクトルを示す図で
ある。3 (a) to 3 (c) are diagrams showing a reproduction sound spectrum when a packet corresponding to a main layer A is subjected to a high degree of error protection and a packet in another layer is lost. .

【図４】（ａ）はネットワーク構成を示すブロック図で
あり、（ｂ）は誤り保護情報を設定する場合の手順を示
すフローチャートである。4A is a block diagram showing a network configuration, and FIG. 4B is a flowchart showing a procedure for setting error protection information.

【図５】本発明の実施の一形態の音響符号化装置を説明
するブロック図である。FIG. 5 is a block diagram illustrating an audio encoding device according to an embodiment of the present invention.

【図６】本発明の実施の一形態の音響復号化装置を説明
するブロック図である。FIG. 6 is a block diagram illustrating an audio decoding device according to an embodiment of the present invention.

[Explanation of symbols]

１０ネットワーク１１送信側１２受信側２１切り出し部２２，２５，３０抽出器２３，２７，３２符号化器２４，２８，５３，５５，５７復号器２６，３１減算器２９，５８加算器３３〜３５パケット化器３６〜３８誤り保護部３９誤り保護制御部４０多重化部５１符号分離器５２，５４，５６符号検査部５９音響信号再生部１１０伝送路 10 network 11 sender 12 Receiver 21 Cutout part 22, 25, 30 extractor 23, 27, 32 encoder 24, 28, 53, 55, 57 Decoder 26,31 Subtractor 29,58 adder 33-35 Packetizer 36-38 Error protection unit 39 Error protection control unit 40 Multiplexer 51 code separator 52, 54, 56 Code checking unit 59 Acoustic signal playback unit 110 transmission line

───────────────────────────────────────────────────── フロントページの続き (72)発明者池田和永東京都千代田区大手町二丁目３番１号日本電信電話株式会社内 (72)発明者森岳至東京都千代田区大手町二丁目３番１号日本電信電話株式会社内Ｆターム(参考） 5D045 DA01 DA11 5K028 AA12 EE08 KK32 MM09 SS05 SS15 ─────────────────────────────────────────────────── ─── Continued front page (72) Inventor Kazunaga Ikeda 2-3-1, Otemachi, Chiyoda-ku, Tokyo Inside Telegraph and Telephone Corporation (72) Inventor Takeshi Mori 2-3-1, Otemachi, Chiyoda-ku, Tokyo Inside Telegraph and Telephone Corporation F-term (reference) 5D045 DA01 DA11 5K028 AA12 EE08 KK32 MM09 SS05 SS15

Claims

[Claims]

1. An acoustic encoding method for encoding and transmitting an acoustic signal, wherein scalable coding is performed on an input acoustic code, and the code obtained for each layer in the scalable encoding is packetized. An audio encoding method for delivering packets of each layer.

2. The acoustic coding method according to claim 1, wherein the packet of each layer is distributed after performing error protection processing so that a packet of a layer having a higher importance has a higher priority.

3. An acoustic encoding method for encoding an acoustic signal for every N (N is an integer of 2 or more) bands, wherein i is an integer of 2 or more and N or less, and the i-th band is i-
A first extraction step of extracting a component signal of the first band of the acoustic signal including a first band; and a first encoding for encoding the component signal of the first band to obtain a first code. An i-1 th decoding step of obtaining a decoded signal of the i-1 th band based on the first to i-1 th codes, and extracting a component signal of the i th band of the acoustic signal An i-th code for obtaining an i-th code by encoding an i-th band component signal and a residual signal of the i-th band component signal and the i-1 th band decoded signal.
And the step of transmitting the first to Nth codes separately for each of the bands and transmitting the acoustic coding method.

4. An error protection process is performed so that transmission errors or omissions in the (i-1) th band are less than transmission errors or omissions in the i-th band, and then the step of transmitting is performed. The audio encoding method according to item 3.

5. An acoustic decoding method for receiving a packet storing a code by acoustic coding and decoding the code, wherein the acoustic coding is scalable coding having a plurality of layers, and the packet Is set for each layer, and the process of inspecting for packet loss in packet transmission and for each layer in which no packet loss is detected, the code is extracted from the packet of the layer and decoded to obtain a decoded signal. As for the process and the layer where packet loss is detected, as silence,
An acoustic decoding method comprising: a step of adding the respective decoded signals; and a step of reproducing a sound based on a result of the addition.

6. An acoustic decoding method for receiving a packet storing a code by acoustic coding and decoding the code, wherein the acoustic coding is scalable coding having a plurality of layers, and the packet Is set for each layer, and the process of inspecting for packet loss in packet transmission and for each layer in which no packet loss is detected, the code is extracted from the packet of the layer and decoded to obtain a decoded signal. And a step of adding a decoded signal using a decoding result of a packet in at least one of the coded frames immediately before and after the same layer with respect to a layer in which packet loss is detected, and a step of adding the respective decoded signals. And a step of reproducing sound based on a result of the addition, the sound decoding method.

7. A j-th code check process for detecting a code error or loss in the j-th code, where N is an integer of 2 or more and j is an integer from 1 to N, and the j-th code If no error or omission is detected, the j-th decoding process for decoding the j-th code to obtain a decoded signal in the j-th band, and the decoded component of each band in which neither error nor omission is detected And an acoustic signal reproduction step of reproducing an acoustic signal based on the addition component obtained in the addition step, wherein the j-th code is the j-th band in scalable encoding. And an i-th band includes an i-1-th band, where i is an integer from 2 to N.

8. An acoustic encoding device for encoding and transmitting an acoustic signal, comprising means for performing scalable encoding on an input acoustic code, and a code obtained for each layer in the scalable encoding. An audio encoding device comprising: a packetizing unit; and a unit for delivering the packet of each layer.

9. The audio encoding device according to claim 8, further comprising means for performing error protection so that a packet of a layer having a higher importance has a higher priority before transmission.

10. An acoustic encoding device for encoding an acoustic signal for every N (N is an integer of 2 or more) bands, wherein i is an integer of 2 or more and N or less, and the i-th band is i-
A first extractor that includes a first band and that extracts a component signal of a first band of the acoustic signal; and a first encoding that encodes the component signal of the first band to obtain a first code. An (i-1) th decoder that obtains a decoded signal in the (i-1) th band based on the 1st to (i-1) th codes, and a first signal that extracts a component signal in the i-th band of the acoustic signal. an i-th extractor, and an i-th code for obtaining a i-th code by encoding a residual signal between the i-th band component signal and the i-1 th band decoded signal.
And an encoding unit that transmits the first to Nth codes separately for each of the bands.

11. An acoustic decoding device for receiving a packet storing a code by acoustic coding and decoding the code, wherein the acoustic coding is scalable coding having a plurality of layers, and the packet Is set for each layer, and means for inspecting for packet loss in packet transmission and, for each layer in which no packet loss is detected, extracts a code from the packet of the layer and decodes it to obtain a decoded signal. As a means and silence for the layer where packet loss is detected,
An audio decoding apparatus comprising: a unit that adds each of the decoded signals; and a unit that reproduces a sound based on the result of the addition.

12. A program for causing a computer to perform acoustic encoding for encoding and transmitting an acoustic signal, the computer performing scalable encoding on an acoustic code to be input, and the scalable encoding. Packetizing the code obtained for each layer in, the process of performing error protection so that the packet of the layer of higher importance has higher priority, and the process of delivering the packet of each layer. Program to be executed.

13. A program for causing a computer to perform acoustic decoding for receiving a packet storing a code by acoustic coding and decoding the code, wherein the acoustic coding is scalable coding having a plurality of layers. The packet is set for each layer, and a process for inspecting the computer for the presence or absence of packet loss in packet transmission, and for each layer in which no packet loss is detected, the packet is encoded from the packet of the layer. Is extracted and decoded to obtain a decoded signal, and regarding the layer where packet loss is detected, there is no sound,
A program for executing a process of adding the respective decoded signals, and a process of reproducing sound based on a result of the addition.