[go: up one dir, main page]

TW201126508A - Audio encoder, audio decoder, method for encoding an input audio information, method for decoding an input audio information and computer program using improved coding tables - Google Patents

Audio encoder, audio decoder, method for encoding an input audio information, method for decoding an input audio information and computer program using improved coding tables Download PDF

Info

Publication number
TW201126508A
TW201126508A TW099102412A TW99102412A TW201126508A TW 201126508 A TW201126508 A TW 201126508A TW 099102412 A TW099102412 A TW 099102412A TW 99102412 A TW99102412 A TW 99102412A TW 201126508 A TW201126508 A TW 201126508A
Authority
TW
Taiwan
Prior art keywords
value
index
tuple
audio
spectral
Prior art date
Application number
TW099102412A
Other languages
Chinese (zh)
Inventor
Guillaume Fuchs
Markus Multrus
Ralf Geiger
Jeremie Lecomte
Frederik Nagel
Julien Robilliard
Arne Borsum
Original Assignee
Fraunhofer Ges Forschung
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofer Ges Forschung filed Critical Fraunhofer Ges Forschung
Publication of TW201126508A publication Critical patent/TW201126508A/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/0017Lossless audio signal coding; Perfect reconstruction of coded audio signal by transmission of coding error

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

An audio decoder comprises an arithmetic decoder for providing a plurality of decoded spectral values on the basis of an arithmetically-encoded representation of the spectral values and a frequency-domain to time-domain converter for providing a time-domain audio representation using the decoded spectral values, in order to obtain a decoded audio information. The arithmetic decoder is configured to derive a group index from a variable-length-codeword representing the group index in dependence on a state index. The arithmetic decoder is configured to derive the values of a most-significant bit-plane of a tuple of spectral values using the group index and an element index, and to provide a tuple of decoded spectral values using the values of the most-significant bit-plane of the tuple of spectral values. The arithmetic decoder is configured to select a cumulative-frequencies-table out of a set of 32 cumulative-frequencies-tables in dependence on the state index, and to apply the selected cumulative-frequencies-table to derive the group index from the variable-length codeword representing the group index.

Description

201126508 六、發明說明: 【發明所屬^技術領域】 依據本發明之實施例係關於—種基於一編碼音訊資訊 提供一解碼音訊資訊的音訊解碼器,一種基於一輸入音訊 貝讯提供一編碼音訊資訊的音訊編碼器,一種基於一編碼 音訊資訊提供一解碼音訊資訊的方法,一種基於一輸入音 汛資訊提供一編碼音訊資訊的方法及一電腦程式。 依據本發明之實施例係關於在一音訊編碼器,諸如例 如一所謂的統一語言及音訊編碼器(USAC)中使用算術編 碼器表格的概念。 發明背景 在下文中,本發明之背景將被簡要解釋以幫助理解本 發明及其優勢。在過去的十年中,在建立數位儲存及以良 好的位元效率散佈音訊内容的可能性上投入巨大的努力。 此方式的一個重要成就是國際標準ISCVIEC 14496-3的定 義。此標準的第3部份係有關於音訊内容的編碼及解碼,而 第3部份的第4子部份係有關於一般音訊編碼。IS〇/IEC 1^96第3部份、第4子部份定義一般音訊内容的編碼及解碼 的一概念。另外,進一步的改進被提出以改進品質及/或減 少所需位元率。 依據該標準之描述的定義’一時域音訊信號被轉換成 一時頻表示。該從時域到時頻域的轉換典型地使用轉換塊 被執行’該等轉換塊也稱為時域樣本的「訊框」。已發現使 3 201126508 用被移位例如-訊框的-半的重疊訊框是有利的,因為重 疊允許有效地避免(歧少減少)人為时^另外,已發現一 視窗化應被執行以避免源自時間有限訊框過程的人為因 素0 藉由將輸入音訊資訊的一視窗化部份從時域轉換至時 頻域’在許多情況巾獲得—能量集巾,使得某些頻譜值包 含一顯著大於複數個其他頻譜值的量級。因此,在許多情 况中-相對小數目的頻譜值具有顯著地高於_平均頻譜值 里級的里級。-導致〜能量集中的時域到時頻域轉換的一 典型範例是所謂的修改型離散餘弦轉換(MDCT)。 該等頻譜值經常依據一心理聲學模型縮放及量化,使 付罝化誤差對心理聲學上較重要的頻譜值相對較小,而對 心理聲學上較*重要的頻譜值相對較A。縮放及量化的頻 譜值被編碼以提供其之一位元有效表示。 例如,使用量化頻譜係數的一所謂霍夫曼編碼在國際 標準ISO/IEC 14496-3:鳩(E),第3部份,第仔部份中被描 述0 然而,已發現頻譜值編碼之品質在所需位元率上具有 一顯著影響。並且,已發現通常在—可攜式消費者裝置中 被實施,且因此價廉且具有低電力消耗之音訊解碼器的複 雜性取決於用於編碼頻譜值之編碼。 鐘於此情況,針對-音訊内容之編碼及解碼概念存在 種可在高位元率效能與資源效率之間提供—改 需求。 < 衷的 201126508 t發明内容;j 發明概要 依據本發明之-實施例建立一種基於一編碼音訊資^ 提供一解碼音訊資訊的音訊解碼器。該音訊解碼器包含;_ 基於該等頻譜值之—算術編碼表示提供複數贿碼頻辨值 :算術解碼器。該音訊解碼器也包含一頻域到時心換 益’其使用解碼義譜值提供—時域音絲*,以獲得解 碼音訊賴。該算鑛碼器倾g絲據描述算術解碼器 之一狀態的一狀態索引而從表示一群索引的可變長度碼字 中導出該群索引。該算術解碼器被組態成使㈣群索引及 一兀素索引導出一頻譜值元組的一最顯著位元平面的值, 該兀素索引描述(或指定,或選擇)群索引選擇的—群中的— 7L素。該算術解碼器被組態成使用該頻譜值元組的最顯著 的位疋平面值提供一解碼頻譜值元組。該算術解碼器被組 態成依賴描述算術解碼器之狀態的狀態索引從—組個累 積頻率表格選擇一累積頻率表格,且將選擇的累積頻率表 格用於從表示群索引的可變長度碼字導出群索引。 依據本發明之實施例是基於發現使用一組32個累積頻 率表格提供一可達位元率與一音訊編碼器或音訊解碼器之 複雜性之間的一最佳折衷。詳言之,已發現32個不同累積 頻率表格適於(由於它們導致一合理低位元率)—音訊内办 的任一相關時頻域表示。已發現一組32個累積頻率表格是 最佳的,因為使用一較小數目的累積頻率表格會導致—顯 著增加的位元率,且因為使用一較大數目的累積頻率表格 201126508 僅帶來位元率無足輕重的改良,但卻引起在編碼 器端及解 碼器端的δ己憶體消耗顯著增加。 綜上所述,一深入研究提出依賴一狀態索引被選擇且 由32個累積頻率表格表示的32個機率模型提供位元率有效 與算術編碼實施努力成果之間的一最佳折衷。 在-較佳實施例中,該算術解碼器被組態成從狀態索 引中導出-7位元散列表索引,且從—散列表獲得—散列表 項值’散列表包含128個散列表索引值在對應散列表項值上 的映射纟此情況中,鼻術解碼器被組態成決定散列表項 值(即在由散列表索引值指㈣—記憶體位置的散列表内 谷)是一溢出值,與一狀態索引值(例如散列表索引值以其為 2礎被導出的-索引值)相襲的有效累積頻率表格識別 符值’或-無效累積頻率表格識別符值(例如,不適合狀態 索引值的-累積辦表格識別符值,散麻索引值基於該 狀匕、索;丨值被導⑴。該算術解碼器被組態成掃描散列表之 項直到找&卜溢出值或—有效累積頻率表格識別符值。該 算術解媽^ 成如果所獲得的散列表項值是溢出值, 則依賴一其中包含狀態索引值之值區隔的識別提供一累積 頻率索?丨值,且如果散列表項值是一與狀態索引值相關聯 的累積頻率表格識別符值,則從獲得的散列表項值導出累 積頻率索引值。 本發明之此實施例是基於發現解碼器僅有少數的音訊 之「顯著」狀態,其使用一由一特別累積頻率表格(與少數 僅大約1到10個狀態相關聯)表示的特別機率模型(與少數僅 201126508 大約1到ίο個狀態相關聯)很重 =多數狀態而言,最好板據:其;==解碼器 測疋值間隔將該等狀態映射至一 < L索5丨之一 將-完整的狀態索引值間隔(=率料值上’藉以 狀癌索引值的範圍)映射至夕於_個不同 由於發現-位S率有心累積頻率索弓1值。 分佈相關聯的「顯著」狀=_得’即使特定機^ 少至僅⑶則是可、,將㈣表之尺寸減 此,散咐之料。。 透狀⑽倍的散列表)。因 H …叫及編Μ端實施需要-極小量^ 源岐有助於使音訊解碼器價麻日L 的貝 之電力消耗適度地小,$進而德持此等音訊解石馬器 裝置的可能性。^而改進了貫施價廉及行動消費 67個2較佳實施例巾,散列表被組態麟7位元散列表的 ,、引映射至有效累積頻率表格制符值,且將7位元 散列表索引的61個值映射至溢出值。因此,有一僅67個相 對J數目的「顯著」狀態存在。另外,不同於「顯著」狀 的任何「非顯著」狀態(非顯著狀態的數目顯著大於顯著 狀態的數目,勤大至少_ — @數,但是最好大於甚至 多於1000的因數)被映射至溢出值,因此使其易區別一「顯 著」狀態與一「非顯著」狀態。因此,顯著與非顯著狀態 之間的差別可以低記憶體消耗被很快地測定。 在一較佳實施例中,算術編碼器被組態成將狀態索引 的67個不同值映射至67個不同累積頻率表格識別符值,使 得26個不同累積頻率索引值與由狀態索引值描述的67個不 201126508 同顯著狀態相關聯。 已發現即使對於相對低數目的67個不同顯著狀態而 言,僅具有26個與其相關的不同累積頻率分佈已足夠。另 外,僅選擇與67個顯著狀態相關聯的26個不同累積頻率表 格顯示帶來位元率需求與編碼/解碼複雜性需求之間的一 最佳折衷。 在一較佳實施例中,算術解碼器被組態成將不同非顯 著狀態映射至九個不同累積頻率索引值,使得總計有九個 不同累積頻率表格可利用為與非顯著狀態一起使用,一間 隔類型映射被執行以導出累積頻率索引值。已發現一極小 數目較佳地是九個不同累積頻率表格在大部份狀態下足以 獲得一位元率有效算術編碼。 也發現在一相對小數目的顯著狀態與一大數目非顯著 狀態之間存在一很大差別。最好透過散列表的有效累積頻 率表格識別符值域使之與累積頻率表格相關聯的6 7個顯著 狀態與頻譜值的特定特徵相關聯,該特定特徵諸如例如具 有一特定寬度及方向的時頻表示中的軌跡。相反,被視作 非顯著狀態,且一累積頻率表格使用一基於範圍的演算法 與其相關聯的所有其他特徵僅表示較少特性頻譜值分佈。 特別也發現與非顯著狀態相關聯的一些累積頻率表格 也非常適合與一些顯著狀態一起使用。因此,已發現使用 應用於一或一個以上顯著狀態或一或一個以上非顯著狀態 的三個累積頻率表格(例如具有與其相關聯的索引05、26及 30的累積頻率表格)是有利的。例如,其被發現就位元率效 201126508 率與計算複雜性之間的折衷而言,如下是特別有利的:具 有23個不同累積頻率表格,該等23個不同累積頻率表格僅 使用散列表的有效累積頻率表格識別符值與顯著狀態相關 聯,具有六個累積頻率表格,該等六個累積頻率表格僅使 用一溢出機構與非顯著狀態相關聯,溢出機構是基於儲存 於散列表及一基於間隔的映射的溢出值。並且,已發現具 有三個累積頻率表格是有利的,該三個累積頻率表格與一 或一個以上的顯著狀態以及一或一個以上的非顯著狀態相 關聯。 依據本發明的另一實施例建立一種基於一輸入音訊資 訊提供一編碼音訊資訊的音訊編碼器。該音訊編碼器包含 一時域到頻域轉換器,其基於輸入音訊資訊的一時域表示 提供一頻域音訊表示,使得頻域音訊表示包含一組頻譜 值。該音訊編碼器也包含一算術編碼器,其被組態成使用 一可變長度碼字編碼一相鄰頻譜值元組,或其之一預處理 版本。該算術編碼器被組態成將一頻譜值元組的一最顯著 位元平面之值映射至一群索引及一元素索引,該元素索引 描述在由群索引選擇的一群中的一元素。該算術編碼器進 一步被組態成依賴表述算術編碼區的一狀態的一狀態索引 從一組32個累積頻率表格中選擇一累積頻率表格,且使用 一選擇的累積頻率表格算術地編碼該群索引,以獲得一被 算術編碼的可變長度碼字。 依據此實施例的音訊編碼器是基於與上述音訊解碼器 相同的觀念。特別地,該音訊編碼器基於發現32個累積頻 201126508 率表格之數目引起位元率效率與編碼/解碼複雜性之間的 一最佳折衷。 依據本發明之另一實施例建立一種基於一編碼音訊表 示提供一解碼音訊表示的方法。 依據本發明之又一實施例建立一種基於一輸入音訊表 示提供一編碼音訊表示的方法。 依據本發明之再一實施例建立一種用於執行該等發明 的方法之電腦程式。 圖式簡單說明 依據本發明之實施例將在下文中參考揭露的圖式被描 述,其中: 第1 a - b圖繪示依據本發明之一實施例的一音訊編碼器 之方塊不意圖, 第2a-b圖繪示依據本發明之一實施例的一音訊解碼器 之方塊示意圖; 第3圖繪示一用於解碼一頻譜值元組的一演算法 「tuples_decode()」的一爲程式碼表示; 第4圖繪示一狀態計算之上下文的一示意表示; 第5a圖繪示重設一上下文的一演算法 「arith_reset__context()」的一偽程式碼表示; 第5b圖繪示映射一上下文的一演算法 「arith_map_context〇」的一偽程式碼表示; 第5c圖繪示獲得一上下文狀態值的一演算法 「arith_get_context〇」的一偽程式碼表示; 10 201126508 第5d圖繪示從一狀態變量導出一累積頻率表格索引值 pki的一演算法「arith—get_pk(s)」的一偽程式碼表示; 第5 e圖繪示從一可變長度碼字算術解碼一符號的一演 算法「arith_decode()」的一偽程式碼表示; 弟5f圖繪示從一群索引nq導出一元素數目值mm及一 群偏移值og的一演算法的一偽程式碼表示; 第5g圖繪示基於群偏移值〇g及一元素索引值此獲得一 頻譜值元組的一最顯著位元平面的頻譜值a、b、c、d的一 演算法的一偽程式碼表示; 第5h圖繪示將一元組a、b、c、d的頻譜值與一較不顯 著位元平面的值結合,以獲得該元組&、1)、(:、(1頻譜值的 一更新版本的演算法之一偽程式碼表示; 「第5i圖繪示更新上下文的一演算法 「adth_update_c〇mext()」的一偽程式碼表示; 第5j圖繪示概念與變量之圖例; 第6a圖繪不—統—語言及音訊編碼器⑽从)原始資料 塊的一語法表示; 第6b圖繪示—單通道元素的—語法表示; 第_綠示—通道對元翻語法表示; 第6d圖繪示_「ics」控制資訊的語法表示; 第6e圖綠示_頻域通道_流的_語法表示;’ 第6f圖繪示被算術編侧頻譜資料之—語法表示 第6g圖繪示解碼触元㈣—語法表示丁 第6h圖繪示資料元素及變量之圖例;/ n 201126508 第7圖繪示一先前使用的算術編碼器的記憶體需求之 一表格表示; 第8圖繪示依據本發明一算術編碼器的記憶體需求之 一表格表示; 第9圖繪示一評估透過依據本發明的算術編碼器獲得 的性能改良之設備的方塊不意圖; 第10圖繪示用於使用一先前使用的算術編碼器編碼不 同音訊資訊所要求的位元率之一表格表示; 第11圖繪示使用發明的概念編碼不同音訊資訊要求的 位元率之表格表示; 第12圖以一表格表示的形式繪示一先前使用的音訊編 碼器與依據本發明的一音訊編碼器產生的平均位元率之間 的一比較; 第13圖以一表格表示的形式繪示當較之於一先前使用 的概念,使用本發明的概念獲得的位元率減少與位元率增 加的一比較; 第14圖繪示一表格「arith_cf_ng_hash[]」之項的一表格 表示; 第15(1)圖到第15(10)圖繪示一表格「arith_cf_ne[]」之 項的一表格表示; 第16(1)圖到第16(32)圖繪示索引pki之32個不同值0到 31的一表格「arith_cf_ng[pki]」之項的一表格表示; 第17(1)圖到第17(2)圖繪示一表格「dgroups[]」之項的 一表格表示; 12 201126508 第18(1)圖到第18(11)圖綠示一表格「dvectors[]」之項 的一表格表示; 第19(1)圖到第19(32)圖繪示一表格 「egroups[a][b][c][d]」之項的一表格表示; 第2 0圖繪示一提供一音訊資訊的一解碼表示的方法之 流程圖;及 第21圖繪示一提供一音訊資訊的一編碼表示的方法之 流程圖。 I:實施方式3 較佳實施例之詳細說明 1.音訊編碼器 在下文中,依據本發明之一實施例的一音訊編碼器將 被描述。第1圖繪示此一音訊編碼器1〇〇的一方塊示意圖。 音訊編碼器100被組態成接收一輸入音訊資訊110,且 以其為基礎提供一構成一編碼音訊資訊的位元流112。音訊 編碼器100可任選地包含一預處理器12〇,預處理器120被組 態成接收輸入音訊資訊11〇且以其為基礎提供一預處理的 輸入音訊資訊110 a。音訊編碼器1 〇 〇也包含一能量集中時域 到頻域信號轉換器(transformer) 13 0,其也稱作信號轉換器 (converter) 〇信號轉換器13〇被組態成接收輸入音訊資訊 110、110a,且以其為基礎提供一較佳地採用一組頻譜值形 式的頻域音訊資訊132。例如,信號轉換器130可被組態成 接收輸入音訊資訊11〇、l10a(例如,一塊時域樣本)的一訊 框’且提供一組表示該相關音訊訊框之音訊内容的頻譜 13 201126508 值。另外’信號轉換器130可被組態成接收輸入音訊資訊 110、110a的複數個後續、重疊或非重疊、音訊訊框,且以 其為基礎提供一時頻域音訊表示,該時頻域音訊表示包含 一序列的後續頻譜元組,一組頻譜值與每一訊框相關聯。 能量集中時域到頻域信號轉換器13〇可包含—能量集 中慮波器組,其提供與不同、重疊或非重疊、頻率範圍相 關聯的頻譜值。例如,信號轉換器13〇可包含—視窗化 MDCT轉換器i3〇a’其被組態成使用一轉換視窗視窗化該输 入音sfl資訊11〇、ii〇a(或其之一訊框),且執行視窗化的輸 入音訊資訊110、ll〇a(或其之視窗化訊框)的—改良型離散 餘弦轉換。因此,頻域音訊表示132可包含與輸入音訊資訊 的一訊框相關聯、呈MDCT係數形式的一組例如1〇24個頻 譜值。 音訊編碼器100進一步可任選地包含一頻譜後處理器 140,其被組態成接收頻域音訊表示132,且以其為基礎提 供一後處理的頻域音訊表示142。頻譜後處理器丨4〇例如可 被組悲成執行一時間雜訊整形及/或一長期預測及/或該記 憶體中習知的任一其他頻谱後處理。該音訊編碼器進一步 可任選地包含一縮放器/量化器15〇,其被組態成接收頻域 音訊表示132或其之後處理版本142,且提供一縮放及量化 的頻域音訊表示152。 音訊編碼器100進一步可任選地包含一心理聲學模型 處理器16〇,其被組態成接收輸入音訊資訊11〇(或其之後處 理版本1 l〇a) ’且以其為基礎提供一可被用於能量集中時域 14 201126508 到頻域信號轉換器130之控制,可任選頻譜後處理器140之 控制,及/或可任選縮放器/量化器150之控制的可任選控制 資訊。例如心理聲學模型處理器160可被組態成分析輸入音 訊資訊,以判斷輸入音訊資訊110' ll〇a對音訊内容之人類 知覺特別顯著的部份,及輸入音訊資訊110、ll〇a對音訊内 容之知覺較不顯著的部份。因此,心理聲學模型處理器160 可提供控制資訊,該控制資訊被音訊編碼器100用於透過縮 放器/量化器150及/或縮放器/量化器150施加的量化解析度 調整頻域音訊表示132、142之縮放。因此,感知上重要的 縮放因數頻帶(即’對音訊内容之人類感知特別重要的諸群 相鄰頻譜值)以一大縮放因數被縮放,且以相對高的解析度 被量化,而感知上較不重要的縮放因數頻帶(即諸群相鄰頻 譜值)以一相對較小的縮放因數被縮放,且以一相對較低的 量化解析度被量化。因此,感知上較重要的頻率之縮放的 頻譜值典型地相當大於感知上較不重要頻率的頻譜值。 該音訊編碼器也包含一算術編碼器170,其被組態成接 收頻域音訊表示132的縮放及量化版本(或可供選擇地,頻 域音訊表示132之後處理版本142,或甚至頻域音訊表示132 其本身),且以其為基礎提供算術碼字資訊172a、172b,使 得該算術碼字資訊表示頻域音訊表示15 2。 音訊編碼器100也包含一位元流負載格式器19〇,其被 組癌成接收其術碼子負§fl 17 2 a、17 2b。位元流負載格式5| 190也典型地被組態成接收附加資訊,諸如例如描述由縮放 器/量化器150施加的縮放因數之縮放因數資訊。另外,位 15 201126508 元流負載格式器19 0可被組態成接收其他控制資訊。位元流 負載格式器190被組態成基於依據一所需位元流語法組合 位元流接收的資訊提供位元流112,這將在下文中被描述。 在下文中,算術編碼器170之細節將被描述。算術編碼 器170被組態成接收頻域音訊表示132的複數個元組,例如 四個後處理及縮放及量化的頻譜值。該算術編碼器包含_ 最顯著位元平面萃取器174 ’其被組態成從一頻譜元組摘取 一最顯著位元平面。在這裡應注意最顯著位元平面可包含 一或甚至更多的位元(例如,兩個或三個位元),它們是該頻 譜值元組的頻譜值之最顯著的位元。因此,最顯著位元平 面萃取器174提供一頻譜值元組(該等頻譜值較佳地在頻率 上相鄰)的一最顯著位元平面176。算術編碼器17〇也包含群 索引決疋子/疋素索引決定子178,其被組態成將最顯著位 元平面176映射至一群索引值ng及一元素索引值此。此映射 可使用-查找表,例如,下文詳細討論的查找表「啡—s」 被執行。該群索引決定子/元素索引決定子178可被組配成 將最顯著位元平面176之值的某些組合映射至僅包含一個 元素的一群的一群索引ng,且可被組配成將最顯著位元平 面176之值的其他組合映射至一包含值之複數個組合的群 的一。因此,群索引決定子/元素索引決定子可被組配成將 包含一相對高機率的最顯著位元平面176之值的此等組合 映射至僅包含一個或僅一些元素的群,且將包含一相對低 機率的最顯著位元平面176之值的組合映射至包含較多元 素的群。因此,一被映射至僅包含_單一元素的群的值之 16 201126508 組合的元素索引ne可僅採用一單值,且可因此被忽略。相 反’一被映射至包含複數個元素之組的值之組合可採用複 數個值。因此,群索引決定子/元素索引決定子178提供一 群索引值ng(也稱為180a),且如果需要,提供元素索引值 ne(也稱為18〇1)),其中元素索引值狀可被設定成一預設值、 或如果.最顯著位元平面176所映射的群ng僅包含一單一元 素時省略之。 算術編碼區170也包含一第一碼字決定子18〇,其被組 配成決定表示群索引ng的一算術碼字acod_ng[pki][ng]。另 外’如果群ng之元素mm之數目大於卜第一碼字決定子18〇 可提供一表示元素索引狀的一算術碼字acod—ne[ne]。表示 兀素索引ne的算術碼字ac〇d_ne[ne]t提供可被忽略。可任 選地,碼字決定子180也可提供一或一個以上溢出碼字(在 本文也稱為「ARmi_ESCAPE」),指示例如可利用的較不 顯著位元平面有多少(且,因此,指示最顯著位元平面之數 字加權)的。第—碼字決定子⑽可被組配成使用—具有(或 被參考的)—累積頻率表格索引Pki的已選擇累積頻率表 格’提供與一群索引ng相關聯的碼字。 爲了確定應選擇的累積頻率表格,該算術編碼器較佳 地包含-狀態追㈣182,其被組態成透過,例如觀測在先 前被編碼的頻譜值元組來追縱算術編碼器的狀態。狀 縱器182因此提供—狀態資訊184,例如-稱為「s」或 的狀態值。算術編碼區17〇也包含一累積頻率表格選擇器 186,其被组態成接收狀態資訊⑻,且向碼字決定子⑽提 17 201126508 供一描述選擇的累積頻率表格的資訊。例如,累積頻率 表格選擇器186可提供—累積頻率表格索引pki,其描述一 組32個累計頻率表格中由碼字決定子選擇使用的累積頻率 表格。可供選擇地,累積頻率表格選擇器186可向碼字決定 子提供整個選擇的累積頻率表格。因此,碼字決定子180可 將選擇的累積頻率表格用於提供群索引ng的碼字 acod_ng[pki][ng] ’使得編碼群索引ng的實際碼字 acod一ng[pki][ng]取決於ng值及累積頻率表格索引pki,且因 此取決於目前狀態資訊184。相反,第一碼字決定子180可 使用一預設(狀態無關)累積頻率表格用於提供碼字 acod_ne[ne],然而,碼字acod_ne[ne]可取決於選擇的群ng 中的元素數目。關於編碼過程及所獲得的碼字格式的進一 步細節將在下文中描述。 算術編碼器170進一步包含一較不顯著的位元平面萃 取器189a,其可被組態成如果欲被編碼的一頻譜值元組的 一或一個以上值僅使用最顯著的位元平面超出可編碼值之 範圍,則從縮放及量化的頻域音訊表示152擷取一或一個以 上較不顯著位元平面。該等較不顯著位元平面每頻譜值可 依需要包含一或一個以上位元。因此,較不顯著位元平面 萃取器189a提供一較不顯著位元平面資訊189b。算術編碼 器170也包含一第二碼字決定子189c,其被組配成接收較不 顯著位元平面資訊189d ’且以其為基礎提供表示〇、1或更 多較不顯著位元平面之内容的〇、1或更多碼字「acod 第二碼字決定子189c可被組配成應用一算術編碼演算、、去戈 18 201126508 任意其他編碼演算法,以從較不顯著位元平面資訊189b導 出較不顯著位元平面碼字「acod_r」。 在本文應注意較不顯著位元平面之數目可依賴縮放及 量化的頻譜值152之值而變化,使得如果目前元組的縮放及 量化頻譜值相對小時,可能根本沒有較不顯著位元平面, 使得如果目前元組的縮放及量化頻譜值在一中範圍,則可 能有一個較不顯著位元平面,且使得如果縮放及量化頻譜 值採用一相對大值,則可能有多於一個的較不顯著位元平 面。 綜上所述,算術編碼器170被組態成使用一階層編碼過 程編碼一由資訊152描述的元組縮放及量化的頻譜值。最顯 著位元平面(包含,例如,一個、兩個或三個位元每頻譜值) 被編碼以獲得一群索引ng的一算術碼字 「acod_ng[pki][ng]」,在一些情況中為一元素索引ne的一碼 字「acod_ne[ne]」。一或一個以上較不顯著位元平面(各該 較不顯著位元平面包含,例如,一個、兩個或三個位元)被 編碼以獲得一或一個以上碼字「acod_r」。當編碼該最顯著 位元平面時,最顯著位元平面的值之組合被映射至複數個 群中的一群ng,其中該等群中的一些僅包含一個元素,而 其中該群中的其他各包含複數個元素。因此,諸值之不同 組合的機率被考慮。隨後,群索引ng及元素索引ne(如果需 要)被編碼,其中32個不同累積頻率表格可用於依賴算術編 碼器170的一狀態,即靠先前編碼的諸頻譜值元組來編碼該 群索引 ng。因此,碼字「acod_ng[pki][ng]」及「acod_ne[ne]」 19 201126508 被獲得,其中如果群索引ng指定為包含多於一個元素的 群,則後者碼字僅被包括在位元流112中。另外,如果一或 一個以上較不顯著位元平面存在,則一或一個以上碼字 「aC〇d—r」被提供且被包括進該位元流中。 重设描述 音訊編碼器1 〇 〇可任選地被組態成決定位元率上的一 改進是否可藉由重設上下文,例如,藉由將狀態索引設定 成—預設值而被獲得。因此,音訊編碼器100可被組態成提 供扎示算術編碼的上下文是否被重設且亦指示在一對應 解碼器中的异術解碼之上下文是否應被重設的一重設資訊 (例如’稱為「arith_reset—flag」)。 關於位元流格式與應用的累積頻率表格之細節將在下 文中被描述。 2.音訊解碼器 在下文中,依據本發明之一實施例的一音訊解碼器將 被也述。第2圖繪示此一音訊解竭器的一方塊示意圖。 音訊解碼器200被組態成接收一位元流21〇,其表示一 編碼音訊資訊,且可等於音訊編碼器1〇〇提供位元流112。 音訊解碼器200基於位元流210提供一解碼音訊資訊212。 音说解碼器200包含一可任選位元流負載變形項220, 其被組態成接收位元流210,且從位元流210擷取一編碼頻 域音訊表示222。例如,位元流負載變形項220可被組配成 处位元210操取算術編碼的頻譜資料’諸如例如,一表示 —元素索引ne的算術碼字「acod_ne[ne]」’及一表示頻域音 20 201126508 況表不的—較不顯著位元平面之一内容的碼字「acod_r」。 因此,編碼頻域音訊表示222組成(或包含)頻譜值的一算術 編碼表示。位元流負載變形項220進一步被組配成從位元流 擷取附加控制資訊,如第2圖所示。另外,該位元流負載變 形項可任選地被組配成從位元流2丨〇擷取一狀態重設資訊 224,其也被指定為一算術重設旗標或「 arith_reset_flag」。 音訊解碼器200包含一算術解碼器230,且也被指定為 「頻譜雜訊解碼器」。算術解碼器23〇被組態成接收編碼頻 域音訊表示220,且可任選地,接收狀態重設資訊224。算 術解碼器230也被組態成提供一解碼頻域音訊表示232,其 可包含頻譜值的一解碼表示。例如,解碼頻域音訊表示232 可包含數頻譜值元組的一解碼表示’它們由編碼頻域音訊 表示220描述。 音訊解碼器200也包含一可任選反向量化器/重新縮放 器240 ’其被組態成接收解碼頻域音訊表示232,且以其為 基礎提供一反向量化的及重新縮放的頻域音訊表示242。 音訊解碼器200進一步包含一可任選頻譜預處理器 250 ’其被組態成接收反向量化且重新縮放的頻域音訊表示 242 ’且以其為基礎提供反向量化及重新縮放的頻域音訊表 不242的一預處理版本252。音訊解碼器2〇〇也包含一頻域到 時域信號轉換器(transf〇rmer)260,其也被指定為一「信號 轉換器(converter)」。信號轉換器26〇被組態成接收反向量化 及重新縮放的頻域音訊表示242(或可供選擇地,反向量化 及重新縮放的頻域音訊表示242,或解碼頻域音訊表示232) 21 201126508 的預處理版本252 ’且以其為基礎提供音訊資訊的一時域表 示262。頻域到時域信號轉換器26〇可例如,包含執行一修 改型離散餘弦反轉換(IMDCT)及一適當視窗化(及其他輔助 功能,諸如例如一疊加)的轉換器。 音訊解碼器200可進一步包含一可任選時域後處理器 270,其被組態成接收音訊資訊的時域表示262,且使用一 時域後處理獲得解碼音訊資訊212。然而,如果後處理被省 略’時域表示262可等於解碼音訊資訊212。 這裡應注意反向量化器/重新縮放器240、頻譜預處理 器250、頻域到時域信號轉換器260及時域後處理器270可賴 控制資訊被控制’該控制資訊由位元流負載變形項220從位 元流210被擷取。 總結上述音訊解碼器200之功能,一解碼音訊表示 232,例如一組與編碼音訊資訊之一音訊訊框相關聯的頻譜 值可基於編碼頻域表示222使用算術解碼器230被獲得。隨 後’該組’例如可能是MDCT係數的1024個頻譜值倍反向 量化、重新縮放及預處理。因此,一組反向量化、重新縮 放及頻譜預處理的頻譜值(例如,1024個MDCT係數)被獲 得。之後,一音訊訊框的一時域表示該組反向量化、重新 縮放且頻譜預處理的頻域值(例如,MDCT係數)中導出。因 此’一音訊訊框的一時域表示被獲得。一給定音訊訊框的 時域表示可與先前及/或後續音訊訊框的時域表示結合。例 如’後續音框的時域表不之間的一疊加可被執行以使 相鄰音訊訊框的時域表示之間的過渡平滑,且獲得一反摺 22 201126508 二、#對關於基於解碼時頻域音訊表示232的解瑪音 Λ貝Λ212的重建,參考例如國際標準岱㈤π 14496 3,第 3部份’第4子部份’其中提出詳細討論。 在下文中’關於算術解碼器230之某些細節將被描述。 算術解碼态230包含—群索引決定子/元素索引決定子 280 ’其被組配成接收描料t引ng的算術碼字 acod_ng[pki][ng;]」’以及如果碼字「⑽可用, 也接收το素索弓丨ne的碼字「aeGd』e_」。群索引決定子彻 被’、且配成提供-解碼群索引值%,且如果有群索引值叩描 述的群包3夕於-個元素時,也提供―解碼元素索引值 ne然而’群索引決定子/元素索引決定子珊可被組配成如 果群索弓丨值ng描述的群僅包含—個元素,則提供預設元素 索引值ne’例如—個。群索引決定子/元素索引決定子· 可被組配成㈣—料含魏個3 2㈣_率表格中的一 累積頻率表格’用於從算術碼字「aeGd—ng[pki][ng]」導出 群索引值ng。算術解碼器進—步包含—最顯著的位元平 面決定子284’其被組配成基於__群索引值叩及_元素索引 值ne’導出一2位元組(或3位元組)頻譜值的一最顯著位元平 面的值286。算術解碼器23〇進一步包含—較不顯著位元平 面決定子288’其被組配成接收表示一頻譜值元組之一或一 個以上較不顯著位元平面的一或一個以上碼字「ac〇d_r」。 因此,較不顯著位元平面決定子288被組配成提供一或一個 以上較不顯著位元平面的解碼值290。音訊解碼器2〇〇也包 S位元平面組合器292 ’其被組態成接收該頻譜值元組之 23 201126508 最顯著位元平面的解媽值286,且如 > ㈣t晚日值70組,則也接收該頻譜值元 ΓΓ 顯著位元平面之解碼值因此,位元平面 2器供-編碼頻譜值元組,該頻譜值元組是解瑪頻 1絲不232的—部份。自'然地,算術解碼器230典髮地 ,組錢提供複數個值_解碼賴值,叫得與音訊内 容的-目前訊框相關聯的-組完整的解碼頻譜值。 鼻術解碼器230進-步包含一累積頻率表格選擇器 2%,其被組態成依賴描述算術解瑪器的—狀態的一狀態索 引298選擇32個累積頻率表格其中之一。算術解碼器2纖 -步包含-狀態追蹤器299,其被組態成依賴該等值組的先 前解碼的頻譜值追縱算術解碼器的一狀態。該狀態資訊可 響應狀態重設資訊224可任選地被重設成—預設狀態資 訊。因此’累積頻率表格選擇器2%被組態成向群索引決定 子/元素索引決定子280提供一已選擇累積頻率表格的一索 引(例如pki)’或一已選擇累積頻率表袼本身,應用於依賴 群索引碼字「acod_ng」編碼群索引ng。 概括音訊解碼器200之功能,音訊解碼器2〇〇被組態成 接收一位元率高效編碼頻域音訊表示222,且以其為基礎獲 得-解碼賊音訊表示。在㈣於基於編碼頻域f訊表示 222獲知解碼頻域音訊表示232的算術解碼器230中,最顯著 位元平面之值的不同組合的一機率使用一算術解碼器28〇 被利用,算術解碼器280被組態成應用一累積頻率表格。另 外,不同頻譜值元組之間的統計相依性透過依賴一狀態索 24 201126508 引298從一組包含32個不同累積頻率表格中選擇不同累積 頻率表格而被利用,該狀態索引298透過觀測先前計算的解 碼頻譜值元組被獲得。 3.頻譜低雜訊編碼工具之概觀 在下文中,關於例如由算術編碼器170及算術解碼器 230執行的編碼及解碼演算法的細節將被描述。 重點放在解碼演算法的描述。然而,應注意,一對應 編碼演算法可依據解碼演算法之教示被執行,其中映射被 反向。 應注意將在下文中討論的解碼被使用以允許經典型後 處理、縮放及量化的頻譜值之一所謂的「頻譜低雜訊編 碼」。頻譜低雜訊編碼被用於一音訊編碼/解碼概念中,以 進一步減少量化頻譜之冗餘,該冗餘例如,藉由一能量集 中時域到頻域轉換器被獲得。 被用於本發明之實施例中的頻譜低雜訊方案是基於一 算術編碼及一動態適應上下文。在較佳實施例及下文中, 頻譜值由將4個連續頻譜值在頻率上組合進而成為4元組的 元組編碼的頻譜低雜訊編碼而被處理。低雜訊編碼由量化 的頻譜值被饋送,且使用例如,從四個先前解碼的臨近4元 組導出的上下文相關累積頻率表格。在本文中,時間與頻 率上的臨近被考慮,如第4圖所示。累積頻率表格(將在下 文中描述)進而被算術編碼器用來產生一可變長度二進制 編碼,且被算術解碼器使用以從一可變長度二進制編碼導 出解碼值。 25 201126508 例如,算術編碼器170依賴各自機率產 4.解碼過程 4.1解碼過程概觀 <一概觀將參考 元組之過程的一 在下文中,解碼一頻譜值元組的過程 第3圖提出,第3圖繪示解碼複數個頻譜值 儀矛王式碼表示。 上下文的初始化 包含使用函數 ’或使用函數 文導出目前上下 目前上下文將在 解碼複數個頻譜值元組的過程包含〜 310。上下文之初始化31〇選擇性地 「anth_reset_context()」重設該上下文 「anth_map_context(lg/4)」從一先前上下 文。重設上下文以及從一先前上下文導出 下文中被描述。 複數個頻譜值元_解碼也包含—元組解碼31泣一 f下文更新314之迭代,該上下文更新由一函數 「adth—update_C〇ntext(a,b,c,d,I,lg/4)」執行,如下文所述。 s玄元組解碼312及上下文更新314被重複lgM次,其中ig/4表 示將被解碼的頻譜值元組之數目◎元組解碼312包含一上下 文值計算312a、一群索引解碼312b、一元素索引解碼312c、 一最顯著位元平面測定312d及一較不顯著位元平面相加 312e 〇 狀態值計算312a包含使用函數「arith_get_context(i)」 計算一第一狀態值’其中該函數返回第一狀態值s。狀態值 26 201126508 計算312a也包含計算一位準值lev,該位準值藉由以24位元 移動第一狀態值s到右側而被獲得。狀態值計算312a也包含 依據第3圖所示之公式計算一第二值t。 群索引解碼312b包含一解碼演算法312ba的一迭代執 行,其中一菱里j在凟异法312的第一执行之前被初始化為 0 ° /與算法312ba包含使用一上述函數「arith_get—pk()」依 賴第二狀態值t計算一狀態索引pki (其也用作一累積頻率表 格索引)。演算法312ba也包含依賴狀態索引pki選擇一累積 頻率表格,其中一變量「cum_freq」可依賴狀態索引pki被 設定成32個累積頻率表格之一的一初始位元址。並且,一 變量「cfl」可被初始化為選擇的累積頻率表格的一長度, 其等於字母中符號之數目,即,可被解碼的不同值之數目。 從「arith_cf一ng[pki=0][545]」到「arith—cf_ng[pki=31][545]」 可用於解碼群索引的全部累積頻率表格之長度是545 ,因為 545個不同群索引及一溢出符號可比解碼。隨後,一群索引 ng可計入選擇累積頻率表格藉由執行一函數 「arith-decode()」而被獲得。當導出群索引ng時,位元流 210名為「acod一ng」的位元可被評估(間第6g)圖。 演算法312ba包含檢查群索引ng是否等於一溢出符號 「ARITH_ESCAPE」。如果群索引不等於算術溢出符號,則 演算法312ba中止(「間斷」-情況)且演算法3nba之剩餘指 令因此被跳過。因此,該過程之執行以原始索引解碼 312c(如果需要)或以最顯著位元平面測定312d繼續。相反, 27 201126508 如果解碼群索引ng等於算術溢出符號「Arith_ESCAPE」, 位準值lev被增加二。並且,如果演算法312ba第一次被執 打’即,如果j=0,則第二狀態值被增加4194304,否則第 二狀態值t被設定成〇»並且,變量』在演算法3121^重複之前 被設定成1 °如上所述’演算法312ba被重複直到解碼群索 引ng與算術溢出符號不同。 群索引解碼312b—經完成,即,一不同於算術溢出符 號之群索引值被解碼’如果需要,元素索引解碼312c被執 订。為此’具有群索引ng的群之一基數(元素數目)被決定’ 其中由群索引叩指定的群之基數mm在表格位置ng由表格 dgroups」的一表格項「dgr〇ups[ng]」之八個最不顯著位 凡(位元0-7)描述。如果群索引%指定的群之基數mm大於 一,則凡素索引ne藉由執行一演算法312ca被獲得。元素索 引ne可任選地被設定成〇,或一不同預設值。例如,運算狀=〇 可在條件敘述「如果(mm>1)」之前被執行。演算法312“ 包含決定一適當累積頻率表格或一累積頻率子表格的一初 始位元址cum_freq」。例如,可變r cum—红叫」可被設定 成累積頻率表格「arith_cf_ne」之初始位元址與值 (mm)*(mm-l)/2之總和,如第3圖所示。並且,變量cfl可被 初始化為各自累積頻率表格或累積頻率子表格的一適當長 度’其等於群索引ng中的元素mm之數目。隨後,元素索引 ne可藉由執行函數rarith一dec〇de()」被獲得,其中與元素 索引編碼相關聯的選擇的累積頻率表格(例如,表格 「arith_cf〜ne」之子表格)被使用。 28 201126508 隨後’最顯著位元平面值測定312d被執行。為此,表 格「dgvectors」的—項被評估,其之索引以元素索引ng藉 由表格「dgr〇ups」之值』的最顯著位元平面(例如位元8-15), 以及藉由凡素索引ne被決定,如第3圖所示。具體而言,第 -頻譜值「a」(屬於-頻譜值元組)之最㈣位元平面之值 藉由表格「dgVeCtors」以—元素索引,>>8+狀)的一項而 被决定。類似地第二頻譜值「b」(屬於—頻譜值元組) 的最顯著位S平面值藉由評估以索引,>>8+ne)+1評估表 格「dgvectors」之項而被獲得。類似地,一第三頻譜值「c」 及一第四頻譜值「d」(屬於該頻譜值元組)的最顯著位元平 面值被獲得,如第3圖中參考數字312d所示。 ik後’較不顯著位元平面被獲得’例如第3圖所示參考 數予312e。對於該元組的每—較不顯著位元平面而言,16 個一進制組合的一個被解碼。然而,應注意獲得較不顯著 位元平面之值與本發明不是特別相關。 4·2解碼順序(第4圖) 在下文中,頻譜值之解碼順序將被描述。 4位元組量化頻譜係數被低雜訊編碼,且從最低頻率係 數及行進波到最高頻率係數被發送(例如,在位元流中)。 來自一高級音訊編碼(例如使用一修改型離散餘弦反 轉換’如在ISO/IEC 14496,第3部份,第4子部份所述)被儲 存於-稱為「x_ac_auant[g][win][sfb;|[bin]」的陣列中,且 低雜況編碼碼字(例如,ac〇d—ng、ac〇d_ne、ac〇d_r)之傳送 順序使得當它們以被接收且儲存在該陣列中的順序編碼 29 201126508 時’「bin」(頻率索弓丨)是最快的增量索引,而「§」是最慢 的增量索引。在一碼字中,解碼之順序是a、b、c、d。換 句話說,值a、b、c、d是相鄰頻率的頻譜值,其中頻譜值a 與比頻譜值b更低的一頻率相關聯,頻譜值!^與弊頻譜值c更 低的一頻率相關聯,而頻譜值c與弊頻譜值d更低的一頻率 相關聯。 來自轉換編碼激勵(tcx)的係數被直接儲存在一陣列 x_tcx—mvquant[win][bin]中,且低雜訊編碼碼字之傳送如 此,以時當它們以接收且儲存在陣列中的順序被編碼時, 「bin」是最快的增量索引,而「win」是最慢的增量索引。 在碼子中’解碼順序是&、b、c、d。換句話說如果頻 譜值描述-語音編碼器的線性預測了核之__轉換編碼激 勵’則頻譜值a、b、c、d與轉換編碼激勵的相鄰及增加頻 率相關聯。 被組態成應用解碼頻域音訊 表不232 ’其由算術解碼器23〇提供,用於使用 域信號轉換「直接甚^ + 接」產生一時域音訊信號表示,以及用於 :用-頻域到時域解碼器及一由頻域之 輪出激發的線性預㈣… ㈣換益之 慮波益間接」提供一音訊信號表示。 a #H其魏在本謂細討論的算術解碼器200非 常適於解碼在頻域中 非 ^ 破編碼的—音㈣容之-時頻诚本- 的頻譜值,且杯對—1域表不 — ㊣ 碼在線性制域中被編Μ 亍二的—線性遽波器提供-刺激信號的—時頻J 不。因此,該算術解碼哭韭a ώ 時頻域表 非㊉適於在-能夠處理頻域編碼 30 201126508 音訊内容及線性預測.頻域編碼音訊内容的音訊解碼器中使 用。 4.3.上下文初始化 在下文中,在一步驟310被執行的上下文初始化將被描 述。首先,可能是位元流一部份的旗標「arith_reset flag」 決定上下文是否應被重設。如果該旗標是真(TRUE),則第 5a圖所示的函數「arith_reset一context()」被呼叫。如果旗標 「arith_reset_flag」是假(FALSE),則依據演算法 「arith—map_context〇」在上一上下文與目前上下文之間執 行一映射,如第5b圖所示。 如圖所示,函數「arith一reset_context()」執行的上下文 之重設包含陣列q及qs(指定為,例如qs[i].a,q[〇][i].a及 q[l][i].a)到零的項「a」、「b」、「c」、「d」之初始化。另外, 陣列q及qs(指定為qs[i].v,q[0][i].v,q[l][i].v)之項 rv」被 初始化為-1。並且,變量「previous_lg」被初始化為1〇24。 然而’如果決定不重設上下文,則上下文一映射可依 據演算法「arith_map_context〇」被執行。如圖所示,該映 射依賴於核心模式,其中「c〇re_mode==l」表示將被解碼 的頻譜值與一現象預測頻域編碼音訊訊框相關聯,且其中 「core_in〇cie==0」表示將被解碼的頻譜值與一頻域編碼音 訊訊框相關聯。應注意’如果與目前頻域編碼音訊訊框相 關聯的頻譜值之數目等於與i=〇到i=lg/4+l的先前訊框相關 聯的頻譜值之數目’則函數「arith_map_context()」將目前 上下文陣列q的項q[〇][i]設定成上一上下文陣列qS的值 31 201126508 qs[i]。 然而’如果與目前音訊訊框相關聯的頻譜值之數目不 同於與先前音訊訊框相關聯的頻譜值之數目,則一較複雜 的映射被執行。然而,關於此情況中之映射的細節與本發 明之關鍵概念並不特別相關,所以詳細參考第5b圖之偽程 式碼。 4-4狀態值計算(第5c圖) 在下文中,狀態值計算312a將被更詳細地描述。 應注意第一狀態值s(如第3圖所示)可以函數 「arith_get一context(i)」之一返回值而獲得,該函數的一偽 程式碼如第5c圖所示。 關於狀態值之計算也參考第4圖,其繪示被用於一狀態 s十算的上下文。第4圖繪示諸頻譜值元組的一二維表示,在 時間以及頻率上。一橫座標41〇描述時間,而一縱座標412 描述頻率。如第4圖所示,要編碼的一元組420與一時間索 引t0及一頻率索引i相關聯(記住要解碼的元組420之頻譜值 與四個不同頻率相關聯)。如圖所示,對於時間索引to而言, 具有頻率索引i-Ι及i-2的元組在具有頻率索弓丨i的元組420要 被解碼時已被解碼。如第4圖所示,具有一時間索引t0及一 頻率索引i-Ι的元組430在元組420被解碼前已被解碼,且元 組430被考慮用於供解碼元組420之用的上下文。類似地, 具有一時間索引t-Ι及一頻率索引i-丨的元組440、具有一時 間索引t-Ι及一頻率索引i的元組444,及具有一時間索引t-1 及一頻率索引i+1的元組448在元組420被解碼前已被解 32 201126508 碼,且被考慮用於供解碼元組420之用的上下文之決定。相 ,一1其他元組已被解碼,它們由具有虛線的方形表示, 且其他逛未被解碼的及由具有虛線的圓圈繪示的元組倍用 於決定供解碼元組420之用的上下文。 現在參考第5c圖,其繪示函數「arith_get_c〇ntext()」 <功能,關於第一上下文值「s」之計算細節將被描述。 函數「arith—get_context()」包含一變量初始化53〇a, 在其期間變量t〇、tl、t2及13依賴陣列q之項「V」在索引位 复(0,i)、(1^)、(O’W)及(〇i+1)被初始化。因此,變量t〇 到t3以項「v」之值被初始化,該等值分別與第4圖所示的 元組444、430、440、448相關聯。 也應注意到函數「arith—get_context()」執行複數個情 /兄的一後續檢查,其中函數「arith_get_context()」在一「返 回」指令達到時被終止,其中該返回指令(或運算子)用於以 片大態值s返回其運算子(接隨返回指令或運算子)。 函數「arith_get_context()」之執行包含一第一條件檢 查530b。如果發現全部變量t0、tl、t2及t3(之值)小於1〇, 則返回值如參考數字530b所示被計算,且函數 「arith_get__context()」以該返回值之返回被終止。 函數「arith_get_context()」之執行也包含一第二條件 檢查530c。如果發現在第二條件檢查530c中,全部變量t〇、 tl、t2及t3小於34,則變量t2及t3被有條件地修改,如參考 數字530c所示,且返回值如參考數字530c所示被計算。特 別地’如果變量t2大於1且小於10,變量t2被設定成2。相反, 33 201126508 如果變量t2大於或等於10 ,則變量〇被設定成3。類似地, 如果變量t3大於1且小於10,則變量【3被設定成2。相反,如 果變里t3大於或等於1〇,則變量13被設定成3。因此,變量 t2值之範圍被限制於一最大正值3。 然而,如果第一條件檢查530b之條件及第二條件檢查 53〇〇之條件都未滿足,則一第三條件檢查530d被執行。如 果發見在第三條件檢查530d中變量t0及tl都小於90,則返回 值如 > 考數字53〇d被計算,其中變量t2及t3之值被考慮。 ^而,如果第一條件檢查530b、第二條件檢查530c及 第—條件檢查530d之條件都未滿足,則一第四條件檢查 53〇e被執行,其中決定變量t0及tl是否都小於544。在這樣 的情况下,返回值如參考數字53〇e被計算,且函數 adth-get^contextO」被終止。 然而’如果條件檢查530b、530c、530d、530e都不引 起函數「arith_get_context〇」之終止,則一上下文計算53〇f 執行上下文計算53〇f包含一變量初始化530fa、一變量 重新縮放53〇fb、一基於表格值適應53〇fc及一返回值計算 53〇fd °在變量初始化530fa中,如果變量t0採用一大於1的 值’變量a〇、b〇、c〇、d〇被設定成陣列q在陣列位置(〇 i)的 項 aj、「b」、「c」、「d」之值。其對應於第4圖的4元組444 之值°相反’如果變量t0之值不大於1,則變量a〇、b0、c0、 d〇被初始化為0。類似地,如果變量tl之值大於1,則變量al、 bl、el、dl被初始化為陣列q在位置的項「a」、rb」、 ^e」、「d」之值,其相對應有第4圖之4元組430之值。 34 201126508 因此,如果變量to之值大於1,變量a〇、b0、c0、do被 設定成時間索弓丨t-1及頻率索引i的一先前解碼頻謹值元組 之頻譜值a、b、c、d。類似地,變量al、bl、cl、dl被設 定成先前解碼頻譜值元組及時間索引tO及頻率索引i-1的頻 謹值a、b、c、d ° 隨後,變量aO、bO、cO、dO、al、bl、cl、dl被迭代 重新縮放,因為數字表示被迭代移向右側一個位元,直到 全部變量aO、bO、cO、dO、al、bl、cl、dl在-4到+3的範圍 内,包括邊界-4及+3。在變量重新縮放53〇fb之後,變量1 表示該組變量aO、bO、cO、dO、al、bl、cl、dl多久被移 向右側,其中至少一個右移運算被執行。因此,適當變量 aO、bO、cO ' dO、al、bl、cl、dl被獲得,它們都在_4與+3 之間的範圍。 隨後,基於表格之值適應530fc被執行。為此,變量t〇 被設定成一值,如果變量tO大於1 ’該值由表格(或陣列) 「egroups」之一項決定。如圖所示,位置 (4+a0,4+b0,4+c0,4+d0)的項被用於此目的。類似地,如果值 tl大於1,則變量tl被設定成一值,該值由一表格位置 (4+&1,4+1)1,4+(;1,4+(!1)的表格「6名1>〇叩8」之_項決定。 最後’一返回值依賴變量1(表示一右移運算多久被應 用),以及依賴變量to與tl被計算,如參考數字530fd所示。 因此’大體可以說函數「arith_get一c〇ntext()」之返回 值由第4圖的元組444、430、440及448之最相關位元平面決 定0 35 201126508 並且,應注意如果變量to大於或等於544,或者變量U 大於或等於544,一表查找被執行,而返回值使用乘法及加 法的一數值計算被使用。因此,如果變量t〇&tl其中之—大 於或專於544,則函數arith__get一context()之返回值之—較自 由(liberate)及較詳細的計算被執行。 應注意在參考數字312a的第3圖中,變量「lev」由函數 「arith_get_c〇ntext⑴」的返回值導出。變量^透過將值s 移向右側24個位元而從值s導出。狀態變量t也透過以值$與 十六進值「OxFFFFFF」之間的一「與」運算以及對最終運 算之結果加-值「1」來執行H料運算,而從值s被 導出。 4.5群索引解碼(第5d圖、第&圖) 在下文中,群索引解碼的過程312b將被討論,過程3i2b 基於上述狀態值t的一先前計算。並且,演算法M2a的一迭 代執订包含函數「amh—get〜pk()」的一呼叫’狀態值t(如第 3圖所示)為一參數。 4.5.1 函數「arith_get_pk()」(第5圖) 函數「arith一get一Pk()」將隨後參考第5d圖被描述。函 數「adth—get—pkO」的執行包含具有以個值的一陣列psd 之初始化,如參考數字54〇a所示。另外,函數 anth_get_pk〇」包含—指標p及變量卜』的初始化如參 考數字5働所示。演算法「adth_get_pk()」也包含變量⑼ 一值的一初始化,該值等於63*t,其中【是當函數 「anlget—Pk〇」被呼叫時,被交給函數「_—⑽―pk()」 36 201126508 的參數。因此,函數「arith_get_pk()」的項值s可等於第3 圖所示演算法「tuples_decode()」之變量t。變量i之初始化 如參考數字540c所示。 函數「arith_get_pk()」也包含一散列表接取540之迭代 運算,其中散列表接取540d被重複,直到一「中斷」條件 到達,或直到一「返回」運算子到達。如果「中斷」條件 到達,則一返回值的一基於範圍提供540e被執行。然而, 如果返回運算子到達,則返回運算子的運算子被返回,且 函數「arith_get_pk()」被終止。 散列表接取540d包含一第一步驟540da、一第二步驟 540db、一第三步驟540dc及一第四步驟540d的一迭代執 行。在第一步驟540da,變量j被設定成表「ari—pk_hash」之 一項的值,其中項之索引由變量i的七個最不顯著位元決 定。在第二步驟540db,決定在第一步驟540da獲得的變量j 之值是否採用OxFFFFFFFF的十六進值。在這樣的情況下, 散列表接取540d的迭代執行被中止,且演算法 「arith_get_pk()」之執行以基於範圍提供一返回值被繼 續。換句話說,如果變量i之七個最不顯著的位元定址的表 格「ari_pk_hash」之項採用OxFFFFFFFF的溢出值,則假定 函數「arith—get_pk()」之輸入變量t定義的狀態是—所謂的 「非顯著」狀態,一返回值應使用返回值的基於範圍提供 540e被分配至該狀態。在散列表接取540d的第三步驟54〇dc 中,檢查變量j之值的最顯著位元(例如位元8到31)是否等於 函數「arith_get_pk()」的輸入變量t之值。在這樣的情況下 37 201126508 變量j的八個最不顯著位元(位元〇到7)被返回成函數 「arith__get_pk()」的一返回值’且函數「arith_get—pk()」被 終止。然而,如果發現第三步驟54〇dc之條件未達到,則變 量i增加1(步驟540dd),且散列表接取540d從第一步驟54〇da 開始被重複。 一返回值的基於範圍提供540e包含指標p到陣列psci中 的一起始點的初始化540ea。起始點由函數「arith_get_pk() _( 值輸入變量t之位元23及24決定,其對應於已被解碼供要解 碼的目前元組之用的溢出符號「ARITH—ESCAPE」之數目。 如果輸入變量t的位元23及24採用值「00」,指標p被初始化 為陣列psci之第一項「24」的點,如果輸入變量s的位元23 及24採用值「01」,指標p被初始化為陣列psci的第八項 「30」,如果輸入變量s的第23及第24位元採用值「10」,指 標P被初始化為陣列psci的第15項「5」,且如果輸入變量t 的第23及第24位元採用值「11」,指標p被初始化為陣列pSci 的第22項「5」。在一後續步驟540eb中,變量j被設定成採用 輸入變量t之22個最不顯著位元(位元1到位元22)表示的 值,如參考數字54〇eb所示。隨後,做出陣列psci之被返回 成演算法「arith_get__pk()」指返回值的項之決定。做出如 下決定: •如果j之值小於436961,且如果j之值也小於252001, 且如果j之值也小於243001,則在步驟540ea決定的 起始點之項被返回; 38 201126508 •如果j之值小於436961,且如果j之值也小於252001, 且如果j之值不小於24300,則在決定步驟540ea的 起始點之後的第一項被返回; •如果j之值小於436961,且如果j之值不小於252001, 且如果j之值小於288993,則在步驟540ea決定的起 始點之後的第二項被返回; •如果j之值小於436961,且如果j之值不小於252001, 且如果j之值不小於288993,則在步驟540ea決定的 起始點之後的第三項被返回; •如果j之值不小於436961,且如果j之值小於 1609865,且如果j之值也小於880865,則在步驟540ea 決定的起始點之後的弟四項被返回, •如果j之值不小於436961,且如果j之值小於 1609865,且如果j之值不小於880865,則在步驟540ea 決定的起始點之後的第五項被返回; •如果j之值不小於436961,且如果j之值部小於 1609865,則在步驟540ea決定的起始點之後的第六項 被返回。 更多詳情參考第5圖的參考數字540ec之演算法。 綜上所述,以狀態值t呼叫的函數「arith_get_pk〇」提 供值pki作為一返回值,如第3圖參考數字312ba所示。 變量pki之值被用於選擇一累積頻率表格以供函數 「arith_decode〇」之執行之用,如參考第3圖所述。因此, 變量「cum_freq[]」被適當初始化以制定選擇的累積頻率表 39 201126508 格。 4.5_2函數「arith_decode()」(第5e圖) 在下文中,函數「arith_decode()」之功能將參考第5e 圖被詳細描述。應注意函數「arith_decode()」使用輔助函 數「arith一first_symbol(void)」,如果其是序列之第一符號, 則返回真(TRUE),否則返回假(FALSE)。函數 「arith_decode()」也使用輔助函數「arith_get_next_bit()」, 其得到並提供位元流的下一位元。 另外’函數「arith_decode()」使用總體變量「i〇w(低)」、 「high(高)」及「value(值)」。另外,函數「arith_decode()」 接收變量「cum_freq[]」作為一輸入變量,其指向選擇的累 積頻率表格之元素(具有元素索引或項索引〇)。並且,函數 「arith 一 decode〇」使用輸入變量cfl,其表示變量 「cum一freq[]」制定的選擇的累積頻率表格之長度。 函數「arith_decode〇」包含一變量初始化550a作為第 一步驟,如果輔助函數「arith_first_symbol」指示一序列符 號的第一符號正被解碼,則其被執行。值初始化55〇a依賴 從位元流使用輔助函數「arith__get_next_bit」獲得的複數 個,例如20個位元初始化變量「vaiue」,使得變量r value」 採用該等位元表示的值。並且,變量「1〇w」被初始化成採 用值0 ’且變量「high」被初始化成採用值1〇48575。 在一第二步驟550b,變量「range(範圍)」被設定成比 變量「high」與「low」之差大一的值。變量rcum」被設 定成表示變量「low」之值與變量rhigh」之值之間的變量 40 201126508 广-」之值的一相對位置。因此,變量「_」依賴變 重「value」之值例如,採用〇與2]6之間的一值。 指標P被初始化為一比選擇的累積頻率之起始位址小 一的值。 演算法「arith_dec〇de()」也包含一迭代累積頻率表格 查找550c迭代累積頻率表格查找被重複,直到變量小 於或等於1。在迭代累積頻率表格查找550c中,指標變量q 被8又疋成專於指標變量p之目前值與變量cfl之值的二分之 一之綜合的—值。如果選擇的累積頻率表格的由指標變量q 定址的項%之值大於變量「cum」之值,則指標變量p被設 定成指標變量q之值,且變量Cfl被增加。最終,變量cfl被移 向右側一位元,因此有效地將變量Cfl之值除以2且忽略模數 部份。 因此’迭代累積頻率表格查找550c有效地將變量r cum」 之值與選擇的累積頻率表格的複數個項比較,以識別選擇 的累積頻率表格中的一間隔,其以累積頻率表格之項為 界,使得值cum落於識別的間隔中。因此,選擇的累積頻率 表格之項界定間隔,其中一各別符號值與選擇的累積頻率 表格的各該間隔相關聯。並且,累積頻率表格之兩個相鄰 至值間的間隔寬度界定與該等間隔相關聯的符號之機率, 使得選擇的累積頻率表格完整地定義不同符號(或符號值) 的一機率分佈。關於可用累積頻率表格之細節將在下文中 參考第16圖被描述。 再次參考第5e圖,符號值由指標變量p的值導出,其中 41 201126508 符號值如參考數字550d被導出。因此,指標變量p與起始位 址「cum_freq」之差可被評估,以獲得符號值,其由變量 「symbol」表示。 演算法「arith_decode」也包含變量「high」與「low」 的一適應550e。如果變量「symbol」表示的符號值不同於〇, 變量「high」被更新,如參考數字550d所示。並且,變量 「low」之值被更新,如參考數字550e所示。變量「high」 被設定成由變量「low」之值決定的一值,變量「range」及 項具有選擇的累積頻率表格之索引「symbol -1」。變量「low」 被增加,其中增加量由變量「range」及具有索引「symbol」 的選擇的累積頻率表格之項決定。因此,變量「low」與 「h i g h」值間之差依賴選擇的累積頻率表格的兩個相鄰項 之數值差被調整。 因此,如果一具有低機率的符號值被檢測到,變量 「low」與「high」之值之間的間隔被減少至一狹窄寬度。 相反,如果檢測到的符號值包含相對大的機率,則變量 「low」與「high」之值之間的間隔的寬度被設定成一相對 大值。另外,變量「low」與「high」之值之間的間隔的寬 度取決於檢測到的符號及累積頻率表格之對應項。 演算法「arith_decode()」也包含一間隔重整550f ’其 中在步驟550e中決定的間隔被迭代地移動且縮放,直到「中 斷」條件達到。在間隔重整550f中,一選擇的下移運算550fa 被執行。如果變量「high」小於524286,不採取措施,且 將重整以一間隔尺寸增加運算560fb繼續。然而’如果變量 42 201126508 「high」不小於524286,且變量「1〇w」大於或等於524286, 則變量「vaiues」、「i〇w」及「high」被減少524286,使得 變量「low」及「high」定義的—間隔下移,且使得變量「^丨狀」 之值也被下移。然而,如果發現變量rhigh」之值不小於 524286,且變量「low」不大於或等於524286,且變量「1〇w」 大於或等於262143,且變量「high」小於786429,則變量 「value」、「low」及「high」被減少262143,藉此下移變量 「high」與「low」以及變量「value」之值之間的值之間的 間隔。然而,如果上述條件都未滿足,間隔重整被中止。 然而,如果上述任一在步驟55〇fa中被評估的條件被滿 足,間隔增加運异550fb被執行。在間隔增加運算55〇fb中, 變量「low」之值被加倍。並且,變量rhigh」之值被加倍, 且加倍之結果疋增加1。並且,變量「value」之值被加倍(被 移向左側一位元)’且輔助函數「arith_get_next_bit」所獲 得位元流之一位元被用作最不顯著位元。 因此,變量「low」與「high」之值之間的間隔尺寸近 乎被加倍,且變量「value」之精度使用位元流的一新位元 被增加。如上所述,步驟550fa及550fb被重複,直到「中斷」 條件到達,即’直到變量「low」與「high」之值之間的間 隔足夠大。 關於演算法「arith_decode()」之功能,應注意「i〇w」 與「high」之值之間的間隔在步驟550e中依賴參考變量 「cum_freq」的累積頻率表格之兩個相鄰項被減少。如果 選擇的累積頻率表格兩個相鄰值之間的間隔小,即,如果 43 201126508 相鄰值相對接近,則在步驟550e獲得的變量「low」與「high」 之值之間的間隔將相對小。相反,如果累積頻率表格的兩 個相鄰項間隔較遠,則在步驟550e獲得的變量「low」與 「high」之值之間的間隔將相對大。 因此,如果步驟550e獲得的變量「low」與「high」之 值之間的間隔相對小,則一大數目的間隔重整步驟將被執 行以將該間隔重新縮放成一「足夠」尺寸(使得條件評估 550fa的條件都未達到)。因此,一來自位元流的相對大數目 的位元將被使用以增加變量「value」之精度。相反,如果 在步驟550e獲得的間隔尺寸相對大,僅需要一較小的間隔 重整步驟550fa及550fb重複數目以將變量「low」與「high」 值之間的間隔重整成一「足夠」尺寸。因此,僅一相對小 數目的來自位元流的位元將被用以增加變量「value」之精 度,且準備下一符號之解碼。 綜上所述,如果一包含一符號被解碼,該符號包含一 相對高機率,且一大間隔透過選擇的累積頻率表格之項與 其相關聯,則僅一相對小數目的位元將從該位元流被讀 取,以允許一後續符號的解碼。相反,如果一符號被解碼, 其包含一相對小的機率,且一小間隔透過選擇的累積頻率 表格與其相關聯,則一相對大數目的位元將從該位元流被 採用,以準備下一符號的解碼。 因此,累積頻率表格之項反映許多被要求用於解碼一 序列符號的位元。藉由依賴一上下文,即,依賴先前解碼 的符號改變累積頻率表格,例如藉由依賴上下文選擇不同 44 201126508 的累積頻率表格,不同符號之間的隨機相依可被利用,其 允δ午後續(相鄰)符號的一特定位元率有效編碼。 综上所述’參考第5e圖被描述的函數「arith_dec〇de()」 以累積頻率表格「arith—cf_ng[pki]口」被呼叫,對應於函數 「arith_get_pk()」返回的索引pki,以決定群索引 4.5.3溢出機制 雖然解碼群索引ng是溢出符號「ARITH_ESCAPE」,一 附加群索引被解碼,且變量lev被增加2。因此,一關於最顯 著位το平面之數值顯著性及關於許多欲被解碼的較不顯著 位元平面的資訊被獲得。如果一溢出符號 「ARITH—ESCAPE」第欠被解碼聽目前元組解碼,狀 %變11進而被增加值41943〇4,這對應於將變量【的第23位 X設為1。如果一溢出符號被解嗎第二次及多次,則狀態變 量t被。又疋為零。在這兩個情況中,當一溢出符號 厂ARITH—ESCAPE」被解碼時’則更新的狀態變量1被用於 群索引解碼312b的一新迭代。 4·6.元素索引解碼(第5f圖) 一旦解碼群索引不是溢出符號r Arith_ESCAPE」,則 在群ng及群偏移叩中的元素爪瓜之數目透過依據第5f圖所 不演算法查找表格「dgroups[]」而被減少。 換句話說,變量mm被設定成一值,該值由群索引叫決 疋的一表格位置的表格「dgroups[]」之項的最不顯著位元 (<丨如位元〇-7)決定。類似地,群偏移og由表格「dgroups[]」 之項的較顯著位元(位元8及以上)決定,該項由變量ng定義 45 201126508 的位置偏移決定。 ng決定的群包 °乎叫函 如果變量mm大於1 ’即,如果由群索弓丨 含多於一個元素,元素索引ne透過以累積頰率表= 數「arith_decode()」被解石馬。 之長 adth_cf_ne+((mm*(mm_1))>>1)[]A 累積頻率表格 度等於mm。 換句話說,表格「adth_cf_ne[]」的一分段被選擇复 中累積頻率表格「arith_Cf_ne[]」全面描述被群索引叫選^ 的一群的複數個不同數目的元素的機率分佈。應注惫累 頻率表格的「arith—cf_ne[]」不同分段(或子表袼)之偏移由 公式(mm*(mm-l))»l)描述。 應注意被用作演算法「arith_decode()」之—輪入變量 的變量「cum_freq」較佳地被初始化成表格或陣列 「arith_cf_ne[]」的分段(或子表格)之起始位元址,其餘群 索引ng描述的目前群之元素mm數目相關聯。並且,演算法 「arith_decode()」的一輸入變量,變量Cfl被初始化為值 mm。隨後,函數「arith—decode〇」被呼叫,其運算已在上 文詳細描述。然而,對於元素索引之解碼而言,函數 「arith_decode()」使用表格或陣列「arith_cf_ne[]」的一子 表格,而不是累積頻率表格「arith_cf_ng[pki=0][545]」到 「arith_cf_ng[pki=31][545]」之一。因此,元素索引ne被提 供成函數「arith_decode()」的一返回值,其使用位元流的 許多位元獲得元素索引ne。 4.7.最顯著位元平面決定(第5g圖) 46 201126508 一旦元素索引ne被解碼且被返回成函數 「arith_decode()」的一返回值,則4元組的之最顯著被發信 按2-位元平面可使用表格「dgveCt〇r[]」依據第5§圖之演算 法被導出。 例如,該頻譜值元組的一第一頻譜值「a」可被設定成 表格或陣列「dgvectorsn」的一項,其中該陣列元素索引(或 表格元素貪引’或簡稱「元素索引」或「項索引」)被決定 為4*(og+ne)。類似地’該頻譜值元組的第二頻譜值「b」可 被設定成陣列「dgvectorst]」的一項,其中該陣列元素索引 由4*(〇g+ne)+l決定。該頻譜值元組的—第三頻譜值「c」可 被設定成陣列「dgvectors[]」的一項,意击 _ *>=,,, 丹Τ — το素索引由 4*(〇g+ne)+2決定。該頻譜值元組的一第四頻譜值可 被設定成陣列「dgvectorsU」的一項,其中該元素索引由 4*(〇g+ne)+3決^。因此,表示該頻譜值元組的最顯著位元 平面的頻譜值「a」、「b」、「C」、「d」從陣列「―咖[]」 導出’其中決定頻譜值「a」、「b」、「c e」' d」的項依據群 索引ng及元素索引ne(如果可用)被選擇。 4.8.較不顯著位元平面決定(第北圖) 剩餘位元平面進而藉由以累積頻率表格「㈣㈣」 呼令v次函數「adth—decodeO」從最顯著被解碼為最不顯 著位準。為此,函數「adth—decodeO」的輪入變量「cum_freq」 可被初始化為陣列「adth_Cf_r[]」的起始位址。並且,函數 「arith一decode〇」的輸入變量cfl可被初始化為表示表格 「a論乂r[]」之長度的-適當值,該長度在—元組維度4 47 201126508 的情況中等於16。 函數「adth-decodeO」可返回變量r,其表示解碼元組 的較不顯著位元平面之-的二進制值,碼位元平面r允許 依據第5h圖所示演算法精化解碼4元組。 換句話說,當「加入」一較不顯著位元平面時,第一 頻譜值「a」乘2(或等效地,被左移一個位元),且值r的最 不顯著位元(位元0)被加成新的最不顯著位元(這可使用一 OR運算完成)。第二頻譜值「b」乘2,且值Γ的第二位元(位 元1)被加成頻譜值「b」的一最不顯著值。第三頻譜值「c」 乘2(或等效地,左移一個位元),且值Γ的第三位元(位元2) 被加入為最不顯著的位元。第四頻譜值「d」乘2(或等效地, 左移一個位元),且值r的第四位元(位元3)被加入為一最不 顯著的位元。詳細參考第5h圖所示演算法。 4.9.上下文更新(第5i圖) 一旦4元組(a,b,c,d)被完全編碼,即,全部較不顯著位 元平面被加入,上下文表格q及qs藉由函數 「arith—update一context〇」被更新。在下文中,函數 「adth_update一contextO」值細節將參考第5i圖被描述,第 5 i圖繪不該函數的^一偽程式碼表示。 函數「arith_update_conteXt()」接收解碼4元組的頻譜 值「a」、「b」、「c」、「d」,欲解碼的4元組(或解碼4元組)之 索引「i」及與目前音訊訊框相關聯的4元組之數字lg/4作為 輸入變量。 函數「arith_update_context()」包含將頻譜值「 、「 、 a |、 b |、 48 201126508 「c」、「d」複製至陣列0的—步驟58(^例如,在位置 (l,i)(也指定為「q[l][i].a」)的陣列q之項「a」被設定成採 用第-頻譜值「a」。在位置(U)的陣⑼之項%被設定成 第二頻譜值「b」。在位置(U)的陣列q之項%」被設定成第 三頻譜值「c」,且在位置(l,i)的陣列q之項「d」被設定成第 三頻譜值「d」。因此,頻譜值「a」、「b」、「 「d」被儲 b」、「c」、「d」。 存於在位置(l,i)的陣列q之項「a 函數「arith_Update_context〇」也包含設定在位置(u) 的陣列q之項「V」的步驟580b。如果該目前編碼元組的頻 譜值的頻譜值「a」、「b」、「c」、「d」其中之一小於_4或大於 等於4,則在位置(l,i)的陣列q之項rv」被設定成值1〇24。 相反,即,如果全部頻譜值「a」、「b」、「c」、「d」在範圍-4 與+3之間,包括邊界’在位置(i,i)的陣列q之項rv」被設 定成表格或陣列「egroups[]」在位置(4+a,4+b,4+c,4+d^ — 項。因此,如果頻譜值「a」、「b」、「c」、「d」其中之一相 對大,則陣列q之項「V」被典型地設定成一標準值1〇24, 藉此使函數「arith_get__context()」執行一相鄰頻譜值元組 之解碼期間的530f之過程。 函數「arith_update—context()」也包含〆第—映射580c, 如果一目前訊框的最後頻譜值元組被解碼’且核心模式是 線性預測頻域核心模式(對於可在一頻域核心模式與一線 性預測頻域核心模式之間切換的一音訊編碼器的情況而 言)。函數「arith_update_context()」也包含一第二映射 580d, 當目前音訊訊框的最後元組的頻譜值被解碼且當核心模式 49 201126508 是頻域核心模4時被執行。 i ;弟映射580c ’參考第5i圖的偽程式碼。關於第二 映射580d,也參考第义圖。 4.10解碼過程之摘要 在下文中,解碼過程將被簡要概述。 4元組的罝化頻譜係數被低雜訊編碼,且從最低頻率係 數被傳送行it高頻率係數。來自冑級*訊編碼的係數 皮儲存在陣列「x-ac_quant[g][win][sfb][bin]」中,低雜 讯編碼碼字料的順序係使得當它們讀接收及儲存在陣 列中的順序被解碼時,「bin」是最快增量的索引,且「g」 疋最慢增量的索引。在一碼字中,解碼順序是a、b、c、d。 來自轉換編碼激勵的係數被直接儲存在一陣列 x__tcx一invquant[win][bin]」中,且低雜訊編碼碼字之傳送 的順序係使得當它們以被接收及儲存在陣列中的順序被解 石馬’「bin」是最快增量的索引,而rwjn」是最慢增量的索 引。在一碼字中,解碼順序是a、b、c、d。 首先,旗標「arith_reset_flag」決定上下文是否必須被 重设。如果旗標是真,則函數「arith_reset_context」被呼 叫’該函數之一偽程式碼表示在第5圖_繪示。相反,當旗 標「arith—reset_flag」是假時,一映射在上一上下文與目前 上下文之間依據函數「arith_map_context()」被完成,該函 數的一偽程式碼表示示於第5b圖中。 低雜訊解碼器輸出4元組經發信量化的頻譜係數。首 先,上下文狀態基於圍繞要解碼的4元組的四個先前解碼群 50 201126508 而被計算。上下文之狀態由函數「arith_get—context〇」提 出,該函數的一偽程式碼表示示於第5c圖。 一旦狀態已知,4元組所屬之最顯著被發信的按2位元 平面之群係使用函數「arith一decodeO」被解碼,饋以對應 上下文狀態的適當累積頻率表格。 該對應由函數「arith_get_pk()」完成,該函數的一偽 程式碼示於第5d圖。 接著’爲了決定群索引ng,函數「arith—dec〇de()」以 與函數數「arith_get_pk〇」返回的索引對應的累積頻率表 格「arith—cf-ng[pki][]」被呼叫。算術編碼器(或解碼器)是 一使用以縮放的標籤產生方法的整數實施。詳細參考,例 如K.Sayood的著作「Introduction to Data Compression」,201126508 VI. Description of the Invention: [Technical Field] According to an embodiment of the present invention, an audio decoder for providing decoded audio information based on a coded audio message is provided, and an encoded audio message is provided based on an input audio message. An audio encoder, a method for providing decoded audio information based on a coded audio message, a method for providing a coded audio message based on an input audio message, and a computer program. Embodiments in accordance with the present invention are directed to the use of an arithmetic encoder table in an audio encoder such as, for example, a so-called Unified Language and Audio Encoder (USAC). BACKGROUND OF THE INVENTION The background of the invention will be briefly explained below to assist in understanding the invention and its advantages. In the past decade, great efforts have been made to establish digital storage and the possibility of distributing audio content with good bit efficiency. An important achievement of this approach is the definition of the international standard ISCVIEC 14496-3. The third part of the standard is about the encoding and decoding of audio content, while the fourth subsection of Part 3 is about general audio coding. Sections 3 and 4 of IS〇/IEC 1^96 define a concept of encoding and decoding of general audio content. In addition, further improvements have been made to improve quality and/or reduce the required bit rate. The definition of the description of the standard 'one time domain audio signal is converted to a time-frequency representation. This transition from the time domain to the time-frequency domain is typically performed using a transform block. These transform blocks are also referred to as "frames" of time domain samples. It has been found to be advantageous to have 3 201126508 being shifted by, for example, a frame-half overlap frame, since the overlap allows for effective avoidance (disparity reduction) of manpower. Additionally, it has been found that a windowing should be performed to avoid The artifact factor 0 derived from the time finite frame process is obtained by converting a windowed portion of the input audio information from the time domain to the time-frequency domain 'in many cases' to obtain an energy towel, so that certain spectral values contain a significant Greater than the magnitude of a plurality of other spectral values. Thus, in many cases - a relatively small number of spectral values have a level that is significantly higher than the _ average spectral value. A typical example of a time domain to time frequency domain conversion that results in ~ energy concentration is the so-called modified discrete cosine transform (MDCT). These spectral values are often scaled and quantized according to a psychoacoustic model, so that the psychoacoustically important spectral values are relatively small, while the psychoacoustically more important spectral values are relatively A. The scaled and quantized spectral values are encoded to provide one of their bit effective representations. For example, a so-called Huffman coding using quantized spectral coefficients is described in the international standard ISO/IEC 14496-3: 鸠(E), part 3, the first part. However, the quality of the spectral value coding has been found. There is a significant impact on the required bit rate. Moreover, it has been found that the complexity of an audio decoder that is typically implemented in a portable consumer device and therefore inexpensive and has low power consumption depends on the encoding used to encode the spectral values. In this case, there is a need for a coding and decoding concept for audio content that can provide a high demand between high bit rate performance and resource efficiency. < Sincerely, the present invention provides an audio decoder that provides a decoded audio message based on a coded audio resource. The audio decoder comprises: _ based on the spectral values - the arithmetically encoded representation provides a complex bribe coded value: an arithmetic decoder. The audio decoder also includes a frequency domain to time center trade' which provides a time domain soundwire* using the decoded spectral value to obtain a decoded audio signal. The calculation code derives the group index from a variable length code word representing a group of indices based on a state index describing a state of one of the arithmetic decoders. The arithmetic decoder is configured to derive a (four) group index and a pixel index from a value of a most significant bit plane of a spectral value tuple that describes (or specifies, or selects) a group index selection - - 7L in the group. The arithmetic decoder is configured to provide a decoded spectral value tuple using the most significant bit plane value of the spectral value tuple. The arithmetic decoder is configured to select a cumulative frequency table from the set of cumulative frequency tables dependent on a state index describing the state of the arithmetic decoder, and to use the selected cumulative frequency table for the variable length codeword from the representative group index Export the group index. Embodiments in accordance with the present invention are based on the discovery of an optimal compromise between providing a reachable bit rate and the complexity of an audio encoder or audio decoder using a set of 32 cumulative frequency tables. In particular, 32 different cumulative frequency tables have been found to be suitable (because they result in a reasonable low bit rate) - any associated time-frequency domain representation of the audio. A set of 32 cumulative frequency tables has been found to be optimal because the use of a smaller number of cumulative frequency tables results in a significantly increased bit rate and because only a larger number of cumulative frequency tables 201126508 are used. The element rate is insignificantly improved, but it causes a significant increase in the consumption of delta-resonance at the encoder end and the decoder end. In summary, an in-depth study proposes that the 32 probability models selected by a state index and represented by 32 cumulative frequency tables provide an optimal compromise between the bit rate efficiency and the results of the arithmetic coding implementation effort. In a preferred embodiment, the arithmetic decoder is configured to derive a -7-bit hash table index from the state index and obtain from the - hash table - the hash table item value - the hash table contains 128 hash table index values Mapping on the corresponding hash table item value In this case, the nasal decoder is configured to determine the hash table item value (ie, in the hash table index value (4) - the valley in the memory location) is an overflow Value, a valid cumulative frequency table identifier value or a null cumulative frequency table identifier value (eg, an unsuitable state) that is inconsistent with a state index value (eg, a hash index value derived from the base-index value) The value of the index value - the cumulative table identifier value, based on the condition, the value is based on (1). The arithmetic decoder is configured to scan the items of the hash table until the & overflow value or - An effective cumulative frequency table identifier value. If the obtained hash table item value is an overflow value, a cumulative frequency is provided depending on an identification of a value interval in which the state index value is included, and If the hash table item Is a cumulative frequency table identifier value associated with the status index value, and the cumulative frequency index value is derived from the obtained hash table item value. This embodiment of the invention is based on the discovery that only a small amount of audio is "significant" by the decoder. State, which uses a special probability model represented by a special cumulative frequency table (associated with a few only about 1 to 10 states) (associated with a few only 201126508 about 1 to ίο states) is heavy = most states , the best board according to: its; == decoder measured value interval maps the states to one < One of the L-sounds 5丨 will be mapped to the complete state index value interval (= the range of the rate value of the cancer cell index value) to the other day due to the discovery-bit S rate heart rate accumulation frequency cable bow 1 value. The distribution of the "significant" shape = _ get 'even if the specific machine ^ is less than only (3) is OK, the size of the (4) table is reduced, the material of the divergence. . Translucent (10) times the hash table). Due to the need for H...calling and editing, the minimum amount of source 岐 helps to make the audio power of the audio decoder priced at a small amount, and the possibility of holding these audio smashing device Sex. ^ Improved the implementation of inexpensive and action consumption 67 2 preferred embodiment, the hash table is configured with the 7-bit hash table, mapped to the effective cumulative frequency table operator value, and 7-bit The 61 values of the hash table index are mapped to the overflow value. Therefore, there is a "significant" state of only 67 relative J numbers. In addition, any "non-significant" state that is different from the "significant" state (the number of non-significant states is significantly greater than the number of significant states, the at least _-@number, but preferably greater than or even more than 1000) is mapped to The overflow value makes it easy to distinguish between a "significant" state and a "non-significant" state. Therefore, the difference between significant and non-significant states can be quickly determined with low memory consumption. In a preferred embodiment, the arithmetic coder is configured to map 67 different values of the state index to 67 different cumulative frequency table identifier values such that 26 different cumulative frequency index values are described by the state index value 67 not 201126508 are associated with significant status. It has been found that even for a relatively low number of 67 different saliency states, it is sufficient to have only 26 different cumulative frequency distributions associated therewith. In addition, selecting only 26 different cumulative frequency tables associated with 67 salient states shows an optimal compromise between the bit rate requirement and the encoding/decoding complexity requirements. In a preferred embodiment, the arithmetic decoder is configured to map different non-significant states to nine different cumulative frequency index values such that a total of nine different cumulative frequency tables are available for use with the non-significant state, one The interval type mapping is performed to derive a cumulative frequency index value. It has been found that a very small number is preferably nine different cumulative frequency tables sufficient to obtain one bit rate effective arithmetic coding in most of the states. It has also been found that there is a large difference between a relatively small number of salient states and a large number of non-significant states. Preferably, the 67 cumulative states associated with the cumulative frequency table are associated with a particular feature of the spectral value by the effective cumulative frequency table identifier range of the hash table, such as for example having a particular width and direction The trajectory in the frequency representation. Instead, it is considered to be a non-significant state, and a cumulative frequency table uses only a range-based algorithm with all other features associated with it representing only a less characteristic spectral value distribution. In particular, it has also been found that some cumulative frequency tables associated with non-significant states are also well suited for use with some salient states. Accordingly, it has been found to be advantageous to use three cumulative frequency tables (e.g., cumulative frequency tables having indices 05, 26, and 30 associated therewith) that are applied to one or more significant states or one or more non-significant states. For example, it was found to be particularly advantageous in terms of the trade-off between bit rate efficiency and computational complexity: having 23 different cumulative frequency tables that use only hash tables The effective cumulative frequency table identifier value is associated with the salient state, having six cumulative frequency tables that are associated with the non-significant state using only one overflow mechanism, the overflow mechanism being based on the stored in the hash table and a The overflow value of the interval map. Also, it has been found to be advantageous to have three cumulative frequency tables associated with one or more significant states and one or more non-significant states. In accordance with another embodiment of the present invention, an audio encoder is provided that provides an encoded audio message based on an input audio message. The audio encoder includes a time domain to frequency domain converter that provides a frequency domain audio representation based on a time domain representation of the input audio information such that the frequency domain audio representation comprises a set of spectral values. The audio encoder also includes an arithmetic coder configured to encode a neighboring spectral value tuple using a variable length codeword, or a preprocessed version thereof. The arithmetic coder is configured to map a value of a most significant bit plane of a spectral value tuple to a group of indices and an element index that describes an element in a group selected by the group index. The arithmetic coder is further configured to select a cumulative frequency table from a set of 32 cumulative frequency tables in dependence on a state index representing a state of the arithmetic coding region, and to mathematically encode the group index using a selected cumulative frequency table. To obtain an arithmetically encoded variable length codeword. The audio encoder according to this embodiment is based on the same concept as the above-described audio decoder. In particular, the audio encoder caused an optimal compromise between bit rate efficiency and encoding/decoding complexity based on the discovery of the number of 32 cumulative frequency 201126508 rate tables. In accordance with another embodiment of the present invention, a method of providing a decoded audio representation based on a coded audio representation is established. In accordance with yet another embodiment of the present invention, a method of providing an encoded audio representation based on an input audio representation is established. In accordance with still another embodiment of the present invention, a computer program for performing the methods of the inventions is created. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1a - b is a block diagram of an audio encoder according to an embodiment of the present invention, a second embodiment of the present invention. FIG. 3 is a block diagram showing an audio decoder according to an embodiment of the present invention; FIG. 3 is a code representation of an algorithm "tuples_decode()" for decoding a spectral value tuple. Figure 4 shows a schematic representation of the context of a state calculation; Figure 5a shows a pseudo-code representation of an algorithm "arith_reset__context()" that resets a context; Figure 5b shows a context-mapped representation A pseudo-code representation of an algorithm "arith_map_context〇"; Figure 5c shows a pseudo-code representation of an algorithm "arith_get_context〇" that obtains a context state value; 10 201126508 Figure 5d shows a state variable from Deriving a pseudo-code representation of an algorithm "arith_get_pk(s)" of a cumulative frequency table index value pki; Figure 5 e shows an arithmetic decoding of a symbol from a variable length codeword A pseudo-code representation of the method "arith_decode()"; a 5f diagram showing a pseudo-code representation of an algorithm for deriving an element number value mm and a group of offset values og from a group of indices nq; A pseudo-code representation of an algorithm for obtaining a spectral value a, b, c, d of a most significant bit plane of a spectral value tuple based on the group offset value 〇 g and an element index value; Plotting the spectral values of the tuples a, b, c, d with the values of a less significant bit plane to obtain the tuple &, 1), (:, (1) an updated version of the spectral value One of the algorithms is a pseudo-code representation; "5th is a pseudo-code representation of an algorithm for updating the context "adth_update_c〇mext()"; Figure 5j is a diagram showing concepts and variables; Figure 6a Not-sentence-language and audio encoder (10) from a grammatical representation of the original data block; Figure 6b shows - single-channel element - grammatical representation; _ green display - channel-to-metagram grammatical representation; Show _"ics" control information syntax representation; 6e green display _ frequency domain channel _ stream _ syntax table ; Figure 6f shows the arithmetic side of the spectrum data - the grammatical representation of the 6th figure shows the decoding of the touch element (four) - the grammatical representation of the figure 6h shows the data elements and variables of the legend; / n 201126508 Figure 7 A table representation of a memory requirement of a previously used arithmetic coder; FIG. 8 is a table representation of one of the memory requirements of an arithmetic coder in accordance with the present invention; and FIG. 9 is an illustration of an evaluation through arithmetic according to the present invention. The block diagram of the performance improved device obtained by the encoder is not intended; FIG. 10 is a table representation of one of the bit rates required for encoding different audio information using a previously used arithmetic coder; FIG. 11 illustrates the use of the invention The concept encodes a table representation of the bit rate required for different audio information; Figure 12 illustrates, in the form of a table, the average bit rate produced by a previously used audio encoder and an audio encoder in accordance with the present invention. a comparison of Figure 13 in a table representation showing a reduction in bit rate and an increase in bit rate using the concept of the present invention compared to a previously used concept a comparison; Figure 14 shows a table representation of a table "arith_cf_ng_hash[]"; Figures 15(1) through 15(10) show a table representation of a table "arith_cf_ne[]" Figure 16(1) to Figure 16(32) show a table representation of the items of the table "arith_cf_ng[pki]" of 32 different values 0 to 31 of the index pki; 17(1) to 17(2) shows a table representation of the item "dgroups[]"; 12 201126508 Figure 18(1) to Figure 18(11) shows a table of the table "dvectors[]" Figure 19 (1) to 19 (32) show a table representation of a table "egroups[a][b][c][d]"; Figure 20 shows a A flowchart of a method of decoding representation of audio information; and FIG. 21 is a flow chart showing a method of providing an encoded representation of audio information. I: Embodiment 3 Detailed Description of Preferred Embodiments 1. Audio Encoder In the following, an audio encoder according to an embodiment of the present invention will be described. FIG. 1 is a block diagram showing the audio encoder 1〇〇. The audio encoder 100 is configured to receive an input audio message 110 and, based thereon, provide a bit stream 112 that constitutes an encoded audio message. The audio encoder 100 can optionally include a pre-processor 12 that is configured to receive input audio information 11 and provide a pre-processed input audio information 110a based thereon. The audio encoder 1 〇〇 also includes an energy concentrated time domain to frequency domain signal converter 130, which is also referred to as a signal converter. The signal converter 13 is configured to receive input audio information 110. And 110a, and based thereon, provide a frequency domain audio information 132 preferably in the form of a set of spectral values. For example, signal converter 130 can be configured to receive a frame of input audio information 11 〇, l10a (eg, a time domain sample) and provide a set of spectra indicative of the audio content of the associated audio frame 13 201126508 . In addition, the 'signal converter 130 can be configured to receive a plurality of subsequent, overlapping or non-overlapping, audio frames of the input audio information 110, 110a, and provide a time-frequency domain audio representation based thereon, the time-frequency domain audio representation A sequence of subsequent spectral tuples is included, a set of spectral values associated with each frame. The energy-concentrated time-domain to frequency-domain signal converter 13A may include an energy-concentrated wave filter set that provides spectral values associated with different, overlapping or non-overlapping, frequency ranges. For example, the signal converter 13A can include a windowed MDCT converter i3〇a' configured to window the input sound sfl information 11〇, ii〇a (or one of its frames) using a conversion window. And performing a modified discrete cosine transform of the input audio information 110, ll〇a (or its windowed frame). Thus, the frequency domain audio representation 132 can include a set of, for example, 1 to 24 spectral values in the form of MDCT coefficients associated with a frame of input audio information. The audio encoder 100 can further optionally include a spectrum post processor 140 configured to receive the frequency domain audio representation 132 and provide a post processed frequency domain audio representation 142 thereon. The spectrum post processor 〇 4 〇 may, for example, be grouped to perform a time noise shaping and/or a long term prediction and/or any other spectral post processing conventionally known in the memory. The audio encoder can further optionally include a scaler/quantizer 15A configured to receive the frequency domain audio representation 132 or a subsequent processed version 142 and to provide a scaled and quantized frequency domain audio representation 152. The audio encoder 100 can further optionally include a psychoacoustic model processor 16〇 configured to receive input audio information 11 (or a subsequent processing version 1 l〇a) 'and provide a basis thereon Used for control of the energy concentration time domain 14 201126508 to the frequency domain signal converter 130, optionally controlled by the spectrum post processor 140, and/or optionally with optional control of the scaler/quantizer 150 . For example, the psychoacoustic model processor 160 can be configured to analyze the input audio information to determine the portion of the input audio information 110' ll〇a that is particularly significant to the human content of the audio content, and to input the audio information 110, ll〇a to the audio The part of the content that is less noticeable. Accordingly, the psychoacoustic model processor 160 can provide control information that is used by the audio encoder 100 to adjust the frequency domain audio representation 132 by the quantized resolution applied by the scaler/quantizer 150 and/or the scaler/quantizer 150. , 142 zoom. Thus, a perceptually important scaling factor band (ie, groups of adjacent spectral values that are particularly important for human perception of audio content) is scaled by a large scaling factor and quantized at a relatively high resolution, while being perceptually more The unimportant scaling factor bands (i.e., cluster neighboring spectral values) are scaled by a relatively small scaling factor and quantized with a relatively low quantization resolution. Thus, the scaled spectral values of the perceptually more important frequencies are typically quite larger than the spectral values of the perceptually less important frequencies. The audio encoder also includes an arithmetic coder 170 configured to receive the scaled and quantized version of the frequency domain audio representation 132 (or alternatively, the frequency domain audio representation 132 is followed by the version 142, or even the frequency domain audio The representation 132 itself, and based thereon, provides arithmetic codeword information 172a, 172b such that the arithmetic codeword information represents the frequency domain audio representation 15 2 . The audio encoder 100 also includes a one-bit stream load formatter 19 that is cancerated to receive its code §fl 17 2 a, 17 2b. The bit stream payload format 5|190 is also typically configured to receive additional information such as, for example, scaling factor information describing the scaling factor applied by the scaler/quantizer 150. Additionally, Bit 15 201126508 Meta-Flow Load Formatter 19 0 can be configured to receive other control information. The bitstream load formatter 190 is configured to provide a bitstream 112 based on information received in accordance with a desired bitstream syntax combination bitstream, as will be described below. In the following, the details of the arithmetic coder 170 will be described. Arithmetic coder 170 is configured to receive a plurality of tuples of frequency domain audio representation 132, such as four post-processing and scaled and quantized spectral values. The arithmetic coder includes a _ most significant bit plane extractor 174' configured to extract a most significant bit plane from a spectral tuple. It should be noted here that the most significant bit plane may contain one or even more bits (e.g., two or three bits) which are the most significant bits of the spectral value of the spectral value tuple. Thus, the most significant bit plane extractor 174 provides a most significant bit plane 176 of spectral value tuples (which are preferably adjacent in frequency). The arithmetic coder 17A also includes a group index 疋/疋 索引 index determinator 178 that is configured to map the most significant bit plane 176 to a set of index values ng and an element index value. This mapping can be performed using a lookup table, for example, the lookup table "Brown-s" discussed in detail below. The group index decision sub/element index determinant 178 can be configured to map certain combinations of values of the most significant bit plane 176 to a group of indices ng of a group containing only one element, and can be grouped into the most Other combinations of values of significant bit planes 176 map to one of a group comprising a plurality of combinations of values. Thus, the group index decision sub/element index determinant can be grouped to map such combinations of values of the most significant bit plane 176 containing a relatively high probability to a group containing only one or only some elements, and will contain A combination of values of the most significant bit plane 176 of a relatively low probability maps to a group containing more elements. Thus, an element index ne that is mapped to a value of a group containing only _single elements can be used with only a single value and can therefore be ignored. A combination of values that are mapped to a group comprising a plurality of elements may take a plurality of values. Thus, the group index decision sub/element index determinant 178 provides a set of index values ng (also referred to as 180a) and, if necessary, an element index value ne (also known as 18〇1), where the element index value can be Set to a preset value, or if the group ng mapped by the most significant bit plane 176 contains only a single element, it is omitted. The arithmetic coding region 170 also includes a first codeword determinant 18〇 which is combined to determine an arithmetic codeword acad_ng[pki][ng] representing the group index ng. Further, if the number of elements mm of the group ng is larger than the first code word determinant 18 〇, an arithmetic code word acup_ne[ne] indicating the element index can be provided. The arithmetic code word ac〇d_ne[ne]t representing the elementary index ne can be ignored. Optionally, codeword determinant 180 may also provide one or more overflow codewords (also referred to herein as "ARmi_ESCAPE"), indicating, for example, how many less significant bit planes are available (and, therefore, indicating The most significant bit plane is digitally weighted). The first codeword determinant (10) may be formulated to use - a selected cumulative frequency table having (or referenced) - cumulative frequency table index Pki' to provide a codeword associated with a group of indices ng. To determine the cumulative frequency table that should be selected, the arithmetic coder preferably includes a - state chasing (four) 182 that is configured to pass, for example, to observe the state of the arithmetic coder prior to the previously encoded spectral value tuple. The terminator 182 thus provides - status information 184, such as a status value called "s" or . The arithmetic coding area 17A also includes a cumulative frequency table selector 186 configured to receive status information (8) and provide information to the codeword determinant (10) for describing the selected cumulative frequency table. For example, the cumulative frequency table selector 186 can provide a cumulative frequency table index pki that describes a cumulative frequency table used by the codeword determinant in a set of 32 cumulative frequency tables. Alternatively, cumulative frequency table selector 186 can provide the entire selected cumulative frequency table to the codeword determinant. Therefore, the codeword determinant 180 can use the selected cumulative frequency table for providing the codeword acad_ng[pki][ng] ' of the group index ng such that the actual codeword acod of the coding group index ng is a ng[pki][ng] The pki is indexed in the ng value and cumulative frequency table, and thus depends on the current status information 184. Instead, the first codeword determinant 180 can use a preset (status-independent) cumulative frequency table for providing the codeword acad_ne[ne], however, the codeword acad_ne[ne] can depend on the number of elements in the selected group ng. . Further details regarding the encoding process and the obtained codeword format will be described below. Arithmetic encoder 170 further includes a less significant bit plane extractor 189a that can be configured to use only one of the most significant bit planes if one or more values of a spectral value tuple to be encoded are exceeded. The range of encoded values then extracts one or more less significant bit planes from the scaled and quantized frequency domain audio representation 152. The less significant bit planes may contain one or more bits per spectral value as desired. Thus, the less significant bit plane extractor 189a provides a less significant bit plane information 189b. Arithmetic encoder 170 also includes a second codeword determinant 189c that is configured to receive less significant bit plane information 189d' and to provide a representation of 〇, 1 or more less significant bit planes based thereon.内容, 1 or more codewords of the content "acod second codeword determinant 189c can be combined to apply an arithmetic coding calculus, to go to 18 201126508 any other coding algorithm, from less significant bit plane information 189b derives the less significant bit plane codeword "acod_r". It should be noted herein that the number of less significant bit planes may vary depending on the value of the scaled and quantized spectral values 152 such that if the current tuple's scaled and quantized spectral values are relatively small, there may be no less significant bit planes at all. So that if the current tuple's scaled and quantized spectral values are in the middle range, there may be a less significant bit plane, and if the scaled and quantized spectral values take a relatively large value, there may be more than one less. Significant bit plane. In summary, the arithmetic coder 170 is configured to encode a tuple scaled and quantized spectral value described by the information 152 using a hierarchical encoding process. The most significant bit plane (including, for example, one, two, or three bits per spectral value) is encoded to obtain an arithmetic codeword "acod_ng[pki][ng]" for a set of indices ng, in some cases A code word "acod_ne[ne]" of an element index ne. One or more less significant bit planes (each of which less significant bit planes contain, for example, one, two or three bits) are encoded to obtain one or more code words "acod_r". When encoding the most significant bit plane, the combination of values of the most significant bit planes is mapped to a group of ngs in a plurality of groups, wherein some of the groups contain only one element, and wherein each of the other groups in the group Contains multiple elements. Therefore, the probability of different combinations of values is considered. Subsequently, the group index ng and the element index ne (if needed) are encoded, wherein 32 different cumulative frequency tables are available for relying on a state of the arithmetic coder 170, ie encoding the group index ng by previously encoded spectral value tuples. . Therefore, the code words "acod_ng[pki][ng]" and "acod_ne[ne]" 19 201126508 are obtained, wherein if the group index ng is specified as a group containing more than one element, the latter code word is only included in the bit element. Stream 112. Additionally, if one or more less significant bit planes exist, one or more code words "aC〇d-r" are provided and included in the bit stream. Reset Description The audio encoder 1 〇 〇 can optionally be configured to determine whether an improvement in the bit rate can be obtained by resetting the context, for example, by setting the state index to a preset value. Thus, the audio encoder 100 can be configured to provide a reset information that indicates whether the context of the arithmetically encoded code is reset and also indicates whether the context of the differentiated decoding in a corresponding decoder should be reset (eg, Is "arith_reset_flag"). Details regarding the bit stream format and the cumulative frequency table of the application will be described below. 2. Audio Decoder In the following, an audio decoder in accordance with an embodiment of the present invention will be described. FIG. 2 is a block diagram showing the audio decanter. The audio decoder 200 is configured to receive a bit stream 21 〇 which represents an encoded audio message and which may be equal to the audio encoder 1 〇〇 providing the bit stream 112. The audio decoder 200 provides a decoded audio message 212 based on the bitstream 210. The speech decoder 200 includes an optional bitstream load variant item 220 that is configured to receive the bitstream 210 and retrieve an encoded frequency domain audio representation 222 from the bitstream 210. For example, the bitstream load variant item 220 can be grouped into a spectral element that the arithmetic unit 210 performs arithmetic coding, such as, for example, an arithmetic codeword "acod_ne[ne]"" of the element index ne and a representation frequency. Domain tone 20 201126508 The condition is not - less significant than the code word "acod_r" of one of the bit planes. Thus, the encoded frequency domain audio representation 222 constitutes (or contains) an arithmetically encoded representation of the spectral values. The bit stream load variant item 220 is further configured to extract additional control information from the bit stream, as shown in FIG. Additionally, the bitstream load variant item can optionally be configured to retrieve a state reset information 224 from the bitstream 2, which is also designated as an arithmetic reset flag or "arith_reset_flag". The audio decoder 200 includes an arithmetic decoder 230 and is also designated as a "spectral noise decoder." Arithmetic decoder 23A is configured to receive encoded frequency domain audio representation 220 and, optionally, receive state reset information 224. The arithmetic decoder 230 is also configured to provide a decoded frequency domain audio representation 232 that can include a decoded representation of the spectral values. For example, the decoded frequency domain audio representation 232 can include a decoded representation of a plurality of spectral value tuples' which are described by the encoded frequency domain audio representation 220. The audio decoder 200 also includes an optional inverse quantizer/rescaler 240' configured to receive the decoded frequency domain audio representation 232 and provide an inverse quantized and rescaled frequency domain based thereon. The audio indicates 242. The audio decoder 200 further includes an optional spectrum pre-processor 250' configured to receive the inverse quantized and rescaled frequency domain audio representation 242' and to provide inverse quantization and rescaling of the frequency domain based thereon. The audio table is not a pre-processed version 252 of 242. The audio decoder 2A also includes a frequency domain to time domain signal converter (transf〇rmer) 260, which is also designated as a "converter". The signal converter 26 is configured to receive the inverse quantized and rescaled frequency domain audio representation 242 (or alternatively, the inverse quantized and rescaled frequency domain audio representation 242, or the decoded frequency domain audio representation 232) 21 Preprocessed version 252' of 201126508 and provides a time domain representation 262 of the audio information based thereon. The frequency domain to time domain signal converter 26 can, for example, include a converter that performs a modified discrete cosine inverse transform (IMDCT) and a suitable windowing (and other auxiliary functions such as, for example, a superposition). The audio decoder 200 can further include an optional time domain post processor 270 configured to receive a time domain representation 262 of the audio information and to obtain decoded audio information 212 using a time domain post-processing. However, if the post-processing is omitted, the time domain representation 262 can be equal to the decoded audio information 212. It should be noted here that the inverse quantizer/rescaler 240, the spectrum preprocessor 250, the frequency domain to time domain signal converter 260, and the time domain post processor 270 can control the information to be controlled. The control information is deformed by the bit stream load. Item 220 is retrieved from bit stream 210. Summarizing the functionality of audio decoder 200 described above, a decoded audio representation 232, such as a set of spectral values associated with an audio frame encoding audio information, may be obtained using arithmetic decoder 230 based on encoded frequency domain representation 222. Subsequent 'the group' may, for example, be 1024 spectral values of the MDCT coefficients, inverse quantized, rescaled, and preprocessed. Therefore, a set of spectral values of inverse quantization, re-scaling, and spectral pre-processing (e.g., 1024 MDCT coefficients) are obtained. Thereafter, a time domain of an audio frame indicates the set of inverse quantized, rescaled, and spectrally preprocessed frequency domain values (e.g., MDCT coefficients). Therefore, a time domain representation of the 'one audio frame' is obtained. The time domain representation of a given audio frame can be combined with the time domain representation of the previous and/or subsequent audio frames. For example, a superposition between the time domain tables of the subsequent frames can be performed to smooth the transition between the time domain representations of adjacent audio frames, and obtain a reflex 22 201126508. The reconstruction of the gamma cymbal Λ Λ 212 of the frequency domain audio representation 232 is referred to, for example, the international standard 岱 (5) π 14496 3, the third part of the '4th subsection' which is discussed in detail. Some details regarding the arithmetic decoder 230 will be described hereinafter. The arithmetic decoding state 230 includes a group index determinant/element index determinant 280' which is configured to receive the arithmetic code word acad_ng[pki][ng;]"' of the t-tng and if the code word "(10) is available, It also receives the code word "aeGd" e_" of the το素索丨丨ne. The group index is determined to be ', and is configured to provide a -decode group index value %, and if there is a group index value 叩 described group packet 3 - element, the "decoding element index value ne is also provided. The determinant sub-element index determinant can be configured to provide a preset element index value ne', for example, if the group described by the group 丨value ng contains only one element. The group index determines the sub/element index determinant. It can be grouped into (4)—the material contains a list of cumulative frequencies in the 3 2 (four) _ rate table 'for the arithmetic code word 'aeGd-ng[pki][ng]” Export the group index value ng. The arithmetic decoder further includes - the most significant bit plane determinant 284' which is formulated to derive a 2-byte (or 3-byte) based on the __group index value 叩 and the _ element index value ne' A value 286 of the most significant bit plane of the spectral value. The arithmetic decoder 23 further includes a less significant bit plane determinant 288' that is configured to receive one or more codewords "ac" representing one of the spectral value tuples or one or more less significant bit planes. 〇d_r". Thus, the less significant bit plane determinants 288 are grouped to provide a decoded value 290 that provides one or more less significant bit planes. The audio decoder 2 〇〇 also includes an S-bit plane combiner 292 'which is configured to receive the resolution value 286 of the most significant bit plane of the 2011-21508 of the spectral value tuple, and such as > (4) t-day value 70 The group also receives the decoded value of the spectral value element 显 significant bit plane. Therefore, the bit plane 2 is supplied with a coded spectral value tuple, and the spectral valued tuple is a part of the semaphore frequency 1 236. Since then, the arithmetic decoder 230 exemplifies that the group of money provides a plurality of values_decoding values, called the set of complete decoded spectral values associated with the current frame of the audio content. The nose decoder 230 further includes a cumulative frequency table selector 2% configured to select one of the 32 cumulative frequency tables dependent on a state index 298 describing the state of the arithmetic lexiculator. The Arithmetic Decoder 2-step includes a state tracker 299 that is configured to track a state of the arithmetic decoder depending on the previously decoded spectral values of the set of equivalences. The status information responsive status reset information 224 can optionally be reset to a preset status message. Thus the 'cumulative frequency table selector 2% is configured to provide an index (eg, pki) of a selected cumulative frequency table to the group index decision sub/element index determinant 280 or a selected cumulative frequency table itself, application The group index ng is encoded in the dependency group index code word "acod_ng". To summarize the functionality of audio decoder 200, audio decoder 2 is configured to receive a one-bit rate efficient encoded frequency domain audio representation 222 and, based thereon, obtain-decode a thief audio representation. In (4) an arithmetic decoder 230 that learns the decoded frequency domain audio representation 232 based on the encoded frequency domain f-representation 222, a probability of different combinations of values of the most significant bit-planes is utilized using an arithmetic decoder 28, arithmetic decoding The 280 is configured to apply a cumulative frequency table. In addition, statistical dependencies between different spectral value tuples are utilized by relying on a state line 298 to select a different cumulative frequency table from a set of 32 different cumulative frequency tables that are previously calculated by observation. The decoded spectral value tuple is obtained. 3. Overview of Spectrum Low Noise Coding Tools In the following, details regarding encoding and decoding algorithms performed by, for example, arithmetic coder 170 and arithmetic decoder 230 will be described. Emphasis is placed on the description of the decoding algorithm. However, it should be noted that a corresponding encoding algorithm can be executed in accordance with the teachings of the decoding algorithm, where the mapping is reversed. It should be noted that the decoding discussed below is used to allow one of the spectral values of classical post-processing, scaling and quantization, so-called "spectral low noise coding". Spectral low noise coding is used in an audio coding/decoding concept to further reduce the redundancy of the quantized spectrum, which is obtained, for example, by an energy center time domain to frequency domain converter. The spectral low noise scheme used in embodiments of the present invention is based on an arithmetic coding and a dynamic adaptation context. In the preferred embodiment and hereinafter, the spectral values are processed by spectral low noise coding that combines four consecutive spectral values on the frequency to become a 4-tuple tuple coded. The low noise code is fed by the quantized spectral values and uses, for example, a context-dependent cumulative frequency table derived from four previously decoded adjacent 4-tuples. In this paper, the proximity of time and frequency is considered, as shown in Figure 4. A cumulative frequency table (described below) is in turn used by the arithmetic coder to generate a variable length binary code and used by the arithmetic decoder to derive decoded values from a variable length binary code. 25 201126508 For example, the arithmetic coder 170 relies on the respective probability of production. 4. The decoding process 4.1 Overview of the decoding process <A Summary of the Process of Referencing a Tuple In the following, the process of decoding a spectral value tuple is shown in Figure 3, and Figure 3 illustrates the decoding of a plurality of spectral values. The initialization of the context contains the use of the function ' or the function to derive the current up and down. The current context will contain ~ 310 in the process of decoding a complex number of spectral value tuples. Context initialization 31 〇 Selectively "anth_reset_context()" resets the context "anth_map_context(lg/4)" from a previous context. Resetting the context and deriving from a previous context are described below. The plurality of spectral value elements_decoding also includes an iteration of tuple decoding 31, which updates 314, which is updated by a function "adth_update_C〇ntext(a,b,c,d,I,lg/4) Execution, as described below. The s-tuple decoding 312 and the context update 314 are repeated lgM times, where ig/4 represents the number of spectral value tuples to be decoded. ◎ tuple decoding 312 includes a context value calculation 312a, a group of index decodings 312b, an element index. Decode 312c, a most significant bit plane measurement 312d, and a less significant bit plane addition 312e. The state value calculation 312a includes calculating a first state value using the function "arith_get_context(i)" where the function returns to the first state. The value s. The state value 26 201126508 calculation 312a also includes calculating a one-bit value lev obtained by moving the first state value s to the right side with 24 bits. The state value calculation 312a also includes calculating a second value t according to the formula shown in FIG. The group index decoding 312b includes an iterative execution of a decoding algorithm 312ba, wherein one is initialized to 0° before the first execution of the distinct method 312. / and the algorithm 312ba includes using the above function "arith_get_pk() A state index pki (which is also used as a cumulative frequency table index) is calculated depending on the second state value t. Algorithm 312ba also includes a dependent frequency index pki to select a cumulative frequency table in which a variable "cum_freq" can be set to an initial bit address of one of the 32 cumulative frequency tables depending on the state index pki. Also, a variable "cfl" can be initialized to a length of the selected cumulative frequency table equal to the number of symbols in the letter, i.e., the number of different values that can be decoded. From "arith_cf_ng[pki=0][545]" to "arith_cf_ng[pki=31][545]", the length of all cumulative frequency tables available for decoding the group index is 545, because 545 different group indexes and An overflow symbol is comparable to decoding. Subsequently, a group of indexes ng can be counted in the selection cumulative frequency table by executing a function "arith-decode()". When the group index ng is derived, the bit stream 210 named "acod-ng" can be evaluated (between 6g). Algorithm 312ba contains checking if the group index ng is equal to an overflow symbol "ARITH_ESCAPE". If the group index is not equal to the arithmetic overflow symbol, then algorithm 312ba is aborted ("intermittent" - case) and the remaining instructions of algorithm 3nba are therefore skipped. Thus, execution of the process continues with the original index decoding 312c (if needed) or with the most significant bit plane determination 312d. In contrast, 27 201126508 If the decoding group index ng is equal to the arithmetic overflow symbol "Arith_ESCAPE", the level value lev is increased by two. And, if the algorithm 312ba is executed for the first time, that is, if j=0, the second state value is increased by 4194304, otherwise the second state value t is set to 〇» and the variable 』 is repeated in the algorithm 3121^ Previously set to 1 ° as described above, the algorithm 312ba is repeated until the decoded group index ng is different from the arithmetic overflow symbol. Group index decoding 312b - upon completion, i.e., a group index value other than the arithmetic overflow symbol is decoded 'If necessary, element index decoding 312c is committed. For this reason, one of the groups (the number of elements) of the group having the group index ng is determined 'the base of the group specified by the group index mm mm in the table position ng by the table dgroups" "dgr〇ups[ng]" The eight least significant bits are described (bits 0-7). If the cardinality mm of the group specified by the group index % is greater than one, the prime index ne is obtained by performing an algorithm 312ca. The element index ne can optionally be set to 〇, or a different preset value. For example, the operation = 〇 can be executed before the conditional statement "if (mm > 1)". Algorithm 312 "contains an initial bit address cum_freq that determines an appropriate cumulative frequency table or a cumulative frequency sub-table." For example, the variable r cum - red call can be set to the sum of the initial bit address of the cumulative frequency table "arith_cf_ne" and the value (mm) * (mm - 1) / 2, as shown in Fig. 3. And, the variable cfl can be initialized to a respective length of the cumulative frequency table or the cumulative frequency sub-table' which is equal to the number of elements mm in the group index ng. Subsequently, the element index ne can be obtained by executing the function rarith-dec〇de(), wherein the selected cumulative frequency table (e.g., the sub-table of the table "arith_cf~ne") associated with the element index encoding is used. 28 201126508 Then the 'most significant bit plane value determination 312d is executed. For this purpose, the item of the table "dgvectors" is evaluated, the index of which is indexed by the element index ng by the value of the table "dgr〇ups", the most significant bit plane (for example, bits 8-15), and by The prime index ne is determined as shown in Figure 3. Specifically, the value of the most (four) bit plane of the first-spectrum value "a" (which belongs to the -spectral value tuple) is represented by the table "dgVeCtors" as an element index, >>8+ shape) was decided. Similarly, the most significant bit S-plane value of the second spectral value "b" (belonging to the spectral value tuple) is obtained by evaluating the index, >>8+ne)+1 evaluation table "dgvectors" . Similarly, a most significant bit plane value of a third spectral value "c" and a fourth spectral value "d" (which belongs to the spectral value tuple) is obtained, as indicated by reference numeral 312d in FIG. After ik, the less significant bit plane is obtained, for example, the reference number shown in Fig. 3 is 312e. For each of the less significant bit planes of the tuple, one of the 16 unary combinations is decoded. However, it should be noted that the value of obtaining a less significant bit plane is not particularly relevant to the present invention. 4·2 Decoding Sequence (Fig. 4) In the following, the decoding order of the spectral values will be described. The 4-bit quantized spectral coefficients are encoded with low noise and are transmitted from the lowest frequency coefficients and the traveling wave to the highest frequency coefficient (e. g., in the bit stream). From an advanced audio coding (for example using a modified discrete cosine inverse conversion as described in ISO/IEC 14496, Part 3, Part 4) - is called "x_ac_auant[g][win] In the array of [sfb;|[bin]", the transmission order of the low-pitched codewords (eg, ac〇d-ng, ac〇d_ne, ac〇d_r) is such that when they are received and stored in the array In the sequential encoding 29 201126508, 'bin' (frequency) is the fastest incremental index, and "§" is the slowest incremental index. In a codeword, the order of decoding is a, b, c, d. In other words, the values a, b, c, d are spectral values of adjacent frequencies, wherein the spectral value a is associated with a lower frequency than the spectral value b, and the spectral value is equal to the lower spectral value c. The frequency is associated, and the spectral value c is associated with a lower frequency of the spectral value d. The coefficients from the converted coded excitation (tcx) are stored directly in an array x_tcx-mvquant[win][bin], and the low noise coded code words are transmitted as such when they are received and stored in the array. When encoded, "bin" is the fastest incremental index, and "win" is the slowest incremental index. In the code, the decoding order is &, b, c, d. In other words, if the spectral value description - the linearity of the speech coder predicts the __ transform coding excitation of the kernel, then the spectral values a, b, c, d are associated with the adjacent and increasing frequencies of the transcoded excitation. Configurable to apply the decoded frequency domain audio table 232' which is provided by the arithmetic decoder 23 , for generating a time domain audio signal representation using the domain signal conversion "directly + +", and for: using the - frequency domain To the time domain decoder and a linear pre-stimulus (4) triggered by the rotation of the frequency domain... (4) The benefit of the wave is indirectly provided to provide an audio signal representation. a #H其魏 The arithmetic decoder 200 discussed in the present discussion is very suitable for decoding the spectral value of the phonological (four) 容--time-frequency-based code in the frequency domain, and the cup pair - 1 domain table No - the positive code is compiled in the linear domain. The second - linear chopper provides - the stimulus signal - the time frequency J does not. Therefore, the arithmetic decoding cries a ώ time-frequency domain table is not suitable for use in an audio decoder capable of processing frequency domain code 30 201126508 audio content and linear prediction. frequency domain coded audio content. 4.3. Context Initialization In the following, context initialization performed in a step 310 will be described. First, it may be that the flag "arith_reset flag" of a part of the bit stream determines whether the context should be reset. If the flag is TRUE, the function "arith_reset-context()" shown in Figure 5a is called. If the flag "arith_reset_flag" is FALSE, a mapping is performed between the previous context and the current context according to the algorithm "arith_map_context", as shown in Figure 5b. As shown, the reset of the context executed by the function "arith-reset_context()" contains the arrays q and qs (specified as, for example, qs[i].a, q[〇][i].a and q[l] [i].a) Initialization of items "a", "b", "c", and "d" to zero. In addition, the arrays q and qs (designated as qs[i].v, q[0][i].v, q[l][i].v) are initialized to -1. Also, the variable "previous_lg" is initialized to 1〇24. However, if it is decided not to reset the context, the context-mapping can be performed according to the algorithm "arith_map_context". As shown, the mapping depends on the core mode, where "c〇re_mode==l" indicates that the decoded spectral value is associated with a phenomenon-predicted frequency-domain encoded audio frame, and wherein "core_in〇cie==0 Indicates that the decoded spectral value is associated with a frequency domain encoded audio frame. It should be noted that 'if the number of spectral values associated with the current frequency domain coded audio frame is equal to the number of spectral values associated with the previous frame with i = 〇 to i = lg / 4 + 1' then the function "arith_map_context() The term q[〇][i] of the current context array q is set to the value 31 201126508 qs[i] of the previous context array qS. However, if the number of spectral values associated with the current audio frame is different from the number of spectral values associated with the previous audio frame, a more complex mapping is performed. However, the details of the mapping in this case are not particularly relevant to the key concepts of the present invention, so reference is made in detail to the pseudo-code of Figure 5b. 4-4 State Value Calculation (Fig. 5c) In the following, state value calculation 312a will be described in more detail. It should be noted that the first state value s (as shown in Figure 3) can be obtained by returning a value of one of the functions "arith_get-context(i)", a pseudo-code of the function as shown in Figure 5c. Regarding the calculation of the state value, reference is also made to Fig. 4, which shows the context used for a state s ten calculation. Figure 4 depicts a two-dimensional representation of the spectral value tuples, in time and frequency. One horizontal coordinate 41 〇 describes time, and one vertical coordinate 412 describes frequency. As shown in Fig. 4, the tuple 420 to be encoded is associated with a time index t0 and a frequency index i (remember that the spectral values of the tuple 420 to be decoded are associated with four different frequencies). As shown, for the time index to, the tuples with frequency indices i-Ι and i-2 have been decoded when the tuple 420 with frequency 丨i is to be decoded. As shown in FIG. 4, tuple 430 having a time index t0 and a frequency index i-Ι has been decoded before tuple 420 is decoded, and tuple 430 is considered for use with decoding tuple 420. Context. Similarly, a tuple 440 having a time index t-Ι and a frequency index i-丨, a tuple 444 having a time index t-Ι and a frequency index i, and having a time index t-1 and a frequency The tuple 448 of index i+1 has been decoded 32 201126508 before tuple 420 was decoded and is considered for decision of the context for decoding tuple 420. Phases, one other tuple has been decoded, they are represented by squares with dashed lines, and other tuples that are undecoded and drawn by dashed circles are used to determine the context for decoding tuples 420. . Referring now to Figure 5c, the function "arith_get_c〇ntext()" is shown. <Function, calculation details regarding the first context value "s" will be described. The function "arith_get_context()" contains a variable initialization 53〇a during which the variables t〇, tl, t2, and 13 depend on the item "V" of the array q at the index bit complex (0, i), (1^) , (O'W) and (〇i+1) are initialized. Therefore, the variables t〇 to t3 are initialized with the value of the item "v", which is associated with the tuples 444, 430, 440, 448 shown in Fig. 4, respectively. It should also be noted that the function "arith_get_context()" performs a subsequent check of a plurality of emotions/brothers, where the function "arith_get_context()" is terminated when a "return" instruction is reached, where the return instruction (or operator) Used to return its operator with the slice large state value s (following the return instruction or operator). The execution of the function "arith_get_context()" includes a first condition check 530b. If all variables t0, tl, t2, and t3 (values) are found to be less than 1 〇, the return value is calculated as indicated by reference numeral 530b, and the function "arith_get__context()" is terminated with the return of the return value. The execution of the function "arith_get_context()" also includes a second condition check 530c. If it is found in the second condition check 530c that all of the variables t〇, tl, t2, and t3 are less than 34, the variables t2 and t3 are conditionally modified as indicated by reference numeral 530c, and the return value is as shown by reference numeral 530c. calculated. Specifically, if the variable t2 is greater than 1 and less than 10, the variable t2 is set to 2. In contrast, 33 201126508 If the variable t2 is greater than or equal to 10, the variable 〇 is set to 3. Similarly, if the variable t3 is greater than 1 and less than 10, the variable [3] is set to 2. Conversely, if the variable t3 is greater than or equal to 1 〇, the variable 13 is set to 3. Therefore, the range of the value of the variable t2 is limited to a maximum positive value of 3. However, if the condition of the first condition check 530b and the condition of the second condition check 53 are not satisfied, a third condition check 530d is executed. If it is found that the variables t0 and tl are both less than 90 in the third condition check 530d, the return value such as > the test number 53〇d is calculated, wherein the values of the variables t2 and t3 are considered. ^, if the conditions of the first condition check 530b, the second condition check 530c, and the first condition check 530d are not satisfied, a fourth condition check 53A is executed, in which it is determined whether the variables t0 and t1 are both smaller than 544. In this case, the return value is calculated as reference numeral 53〇e, and the function adth-get^contextO" is terminated. However, if the condition check 530b, 530c, 530d, 530e does not cause the termination of the function "arith_get_context", then a context calculation 53〇f performs a context calculation 53〇f containing a variable initialization 530fa, a variable rescaling 53〇fb, Based on the table value adaptation 53〇fc and a return value calculation 53〇fd ° in the variable initialization 530fa, if the variable t0 uses a value greater than 1 'variable a〇, b〇, c〇, d〇 is set to the array q The values of the items aj, "b", "c", and "d" at the array position (〇i). The value corresponding to the 4-tuple 444 of Fig. 4 is opposite. If the value of the variable t0 is not greater than 1, the variables a〇, b0, c0, d〇 are initialized to zero. Similarly, if the value of the variable t1 is greater than 1, the variables al, bl, el, and dl are initialized to the values of the items "a", rb", ^e", and "d" of the array q at the position, which correspondingly have The value of the 4-tuple 430 of Figure 4 is shown. 34 201126508 Therefore, if the value of the variable to is greater than 1, the variables a 〇, b0, c0, do are set to the time value a t 及 and the spectral value a, b of a previously decoded frequency value tuple of the frequency index i , c, d. Similarly, the variables al, bl, cl, dl are set to the frequency values a, b, c, d ° of the previously decoded spectral value tuple and the time index tO and the frequency index i-1, followed by the variables aO, bO, cO , dO, al, bl, cl, dl are iteratively rescaled because the number representation is iteratively moved to the right bit until all variables aO, bO, cO, dO, al, bl, cl, dl are at -4 to + Within the scope of 3, including borders -4 and +3. After the variable is rescaled 53〇fb, the variable 1 indicates how long the set of variables aO, bO, cO, dO, al, bl, cl, dl are moved to the right, at least one right shift operation is performed. Therefore, the appropriate variables aO, bO, cO 'dO, al, bl, cl, dl are obtained, and they are all in the range between _4 and +3. Subsequently, the 530fc is implemented based on the value of the table. To this end, the variable t〇 is set to a value, if the variable tO is greater than 1 ', the value is determined by one of the tables (or arrays) "egroups". As shown, the items of position (4+a0, 4+b0, 4+c0, 4+d0) are used for this purpose. Similarly, if the value t1 is greater than 1, the variable t1 is set to a value consisting of a table position (4+&1,4+1)1,4+(;1,4+(!1) table The decision of "6 1> 〇叩 8" is final. The last 'one return value depends on variable 1 (indicating how long a right shift operation is applied), and the dependent variables to and tl are calculated, as indicated by reference numeral 530fd. 'It can be said that the return value of the function "arith_get_c〇ntext()" is determined by the most relevant bit plane of the tuples 444, 430, 440 and 448 of Fig. 4 35 201126508 and it should be noted that if the variable to is greater than or Equal to 544, or the variable U is greater than or equal to 544, a table lookup is performed, and the return value is used using a numerical calculation of multiplication and addition. Therefore, if the variable t〇&tl is greater than or specific to 544, then The return value of the function arith__get-context() is performed in a more liberal (liberate) and more detailed calculation. It should be noted that in the third figure of reference numeral 312a, the variable "lev" is returned by the function "arith_get_c〇ntext(1)". Export. The variable ^ is moved from the value s by shifting the value s to the right 24 bits. The state variable t is also executed by the value s by an AND operation between the value $ and the hexadecimal value "OxFFFFFF" and by adding a value of "1" to the result of the final operation. 4.5 Group Index Decoding (Fig. 5d, & Graph) In the following, the process of group index decoding 312b will be discussed, the process 3i2b is based on a previous calculation of the state value t described above, and an iterative binding of the algorithm M2a A call 'state value t (as shown in Figure 3) containing the function "amh_get~pk()" is a parameter. 4.5.1 Function "arith_get_pk()" (Fig. 5) Function "arith one get one Pk()" will be described later with reference to Figure 5d. The execution of the function "adth_get_pkO" includes initialization with an array of values psd, as indicated by reference numeral 54〇a. In addition, the function anth_get_pk〇" The initialization of the inclusion-indicator p and the variable bu is shown in reference numeral 5. The algorithm "adth_get_pk()" also contains the initialization of a variable (9), which is equal to 63*t, where [is the function "anlget-" When Pk〇 is called, it is given to the function "_-(10)-pk()" 36 The parameter of 201126508. Therefore, the item value s of the function "arith_get_pk()" can be equal to the variable t of the algorithm "tuples_decode()" shown in Fig. 3. The initialization of the variable i is as shown by reference numeral 540c. Function "arith_get_pk() It also includes an iterative operation of a hash table access 540 in which the hash table access 540d is repeated until an "interrupt" condition arrives, or until a "return" operator arrives. If the "interrupt" condition arrives, a range-based provision 540e of a return value is executed. However, if the return operator arrives, the operator returning the operator is returned and the function "arith_get_pk()" is terminated. The hash table fetch 540d includes an iterative execution of a first step 540da, a second step 540db, a third step 540dc, and a fourth step 540d. In a first step 540da, the variable j is set to the value of one of the tables "ari_pk_hash", where the index of the item is determined by the seven least significant bits of the variable i. In a second step 540db, it is determined whether the value of the variable j obtained in the first step 540da adopts a hexadecimal value of OxFFFFFFFF. In such a case, the iterative execution of the hash table fetch 540d is aborted, and the execution of the algorithm "arith_get_pk()" is continued to provide a return value based on the range. In other words, if the item of the table "ari_pk_hash" of the seven least significant bit addresses of the variable i uses the overflow value of OxFFFFFFFF, it is assumed that the state defined by the input variable t of the function "arith_get_pk()" is - the so-called The "non-significant" state, a return value should be assigned to the state based on the range provided by the range 540e. In the third step 54 〇 dc of the hash table fetch 540d, it is checked whether the most significant bit (e.g., bit 8 to 31) of the value of the variable j is equal to the value of the input variable t of the function "arith_get_pk()". In this case, the eight least significant bits of the variable j (bits 〇 to 7) are returned as a return value of the function "arith__get_pk()" and the function "arith_get_pk()" is terminated. However, if the condition of the third step 54 〇 dc is found not to be reached, the variable i is incremented by one (step 540 dd), and the hash table access 540d is repeated from the first step 54 〇da. The range-based provision 540e of a return value contains an initialization 540ea of the index p to a starting point in the array psci. The starting point is determined by the function "arith_get_pk() _ (value input bits t and 23 of the variable t, which corresponds to the number of overflow symbols "ARITH_ESCAPE" that have been decoded for the current tuple to be decoded. The bits 23 and 24 of the input variable t take the value "00", and the index p is initialized to the point of the first item "24" of the array psci. If the bits 23 and 24 of the input variable s use the value "01", the index p Initialized to the eighth item "30" of the array psci, if the 23rd and 24th bits of the input variable s take the value "10", the index P is initialized to the 15th item "5" of the array psci, and if the input variable The 23rd and 24th bits of t take the value "11", and the index p is initialized to the 22nd item "5" of the array pSci. In a subsequent step 540eb, the variable j is set to adopt the 22 most input variables t The value represented by the insignificant bit (bit 1 to bit 22) is as indicated by reference numeral 54 〇 eb. Subsequently, the decision is made that the array psci is returned to the term "arith_get__pk()" which refers to the return value. Make the following decision: • If the value of j is less than 436961, and if the value of j is also less than 252001 And if the value of j is also less than 243001, then the entry of the starting point determined in step 540ea is returned; 38 201126508 • If the value of j is less than 436961, and if the value of j is also less than 252001, and if the value of j is not less than 24300 Then, the first item after the starting point of the decision step 540ea is returned; • if the value of j is less than 436961, and if the value of j is not less than 252001, and if the value of j is less than 288993, then the decision is made in step 540ea The second term after the start point is returned; • If the value of j is less than 436961, and if the value of j is not less than 252001, and if the value of j is not less than 288993, then the third term after the starting point determined in step 540ea Returned; • If the value of j is not less than 436961, and if the value of j is less than 1609865, and if the value of j is also less than 880865, then the four items after the starting point determined in step 540ea are returned, • if j The value is not less than 436961, and if the value of j is less than 1609865, and if the value of j is not less than 880865, then the fifth item after the starting point determined in step 540ea is returned; • if the value of j is not less than 436961, and if If the value of j is less than 1609865, then The sixth item after the starting point determined in step 540ea is returned. For more details, refer to the algorithm of reference numeral 540ec in Fig. 5. In summary, the function "arith_get_pk" called with the state value t provides the value pki As a return value, as shown in reference numeral 312ba in FIG. The value of the variable pki is used to select a cumulative frequency table for execution of the function "arith_decode" as described in Figure 3. Therefore, the variable "cum_freq[]" is properly initialized to formulate the selected cumulative frequency table 39 201126508. 4.5_2 Function "arith_decode()" (Fig. 5e) In the following, the function of the function "arith_decode()" will be described in detail with reference to Fig. 5e. It should be noted that the function "arith_decode()" uses the auxiliary function "arith_first_symbol(void)", and if it is the first symbol of the sequence, it returns TRUE, otherwise it returns FALSE. The function "arith_decode()" also uses the helper function "arith_get_next_bit()", which gets and provides the next bit of the bitstream. In addition, the function "arith_decode()" uses the population variables "i〇w (low)", "high", and "value". In addition, the function "arith_decode()" receives the variable "cum_freq[]" as an input variable that points to the element of the selected cumulative frequency table (with element index or item index 〇). Also, the function "arith_decode" uses the input variable cfl, which represents the length of the selected cumulative frequency table defined by the variable "cum-freq[]". The function "arith_decode" contains a variable initialization 550a as a first step, and if the helper function "arith_first_symbol" indicates that the first symbol of a sequence symbol is being decoded, it is executed. Value Initialization 55〇a Dependency The plural number obtained from the bit stream using the helper function "arith__get_next_bit", for example, 20 bits initializes the variable "vaiue" such that the variable r value" takes the value represented by the bit. Further, the variable "1〇w" is initialized to the use value 0' and the variable "high" is initialized to the value 1〇48575. In a second step 550b, the variable "range" is set to a value that is one greater than the difference between the variables "high" and "low". The variable rcum" is set to a relative position between the value of the variable "low" and the value of the variable rhigh". Therefore, the variable "_" depends on the value of "value", for example, a value between 〇 and 2]6. The index P is initialized to a value that is one less than the starting address of the selected cumulative frequency. The algorithm "arith_dec〇de()" also contains an iterative cumulative frequency table. The lookup 550c iterative cumulative frequency table lookup is repeated until the variable is less than or equal to one. In the iterative cumulative frequency table lookup 550c, the index variable q is converted to a combined value of one-half of the value of the current value of the index variable p and the value of the variable cfl. If the value of the item % addressed by the index variable q of the selected cumulative frequency table is larger than the value of the variable "cum", the index variable p is set to the value of the index variable q, and the variable Cfl is incremented. Eventually, the variable cfl is shifted to the right bit, so the value of the variable Cfl is effectively divided by 2 and the modulo part is ignored. Thus the 'iteration cumulative frequency table lookup 550c effectively compares the value of the variable r cum ' with the plurality of terms of the selected cumulative frequency table to identify an interval in the selected cumulative frequency table that is bounded by the terms of the cumulative frequency table So that the value cum falls within the identified interval. Thus, the item of the selected cumulative frequency table defines an interval in which a respective symbol value is associated with each of the intervals of the selected cumulative frequency table. And, the interval width between two adjacent to values of the cumulative frequency table defines the probability of symbols associated with the equal intervals such that the selected cumulative frequency table completely defines a probability distribution of different symbols (or symbol values). Details regarding the available cumulative frequency table will be described below with reference to Fig. 16. Referring again to Figure 5e, the symbol value is derived from the value of the index variable p, where the 41 201126508 symbol value is derived as reference numeral 550d. Therefore, the difference between the index variable p and the start address "cum_freq" can be evaluated to obtain a symbol value, which is represented by the variable "symbol". The algorithm "arith_decode" also contains an adaptation 550e for the variables "high" and "low". If the symbol value represented by the variable "symbol" is different from 〇, the variable "high" is updated as indicated by reference numeral 550d. Also, the value of the variable "low" is updated as indicated by reference numeral 550e. The variable "high" is set to a value determined by the value of the variable "low", and the variable "range" and the item have the index "symbol -1" of the selected cumulative frequency table. The variable "low" is incremented, where the increment is determined by the variable "range" and the selected cumulative frequency table with the index "symbol". Therefore, the difference between the variables "low" and "h i g h" values depends on the difference between the two adjacent terms of the selected cumulative frequency table. Therefore, if a symbol value having a low probability is detected, the interval between the values of the variables "low" and "high" is reduced to a narrow width. Conversely, if the detected symbol value contains a relatively large probability, the width of the interval between the values of the variables "low" and "high" is set to a relatively large value. In addition, the width of the interval between the values of the variables "low" and "high" depends on the corresponding symbols of the detected symbols and the cumulative frequency table. The algorithm "arith_decode()" also contains an interval reforming 550f' wherein the interval determined in step 550e is iteratively moved and scaled until the "interrupt" condition is reached. In interval reforming 550f, a selected downshift operation 550fa is performed. If the variable "high" is less than 524286, no action is taken and the reforming continues with an interval size increase operation 560fb. However, if the variable 42 201126508 "high" is not less than 524286 and the variable "1〇w" is greater than or equal to 524286, the variables "vaiues", "i〇w" and "high" are reduced by 524286, so that the variable "low" and The definition of "high" moves down, and the value of the variable "^" is also shifted down. However, if the value of the variable rhigh is found to be not less than 524286, and the variable "low" is not greater than or equal to 524286, and the variable "1〇w" is greater than or equal to 262143, and the variable "high" is less than 786429, the variable "value", "low" and "high" are reduced by 262143, thereby shifting the interval between the values of the variables "high" and "low" and the value of the variable "value". However, if none of the above conditions are met, the interval reorganization is aborted. However, if any of the conditions evaluated above in step 55 〇fa are satisfied, the interval increase 550fb is executed. In the interval increase operation 55〇fb, the value of the variable "low" is doubled. Also, the value of the variable rhigh" is doubled, and the result of the doubling is increased by one. Also, the value of the variable "value" is doubled (moved to the left one bit) and one bit of the bit stream obtained by the helper function "arith_get_next_bit" is used as the least significant bit. Therefore, the interval between the values of the variables "low" and "high" is nearly doubled, and the precision of the variable "value" is increased using a new bit of the bit stream. As described above, steps 550fa and 550fb are repeated until the "interrupt" condition is reached, i.e., until the interval between the values of the variables "low" and "high" is sufficiently large. Regarding the function of the algorithm "arith_decode()", it should be noted that the interval between the values of "i〇w" and "high" is reduced in the adjacent frequency table of the cumulative frequency table depending on the reference variable "cum_freq" in step 550e. . If the interval between two adjacent values of the selected cumulative frequency table is small, that is, if the adjacent values of 43 201126508 are relatively close, the interval between the values of the variables "low" and "high" obtained at step 550e will be relatively small. Conversely, if the two adjacent terms of the cumulative frequency table are far apart, the interval between the values of the variables "low" and "high" obtained at step 550e will be relatively large. Thus, if the interval between the values of the variables "low" and "high" obtained in step 550e is relatively small, then a large number of interval reshaping steps will be performed to rescale the interval to a "sufficient" size (making the condition The conditions for evaluating 550fa have not been met). Therefore, a relatively large number of bits from the bit stream will be used to increase the precision of the variable "value". Conversely, if the interval size obtained at step 550e is relatively large, only a small interval of reforming steps 550fa and 550fb is required to repeat the interval between the variable "low" and "high" values to a "sufficient" size. . Therefore, only a relatively small number of bits from the bit stream will be used to increase the precision of the variable "value" and prepare for decoding of the next symbol. In summary, if a symbol containing one symbol is decoded, the symbol contains a relatively high probability, and a large interval is associated with the item of the selected cumulative frequency table, only a relatively small number of bits will be from the bit. The stream is read to allow decoding of a subsequent symbol. Conversely, if a symbol is decoded, it contains a relatively small probability, and a small interval is associated with it via the selected cumulative frequency table, then a relatively large number of bits will be used from that bit stream to prepare for the next Decoding of a symbol. Thus, the terms of the cumulative frequency table reflect a number of bits that are required to decode a sequence of symbols. By relying on a context, ie, relying on previously decoded symbols to change the cumulative frequency table, for example by relying on context to select different cumulative frequency tables of 201126508, random dependencies between different symbols can be utilized, which allows for the follow-up of A specific bit rate of the adjacent) symbol is effectively encoded. In summary, the function "arith_dec〇de()" described with reference to Fig. 5e is called with the cumulative frequency table "arith_cf_ng[pki] port, corresponding to the index pki returned by the function "arith_get_pk()", Determining the Group Index 4.5.3 Overflow Mechanism Although the decoding group index ng is the overflow symbol "ARITH_ESCAPE", an additional group index is decoded and the variable lev is incremented by 2. Thus, a numerical significance about the most significant bit το plane and information about a number of less significant bit planes to be decoded are obtained. If an overflow symbol "ARITH-ESCAPE" is decoded and the current tuple is decoded, the value % is changed to 11 and further increased by 41943〇4, which corresponds to setting the 23rd bit X of the variable to 1. If an overflow symbol is solved for the second and more times, the state variable t is taken. It is zero. In both cases, the updated state variable 1 is used for a new iteration of the group index decoding 312b when an overflow symbol factory ARITH_ESCAPE" is decoded. 4·6. Element index decoding (Fig. 5f) Once the decoded group index is not the overflow symbol r Arith_ESCAPE, the number of elements in the group ng and the group offset 透过 is searched by the algorithm according to the algorithm of Fig. 5f. "dgroups[]" was reduced. In other words, the variable mm is set to a value that is the least significant bit of the item of the table "dgroups[]" of a table position by the group index. <丨如位〇-7) decided. Similarly, the group offset og is determined by the more significant bits (bits 8 and above) of the entry of the table "dgroups[]", which is determined by the position offset of the variable ng defined 45 201126508. The group that is determined by ng is called the message. If the variable mm is greater than 1 ’, if the group has more than one element, the element index ne is solved by accumulating the cheek rate table “arith_decode()”. Length adth_cf_ne+((mm*(mm_1))>>1)[]A The cumulative frequency table is equal to mm. In other words, a segment of the table "adth_cf_ne[]" is selected to repeat the cumulative frequency table "arith_Cf_ne[]" to fully describe the probability distribution of a plurality of different numbers of elements of a group indexed by the group index. It should be noted that the offset of the different segments (or sub-forms) of "arith_cf_ne[]" in the tired frequency table is described by the formula (mm*(mm-l))»l). It should be noted that the variable "cum_freq" used as the algorithm "arith_decode()" is preferably initialized to the starting bit address of the segment (or sub-table) of the table or array "arith_cf_ne[]". The remaining group index ng is associated with the number of elements mm of the current group. Also, an input variable of the algorithm "arith_decode()", the variable Cfl is initialized to the value mm. Subsequently, the function "arith_decode〇" is called, and its operation has been described in detail above. However, for the decoding of the element index, the function "arith_decode()" uses a subform of the table or array "arith_cf_ne[]" instead of the cumulative frequency table "arith_cf_ng[pki=0][545]" to "arith_cf_ng[ One of pki=31][545]". Therefore, the element index ne is provided as a return value of the function "arith_decode()", which uses the many bits of the bit stream to obtain the element index ne. 4.7. Most significant bit plane decision (Fig. 5g) 46 201126508 Once the element index ne is decoded and returned as a return value of the function "arith_decode()", the most significant 4-tuple is sent by 2- The bit plane can be derived using the table "dgveCt〇r[]" according to the algorithm of the 5th figure. For example, a first spectral value "a" of the spectral value tuple can be set to an entry of a table or array "dgvectorsn", wherein the array element index (or table element seduce or simply "element index" or " The item index") is determined to be 4*(og+ne). Similarly, the second spectral value "b" of the spectral value tuple can be set to an item of the array "dgvectorst", wherein the array element index is determined by 4*(〇g+ne)+l. The third spectral value "c" of the spectral value tuple can be set to an item of the array "dgvectors[]", and the target _ *>=,,, Dan Τ - το素 index is 4* (〇g +ne)+2 decision. A fourth spectral value of the spectral value tuple can be set to an item of the array "dgvectorsU", wherein the element index is determined by 4*(〇g+ne)+3. Therefore, the spectral values "a", "b", "C", and "d" indicating the most significant bit plane of the spectral value tuple are derived from the array "-coffee []", wherein the spectral value "a" is determined. The items of "b" and "ce" 'd' are selected based on the group index ng and the element index ne (if available). 4.8. Less significant bit-plane decision (Norther map) The remaining bit-planes are then most significantly decoded to the least significant level by the cumulative frequency table "(4)(4)". The v-th order function "adth-decodeO" is decoded most significantly. For this purpose, the round variable "cum_freq" of the function "adth_decodeO" can be initialized to the start address of the array "adth_Cf_r[]". Also, the input variable cfl of the function "arith-decode" can be initialized to an appropriate value representing the length of the table "a argument 乂r[]", which is equal to 16 in the case of the tuple dimension 4 47 201126508. The function "adth-decodeO" returns a variable r representing the binary value of the less significant bit plane of the decoded tuple, which allows the algorithm to refine the 4-tuple according to the algorithm shown in Figure 5h. In other words, when "joining" a less significant bit plane, the first spectral value "a" is multiplied by 2 (or equivalently, shifted one bit to the left), and the least significant bit of the value r ( Bit 0) is added to the new least significant bit (this can be done using an OR operation). The second spectral value "b" is multiplied by 2, and the second bit (bit 1) of the value Γ is added to a least significant value of the spectral value "b". The third spectral value "c" is multiplied by 2 (or equivalently, shifted one bit to the left), and the third bit (bit 2) of the value Γ is added as the least significant bit. The fourth spectral value "d" is multiplied by 2 (or equivalently, shifted one bit to the left), and the fourth bit (bit 3) of the value r is added as a least significant bit. Refer to the algorithm shown in Figure 5 for details. 4.9. Context update (Fig. 5i) Once the 4-tuple (a, b, c, d) is fully encoded, ie, all less significant bit planes are added, the context tables q and qs are represented by the function "arith_update" A context is updated. In the following, the details of the function "adth_update-contextO" value will be described with reference to Figure 5i, and Figure 5i depicts the pseudo-code representation of the function. The function "arith_update_conteXt()" receives the spectral values "a", "b", "c", and "d" of the decoded 4-tuple, and the index "i" and the 4-tuple (or decoded 4-tuple) to be decoded. The current 4-tuple number lg/4 associated with the audio frame is used as an input variable. The function "arith_update_context()" contains the step 58 of copying the spectral values ", ", a |, b |, 48 201126508 "c", "d" to array 0 (^, for example, at position (l, i) (also The item "a" of the array q designated as "q[l][i].a") is set to use the first-spectrum value "a". The item % of the array (9) at the position (U) is set to the second The spectrum value "b". The item %" of the array q at the position (U) is set to the third spectrum value "c", and the item "d" of the array q at the position (l, i) is set to the third The spectrum value is "d". Therefore, the spectrum values "a", "b", and "d" are stored in b", "c", and "d". The items in the array q at the position (l, i)" a The function "arith_Update_context" also includes a step 580b of the item "V" of the array q set at the position (u). If the spectrum values of the current encoded tuple have the spectral values "a", "b", "c", If one of "d" is less than _4 or greater than or equal to 4, the item rv" of the array q at the position (l, i) is set to a value of 1 〇 24. Conversely, if all the spectral values are "a", " b", "c", "d" in the van Between -4 and +3, including the boundary 'item rv of array q at position (i, i)' is set to table or array "egroups[]" at position (4+a, 4+b, 4+c , 4+d^ — the term. Therefore, if one of the spectral values "a", "b", "c", and "d" is relatively large, the item "V" of the array q is typically set to a standard value of 1. 〇24, thereby causing the function "arith_get__context()" to perform a process of 530f during decoding of a neighboring spectral value tuple. The function "arith_update_context()" also includes 〆-map 580c, if a current frame The last spectral value tuple is decoded 'and the core mode is the linear prediction frequency domain core mode (for the case of an audio encoder that can switch between a frequency domain core mode and a linear prediction frequency domain core mode). "arith_update_context()" also includes a second mapping 580d, which is executed when the spectral value of the last tuple of the current audio frame is decoded and when core mode 49 201126508 is the frequency domain core mode 4. i; brother mapping 580c 'reference The pseudo code of the 5th figure. Regarding the second map 580d, reference is also made to Figure 4.10 Summary of the decoding process In the following, the decoding process will be briefly summarized. The 4-tuple demodulated spectral coefficients are encoded by low noise, and the line is transmitted from the lowest frequency coefficient to the high frequency coefficient. The coefficients are stored in the array "x-ac_quant[g][win][sfb][bin]", and the order of the low noise coded code blocks is such that when they are read and received and stored in the array, the order is decoded. , "bin" is the index of the fastest increment, and "g" is the index of the slowest increment. In a codeword, the decoding order is a, b, c, d. The coefficients from the transform coding excitation are stored directly in an array x__tcx-invquant[win][bin]", and the order of transmission of the low noise codewords is such that they are in the order in which they are received and stored in the array. The solution "Bin" is the index of the fastest increment, and rwjn" is the index of the slowest increment. In a codeword, the decoding order is a, b, c, d. First, the flag "arith_reset_flag" determines if the context must be reset. If the flag is true, the function "arith_reset_context" is called. 'The pseudo code of one of the functions is shown in Figure 5. Conversely, when the flag "arith_reset_flag" is false, a mapping is completed between the previous context and the current context according to the function "arith_map_context()", and a pseudo code representation of the function is shown in Figure 5b. The low noise decoder outputs a 4-tuple signaled quantized spectral coefficient. First, the context state is calculated based on four previous decoded groups 50 201126508 around the 4-tuple to be decoded. The state of the context is proposed by the function "arith_get_context〇", a pseudo-code representation of the function is shown in Figure 5c. Once the state is known, the most significant transmitted 2-bit plane group to which the 4-tuple belongs is decoded using the function "arith-decodeO", fed with the appropriate cumulative frequency table corresponding to the context state. This correspondence is done by the function "arith_get_pk()", and a pseudo code of the function is shown in Fig. 5d. Then, in order to determine the group index ng, the function "arith_dec〇de()" is called with the cumulative frequency table "arith_cf-ng[pki][]" corresponding to the index returned by the function number "arith_get_pk". An arithmetic coder (or decoder) is an integer implementation that uses a scaled label generation method. For detailed reference, for example, K. Sayood's book "Introduction to Data Compression",

Elsevier公司,2006年,第三版。第5e圖的偽c碼描述函數 「arith_decode〇」使用的演算法。 當解碼群索引nS是溢出符號時,ARITH_ESCAPE,一 附加群索引ng被解碼,且變量lev增加2。上下文之狀態也被 調整。一旦解碼群索引不是溢出符號,則ARITH_ESCApE, 群中的元素mm數量,群偏移〇g依據第5f圖所示演算法藉由 查找表格「dgroups[]」被推斷。 元素索引ne進而藉由以累積頻率表格(arith_cf_ne+ (mm*(mm-l))»l)□呼叫函數 rarith_dec〇de()」而被解碼。 一旦元素索引被解碼,4元組的最顯著按2位元平面可依據 第5g圖所示决算法以表格r dgVect〇r[]」被導出。 剩餘位元平面進而藉由以累積頻率表格「arith—c—r」呼 51 201126508 叫lev次函數「arith. decodeO」從最顯著被解碼為最不 顯著 位準。Elsevier, 2006, third edition. The pseudo-c code of Fig. 5e describes the algorithm used by the function "arith_decode". When the decoded group index nS is an overflow symbol, ARITH_ESCAPE, an additional group index ng is decoded, and the variable lev is incremented by 2. The state of the context is also adjusted. Once the decoded group index is not an overflow symbol, ARITH_ESCApE, the number of elements mm in the group, the group offset 〇g is inferred by the lookup table "dgroups[]" according to the algorithm shown in Fig. 5f. The element index ne is in turn decoded by the cumulative frequency table (arith_cf_ne+(mm*(mm-l))»l)□ call function rarith_dec〇de()”. Once the element index is decoded, the most significant 2-bit plane of the 4-tuple can be derived from the table r dgVect〇r[] using the decision algorithm shown in Figure 5g. The remaining bit plane is then decoded from the most significant to the least significant level by the cumulative frequency table "arith_c_r" called 51 201126508 called the lev function "arith. decodeO".

圖所示。 fq 及 qs 如苐Si 定義之說明如第5j圖所示。 5.映射表格 在依據本發明的一實施例中,特別有利The figure shows. The description of fq and qs as defined by 苐Si is shown in Figure 5j. 5. Mapping Tables In an embodiment in accordance with the invention, it is particularly advantageous

可训頁利的表格 被用於執行參考第Μ 圖討論的函數「arith_get一pk」,且被用於執行泉考第 論的函數「arith_decode」。 5 · 1 ·表格「arith_cf—ng_hash」The trainable page table is used to execute the function "arith_get-pk" discussed in the figure, and is used to execute the function "arith_decode" of the spring test. 5 · 1 · Form "arith_cf-ng_hash"

之一特別有利實施的内容示於第14圖之表格中。此處應、 意第14圖之表格列出之表格「arith_cf—ng hash[]」的項^亥 等項參考一個一維整數型項索引(也稱為「元素索引」或「= 列索引」),其例如被指定為「i」。如圖所示,第一行14ι〇 是一索引行’描述與各自列1412a到1412p相關聯的起始索 引。一第一值行1420顯示表格「arith—cf-ngJlash[]」項索弓丨 等於索引行1310中所示的起始索引的項值。一第二值七 1422顯示表格「arith_cf_ng—hash」之項,該等項索引比各 列在行1410中所示的起始索引大卜一第三值行1424描述表 格「arith_Cf_ng_hash」之項,其元素索引比對應列在行141〇 52 201126508 中所示的起始索引大2。類似地,行1426、1428、1430、1432、 1434缚示表格「arith_cf_ng_hash」之項,其元素索引比各 列1412a到1412p在行1410所示的起始索引大3(行1526)、 4(行 1428)、5(行 1430)、6(行 1432)或7(行 1434)。 例如,列1412a顯示在行1420到1434的具有元素索引 0、1、2、3、4、5、6及7的表格「arith_cf_ng_hash」之項。 類似地,列1412b顯示在行1420到1434的具有元素索引8、 9、10、11、12、13、14及 15的表格「arith_cf_ng_hash」之 項。對於其他列1412c到1412p而言,上述項的安排應用類 似。 5.2•表格「arith_cf_ne」 第15(1)圖到第15(10)圖繪示表格「arith_cf_ne」之項的 表格表示,當解碼元素索引ne時,其由函數「arith_decode()」 評估。 如圖所示,表格「arith_cf_ne」之項包含0與2699之間 的索引。一起始索引行1510描述與各列相關聯的起始索 弓丨。一第一列例如被指定為1512a,且一第二列例如被指定 為1512b。 一第一項行1520繪示表格「arith_cf_ne」的項,其之項 索引由對應列(例如,列1512、1512b)的起始索引行1510中 列出的值決定。項行 1522、1524、1526、1528、1530、1532、 1534顯示具有元素索引的表格「arith_cf_ne[]」,該等元素 索弓丨比各列(例如列1512a、1512b)在起始索引行1510中所示 的起始索引大1(行1522)、2(行1524)、3(行1526)、4(行 53 201126508 1528)、5(行 1530)、6(行 1532)或7(行 1534)。因此,第一列 1512a在行1520到1534顯示具有〇與7之間的項索引的表格 「arith一cf__ne口」之元素。第二列1512b在行1520到1534顯 示具有在8與15之間的項索引的表格「arith_cf_ne[]」之項。 5.3.表格「arith_cf_ng[pki][545]」 第16(1)圖到第16(32)圖繪示一組32個累積頻率表格 「arith—cf_ng[pki][545]」,其中之一由一音訊編碼器1〇〇或 一音訊解碼器200選擇,例如,用於執行函數 「arith—decode()」,即,用於解碼群索引ng。第16(1)圖到第 16(32)圖中所示32個累積頻率表格其中被選擇者在執行函 數「arith_decode()」時採用表格「cum_freq[]」之功能。 如第16(1)圖到第16(32)圖所示,變量pki的不同值與不 同表格相關聯。一累積頻率表格索引pki=〇與一表格1601相 關聯,一累積頻率表格索引pki=l與表格1602相關聯,且累 積頻率表格索引pki=2、pki=3、......pki=31與表格1602到 1632相關聯。 與表格索引?1^丨=0到咏丨=32相關聯的表格1601到1632之 結構相等’所以僅與表格索引pki=〇相關聯的表格16〇1之表 示的結構將被詳細描述。表格丨6〇1包含一索引行1640,其 顯示與表格1601之各列相關聯的起始索引。一第一列被指 定為1642a ’而一第二列被指定為1642b。 一第一項行1650表示表格1601的項,其之項索引等於 各列在索引行1640中所示的起始索引。類似地,項行1651 到1665顯示具有如下項索引的表格1601(與表格索引pki=〇 54 201126508 相關聯)之項,該等項索引比各列在索引行1640中所示的起 始索引大1(行 1651)、2(行 1652)、3(行 1653)、4(行 1654)、 5(行 1655)、6(行 1656)、7(行 1657)、8(行 1658)、9(行 1659)、 1〇(行 1660)、11(行 1661)、12(行 1662)、13(行 1663)、14(行 1664)、15(行1665)。因此’第一列1632a顯示具有在〇(值 16684)與15(值6352)之間的元素索引的表格1601之項的 值。第二列1642b顯示具有16(值6202)與31(值3547)之間的 元素索引的表格1601之項值。 自然,上述規則也適用於其他列。並且,表格16(2)到 16(32)中的元素項之安排等於表格1601之項的安排。 5.4.表格「dgroups[]」 第17(1)圖及第17(2)圖顯示一表格「dgroups[]」的項之 表示’該表格可由音訊編碼器100及音訊解碼器200應用。 例如’表格「dgroups[]」可被用於執行演算法 「tuples_decode〇」,如第3圖所示。並且,表格「dgroups[]」 可由第5f圖中的演算法應用,以決定一群中的元素數目mm 及群索引ng指定的一群之群偏移og。 表格「dgroups[]」的表示包含一索引行πΐ〇,其顯示 與各列’例如該表格表示的一第一列1712a及一第二列 1712b相關聯的起始索引。表示的一第一項行172〇顯示表格 「dgroups[]」的項’其之元素索引等於各列在索引行171〇 中所示的起始索引。類似地,項行1722、1724、1726、1728、 1730、1732、1734顯示表格「dgroups[]」的項,元素索引 比各列在索引行1710中所示的起始索引大丨(行1722)、2(行 55 201126508 1724)、3(行 1726)、4(行 1728)、5(行 1730)、6(行 1732)或7(行 1734)。例如,第一列1712&顯示在項行1722到1734中的具 有在0(行1720)與7(行1734)之間的元素索引的表格 「dgroupS[]」之項。類似地,第二列1712b顯示在行172〇到 1734中具有在8(行1720)與15(行1734)之間的元素索引的表 格dgroups[]」之值項。應注意表格「dgroups[]」之值項 被示於第17(1)圖及第17(2)圖中的一十六進記法,該記法由 別置「Ox」指示。最顯著的十六進數位被示於左側,而最 不顯著的十六進數位被示於右側。 5.5j;^「dgvectors[]」 第18(1)圖到弟18(11)圖繪示表格「dgvectors[]」之項的 表格表示。表格「dgvectors[]」可例如被用於音訊編碼器1〇〇 或音说解碼器2〇〇。例如,表格「dgvectors[]」可被用於第3 圖所示函數「tuples一decode()」的步驟312d,或第5g圖的演 异法之執行。因此,表格「dgVectors[]」可被用以將一群索 引及元素所以映射至一頻譜值元組的一最顯著位元平面之 值。 第18(1)圖到第18(11)圖的表格表示包含一索引行 1810,其包含於表格表示諸列(例如第一列1812&或一第二 列1812b)相關聯的起始索引。項行1820到1882顯示表格 「dgvectors[]」之項,其項索引等於各列在索引行181〇所示 的一對應起始索引。後續項行1822到1882以昇冪顯示表格 「dgVectors[]」之項,該等項比各列在索引行181〇中所示的 起始索引值大 1(行 1822)、2(行1824)、3(行 1826)、4、5、6、 56 201126508 7、8、9、10、Η、......29(行 1878)、30(行 1880)或31(行 1882)。 因此,第一列1812a在行1820到1882顯示的具有0與31之間 的元素索引的表格「dgvectors[]」之項,其中與項相關聯之 元素索引從左到右單調增加。類似地,第二列1812b在行 1820與1882之間顯示具有32與63之間的元素索引的表格 「dgvectors[]」(元素索引從左到右增加)。 5.6^;j^regroupsj 第19(1)圖到第19(32)圖顯示一表格 「egroups[a][b][c][d]」的一表格表示,其也可被視為一具 有四個元素索引a、b、c、d的4維陣列。應注意各該元素索 引a、b、c、d可採用0與7之間的值。表格或陣列 「egroups[a][b][c][d]」可被用於音訊編碼器100或音訊解碼 器200。例如,表格或陣列「egroups[a][b][c][d]」可被用於 函數「arith_get_context()」來導出返回值,且可被用於函 數「arith_update_context()」來決定以項索引(1,1+j)的陣列q 之項「V」。 陣列「egroups」的項被示於64個表格1901到1964中。 索引a與b之不同組合與各該表格1901至1904相關聯。例 如,組合a=0與b=0與第一表格1901相關聯,而組合a=0,b=l 與第二表格1902相關聯。 此處應注意不同表格1901至1964之結構是相同的,所 以在本文中將僅討論第一表格1901之結構。表格1901包含 一表示與各列1972a到1972h相關聯的第三索引c之值的索 引行1970。類似地,一索引列1980表示與表格1901之各列 57 201126508 1982a到1982h相關聯的第四索引d之索引值。因此,與表格 1901到1964其中之一的項相關聯的索引^由各列的索引行 之值決疋’而一項的第四索引d由項的各行之索引列之值決 定。例如’表格1901之列1972表示索引a=0, b=0, c=〇及d=(0 到7)(從左到右)的陣列「egr0UpS[a][b][c][d]」之項。並且, 行1982a表示從頂部到底部a=〇,b=〇,c=(〇到7)及d=〇的陣 列「egroups[a][b][c][d]」之項。並且,應注意陣列「egr〇ups」 之項以十/、進5己法表示,其以前置「〇x」指示。 6.性能評估及優勢 依據本發明之實施例使用一組更新的表格,如上所 述,當與一先前使用組的表格相比,它們大大減少頻譜低 雜訊編碼的記憶體需求。無損轉碼在位元率限制上是可能 的0 在下文中,發明的概念下的一先前使用的低雜訊編碼 之修改將被討論。 音sfl編碼概念,諸如例如所謂的統一語言及音訊編碼 器(USAC)使用一上下文適用算術編碼器(及解碼器)用於低 雜訊(或無損)編碼量化的頻譜係數(例如頻譜係數252)。與 异術編解碼益(編碼器或解碼器)相關聯的上下文適用允 許達到高低雜訊編碼性能。此技術的主要缺點來自於其在 一記憶體需求上的相對高複雜性,事實上,上下文適用要 求一相當大的組模型化不同機率分佈。先前使用的統一語 言及音訊編碼器(USAC)之惟讀記憶體(R〇M)消耗被評估為 大約150kWords ’其中熵編碼器表示總需求的大約73%。 58 201126508 本文的目的之—是接φ (或解碼ϋ),ι㈣二 表格用於算術編碼器 …轉低雜訊編解碼器 始性能__目當少__。解細之原 前狀Ϊ:文中,先前被實施的統-語言及音訊編碼器之目 狀怎將被描述。第7圖繪示 杂 ώ先則貝施的USAC低雜 解馬4編碼益或解碼器)的詳細記憶體需求的表格。 评圖之表格易於觀察到目前最需要的表格是與群 ^碼的上下文適用相關的表格「anth_(ng_hash[]」及 a=th_ef_ng[][]」。值得注意的是聚集元素㈣符號的累積 '「:<·及剩餘位元平面符號的表格「adth_cf_ne[]」及 a__Cf—r[]」可易於被代數地回覆,且不—定需要被儲存。 在下文中’關於該提出的新組減少的表格之一些細節 將被為述。依據本發明,建議以第14圖及第仏⑴圖到第 =32)圖提出的新組表格替換先前使用的表格 anth—cf一ng_hash[]」及 r arkh—cf_ng[][]」。新表格 arith__cf_ng_hash[]」及「arith cf ng[pki][545]」與先前使 用的表格(例如用於USAC參考模型中的表格)相比展現減 少的尺寸’且隨後被稱為r減少的表格」。使用減少表格的 低雜訊編解碼器(編碼器或解碼器)之記憶體需求被詳細列 在第8圖的表格中。 該新表格組相較於原始組顯示一大約為7的尺寸減少 因數。該減少可藉由減少機率分佈模型之數目,且藉由最 佳化選擇的模型而被達到。另外,上下文狀態與新定義的 模型之間的映射被最佳化。 59 201126508 在下文中’關於位元率的一性能評估將被提出。特別 地’-使用先前應用的「大」表格被產生的位元流,與依 據提出的「較小」表格提供的-轉碼位元流之間的一無損 轉碼將被討論。該触表格被證明能夠清晰地轉碼一以先 前使用的表格產生的位元流,該等表格也被稱為「參考模 型0表格」或「RMO表格」。轉碼是使用第9圖中描述的轉碼 方案被實行。 第9圖顯示從「參考模型〇表格」到「減少的表格」的 一無損轉碼之一方塊示意圖。如第9圖所示,評估設定包含 一 USAC RMO編碼器91〇,其接受頻譜值9〇8作為一項資 sfl,且使用參考模型〇的「舊」表格提供該等頻譜值的一算 術編碼的表示912。因此,一所謂的rRM〇」位元流被獲得, 其包含异術編碼頻譜值912。評估設定9〇〇也包含無損轉碼 920 ’其中RM0位元流的算術編碼頻譜值912由一熵解碼器 使用「舊」表格(參考模型〇表格)被解碼,以獲得解碼的頻 譜值924。該無損轉碼也包含一熵編碼器,其被組態成使用 「新」減少的表格編碼解碼頻譜值924,以獲得一減少的表 格位元流928。隨後,包含使用「舊」表格被編碼的頻譜值 之表示的RM0位元流與包含使用「新」表格被編碼的頻譜 值之表示的減少表格位元流928相比。 爲了證明之目的’所謂的RM0位元流使用一USAC參考 解碼器且使用「舊」表格被解碼,以獲得一所謂的RM0合 成結果942。並且,所謂的減少表格位元流928使用一USAC 參考解碼器以及使用「新」表格被解碼。因此,一減少表 60 201126508 格合成結果952被獲得。RM0合成結果942隨後與減少表格 之合成結果952比較以證實實施之正確性。 在下文中,分析之某些結果將被描述。第10圖及第11 圖之表格分別以RMO表格及減少表格顯示完整、序連、編 碼項的所有次段上的最小、最大及平均位元率。對於每一 運算模式’一個別次段長度由後續存取單元制組合決定, 該存取單元制長度接近100ms。次段長度及相應的位元率被 列在兩個表格中。上文提到,從RM〇表格位元流到減少表 才。位元流的無損轉碼對於每一運算模式被達到,即,在獲 得位元擷取合成時,位元儲存條件未被違反。第12圖之 表格比較當使用R Μ 0表格與減少表格時僅由核心編解碼器 產生的位元率。除了每秒64kWt(kbps)立體聲外,減少的表 格對母—運#模式在執行上平均比r则表格為佳,位元的 平均増長僅為64kbps_.G2%,其相對應於大約Q 5位元/訊 U而以減)表格產生的位元流仍匹配在此運算模式 的位元率需求。 第13圖之表格顯示對於各該運算模式而言,在將麵 ^格位7^流轉碼成減少表格位元流之後或之前聚集的位元 ^的最差及最佳情況1差被__次段基礎。可觀察 6在所有情財,減少表格雜能是極其—致的且極其穩 疋。對於一次段而言,夢由 -^ 5由以減少表格替換RM0表格的位 加在總位元率的㈣以下。另—方面,位元之減少 可達到多於6%。 所述USAC頻,酱低雜訊編碼的新表格被提議,其 61 201126508 之尺寸被大大減)’同時維持頻譜低雜訊編碼模組的高編 碼性能。達到的尺寸減少大約為-7的比*,或大於 90kWordS。提出的新組表格允許大大減少記憶體需求,且 因此降低實施複雜性。當崎合成輸出波形時,—位元精 確性對於每一運算模式被維持。 上述優勢藉由改進算術編碼的一特定部份被達到。在 二實施例中’—新散列表 ’ adth_ef_ng_hash[128]、-32 . arith^ng[32][545] , th_get』k()」被使用。算術編解碼器表格(當與先前使 用的算術編解碼絲格相比時)的更新改變狀態索引s到機 率㈣索邮及機率模型本身的映射。其不改變狀態索引 ★也不改^機率索引此後被使用編碼目前符合的方 法(即’目前4元組的群索引ng)。 :表格的主要優勢是減少用於儲存表格的記憶體需 字的尺現在具有大約15kW〇nlS(即,每32個位元15*1024個 —*寸而不疋大約110kW〇rds,同時維持編碼效率。 f來上述改進的重要層面之-是狀態索引與機率模型 奋 x月的映射。此映射藉由以狀態變l:t作為一輸入 啤Η函數anth—get』k()而被完成。 返::是機率模型索引W,其由算術編解碼 器使用為 ’”別付嬈(或從—組累積頻率表格選擇適當累積頻率 表格)的機率分佈(也稱為累積頻率)。One of the particularly advantageous implementations is shown in the table of Figure 14. Here, the item "arith_cf-ng hash[]" listed in the table of Figure 14 refers to a one-dimensional integer type index (also known as "element index" or "= column index". ), for example, is designated as "i". As shown, the first line 14 ι is an index line 'describes the starting index associated with the respective columns 1412a through 1412p. A first value line 1420 displays the item value of the table "arith_cf-ngJlash[]" equal to the start index shown in index line 1310. A second value seven 1422 displays an entry for the table "arith_cf_ng_hash" which describes the table "arith_Cf_ng_hash" as compared to the start index shown in row 1410. The element index is 2 greater than the starting index shown in row 141〇52 201126508. Similarly, rows 1426, 1428, 1430, 1432, 1434 bind the entries of the table "arith_cf_ng_hash" whose element index is 3 (rows 1526), 4 (rows) greater than the starting index shown by row 1410 for each column 1412a through 1412p. 1428), 5 (row 1430), 6 (row 1432), or 7 (row 1434). For example, column 1412a displays the entries of the table "arith_cf_ng_hash" with element indices 0, 1, 2, 3, 4, 5, 6, and 7 in rows 1420 through 1434. Similarly, column 1412b displays the entries of the table "arith_cf_ng_hash" with element indices 8, 9, 10, 11, 12, 13, 14, and 15 at rows 1420 through 1434. For the other columns 1412c to 1412p, the arrangement of the above items is similar. 5.2•Form “arith_cf_ne” Tables 15(1) to 15(10) show the table representation of the item “arith_cf_ne” in the table. When the element index ne is decoded, it is evaluated by the function “arith_decode()”. As shown, the entry in the table "arith_cf_ne" contains an index between 0 and 2699. A starting index row 1510 describes the starting chain associated with each column. A first column is designated, for example, 1512a, and a second column is designated, for example, 1512b. A first line 1520 shows the entry for the table "arith_cf_ne" whose index is determined by the values listed in the starting index row 1510 of the corresponding column (e.g., columns 1512, 1512b). Item rows 1522, 1524, 1526, 1528, 1530, 1532, 1534 display a table "arith_cf_ne[]" with an element index, which is in the starting index row 1510 for each column (eg, columns 1512a, 1512b). The starting index shown is 1 (row 1522), 2 (row 1524), 3 (row 1526), 4 (line 53 201126508 1528), 5 (row 1530), 6 (row 1532), or 7 (row 1534). . Therefore, the first column 1512a displays elements of the table "arith-cf__ne" having an item index between 〇 and 7 in rows 1520 to 1534. The second column 1512b displays entries of the table "arith_cf_ne[]" having an item index between 8 and 15 in rows 1520 through 1534. 5.3. Form "arith_cf_ng[pki][545]" Figures 16(1) through 16(32) show a set of 32 cumulative frequency tables "arith_cf_ng[pki][545]", one of which consists of An audio encoder 1 or an audio decoder 200 selects, for example, the function "arith_decode()", that is, for decoding the group index ng. The 32 cumulative frequency tables shown in Figs. 16(1) to 16(32) have the function of the table "cum_freq[]" when the selector performs the function "arith_decode()". As shown in Figures 16(1) through 16(32), the different values of the variable pki are associated with different tables. A cumulative frequency table index pki=〇 is associated with a table 1601, a cumulative frequency table index pki=l is associated with the table 1602, and the cumulative frequency table index pki=2, pki=3, ... pki= 31 is associated with tables 1602 through 1632. With a table index? The structure of the tables 1601 to 1632 associated with 1^丨=0 to 咏丨=32 is equal' so that only the structure represented by the table 16〇1 associated with the table index pki=〇 will be described in detail. Table 丨6〇1 contains an index row 1640 that displays the starting index associated with each column of table 1601. A first column is designated as 1642a' and a second column is designated as 1642b. A first item row 1650 represents the entry of table 1601 with an item index equal to the starting index of each column shown in index row 1640. Similarly, item rows 1651 through 1665 display items of table 1601 (associated with table index pki=〇54 201126508) having an index that is larger than the starting index shown in index row 1640 for each column. 1 (row 1651), 2 (row 1652), 3 (row 1653), 4 (row 1654), 5 (row 1655), 6 (row 1656), 7 (row 1657), 8 (row 1658), 9 ( Lines 1659), 1 (rows 1660), 11 (rows 1661), 12 (rows 1662), 13 (rows 1663), 14 (rows 1664), 15 (rows 1665). Thus the 'first column 1632a' displays the value of the entry of the table 1601 with the element index between 〇 (values 16684) and 15 (value 6352). The second column 1642b displays the value of the table 1601 having an index of elements between 16 (value 6202) and 31 (value 3547). Naturally, the above rules also apply to other columns. Also, the arrangement of the element items in Tables 16(2) through 16(32) is equal to the arrangement of the items in Table 1601. 5.4. Table "dgroups[]" Figures 17(1) and 17(2) show a representation of the item "dgroups[]". The table can be applied by the audio encoder 100 and the audio decoder 200. For example, the 'table "dgroups[]" can be used to execute the algorithm "tuples_decode", as shown in Figure 3. Also, the table "dgroups[]" can be applied by the algorithm in Fig. 5f to determine the number of elements mm in a group and the group offset og of a group specified by the group index ng. The representation of the table "dgroups[]" contains an index row π ΐ〇 which displays the starting index associated with each column 'e.g., a first column 1712a and a second column 1712b represented by the table. A first line 172 of the representation indicates that the item of the table "dgroups[]" has an element index equal to the starting index of each column as shown in index line 171A. Similarly, item rows 1722, 1724, 1726, 1728, 1730, 1732, 1734 display entries for the table "dgroups[]", the element index is greater than the starting index of each column shown in index row 1710 (line 1722). 2 (row 55 201126508 1724), 3 (row 1726), 4 (row 1728), 5 (row 1730), 6 (row 1732), or 7 (row 1734). For example, the first column 1712& displays the entry of the table "dgroupS[]" having the element index between 0 (rows 1720) and 7 (row 1734) in the item rows 1722 to 1734. Similarly, the second column 1712b displays the value entries for the table dgroups[]" having the element index between 8 (rows 1720) and 15 (row 1734) in rows 172 〇 through 1734. It should be noted that the value of the table "dgroups[]" is shown in the 16th (1) and 17th (2) figures, and the notation is indicated by "Ox". The most significant hexadecimal digits are shown on the left and the least significant hexadecimal digits are shown on the right. 5.5j;^"dgvectors[]" Figure 18(1) to Di 18(11) shows a table representation of the item "dgvectors[]". The table "dgvectors[]" can be used, for example, for an audio encoder 1 or a speech decoder 2 . For example, the table "dgvectors[]" can be used in step 312d of the function "tuples-decode()" shown in Fig. 3, or in the execution of the variant of the 5g map. Thus, the table "dgVectors[]" can be used to map a set of indices and elements to a value of a most significant bit plane of a spectral value tuple. The table representations of Figures 18(1) through 18(11) include an index row 1810 that is included in the table to indicate the starting index associated with the columns (e.g., the first column 1812& or a second column 1812b). Item lines 1820 through 1882 display the entry of the table "dgvectors[]" with an item index equal to a corresponding starting index for each column as indicated by index line 181'. Subsequent item lines 1822 through 1882 display the item "dgVectors[]" in ascending power, which is greater than the starting index value shown in index line 181 of each column by one (row 1822), 2 (row 1824). 3 (row 1826), 4, 5, 6, 56 201126508 7, 8, 9, 10, Η, ... 29 (row 1878), 30 (row 1880) or 31 (row 1882). Thus, the first column 1812a displays the entry of the table "dgvectors[]" with the element index between 0 and 31 displayed in rows 1820 through 1882, where the element index associated with the item monotonically increases from left to right. Similarly, the second column 1812b displays a table "dgvectors[]" with an element index between 32 and 63 between rows 1820 and 1882 (the element index increases from left to right). 5.6^;j^regroupsj Figures 19(1) through 19(32) show a table representation of the table "egroups[a][b][c][d]", which can also be considered as having A four-dimensional array of four elements indexing a, b, c, d. It should be noted that each of the element indices a, b, c, d can take a value between 0 and 7. The table or array "egroups[a][b][c][d]" can be used for the audio encoder 100 or the audio decoder 200. For example, the table or array "egroups[a][b][c][d]" can be used in the function "arith_get_context()" to derive the return value, and can be used in the function "arith_update_context()" to determine the item. The item "V" of the array q of the index (1, 1+j). The items of the array "egroups" are shown in 64 tables 1901 to 1964. Different combinations of indices a and b are associated with each of the tables 1901 to 1904. For example, the combination a=0 and b=0 are associated with the first table 1901, and the combination a=0, b=l is associated with the second table 1902. It should be noted here that the structures of the different tables 1901 to 1964 are the same, so only the structure of the first table 1901 will be discussed herein. Table 1901 contains a navigation line 1970 representing the value of the third index c associated with each column 1972a through 1972h. Similarly, an index column 1980 represents the index value of the fourth index d associated with each column 57 201126508 1982a through 1982h of table 1901. Therefore, the index associated with the item of one of the tables 1901 to 1964 is determined by the value of the index row of each column and the fourth index d of the item is determined by the value of the index column of each row of the item. For example, 'Table 190, column 1972, represents the array "egr0UpS[a][b][c][d] for the indexes a=0, b=0, c=〇 and d=(0 to 7) (from left to right). Item. Also, line 1982a represents an item "egroups[a][b][c][d]" from top to bottom a = 〇, b = 〇, c = (〇 to 7) and d = 〇. Also, it should be noted that the item "egr〇ups" of the array is represented by the ten/, five-input method, which is previously set to "〇x". 6. Performance Evaluation and Advantages An embodiment of the present invention uses a set of updated tables, as described above, which greatly reduces the memory requirements of spectrally low noise encoding when compared to a previously used group of tables. Lossless transcoding is possible in terms of bit rate limits. In the following, a modification of a previously used low noise code under the concept of the invention will be discussed. Tone sfl coding concepts such as, for example, the so-called Unified Language and Audio Encoder (USAC) use a context-applicable arithmetic coder (and decoder) for low noise (or lossless) coding of quantized spectral coefficients (eg, spectral coefficients 252) . The contextual application associated with the different codec (encoder or decoder) allows for high and low noise coding performance. The main drawback of this technique stems from its relatively high complexity in a memory requirement. In fact, contextual application requires a relatively large group to model different probability distributions. The read-only memory (R〇M) consumption of the previously used Unified Language and Audio Encoder (USAC) was evaluated to be approximately 150 kWords' where the entropy encoder represents approximately 73% of the total demand. 58 201126508 The purpose of this paper is to connect φ (or decode ϋ), ι (four) two tables for arithmetic coder ... turn down the noise codec. The initial performance __ is less __. The original case: In the text, the purpose of the previously implemented system-language and audio encoder will be described. Figure 7 is a table showing the detailed memory requirements of the USAC low miscellaneous horse 4 coding benefit or decoder of Miscellaneous. The table of the map is easy to observe that the most needed form is the table "anth_(ng_hash[]" and a=th_ef_ng[][]" related to the context of the group code. It is worth noting that the aggregation of the elements (four) is cumulative. '": <· and the table of the remaining bit plane symbols "adth_cf_ne[]" and a__Cf_r[]" can be easily replied algebraically, and do not need to be stored. In the following 'about the proposed new group Some details of the reduced form will be described. In accordance with the present invention, it is proposed to replace the previously used form anth_cf_ng_hash[] with the new set of tables presented in Figure 14 and Figures (1) through (32). r arkh—cf_ng[][]”. The new forms arith__cf_ng_hash[]" and "arith cf ng[pki][545]" exhibit reduced size compared to previously used forms (eg, for tables in the USAC reference model) and are subsequently referred to as r-reduced tables "." Memory requirements for low noise codecs (encoders or decoders) using reduced tables are detailed in the table in Figure 8. The new table group displays a size reduction factor of approximately 7 compared to the original group. This reduction can be achieved by reducing the number of probability distribution models and by optimizing the selected model. In addition, the mapping between the context state and the newly defined model is optimized. 59 201126508 In the following 'a performance evaluation on the bit rate will be proposed. In particular, a lossless transcoding between the bit stream generated using the previously applied "large" table and the -transcoded bit stream provided in accordance with the "smaller" form proposed will be discussed. The touch table is shown to be able to clearly transcode a bit stream generated by a previously used form, which is also referred to as a "reference model 0 form" or "RMO form". The transcoding is carried out using the transcoding scheme described in Figure 9. Figure 9 shows a block diagram of a lossless transcoding from the Reference Model 〇 Table to the Reduced Table. As shown in Fig. 9, the evaluation setting includes a USAC RMO encoder 91〇, which accepts the spectral value 9〇8 as a sfl, and provides an arithmetic coding of the spectral values using the “old” table of the reference model 〇 Representation 912. Thus, a so-called rRM〇" bit stream is obtained, which contains the isotactic encoded spectral value 912. The evaluation setting 9 〇〇 also includes lossless transcoding 920 'where the arithmetically encoded spectral value 912 of the RM0 bit stream is decoded by an entropy decoder using an "old" table (reference model 〇 table) to obtain a decoded spectral value 924. The lossless transcoding also includes an entropy coder configured to decode the spectral value 924 using a "new" reduced table code to obtain a reduced table bit stream 928. The RM0 bitstream containing the representation of the spectral values encoded using the "old" table is then compared to the reduced table bitstream 928 containing the representation of the spectral values encoded using the "new" table. For the purpose of proof, the so-called RM0 bit stream is decoded using a USAC reference decoder and using an "old" table to obtain a so-called RM0 synthesis result 942. Also, the so-called reduced table bitstream 928 is decoded using a USAC reference decoder and using a "new" table. Therefore, a reduction of the table 60 201126508 lattice synthesis result 952 was obtained. The RM0 synthesis result 942 is then compared to the reduction table synthesis result 952 to confirm the correctness of the implementation. In the following, some of the results of the analysis will be described. The tables in Figures 10 and 11 show the minimum, maximum and average bit rates on all sub-segments of complete, sequential, and coded items in RMO and reduced tables, respectively. For each mode of operation, the length of a different segment is determined by the combination of subsequent access units, which is approximately 100 ms in length. The length of the sub-segment and the corresponding bit rate are listed in two tables. As mentioned above, flow from RM〇 table bits to reduced tables. The lossless transcoding of the bit stream is achieved for each mode of operation, i.e., the bit storage condition is not violated when the bit extraction synthesis is obtained. The table in Figure 12 compares the bit rates produced by the core codec only when using the R Μ 0 table and reducing the table. In addition to the 64 kWt (kbps) per second stereo, the reduced table-to-mother-transport mode is better in the implementation than the r, and the average length of the bit is only 64 kbps_.G2%, which corresponds to approximately Q 5 bits. The bit stream generated by the meta/signal U and the subtracted table still matches the bit rate requirement in this mode of operation. The table in Fig. 13 shows that for each of the operation modes, the worst and best case 1 difference of the bit ^ aggregated after or after transcoding the face stream into the reduced table bit stream is __ Subsection basis. Observable 6 In all the wealth, reducing the form genus is extremely convincing and extremely stable. For a segment, the dream is replaced by (4) the total bit rate by the bit that replaces the RM0 table with the reduced table. On the other hand, the reduction of bits can reach more than 6%. The USAC frequency, a new table of low noise coding for sauces is proposed, the size of which is greatly reduced by 61 201126508 while maintaining the high coding performance of the spectrum low noise coding module. The size reduction achieved is approximately -7 ratio*, or greater than 90kWordS. The proposed new set of tables allows for a significant reduction in memory requirements and therefore reduces implementation complexity. When Saki synthesizes the output waveform, the bit precision is maintained for each operation mode. The above advantages are achieved by improving a specific part of the arithmetic coding. In the second embodiment, 'new hash table' adth_ef_ng_hash[128], -32. arith^ng[32][545], th_get』k()" is used. The update of the arithmetic codec table (when compared to the previously used arithmetic codec silk) changes the state index s to the probability (4) mapping of the mail and probability model itself. It does not change the state index. ★ Does not change the probability index. This is followed by the encoding method currently used (ie, the current 4-tuple group index ng). The main advantage of the table is that the size of the memory required to store the table is now about 15 kW 〇 nlS (ie, 15 * 1024 - every inch of 32 bits - * inch instead of about 110 kW 〇 rds while maintaining the code Efficiency. The important aspect of the above improvement is the mapping of the state index to the probability model. This mapping is done by changing the state: l:t as an input beer function anth_get』k(). The return: is the probability model index W, which is used by the arithmetic codec as the probability distribution (also referred to as the cumulative frequency) of the 'Do not pay (or select the appropriate cumulative frequency table from the -group cumulative frequency table).

、射、兩個步驟完成’該兩個步驟以函數中兩個不同 部份被完成。 N 62 201126508 在(函數「arith_get—Pk()」)第一部份540d,如果目前狀 態t是一顯著狀態,散列表rarith—cf_ng_hash」(也稱為 ari_pk_hash[])被考慮且被用於核對。一顯著狀態是在一訓 練階段被選擇使其「本身」映射至pki的狀態◊非顯著狀態 稍後在函數arith一get—pk()的第二部份550e使用預設映射(也 稱為基於範圍的映射)被映射至pki。依據本發明,顯著狀態 之數目被減少至67’其大大低於來自舊表格的先前22955個 顯著狀態。藉由減少顯著狀態之數目,表格 「arith_cf—ng_hash[]」(也稱為ari_pk_hash[])之尺寸被成比 例減少。並且希望性能將被影響。然而,性能可藉由保證 訓練被精確完成而被維持。該訓練可在一切換編碼結構中 針對AAC以及TCX被執行。此等預防當產生舊表格及保留 6 7個狀態是對於一非切換音訊編解碼器以及對於一切換音 sfL編解碼裔的最有用狀I時被採用。 下文編碼允許檢測目前狀態t是否是一顯著狀態。如果 是,則函數在相關聯機率模型索引pki返回中結束(因為返回 狀態被達到): i=63 *t; for (;;) ί j=ari_pk_hash[i&127]; if (j= = 〇xFFFFFFFFul) break; if ( = = ^) return j&255; /++,. 63 201126508 在上述編碼(又見參考數字^(^^的第一列的係數63決 疋散列表搜尋中的起始點。在訓練階段其使用一全局最佳 化被固定。在散列表中,狀態索引t在項之24個「最後」位 元(例如項之24個最左或最顯著位元)上被編碼。「第_」位 7C(例如,八個最右或最不顯著位元)相對應於關聯機率索 引。在一簡單實施中,表格僅包含67個項。然而,爲了最 小化對表格的存取數目,一溢出機制被使用。事實上值 OxFFFFFFFFul的128-67個溢出符號被插入且當一狀態不顯 著時,允許減少對表格的存取數目。在此情況中,當此一 付5虎遇到時,編碼直接跳至函數「arith—get_pk()」的一第 二部份540e(因為「中斷」狀態在此情況中被達到)。 映射函數的540e之第二部份被用於映射不顯著狀態。 該等狀態依據索引被分成七個不統一段。各段與一機率模 型相關聯。該映射被記錄在查找表psci□中。在第22個位元 t之外的位元指示位準預測之準確度。依據預測準確度,一 不同映射被使用。總計七段與四個準確度被考慮,它們對 應於儲存在psci□中的總計28個不同映射。該映射依下文被 完成: p=psci+7*(t>>22); j= t & 4194303; if (j<436961 ) i if( j<252001 ) return p[(j<243001 )?〇:!]; else return 64 201126508 p[(j<288993)?2:3]; } else { if ( j<1609865 ) return p[(j<880865)?4:5]; else return p[6]; } 最終,該等機率模型從128被減少至32。已發現許多先 前使用的模型僅很少被選擇或並不真正有用。一適當訓練 允許選擇僅32個表現模型。 綜上所述,依據本發明之實施例係有關於上述映射表 格,它們在一音訊編碼器或一音訊解碼器中被實施。新表 格之優勢部份來自於被執行的充分訓練。另外,本發明是 基於有關表格之最佳尺寸的考慮。 7.位元流語法 在下文中,一攜帶算術編碼頻譜資訊的位元流之位元 流語法將參考第6a圖到第6h圖被描述。 第6a圖繪示一所謂的USAC列資料塊 (「usac_raw_data_block()」)的一語法表示。 202USAC列資料塊包含一或一個以上單通道元素 (「single_channel_element()」)及/或一或一個以上通道對元 素(「channel_pair_element()」)° 現在參考第6b圖,一單通道元素的語法被描述。該單 通道元素依賴核心模式包含一線性預測域通道串流 65 201126508 (「lpd_channel_stream()」),或一頻域通道串流 (「fd_channel_stream()」)。 第6c圖繪示一通道對元素的一語法表示。一通道對元 素包含核心模式資訊(「core_mode0」,「core_model」)。 另外,通道對元素可包含一組態資訊「ics_info〇」。另外, 取決於核心模式資訊,通道對元素包含一線性預測域通道 串流或與該等通道的一第一個相關聯的一頻域通道串流’ 且該通道對元素也包含一線性預測域通道串流或與該等通 道的一第二個相關聯的一頻域通道串流。 組態資訊「ics_info()」,第6d圖所示的一語法表示’包 含複數個不同組態資訊項,它們與本發明不是特別相關。 一頻域通道串流(「fd_channel_stream〇」),第6e圖所 示的一語法表示,包含一增益資訊(「global—gain」)及一組 態資訊(「ics_info〇」)。另外,頻域通道串流包含比例因數 資訊(「scale_factor_data()」),其描述用於縮放不同比例因 數頻帶之頻譜值的比例因數,且例如由縮放器150及重新縮 放器240應用。頻域通道串流也包含算術編碼頻譜資料 (「ac_spectral—data〇」),其表示算術編碼頻譜值。 算術編碼頻譜資料(「ac_spectral_data()」),第6f圖所 示的一語法表示,包含一可任選算術重設旗標 (「arith_reset_flag」),其被用於選擇性地重設上下文,如 上所述。另外,算術編碼頻譜資料包含複數個算術資料塊 (「arith_data」),其攜帶算術編碼頻譜值。算術編碼資料 塊之結構取決於頻帶之數目(以變量「num_bands」表示)’ 66 201126508 且也取決於算術重設旗標,將在下文中討論。 算術編碼資料塊之結構將參考第6g圖被描述,第㈣ 顯=該等算術編碼資料塊之—語法絲。細編碼資料塊 的資料表示取決於要被編碼的頻譜值之數目匕,算術重抓 旗標之狀態,且也取決於上下文,即,先前編碼的頻譜值。 用於編碼目前組的頻譜值的上下文依據參考數字66〇 所示的上下文決定演算法被決定。算術料資料塊包含㈣ 組的碼字,各組碼字表示-頻譜值組碼字包含一 算術碼字rac〇d_ng[pki][ng]」,其表示使用丨與加位元之間 的一頻譜值元組的一群索引ng。如果包含—頻譜值元組的 群包括多於一個元素,則一組碼字也包含一算術碼字 「acod_ne[ne]」,其表示該頻譜值元組的一元素索引时。另 外,如果該頻譜值元組需要比一正確表示的最顯著位元平 面要求更多位元平面,該組碼字包含一或一個以上碼字 「acod~r[][][][]」。碼字「acod_ne[ne]」表示使用!與2〇位元 之間的元素索引,且碼字「ac〇d_r[][][][]」表示使用1與2〇 位元之間的一較不顯著位元平面。 然而,如果一或一個以上較不顯著位元平面被要求(除 最顯著位元平面之外)用於該頻譜值元組的一適當表示,其 藉由使用—或一個以上算術溢出碼字(「ARITH_ESCAPE」) 被發信。因此,可大體而言對於一頻譜值元組來說,決定 了所需要位元平面(最顯著位元平面,且,可能一或一個以 上附加較不顯著位元平面)之數目。如果一或一個以上較不 顯著位元平面被需要,其藉由一或一個以上算術溢出碼字 67 201126508 「aC〇d_ng[Pki][ARITH—ESCAPE]」被發信,溢出碼字依據 一目前選擇的累積頻率表格,變量pki提供的—累積頻率表 格索引被編碼。另外,如果一或一個以上算術溢出碼字被 包括在位元流中,該上下文適用,如參考數字664、662所 示。接隨一或一個以上算術溢出碼字之後,一算術碼字 「aC〇d_ng[Pki][ng]」被包括在位元流中,如參考數字663 所示,其中pki表示目前有效機率模型索引(考慮到包括算術 溢出碼字產生的上下文剌)’且其中ng表示與要被編碼的 該頻譜值元組之最顯著位元平面相關聯的群索引。該群索 引可在一編碼器中藉由評估表格dg向量被導出,該表格电 向量允許當與表格「dgroups」組合時導出與一頻譜值元組 相關聯的群索引ng及一元素索引ne。 如果包括該欲被編碼的頻譜值元組的群包含多於一個 元素,算術編碼資料塊包含使用一適當選擇的累積頻率表 格編碼元素索引ne的碼字「acod_ne[ng]」。 如上所述,任何較不顯著位元平面的存在導致一或一 個以上碼子ac〇d_r[] [][][]」的存在,各表示—較不顯著位 元平面的一元組4位元。該一或一個以上碼字 「acod一!·[][][][]」依據一對應累積頻率表格被編碼,該累積 頻率表格是恒定的且上下文無關的。 另外,應注意該上下文在編碼各頻譜值元組之後被更 新,如參考數字668所示,使得該上下文對於編碼兩個後續 頻譜值元組典型地不同。 第6h圖顯示定義以及定義算術編碼資料塊之語法之輔 201126508 助元素的說明。 综上所述,-位元流格式被描述,其可由音訊編碼器 100提供,且可由音訊解碼器200評估。算術編碼頻譜值之 位元流被編碼使得其適合解碼上述演算法。 另外,應大體注意到編碼是解竭的反運算,所以可大 體假定該編碼器使用上述表格執行—表格查找,這近似為 由解碼器執行的表格㈣之逆反。大體上,可以說瞭解解 碼演算法及/或所需位元流語法的該技藝中具有通常知識 者將容易地能卜算術編碼器,這提供在位元流語法 中定義的且算術解碼器需要的資料。 8.解碼方法 i在下文巾,—種基於—編碼音訊:#訊提供—解碼音訊 資讯的方法將參考第20圖被描述。方法2〇〇〇包含—某於頻 譜值的一算術編碼表示提供複數個解碼頻譜值的第1步驟 細。方法2_進__步包含—使崎碼頻譜值提供一時域 音訊表示的第二步驟2020。步驟2〇1〇包含依賴—狀離索引 從-組32個累積頻率表格選擇2G12_累積頻率表格:步驟 2010也包含一應用選擇的累積頻率表格從表示群索引的一 可變長度碼字導出-群索引的子步獅14。步繼刚也包 含使用群索引及元素索引導出一頻譜值元組的—最顯著位 元平面之值的子步驟2016,該元素索引表示群索引選擇的 一群中的-it素。步驟2_也包含—使用該頻譜值^組的 最顯著位元平面之值提供一解碼頻譜值元組的子步驟 2018。 69 201126508 9. 編碼方法 在下文中,一種基於一輸入音訊資訊提供一編碼音訊 資訊的方法將參考第21圖被討論。第21圖之方法2100包含 一基於輸入音訊資訊的一時域音訊表示提供一頻域音訊表 示的第一步驟,使得該頻域音訊表示包含一組頻譜值,且 使得一能量被集中於一子組頻譜值。方法2100也包含一編 碼該組頻譜值的一元組相鄰頻譜值,或編碼該組頻譜值的 一預處理版本的一相鄰頻譜值元組的第二步驟2120。步驟 2120包含一將一元組頻譜值的一最顯著位元平面之值映射 至一群索引及一元素索引的子步驟2122,該元素索引表示 群索引選擇的一群中的一元素。步驟2120也包含一依賴一 描述算術編碼器之狀態的狀態索引從一組32個累積頻率表 格中選擇一累積頻率表格的子步驟2124。步驟2120也包含 一使用選擇的累積頻率表格(在子步驟2124中選擇的)算術 編碼群索引以獲得一算術編碼可變長度碼字的子步驟 2126。 此處應注意第20圖及第21圖的方法2000及2100可由本 文關於發明的編碼方案描述的任何特徵及功能補充。另 外,方法2000及2100可由本文所討論設備的任何特徵及功 能補充。並且,本文描述的編碼表格較佳地與編碼方法及 解碼方法一起使用。 10. 實施替代方案 雖然某些層面以一設備為背景被描述,很明顯,此等 層面也表示對應方法的一描述,其中一方塊或裝置相對應 70 201126508 2一方法步驟或—方法步驟的特徵。_地,在-方法步 P之上下文中描述的層面也表示—對應方塊或-對應設備 7目或特徵的—描述。某些或所有的該等方法步驟可藉 由(或使用)一硬體設備’諸如例如1處理器、-可程式電 月旬或-電子電路被執行。在_些實施例中,某—或一個以 上的最重要方法步驟可由此—設備執行。 u的u g聽射破儲存於—數位儲存媒體或可 專輸媒體上諸如一無線傳輸媒體或一有線傳輸媒體諸 如網際網路被傳送。 〜據ί — 實知要求,本發明之實施例可以硬體或軟體 2 h &可❹_具有f子可讀控制信號儲存於其上 術儲存媒體’例如-軟碟、-軸、一藍光、一 DC、 PR〇M、— EPR〇M、— EEPROM 或-FLASH記 執行,該等數位儲存媒體與—可程式電腦系統協作 $此。與其協作)使得各方法被執行。· 可以是電腦可讀的。 予午篮 依據本發明之一此會祐加—人 ㈣H辦 二貫〜例包含-具有電子可讀控制信 們能夠以一可程式電腦系統運算,使得 本文描述的该專方法之—被執行。 腦產。^明之實施例可以—程式碼被實施成一電 =…當該電腦程式產品執行於一電腦之上時,該 執行該等方法之-。該程式碼可例如被 儲存於一機益可讀載體上。 八他實知例包含用於執行本文所述方法之—儲存於一 71 201126508 機器可讀載體上的電腦程式。 因此,換句話說,發明的方法之一實施例是一具有一 程式碼的電腦程式,當其執行於一電腦之上時執行本文所 述方法之一。 因此,發明的方法之一另外的實施例是一資料載體(或 一數位儲存媒體,或一電腦可讀媒體),其包含儲存於其上 用於執行本文所述方法之一的電腦程式。 因此,發明的方法之一另外的實施例是一資料串流或 一序列表示執行本文所述方法之一的電腦程式的信號。該 資料串流或該序列信號可例如被組態成經由一資料通信連 接,例如經由網際網路被轉移。 一另外的實施例包含一處理裝置,例如一電腦,或一 可程式邏輯裝置,被組態成或適應於執行本文所述方法之 ---- 一另外的實施例包含一電腦,其具有儲存於其上執行 本文所述方法之一的電腦程式。 在一些實施例中,一可程式邏輯裝置(例如一現場可程 式閘陣列)可被用以執行本文所述方法之某些或全部功 能。在一些實施例中,一現場可程式閘陣列可與一微處理 器協作以執行本文所述方法之一。大體上,該等方法較佳 地由任何硬體設備執行。 上述實施例僅是對本發明之原理的說明。應理解本文 所述安排及細節之修改及變化對該技藝中具有通常知識者 將是明顯的。因此意圖僅受限於所附申請專利範圍之範 72 201126508 圍、而非受限於本文中藉由實施例之描述及解釋提出的特 定細節。 雖然上文特別參考上述實施例被顯示及描述,該技藝 中具有通常知識者將理解可作成在形式及細節上的各種其 他變化而不違背其精神及範圍。也應理解可順應不同實施 例作成各種變化,而不違背本文所述及下文申請專利範圍 包括的較廣泛概念。 【圖式簡單說明3 第1 a - b圖繪示依據本發明之一實施例的一音訊編碼器 之方塊示意圖; 第2 a-b圖繪示依據本發明之一實施例的一音訊解碼器 之方塊示意圖; 第3圖繪示一用於解碼一頻譜值元組的一演算法 「tuples_decode()」的一偽程式碼表示; 第4圖繪示一狀態計算之上下文的一示意表示; 第5a圖繪示重設一上下文的一演算法 「arith_reset_context()」的一偽程式碼表示; 第5b圖繪示映射一上下文的一演算法 「arith_map_context()」的一偽程式碼表示; 第5c圖繪示獲得一上下文狀態值的一演算法 「arith_get_context()」的一偽程式碼表示; 第5d圖繪示從一狀態變量導出一累積頻率表格索引值 pki的一演算法「arith_get_pk(s)」的一偽程式碼表示; 第5 e圖繪示從一可變長度碼字算術解碼一符號的一演 73 201126508 算法「arith—decodeo」的一偽程式碼表示; 第5f圖綠示從-群索引叫導出—元素數目值咖及— 群偏移值og的一演算法的一偽程式碼表示; 第5g圖繪示基於群偏移值〇g及一元素索引值ne獲得— 頻譜值元組的一最顯著位元平面的頻譜值a、b、:d的一 演算法的一偽程式碼表示; 第5h圖繪示將一亓, 兀、且a、b、c、d的頻譜值與一較不 著位元平_值結合,叫得該元組a、b、e、蝴譜值的 -更新版本的演算法之―偽程式碼表示; 第5i圖繪示更紐^ τ 更新上下文的一演算 「arith_Update_c崎xt()」的一偽程式碼表^ 彳 第5j圖繪示概讀變量之圖例; 第6a圖繪示一統一言五士 κ 1 塊的-語法表示;°5及^編碼_道)原始資料 第_繪示-單通道元素的一語法表示; 第6C圖繪不一通道對元素的語法表示; 第6d圖繪示一「ic<5 似」_資訊的語法表示; 苐6e圖繪不一頻域通道串流的一語 第_繪示被算術一的_f料之—語法 苐6g圖繪不解碼一組頻譜值元組的— 丁 第他圖繪示資料元素及變量之圖例; 第7圖繪示一先前使用 一表格表示; ^術編㉟器的記憶體需求之 第8圖繪示依據本發明一〜 一异術編碼器的記憶體需求之 74 201126508 一表格表示; 第9圖繪示一評估透過依據本發明的算術編碼器獲得 的性能改良之設備的方塊示意圖; 第10圖繪示用於使用一先前使用的算術編碼器編碼不 同音訊資訊所要求的位元率之一表格表示; 第11圖繪示使用發明的概念編碼不同音訊資訊要求的 位元率之表格表示; 第12圖以一表格表示的形式繪示一先前使用的音訊編 碼器與依據本發明的一音訊編碼器產生的平均位元率之間 的一比較; 第13圖以一表格表示的形式繪示當較之於一先前使用 的概念,使用本發明的概念獲得的位元率減少與位元率增 加的一比較; 第14圖繪示一表格「arith_cf_ng_hash[]」之項的一表格 表示; 第15(1)圖到第15(10)圖繪示一表格「arith_cf_ne[]」之 項的一表格表示; 第16(1)圖到第16(32)圖繪示索引pki之32個不同值0到 31的一表格「arith_cf_ng[pki]」之項的一表格表示; 第17(1)圖到第17(2)圖繪示一表格「dgroups[]」之項的 一表格表示; 第18(1)圖到第18(11)圖繪示一表格「dvectors[]」之項 的一表格表示; 第19(1)圖到第19(32)圖繪示一表格 75 201126508 「egroups[a][b][c][d]」之項的一表格表示; 第20圖繪示一提供一音訊資訊的—解碼表示的方法之 流程圖;及 編碼表示的方法之 第21圖繪示一提供一音訊資訊的— 流程圖。 【主要元件符號說明】 100…音訊編碼器 110···輸入音訊資訊 110a…預處理之11〇 112、210···位元流 120…預處理器 130、260···信號轉換器 130a…視窗化MDCT轉換器 132···頻域音訊表示 140···頻譜後處理器 142…後處理之132 150···縮放器/量化器 152"·縮放及量化之132 160…心理聲學模型處理器 170…算術編碼器 172a、172b…算術碼字資訊 174…最顯著位元平面萃取器 176…最顯著位元平面 索弓丨決定子 180···第一碼字決定子 180a…群索引值ng 180b…元素索引值ne 182、299...狀態追縱器 184.·.狀態資訊 186、296···累積頻率表格選擇 器 188…描述選擇的累積頻率表 格之資訊 189a…較不顯著位元平面萃取 器 189b、189d·"較不顯著位元平 面資訊 189c…第二碼字決定子 190···位元流負載格式器 200…音訊解碼器 178、280…群索引決定子/元素212…解碼音訊資訊 76 201126508 220…位元流負載變形項 222…編碼頻域音訊表示 224…狀態重設資訊 230…算術解碼器 232…解碼頻域音訊表示 240···反向量化器/重新縮放器 242…反向量化及重新縮放之 頻域表示 250···頻譜預處理器 252…242之預處理版本 262…時域表示 270···時域後處理器 284…最顯著位元平面決定子 286…最顯著位元平面值 288···較不顯著位元平面決定 子 290…較不顯著位元平面解碼 值 292···位元平面組合器 298…狀態索引 310···上下文初始化 312----元組解碼 312a…上下文值計算 312b…群索引解碼 312c…元素索引解碼 312(l···最顯著位元平面測定 312e…較不顯著位元平面相加 312ba··.解碼演算法 312ca···演算法 410…橫座標 412…縱座標 420〜444···元組 530a、530fa、550a...變量初始 化 530b…第一條件檢查 530c…第二條件檢查 530cl···第三條件檢查 530e…第四條件檢查 530f·.·上下文計算 530fb…變量重新縮放 530fc.·.基於表格值適應 530fd…返回值計算 540a〜540c 、 550b 、 550d 、 660〜668…參考數字 540d…散列表接取 540e…基於範圍提供 540ea〜540ec…540e之步驟 550c···迭代累積頻率表格查找 77 201126508 550e…適應 550f…間隔重整 550fa···下移運算 5獅..·間隔尺"t増力π運算 580a、580b…步驟 580c…第一映射 580d…第二映射 900…評估設定 910“_USACRM0 編碼器 912…算術編碼頻譜值 920…無損轉碼 924…解碼頻譜值 928…減少的表格位元元流 942...RM0合成結果 952…減少表格合成結果 1410、1414〜1434、1510、 1514-1534 、 1710 ' 1714-1734 、 1810 、 1814-1882 、 1970 、 1980~1982h..·行 1412a〜1412p、1512a、1512b、 1712a、1712b、1812a、1812b、 1972a 〜1972h...列 1601-1665…表格 1901〜1964…表格 2000···方法 2010~2126…步驟 78, shot, two steps to complete 'The two steps are completed in two different parts of the function. N 62 201126508 In the first part 540d (function "arith_get-Pk()"), if the current state t is a significant state, the hash table rarith_cf_ng_hash (also known as ari_pk_hash[]) is considered and used for checking . A salient state is a state that is selected to map itself to pki during a training phase. A non-significant state is later used in the second part 550e of the function arith-get_pk() to use a preset mapping (also known as based on The mapping of the range is mapped to pki. In accordance with the present invention, the number of salient states is reduced to 67' which is much lower than the previous 22955 salient states from the old table. By reducing the number of salient states, the size of the table "arith_cf-ng_hash[]" (also known as ari_pk_hash[]) is reduced proportionally. And hope that performance will be affected. However, performance can be maintained by ensuring that training is accurately completed. This training can be performed for AAC and TCX in a handover coding structure. Such prevention is used when generating old forms and retaining 6 states for a non-switching audio codec and for the most useful form I of a switching tone sfL codec. The encoding below allows to detect if the current state t is a significant state. If so, the function ends in the relevant online rate model index pki return (because the return status is reached): i=63 *t; for (;;) ί j=ari_pk_hash[i&127]; if (j= = 〇 xFFFFFFFFul) break; if ( = = ^) return j&255; /++,. 63 201126508 In the above encoding (see also the reference number ^ (^^ the first column of the coefficient 63 determines the beginning of the hash table search) Point. During the training phase it is fixed using a global optimization. In the hash table, the state index t is encoded on the 24 "last" bits of the item (eg, the 24 most left or most significant bits of the item) The "#" bit 7C (eg, the eight rightmost or least significant bits) corresponds to the associated probability index. In a simple implementation, the table contains only 67 items. However, in order to minimize the storage of the table. Take the number, an overflow mechanism is used. In fact, 128-67 overflow symbols of the value OxFFFFFFFFul are inserted and when a state is not significant, the number of accesses to the table is allowed to be reduced. In this case, when this one pays 5 tigers When encountered, the code jumps directly to a second part 540e of the function "arith_get_pk()" (because "in the middle" The "off" state is reached in this case. The second portion of the mapping function 540e is used to map the insignificant state. The states are divided into seven non-uniform segments by index. Each segment is associated with a probability model. The mapping is recorded in the lookup table psci □. The bits outside the 22nd bit t indicate the accuracy of the level prediction. A different mapping is used depending on the prediction accuracy. A total of seven segments and four accuracy Considered, they correspond to a total of 28 different mappings stored in psci□. The mapping is done as follows: p=psci+7*(t>>22); j= t &4194303; if (j< 436961 ) i if( j<252001 ) return p[(j<243001 )?〇:!]; else return 64 201126508 p[(j<288993)?2:3]; } else { if ( j<1609865 ) return p[(j<880865)?4:5]; else return p[6]; } Finally, the probability models are reduced from 128 to 32. It has been found that many previously used models are rarely selected or not really Useful. A proper training allows the selection of only 32 performance models. In summary, embodiments in accordance with the present invention are related to the above mapping table. They are implemented in an audio encoder or an audio decoder. Portion of the advantages of the new tables from fully trained to be executed. Additionally, the present invention is based on considerations regarding the optimal size of the form. 7. Bit Stream Syntax In the following, a bit stream syntax of a bit stream carrying arithmetically encoded spectrum information will be described with reference to Figs. 6a to 6h. Figure 6a shows a grammatical representation of a so-called USAC column data block ("usac_raw_data_block()"). The 202USAC column data block contains one or more single channel elements ("single_channel_element()") and/or one or more channel pair elements ("channel_pair_element()"). Referring now to Figure 6b, the syntax of a single channel element is description. The single channel element dependent core mode includes a linear prediction domain channel stream 65 201126508 ("lpd_channel_stream()"), or a frequency domain channel stream ("fd_channel_stream()"). Figure 6c shows a grammatical representation of a channel pair of elements. The one channel pair element contains core mode information ("core_mode0", "core_model"). In addition, the channel pair element may contain a configuration information "ics_info". In addition, depending on the core mode information, the channel pair element includes a linear prediction domain channel stream or a frequency domain channel stream associated with a first one of the channels and the channel pair element also includes a linear prediction domain A channel stream or a frequency domain channel stream associated with a second of the channels. The configuration information "ics_info()", a syntax representation shown in Fig. 6d' contains a plurality of different configuration information items, which are not particularly relevant to the present invention. A frequency domain channel stream ("fd_channel_stream"), a syntax representation shown in Figure 6e, contains a gain information ("global-gain") and a set of state information ("ics_info"). In addition, the frequency domain channel stream contains scale factor information ("scale_factor_data()"), which describes the scaling factor used to scale the spectral values of the different scale factor bands, and is applied, for example, by the scaler 150 and the rescaler 240. The frequency domain channel stream also contains arithmetically encoded spectral data ("ac_spectral-data"), which represents the arithmetically encoded spectral values. Arithmetically encoded spectral data ("ac_spectral_data()"), a grammatical representation shown in Figure 6f, containing an optional arithmetic reset flag ("arith_reset_flag"), which is used to selectively reset the context, as above Said. In addition, the arithmetically encoded spectral data contains a plurality of arithmetic data blocks ("arith_data") that carry arithmetically encoded spectral values. The structure of the arithmetically encoded data block depends on the number of frequency bands (represented by the variable "num_bands") 66 201126508 and also depends on the arithmetic reset flag, which will be discussed below. The structure of the arithmetically encoded data block will be described with reference to Fig. 6g, and (4) = the syntax of the arithmetically encoded data blocks. The data representation of the finely coded data block depends on the number of spectral values to be encoded, the state of the arithmetic re-catch flag, and also on the context, i.e., the previously encoded spectral values. The context used to encode the spectral values of the current set is determined by the context decision algorithm shown by reference numeral 66〇. The arithmetic material data block includes (4) groups of code words, each group of code word representations - the spectral value group code words comprise an arithmetic code word rac〇d_ng[pki][ng]", which represents one between the use of the 丨 and the add bits A set of index ng of spectral value tuples. If a group containing a spectral value tuple includes more than one element, then a set of codewords also contains an arithmetic codeword "acod_ne[ne]" which represents an element index of the spectral value tuple. In addition, if the spectral value tuple requires more bit planes than a correctly represented most significant bit plane, the set of code words contains one or more code words "acod~r[][][][]" . The code word "acod_ne[ne]" indicates use! The element index between the two bits and the code word "ac〇d_r[][][][]" indicates the use of a less significant bit plane between the 1 and 2 bits. However, if one or more less significant bit planes are required (in addition to the most significant bit plane) for an appropriate representation of the spectral value tuple, by using - or more than one arithmetic overflow codeword ( "ARITH_ESCAPE") was sent. Thus, in general, for a spectral value tuple, the number of required bit planes (the most significant bit planes, and possibly one or more less significant bit planes) may be determined. If one or more less significant bit planes are needed, they are sent by one or more arithmetic overflow codewords 67 201126508 "aC〇d_ng[Pki][ARITH_ESCAPE]", the overflow codeword is based on a current The cumulative frequency table selected, the index of the cumulative frequency table provided by the variable pki is encoded. Additionally, if one or more arithmetic overflow codewords are included in the bitstream, the context applies, as indicated by reference numerals 664, 662. Following one or more arithmetic overflow codewords, an arithmetic codeword "aC〇d_ng[Pki][ng]" is included in the bitstream, as indicated by reference numeral 663, where pki represents the current effective probability model index (considering context 包括 including arithmetic overflow codeword generation)' and where ng represents the group index associated with the most significant bit plane of the spectral value tuple to be encoded. The group index can be derived in an encoder by evaluating the table dg vector, which allows the group index ng and an element index ne associated with a spectral value tuple to be derived when combined with the table "dgroups". If the group including the spectral value tuple to be encoded contains more than one element, the arithmetically encoded data block contains the code word "acod_ne[ng]" which encodes the element index ne using an appropriately selected cumulative frequency table. As noted above, the presence of any less significant bit plane results in the presence of one or more code ac〇d_r[] [][][]", each representation - a less significant one-tuple 4-bit of the bit plane . The one or more code words "acod one!·[][][][]" are encoded according to a corresponding cumulative frequency table, which is constant and context-independent. In addition, it should be noted that the context is updated after encoding each spectral value tuple, as indicated by reference numeral 668, such that the context is typically different for encoding two subsequent spectral value tuples. Figure 6h shows a description of the definition and definition of the syllabus of the arithmetically encoded data block. In summary, the -bit stream format is described, which may be provided by the audio encoder 100 and may be evaluated by the audio decoder 200. The bit stream of the arithmetically encoded spectral values is encoded such that it is suitable for decoding the above algorithms. In addition, it should be noted that the encoding is an inverse of the decompression, so it is generally assumed that the encoder performs the table lookup using the above table, which is approximated by the inverse of the table (4) performed by the decoder. In general, it can be said that those having ordinary knowledge in the art of decoding algorithms and/or required bitstream grammars will readily be able to arbitrarily coder, which is defined in the bitstream syntax and required by the arithmetic decoder. data of. 8. Decoding Method i In the following, a method based on - encoding audio: #讯送-decoding audio information will be described with reference to FIG. Method 2 〇〇〇 includes - an arithmetic coding representation of the spectral value representing the first step of providing a plurality of decoded spectral values. The method 2_in_step includes a second step 2020 of providing a time domain audio representation of the amplitude spectrum value. Step 2〇1〇 includes a dependency-like index from the group-32 cumulative frequency table to select a 2G12_cumulative frequency table: step 2010 also includes an application-selected cumulative frequency table derived from a variable length codeword representing the group index - The sub-step Lion 14 of the group index. Step 2 also includes a sub-step 2016 of deriving a value of the most significant bit plane of a spectral value tuple using the group index and the element index, the element index representing the -it element in the group selected by the group index. Step 2_ also includes the substep 2018 of providing a decoded spectral value tuple using the value of the most significant bit plane of the set of spectral values. 69 201126508 9. Coding method In the following, a method for providing a coded audio message based on an input audio message will be discussed with reference to FIG. The method 2100 of FIG. 21 includes a first step of providing a frequency domain audio representation based on a time domain audio representation of the input audio information, such that the frequency domain audio representation includes a set of spectral values, and an energy is concentrated in a subset Spectrum value. The method 2100 also includes a second set of neighboring spectral values encoding the set of spectral values, or a second step 2120 of encoding a pre-processed version of the set of spectral values. Step 2120 includes a sub-step 2122 of mapping a value of a most significant bit plane of the tuple spectrum values to a group of indices and an element index, the element index representing an element of the group of group index selections. Step 2120 also includes a sub-step 2124 of selecting a cumulative frequency table from a set of 32 cumulative frequency tables, dependent on a state index describing the state of the arithmetic coder. Step 2120 also includes a sub-step 2126 of using the selected cumulative frequency table (selected in sub-step 2124) to obtain an arithmetically encoded variable length codeword. It should be noted herein that the methods 2000 and 2100 of Figures 20 and 21 may be supplemented by any of the features and functions described herein with respect to the coding scheme of the invention. In addition, methods 2000 and 2100 can be supplemented by any of the features and functions of the devices discussed herein. Also, the encoding table described herein is preferably used with an encoding method and a decoding method. 10. IMPLEMENTING ALTERNATIVES While some aspects are described in the context of a device, it is clear that such levels also represent a description of the corresponding method, in which a block or device corresponds to a feature of a method step or method step. . _ Ground, the level described in the context of the method step P is also indicated - the corresponding block or - corresponding device - 7 - or feature - description. Some or all of these method steps may be performed by (or using) a hardware device such as, for example, a processor, a programmable circuit or an electronic circuit. In some embodiments, one or more of the most important method steps can be performed by the device. The u g is stored in a digital storage medium or on a removable medium such as a wireless transmission medium or a wired transmission medium such as the Internet. According to the requirements of the present invention, the embodiment of the present invention can be stored in a storage medium such as a floppy disk, an axis, or a blue light. , a DC, PR 〇 M, - EPR 〇 M, - EEPROM or - FLASH record execution, the digital storage media and the programmable computer system cooperate $ this. Work with it) so that each method is executed. · Can be computer readable. The lunch basket according to one of the present inventions will be added to the human (4) H office. The second instance contains an electronically readable control signal that can be operated by a programmable computer system so that the specific method described herein is executed. Brain production. The embodiment of the invention can be implemented as a code = when the computer program product is executed on a computer, the method is executed. The code can be stored, for example, on a machine readable carrier. Eight practical examples include computer programs for performing the methods described herein - stored on a machine readable carrier of 71 201126508. Thus, in other words, one embodiment of the inventive method is a computer program having a code that, when executed on a computer, performs one of the methods described herein. Thus, an additional embodiment of the inventive method is a data carrier (or a digital storage medium, or a computer readable medium) comprising a computer program stored thereon for performing one of the methods described herein. Thus, an additional embodiment of the inventive method is a data stream or a sequence of signals representing a computer program that performs one of the methods described herein. The data stream or the sequence of signals can be configured, for example, to be connected via a data communication, such as via an internet. A further embodiment comprises a processing device, such as a computer, or a programmable logic device, configured or adapted to perform the methods described herein - an additional embodiment comprising a computer having storage A computer program on which one of the methods described herein is performed. In some embodiments, a programmable logic device (e.g., a field programmable gate array) can be used to perform some or all of the functions of the methods described herein. In some embodiments, a field programmable gate array can cooperate with a microprocessor to perform one of the methods described herein. In general, the methods are preferably performed by any hardware device. The above embodiments are merely illustrative of the principles of the invention. It will be appreciated that modifications and variations of the arrangements and details described herein will be apparent to those skilled in the art. Therefore, it is intended to be limited only by the scope of the appended claims While the invention has been shown and described with reference to the embodiments of the present invention, it will be understood that It will also be appreciated that various changes may be made in accordance with the various embodiments without departing from the broader scope of the invention as described herein. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 a - b are block diagrams showing an audio encoder according to an embodiment of the present invention; and FIG. 2 ab is a block diagram of an audio decoder according to an embodiment of the present invention; Figure 3 is a schematic representation of a pseudo-code representation of an algorithm "tuples_decode()" for decoding a spectral value tuple; Figure 4 is a schematic representation of the context of a state calculation; Figure 5a A pseudo-code representation of an algorithm "arith_reset_context()" that resets a context is shown; Figure 5b shows a pseudo-code representation of an algorithm "arith_map_context()" that maps a context; Figure 5c A pseudo-code representation of an algorithm "arith_get_context()" for obtaining a context state value; Figure 5d shows an algorithm "arith_get_pk(s)" for deriving a cumulative frequency table index value pki from a state variable. A pseudo-code representation; Figure 5 e shows a pseudo-code representation of the algorithm "arith-decodeo" of a 2011 73508 algorithm for decoding a symbol from a variable-length codeword; Figure 5f shows a sub-group index call a - pseudo-code representation of an algorithm for the number-of-element values and the group offset value og; Figure 5g shows one of the spectral value tuples based on the group offset value 〇g and an element index value ne A pseudo-code representation of an algorithm of the spectral values a, b, :d of the most significant bit plane; Figure 5h shows the spectral values of a 亓, 兀, and a, b, c, d Without the bit-level combination of values, the pseudo-code representation of the algorithm of the tuple a, b, e, and the spectral value-updated version is called; the 5i diagram shows the update of the context τ A pseudo-code table of the calculation "arith_Update_c saki xt()" ^ 彳 Figure 5j shows a legend of the read-ahead variable; Figure 6a shows the grammatical representation of a unified five-kappa 1 block; °5 and ^ code _ Road) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 6e is not the same as the phrase of the frequency domain channel stream. The _f material is grammar. The grammar 苐6g map does not decode a set of spectral value tuples. The figure shows a legend of the data element and the variable; FIG. 7 shows a previously used table representation; FIG. 8 shows the memory requirement of the device 35, and shows the memory of the one-to-one encoder according to the present invention. 74 indicates a table representation of a device for improving performance obtained by an arithmetic coder according to the present invention; and FIG. 10 illustrates a different encoding for using a previously used arithmetic coder. A table representation of the bit rate required by the audio information; Figure 11 shows a tabular representation of the bit rate required to encode different audio information using the inventive concept; Figure 12 depicts a previously used form in a table representation A comparison between an audio encoder and an average bit rate produced by an audio encoder in accordance with the present invention; Figure 13 is a table representation showing the concept of the present invention when compared to a previously used concept A comparison of the obtained bit rate reduction with the increase of the bit rate; Figure 14 shows a table representation of a table "arith_cf_ng_hash[]"; 15th (1) to 15th (10) A table showing the entry of the table "arith_cf_ne[]"; Figure 16(1) to Figure 16(32) show a table of the 32 different values 0 to 31 of the index pki "arith_cf_ng[pki]" A table representation of the item; Figures 17(1) through 17(2) depict a table representation of the item "dgroups[]"; Figures 18(1) through 18(11) A table of items in the form "dvectors[]"; Figures 19(1) through 19(32) show a table 75 201126508 "egroups[a][b][c][d]" A table representation is shown; FIG. 20 is a flow chart showing a method for providing a decoding information and a method for decoding the representation; and FIG. 21 is a flowchart showing a method for providing an audio message. [Description of main component symbols] 100...audio encoder 110··· input audio information 110a...preprocessing 11〇112, 210·· bit stream 120...preprocessor 130, 260···signal converter 130a... Windowed MDCT Converter 132··· Frequency Domain Audio Representation 140··· Spectrum Post Processor 142... Post Processing 132 150···Scaler/Quantizer 152"·Scale and Quantization 132 160...Psychoacoustic Model Processing 170... Arithmetic encoder 172a, 172b... Arithmetic codeword information 174... Most significant bit plane extractor 176... Most significant bit plane 丨 子 determinant 180··· First codeword determinant 180a... Group index value Ng 180b... element index value ne 182, 299... state tracker 184.. state information 186, 296... cumulative frequency table selector 188... description of the selected cumulative frequency table information 189a... less significant bit Element plane extractor 189b, 189d·" less significant bit plane information 189c... second code word determinant 190··· bit stream load formatter 200... audio decoder 178, 280... group index determinant/element 212...decoding audio information 76 201126508 220 ...bitstream load deformation term 222...encoding frequency domain audio representation 224...state reset information 230...arithmetic decoder 232...decoded frequency domain audio representation 240··reverse quantizer/rescaler 242...inverse quantization and Re-scaled frequency domain representation 250·. Pre-processing version 262 of spectrum pre-processors 252...242... Time domain representation 270···Time domain post-processor 284... Most significant bit-plane decision 286...most significant bit Plane value 288··· less significant bit plane determinant 290... less significant bit plane decoding value 292···································· Decoding 312a... Context value calculation 312b... Group index decoding 312c... Element index decoding 312 (l··· Most significant bit plane measurement 312e... Less significant bit plane addition 312ba··. Decoding algorithm 312ca··· calculus Method 410: abscissa 412... ordinate 420~444·.·tuple 530a, 530fa, 550a...variable initialization 530b...first condition check 530c...second condition check 530cl···third condition check 530e... Four condition check 530f·.·context Calculate 530fb...variable rescaling 530fc.. based on table value adaptation 530fd...return value calculation 540a~540c, 550b, 550d, 660~668...reference number 540d...scatter list access 540e...provide 540ea~540ec...540e based on range Step 550c··· Iterative cumulative frequency table lookup 77 201126508 550e... Adaptation 550f... Interval reforming 550fa···Down shift operation 5 lion..·Spacer"t force π operation 580a, 580b...Step 580c...First mapping 580d...second mapping 900...evaluation setting 910"_USACRM0 encoder 912...arithmetic encoding spectral value 920...non-destructive transcoding 924...decoding spectral value 928...reduced table bitstream stream 942...RM0 synthesis result 952...reduction table Synthesis results 1410, 1414~1434, 1510, 1514-1534, 1710 '1714-1734, 1810, 1814-1882, 1970, 1980~1982h..· rows 1412a~1412p, 1512a, 1512b, 1712a, 1712b, 1812a, 1812b 1972a~1972h...column 1601-1665...Form 1901~1964...Form 2000···Method 2010~2126...Step 78

Claims (1)

201126508 七、申請專利範圍: 1. 一種用於基於一編碼音訊資訊提供一解碼音訊資訊的 音訊解碼器,該音訊解碼器包含: 一算術解碼器,用於基於複數個頻譜值的一算術編 碼表示提供複數個解碼頻譜值;及 一頻域到時域轉換器,用於使用該等解碼頻譜值提 供一時域音訊表示,以獲得解碼音訊資訊; 其中該算術解碼器被組態成依賴一狀態索引從表 示一群索引的一可變長度碼字導出該群索引; 其中該算術解碼器被組態成使用該群索引及一元 素索引導出一頻譜值元組的一最顯著位元平面之值,該 元素索引描述由該群索引選擇的一群中的一元素; 其中該算術解碼器被組態成使用該頻譜值元組的 該最顯著位元平面之值提供一解碼頻譜值元組;及 其中該算術解碼器被組態成依賴該狀態索引從一 組32個累積頻率表格中選擇一累積頻率表格,且將該選 擇的累積頻率表格應用於從表示該群索引的該可變長 度碼字導出該群索引。 2. 如申請專利範圍第1項所述之音訊解碼器, 其中該算術解碼器被組態成從該狀態索引導出一7 位元散列表索引值,且從一散列表獲得一散列表值,該 散列表包含128個散列表索引值於對應散列表項值上的 一映射,及 其中該算術解碼器被組態成決定與該從狀態索引 79 201126508 導出的放列表索引值相關聯的該散列表值是否為—溢 出值、一與該狀態索引相關聯的一有效累積頻率表格識 別符值'或一與該狀態索引相衝突的一無效累積頻率表 格識別符值,且如果與該從狀態索引導出的散列表索引 值相關聯的散列表值是一與該狀態索引相關聯的有效 累積頻率表格識別符值,則從該散列表值導出該累積頻 率表格索引值,且如果與該從狀態索引導出的散列表索 引值相關聯的該散列表值是該溢出值,則依賴其中包含 该狀態索引之值之一間隔的一識別提供一累積頻率表 格索引值;以及 其中该异術解碼器被組態成從由該狀態索引導出 的該散列表索引值表示的表項開始掃描該散列表之表 項,,直到與由狀態索引導出的該散列表索引值相關聯 的該散列表值是一溢出值或一與該狀態索引相關聯的 有效累積頻率表格識別符值為止,及 其中δ玄异術解碼器被組態成,如果掃描該散列表之 °亥4表項時達到的散列表值是該溢出值時,則依賴其中 含有該狀態索引之值的一間隔的一識別提供一累積頻 率表格索引值,且如果當掃描該散列表之表項達到的散 列表值是一與該狀態索引相關聯的有效累積頻率表格 識別付值時,則從掃描該散列表之表項時達到的該散列 表值導出該累積頻率表格索引值。 3.如申請專利範圍第2項所述之音訊解碼器,其中該散列 表被配置成將該7位元散列表索引值之67個值映射至有 80 201126508 效累積頻率表格識別符值之上,且將該7位元散列表索 引值之61個值映射至該溢出值上。 4. 如申請專利範圍第3項所述之音訊解碼器,其中該算術 解碼器被組態成將該狀態索引的67個不同值映射至26 個不同累積頻率表格索引值上,使得26個不同累積頻率 表格索引值與該狀態索引描述的67個不同顯著狀態相 關聯。 5. 如申請專利範圍第2-4項其中之一所述之音訊解碼器, 其中該算術解碼器被組態成將複數個非顯著狀態映射 至9個不同累積頻率表格索引值上。 6. 如申請專利範圍第1-5項其中之一所述之音訊解碼器, 其中該算術解碼器被組態成設定i=6 3 * t 且迭代執行該演算法: j=ari_pk_hash[i&127]; if (j==〇xFFFFFFFFul ) break; if ( (j»8)==t ) return j&255; i++; 直到第一條件j==OxFFFFFFFF 或第二條件(j»8)==t被滿足, 其中s表示該狀態索引, 其中i與j表示整數變量, 其中ari_pk_hash[i&128]表示一散列表的索引 (i&128)的一表項, 其中「&」表示一按位元邏輯AND運算子, 81 201126508 其中「>>8」表示移動8位元的一二進制右移運算, 其中「==」表示一識別條件的核對一識別條件, 其中「++」表示一增加一運算子, 其中「return j&255」表示返回變量j之該8個最不顯 著位元描述的一值作為一累積頻率表格索引值的一運 算;及 其中該算術解碼器被組態成執行該演算法 p=psci+7*(t>>22); j= t & 4194303; if (j<436961 ) { if (j<252001 ) return p[(j<243001)?0:1]; else return p[(j<288993)?2:3]; else if (j<1609865 ) return p[(j<880865)?4:5]; else return p[6]; } 以響應一中斷條件,其中 82 201126508 psci[28] = { 24,5,25,26,27,28,29,30,5,30,30,30,30,31,5,5, 5,5,5,5,5,5,5,5,5,5,5,5 定義一具有陣列索引0到27的不同累積頻率表格索 引值的陣列,且 其中該運算P = psci + 7 ·ί >> 22將該指標p設定成 具有一元素索引的一元素,該元素索引由7乘以該狀態 索弓丨t之最顯著位元表示的一值決定, 其中如果該條件被滿足’該等形式「return P[c〇nditi〇n?x:y]」之一運算返回具有一元素索引之該陣 列psci的一項,該元素索引由7乘以該狀態索引s與值入 的最顯著位元表示的一值與值χ之總和決定,且如果該 條件未被滿足,及當該散列表⑻—沐上沾如被定義如第 14圖中所科’則返回具有—元素㈣之辦列Psci的 —項’該元素索引由具有—元素索引之該陣列_之一 項’該το素索引由7乘以該狀態索引5的最顯著位元所表 示的一值與值y之總和決定。 、 之一所述之音訊解碼器, 態成依賴該編碼音訊資訊 7.如申請專利範圍第16項其中 其中該算術解碼器被組 的複數個位元獲得—碼字值 獲得描述在一 值之間的一範圍内 位置值, 較低範圍邊界值與—較高範圍邊界 的該碼字值的一相對位置的一相對 83 201126508 決定該相對位置值被包括在複數個由選擇累積頻 率表格之表項所定義之間隔中的哪一間隔, 響應於該相對位置值被包括在複數個由該選擇累 積頻率表格之表項所定義之間隔中的哪一間隔的決定 結果提供一符號資訊,及 依賴于與該符號資訊相關聯的該選擇的累積頻率 表格的一或一個以上表項更新該等範圍值之一或兩 個,及 重新縮放該等範圍邊界值之間的範圍,及 使用該編碼音訊資訊的一或一個以上附加位元更 新該碼字值。 8.如申請專利範圍第1-7項其中之一所述之音訊解碼器, 其中該算術解碼器被組態成基於該編碼音訊資訊,使用 32個累積頻率表格中之一被選擇表格,且使用下文演算 法arith_decode〇獲得該群索引, arith_decode() { if(arith_first_symbol〇) { value = 0; for (i=l; i<=20; i++) { value = (value«l) | arith_get_next_bit(); 84 201126508 l〇w=0; high=1048575; range = high-low+1; cum =((((int64) (value-low+l))«16)-((int64) l))/((int64) range); p = cum_freq-l; do { q=p+(cfl»l); if ( *q > cum ) { p=q; cfl++; } cfl»=l; } while ( cfl>l ); symbol = p-cum_freq+l; if(symbol) high=low+(((int64) range)*((int64)cum_freq[symbol-l]))»16 - 1; low += (((int64) range)* ((int64) cum_freq[symbol]))»16; 85 201126508 for (;;) { if ( high<524286) { } else if ( low>=524286) { value -=524286; low -=524286; high -=524286; else if ( low>=262143 && high<786429) value -= 262143; low -= 262143; high -= 262143; else break; low += low; high += high+1; value = (value«l) | arith_get_next_bit(); 86 201126508 return symbol; } 其中該算術解碼器被組態成如果一輔助函數 「arith_first_symbol()」指示一序列符號的一第一符號被 解碼,則依賴該編碼音訊資訊的20個位元初始化一碼字 值「value」; 其中一輔助函數「arith_get_next_bit()」提供該編碼 音訊資訊的下一位元, 一運算子「<<」表示一布林左移運算,以將該運算 子「<<」前的一運算元以一接隨運算子「<<」後的運算 元指定的若干位元移至左位元, 其中運算子「I」指定一布林或運算, 其中運算子「(int64)」表示欲被用於一後續運算元 的一表示之一數字類型是一 64位元整數數字類型, 其中「cum_freq」是一選擇累積頻率表格之一第一 表項的一位址值, 其中「*q」表示具有一項索引q-cum_freq的該選擇 累積頻率表格的一(q-cum_freq+l)-th表項, 其中cfl一被初始化為該已選擇的累積頻率表格之 一長度且在演算法「arith_decode〇」的一處理期間被修 改的變量, 其中「cum_freq[symbol-l]」表示該已選擇累積頻 87 201126508 率表格的一symbol-th表項;且 其中一運算「f〇r(;;){···}」指示重複括號「{“ and “}」 中所包括的一指令塊,直到一「中斷」指令到達。 9. 如申請專利範圍第8項所述之音訊解碼器,其中該32個 累積頻率表格與〇與31之間的3 2個累積頻率表格索引值 pki相關聯,且 其中該等累積頻率表格依據第16(1)圖到第16(32) 圖的表格表示被定義。 10. 如申請專利範圍第1-9項其中之一所述之音訊解碼器, 其中該算術解碼器被組態成決定由該群索引表示的一 群之群元素數目, 如果由該群索引表示的該群包含多於一個元素,評 估一元素索引的一編碼表示,以獲得該元素索引的一解 碼表示, 依賴該群索引,且如果由該群索引表示的該群包含 多於一個元素,依賴該元素索引以決定一查找位址基 底,且 依賴該查找位址基底決定一查找表中的最顯著位 元平面的查找值。 11. 如申請專利範圍第1-10項其中之一所述之音訊解碼 器,其中該算術解碼器被組態成使用如下定義的一演算 法「arith_get_context」解碼一目前頻譜值元組的狀態: arith_get_context() 88 201126508 t0=q[0][i].v+l; tl=q[l][i-l].v+l; t2=q[〇][i-l].v+l; t3=q[0][i+l].v+l; if ( (t0<10) && (tl<10) && (t2<10) && (t3<10) ){ if ( t2>l ) t2=2; if ( t3>l ) t3:2; return 3*(3*(3*(3*(3*(10*(10*t0+tl))+t2)+t3))); if ( (t0<34) && (tl<34) && (t2<34) && (t3<34) ){ if ( (t2>l) && (t2<10)) t2=2; else if ( t2>=10 ) t2=3; if ( (t3>l) && (t3<10)) t3=2; else if ( t3>=10 ) t3=3; return 252000+4*(4*(34*(34*t0+tl))+t2)+t3; 89 201126508 if ( (t0<90) && (tl<90) ) return 880864+90*(90*t0+tl); if ( (t0<544) && (tl<544) ) return 1609864 + 544*t0+tl; if ( t0>l ){ a0=q[0][i].a; b0=q[0][i].b; c0=q[0][i].c; d0=q[0][i].d;} else a0=b0=c0=d0=0; if ( tl>l ){ al=q[l][i-l].a; bl=q[l][i-l].b; cl=q[l][i-l].c; dl=q[l][i-l].d;} else al=bl=cl=dl=0; 1=0; 90 201126508 aO»=l; bO»=l; cO»=l; dO»=l; al»=l; bl»=l; cl»=l; dl»=l; 1++; while ( (a0<-4) || (a0>=4) I丨(b0<-4) || (b0>=4)丨丨 (c0<-4) || (c0>=4) || (d0<-4) || (d0>=4) || (al<-4) || (al>=4) || (bl<-4) || (bl>=4) || (cl<-4) || (cl>=4) || (dl<-4) || (dl>=4)); if ( tO>l ) t0=l+(egroups[4+a0][4+b0][4+c0][4+d0] » 16); if ( tl>l ) tl = l+(egroups[4+al][4+bl][4+cl][4+dl] » 16); return 1609864 + ((l«24)|(544*tO+tl)); 91 201126508 其中q[0][i]_v表示與如該目前頻譜值元組的相同 頻率相關聯的一先前音訊訊框的一頻譜值元組的一上 下文變量; 其中q[l][i-l]表示與低於該目前之頻譜值元組的頻 率相關聯的一目前音訊訊框的一頻譜值元組的一上下 文變量; 其中q[0][i-l]表示低於該目前之頻譜值元組的頻率 相關聯的該先前音訊訊框的一頻譜值元組的一上下文 變量; 其中q[0][i+l]表示與高於目前之頻譜值元組的頻率 相關聯的該先前音訊訊框的一頻譜值元組的一上下文 變量; 其中「&&」表示一邏輯AND運算; 其中q[0][i].a表示與該先前之頻譜值元組相同的頻 率相關聯的該目前音訊訊框的一頻譜值元組的一第一 元組值「a」; 其中q[0][i].b表示與該目前頻譜值元組相同的頻率 相關聯的該先前音訊訊框的一頻譜值元組的—第二元 組值「b」; 其中q[〇][i].c表示與該目前頻譜值元組相同的頻率 相關聯的該先前音訊訊框的一頻譜值元組的一第二元 組值「c」; 其中q[〇][i].d表示與該目前頻譜值元組相同的頻率 92 201126508 相關聯的該先前音訊訊框的一頻譜值元組的一第四元 組值「d」; 其中q[l][i-l].a表示低於該目前頻譜值頻率元組相 關聯的該目前音訊訊框的一頻譜值元組的一第一元組 值「a」; 其中q[l][i-l].b表示低於該目前頻譜值頻率元組相 關聯的該目前音訊訊框的一頻譜值元組的一第二元組 值「b」; q[l][i-l].c表示與低於目前頻譜值頻率元組相關聯 的該目前音訊訊框的一頻譜值元組的一第三元組值 「c」; q[l][i-l].d表示與低於目前頻譜值頻率元組相關聯 的該先前音訊訊框的一頻譜值元組的一第四元組值 「d」; 其中 aO、bO、cO、dO、al、bl、cl、dl 是表示一 2-s 補充表示的一發信數目的變量; 其中「>>」表示一邏輯右移運算子;及 其中「egroups[4+a0][4+b0][4+c0][4+d0]」定義一具 有一四維陣列「egroups」的項目索引4+aO、4+bO、4+cO、 4+dO的一項目, 其中「egroups[4+al][4+bl][4+cl][4+dl]」定義具有 該四維陣列「egroups」的項目索引4+al、4+bl、4+cl、 4+dl的一項目, 其中該陣列「egroups」依據第19(1)圖到第19(32) 93 201126508 圖的該表格表示被定義。 12.如申請專利範圍第11項所述之音訊解碼器, 其中該算術解碼器被組態成使用下文演算法更新 該等上下文變量: arith_update_context() { q[l][i].a=a; q[l][i]-b=b; q[l][i]-c=c; q[l][i].d=d; if ( (a<-4) || (a>=4) || (b<-4) || (b>=4) || (c<-4) || (c>=4) || (d<-4) || (d>=4)) { q[l][i].v =1024; } else q[l][i].v=egroups[4+a][4+b][4+c][4+d]; if(i==lg/4 && core_mode==l){ qs[0]=q[l][0]; ratio= ((float) lg)/((float)1024); for(j=0; j<256;j++) 94 201126508 k = (int) ((float) j*ratio); qs[l+k] = q[l][l+j]; } qs[previous—lg/4+1] = q[l][lg/4+l]; previous_lg = 1024; if(i==lg/4 && core_mode==0) { for(j=0; j<258; j++) { qs[j] =q[l][j]; } previous_lg = min(1024,lg); } } 其中,a、b、c、d表示一目前完全解碼頻譜值元組 的值; 其中lg/4表示與該目前音訊訊框相關聯的若干4元 組; 其中「core_mode== 1」表示一線性預測域核心模 式;及 .其中「core_mode==0」表示一頻域核心模式; 其中一「(float)」運算子表示使用一浮點數字表示; 95 201126508 及 其中一「(int)」運算子表示使用一整數表示。 13. —種用於基於一輸入音訊資訊提供一編碼音訊資訊的 音訊編碼器,該音訊編碼器包含: 一能量集中時域到頻域轉換器,用於基於該輸入音 訊資訊的一時域表示提供一頻域音訊表示,使得該頻域 音訊表示包含一組頻譜值; 一算術編碼器被組態成使用一可變長度碼字編碼 一相鄰頻譜值元組,或其之一預處理版本; 其中該算術編碼器被組態成將一頻譜值元組的一 最顯著位元平面之值映射至一群索引及一元素索引 上,該元素索引描述在一群中由該群索引選擇的一元 素; 其中該算術編碼器被組態成依賴該算術編碼器的 一狀態索引從一組32個累積頻率表格中選擇一累積頻 率表格;及 其中該算術編碼器被組態成使用該已選擇累積頻 率表格算術編碼該群索引,以獲得一算術編碼可變長度 碼字。 14. 一種用於基於一編碼音訊表示提供一解碼音訊表示的 方法,該方法包含: 基於該等頻譜值之一算術編碼表示提供複數個解 碼頻譜值;及 使用該解碼頻譜值提供一時域音訊表示; 96 201126508 其中基於該等頻譜值的一算術編碼表示提供複數 個解碼頻譜值,包含 依賴-狀㈣引從-組32個累積頻率表格中選擇 —累積頻率表格, 應用該選擇的累積頻率表格從表示該群索引的— 可變長度碼字導出一群索引, 該群索引及一元素索引導出一頻譜值元組的 —最顯著位元平面之值,該元素索引表示在一群中由該 群索引選擇的的一元素;及 使用该頻譜值元組的最顯著位元平面之值提供一 解碼頻譜值元組。 15. 一種基於一輸入音訊表示提供一 法’該方法包含: 編碼音訊表示的 方 基於-輸入音訊資訊的一時域音訊表示提供一頻 域音訊表示,使得該頻域音訊表示包括譜值,且 使得一能量在該等頻譜值的一子組中被集中;及 編碼該組賴值的相_譜值,或該_譜值的一 預處理版本n组,其中該相鄰頻譜m组的編碼 包含 將該頻譜值元組的-最顯著位元平面之值映射至 -群索引及-元素索引上,該元素索引表示在—群内由 該群索引選擇的一元素, 依賴描述該算術編碼之一狀態的一狀態索引從一 組32個累積頻率表格中選擇一累積頻率表格,及 97 201126508 使用該選擇的累積頻率表格算術編碼該群索引,以 獲得一算術編碼可變長度碼字。 16.—種電腦程式,當在一電腦上執行時可供執行如申請專 利範圍第14項或第15項所述之方法。 98201126508 VII. Patent Application Range: 1. An audio decoder for providing a decoded audio message based on a coded audio message, the audio decoder comprising: an arithmetic decoder for an arithmetic coded representation based on a plurality of spectral values Providing a plurality of decoded spectral values; and a frequency domain to time domain converter for providing a time domain audio representation using the decoded spectral values to obtain decoded audio information; wherein the arithmetic decoder is configured to rely on a state index Deriving the group index from a variable length codeword representing a group of indices; wherein the arithmetic decoder is configured to derive a value of a most significant bit plane of a spectral value tuple using the group index and an element index, An element index describing an element of the group selected by the group index; wherein the arithmetic decoder is configured to provide a decoded spectral value tuple using the value of the most significant bit plane of the spectral value tuple; The arithmetic decoder is configured to select a cumulative frequency table from a set of 32 cumulative frequency tables depending on the state index, and select the cumulative frequency table The cumulative frequency table showing applied from the variable length code words of the group index deriving the group index. 2. The audio decoder of claim 1, wherein the arithmetic decoder is configured to derive a 7-bit hash table index value from the state index and obtain a hash table value from a hash table, The hash table includes a mapping of 128 hash table index values on corresponding hash table entry values, and wherein the arithmetic decoder is configured to determine the hash associated with the drop list index value derived from the state index 79 201126508 Whether the list value is an overflow value, a valid cumulative frequency table identifier value associated with the state index or an invalid cumulative frequency table identifier value that conflicts with the state index, and if the slave state index is The hash table value associated with the derived hash table index value is a valid cumulative frequency table identifier value associated with the state index, the cumulative frequency table index value is derived from the hash table value, and if the slave state index is The hash table value associated with the derived hash table index value is the overflow value, and is dependent on an identification provided in which one of the values of the state index is included a cumulative frequency table index value; and wherein the foreign-speaker decoder is configured to scan an entry of the hash table from an entry represented by the hash table index value derived from the state index until it is derived from the state index The hash table value associated with the hash table index value is an overflow value or a valid cumulative frequency table identifier value associated with the state index, and wherein the delta hypersynthesis decoder is configured to scan if When the hash table value reached by the hash table of the hash table is the overflow value, a cumulative frequency table index value is provided depending on an identification of an interval in which the value of the state index is included, and if the volume is scanned, When the hash table value reached by the list entry is a valid cumulative frequency table identification value associated with the status index, the cumulative frequency table index value is derived from the hash table value reached when scanning the entry of the hash table. . 3. The audio decoder of claim 2, wherein the hash table is configured to map 67 values of the 7-bit hash table index value to a value of 80 201126508 effective cumulative frequency table identifier And mapping 61 values of the 7-bit hash table index value to the overflow value. 4. The audio decoder of claim 3, wherein the arithmetic decoder is configured to map 67 different values of the state index to 26 different cumulative frequency table index values such that 26 different The cumulative frequency table index value is associated with 67 different salient states described by the state index. 5. The audio decoder of any one of claims 2-4, wherein the arithmetic decoder is configured to map a plurality of non-significant states onto nine different cumulative frequency table index values. 6. The audio decoder of any one of claims 1-5, wherein the arithmetic decoder is configured to set i=6 3 * t and iteratively execute the algorithm: j=ari_pk_hash[i&127]; if (j==〇xFFFFFFFFul ) break; if ( (j»8)==t ) return j&255;i++; until the first condition j==OxFFFFFFFF or the second condition (j»8)== t is satisfied, where s represents the state index, where i and j represent integer variables, where ari_pk_hash[i&128] represents an entry of an index of the hash table (i&128), where "&" indicates a press Bit logical AND operator, 81 201126508 where ">>8" represents a binary right shift operation of moving 8 bits, where "==" indicates a check condition for identifying a condition, where "++" indicates Adding an operator, where "return j&255" represents a value returned by the eight least significant bits of the variable j as an operation of a cumulative frequency table index value; and wherein the arithmetic decoder is configured Execute the algorithm p=psci+7*(t>>22); j= t &4194303; if (j<436961 ) { if (j<252001 ) return p[(j<243001)?0:1]; else return p[(j<288993)?2:3]; else if (j<1609865 ) return p[(j<880865 )?4:5]; else return p[6]; } in response to an interrupt condition, where 82 201126508 psci[28] = { 24,5,25,26,27,28,29,30,5,30, 30, 30, 30, 31, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5 define a different cumulative frequency table index value with array indices 0 to 27. Array, and wherein the operation P = psci + 7 · ί > 22 sets the index p to an element having an element index, which is multiplied by 7 by the most significant bit of the state a value representation of the meta-representation, wherein if the condition is satisfied, the operation of the form "return P[c〇nditi〇n?x:y]" returns an item of the array psci having an element index, the element The index is determined by multiplying 7 by the state index s and the sum of the value and the value χ represented by the most significant bit of the value, and if the condition is not satisfied, and when the hash table (8) is as defined The figure in Figure 14 returns the item of the Psci with the element (4). Element having index - 7 multiplied by the most significant bit of the state index 5, the index of the array element of one item _ 'το the pixel value and a sum value of the index y of the decision table illustrated. And the audio decoder according to one of the states, dependent on the encoded audio information. 7. According to claim 16 of the patent application, wherein the arithmetic decoder is obtained by a plurality of bits of the group - the codeword value is obtained and described in a value An in-range position value between, a lower range boundary value and a relative position of the relative position of the codeword value of the higher range boundary. 201126508 determines that the relative position value is included in a plurality of tables of selected cumulative frequency tables. Which of the intervals defined by the item provides a symbol information, and depends on the decision result of which of the intervals defined by the entry of the selected cumulative frequency table is included in the relative position value Updating one or more of the range values for one or more entries of the selected cumulative frequency table associated with the symbol information, and rescaling the range between the range boundary values, and using the encoded audio One or more additional bits of information update the codeword value. 8. The audio decoder of any one of clauses 1-7, wherein the arithmetic decoder is configured to select a table using one of 32 cumulative frequency tables based on the encoded audio information, and Use the algorithm arith_decode below to get the group index, arith_decode() { if(arith_first_symbol〇) { value = 0; for (i=l; i<=20; i++) { value = (value«l) | arith_get_next_bit() 84 201126508 l〇w=0; high=1048575; range = high-low+1; cum =((((int64) (value-low+l))«16)-((int64) l))/( (int64) range); p = cum_freq-l; do { q=p+(cfl»l); if ( *q > cum ) { p=q; cfl++; } cfl»=l; } while ( cfl>l ); symbol = p-cum_freq+l; if(symbol) high=low+(((int64) range)*((int64)cum_freq[symbol-l]))»16 - 1; low += (((int64) Range)* ((int64) cum_freq[symbol]))»16; 85 201126508 for (;;) { if ( high<524286) { } else if ( low>=524286) { value -=524286; low -=524286 ; high -=524286; else if ( low>=262143 &&high<786429) value -= 262143; low -= 262143; high -= 262143; else break; low += low; high += high+1; value = (value«l) | arith_get_next_bit(); 86 201126508 return symbol; } where the arithmetic decoder is configured If a helper function "arith_first_symbol()" indicates that a first symbol of a sequence of symbols is decoded, a codeword value "value" is initialized depending on 20 bits of the encoded audio information; one of the auxiliary functions "arith_get_next_bit() Providing the next bit of the encoded audio information, an operator "<<" indicates a Boolean left shift operation to perform an operation on the operator before the operator "<<<" The number of bits specified by the operand after the subordinate "<<" is moved to the left bit, where the operator "I" specifies a Boolean OR operation, where the operator "(int64)" indicates that it is to be used for a subsequent One of the representations of the operand is a 64-bit integer number type, where "cum_freq" is a one-bit value of the first entry of one of the cumulative frequency tables, where "*q" indicates an index The choice of q-cum_freq is tired a (q-cum_freq+l)-th entry of the frequency table, wherein cfl is initialized to one of the lengths of the selected cumulative frequency table and is modified during a process of the algorithm "arith_decode", wherein "cum_freq[symbol-l]" indicates a symbol-th entry of the selected cumulative frequency 87 201126508 rate table; and one of the operations "f〇r(;;;){···}" indicates repeated parentheses "{" An instruction block included in and "}" until an "interrupt" instruction arrives. 9. The audio decoder of claim 8, wherein the 32 cumulative frequency tables are associated with 32 cumulative frequency table index values pki between 〇 and 31, and wherein the cumulative frequency tables are based on The table representations from Figures 16(1) through 16(32) are defined. 10. The audio decoder of any of claims 1-9, wherein the arithmetic decoder is configured to determine a number of group elements of the group represented by the group index, if represented by the group index The group contains more than one element, evaluating a coded representation of an element index to obtain a decoded representation of the element index, relying on the group index, and if the group represented by the group index contains more than one element, relying on the group The element index determines a lookup address base and relies on the lookup address base to determine a lookup value for the most significant bit plane in a lookup table. 11. The audio decoder of any one of claims 1-10, wherein the arithmetic decoder is configured to decode a state of a current spectral value tuple using an algorithm "arith_get_context" as defined below: Arith_get_context() 88 201126508 t0=q[0][i].v+l; tl=q[l][il].v+l; t2=q[〇][il].v+l; t3=q [0][i+l].v+l; if ( (t0<10) &&(tl<10)&&(t2<10)&&(t3<10) ){ if (t2>l) t2=2; if ( t3>l ) t3:2; return 3*(3*(3*(3*(3*(10*(10*t0+tl))+t2)+t3 ))); if ( (t0<34) &&(tl<34)&&(t2<34)&&(t3<34) ){ if ( (t2>l) &&;(t2<10))t2=2; else if ( t2>=10 ) t2=3; if ( (t3>l) &&(t3<10))t3=2; else if ( t3>= 10) t3=3; return 252000+4*(4*(34*(34*t0+tl))+t2)+t3; 89 201126508 if ( (t0<90) &&(tl<90) ) Return 880864+90*(90*t0+tl); if ( (t0<544) &&(tl<544) ) return 1609864 + 544*t0+tl; if ( t0>l ){ a0=q[ 0][i].a; b0=q[0][i].b; c0=q[0][i].c; d0=q[0] [i].d;} else a0=b0=c0=d0=0; if ( tl>l ){ al=q[l][il].a; bl=q[l][il].b; cl q=q[l][il].d;} else al=bl=cl=dl=0; 1=0; 90 201126508 aO»=l; bO»=l ; cO»=l; dO»=l; al»=l; bl»=l; cl»=l; dl»=l; 1++; while ( (a0<-4) || (a0>=4 I丨(b0<-4) || (b0>=4)丨丨(c0<-4) || (c0>=4) || (d0<-4) || (d0>=4) | | (al<-4) || (al>=4) || (bl<-4) || (bl>=4) || (cl<-4) || (cl>=4) || ( Dl<-4) || (dl>=4)); if ( tO>l ) t0=l+(egroups[4+a0][4+b0][4+c0][4+d0] » 16); If ( tl>l ) tl = l+(egroups[4+al][4+bl][4+cl][4+dl] » 16); return 1609864 + ((l«24)|(544*tO+ Tl)); 91 201126508 wherein q[0][i]_v represents a context variable of a spectral value tuple of a previous audio frame associated with the same frequency of the current spectral value tuple; wherein q[l [il] represents a context variable of a spectral value tuple of a current audio frame associated with a frequency lower than the current spectral value tuple; wherein q[0][il] indicates lower than the current one Spectral value tuple a context variable of a spectral value tuple of the previous audio frame associated with the frequency; wherein q[0][i+1] represents the previous audio frame associated with a frequency higher than the current spectral value tuple a context variable of a spectral value tuple; wherein "&&" represents a logical AND operation; wherein q[0][i].a represents the same frequency associated with the previous spectral value tuple a first tuple value "a" of a spectral value tuple of the current audio frame; wherein q[0][i].b represents the previous audio frame associated with the same frequency of the current spectral value tuple a second tuple value "b" of a spectral value tuple; wherein q[〇][i].c represents a spectral value of the previous audio frame associated with the same frequency of the current spectral value tuple A second tuple value "c" of the tuple; wherein q[〇][i].d represents a spectral value tuple of the previous audio frame associated with the same frequency of the current spectral value tuple 92 201126508 a fourth ternary value "d"; wherein q[l][il].a represents the target associated with the current spectral value frequency tuple a first tuple value "a" of a spectral value tuple of the audio frame; wherein q[l][il].b represents one of the current audio frames associated with the current spectral value frequency tuple a second tuple value "b" of the spectral value tuple; q[l][il].c represents a tuple of a spectral value tuple of the current audio frame associated with the current spectral value frequency tuple The third tuple value "c"; q[l][il].d represents a fourth tuple value of a spectral value tuple of the previous audio frame associated with the current spectral value frequency tuple. d"; wherein aO, bO, cO, dO, al, bl, cl, dl are variables representing a number of transmissions of a 2-s supplementary representation; wherein ">>" represents a logical right shift operator; And "egroups[4+a0][4+b0][4+c0][4+d0]" defines an item index 4+aO, 4+bO, 4+cO, which has a four-dimensional array "egroups", An item of 4+dO, where "egroups[4+al][4+bl][4+cl][4+dl]" defines the item index 4+al, 4+bl, which has the four-dimensional array "egroups", An item of 4+cl, 4+dl, where the array "egroups" is based on Figure 19(1) 19 (32) 93 201 126 508 This table indicates Fig defined. 12. The audio decoder of claim 11, wherein the arithmetic decoder is configured to update the context variables using an algorithm: arith_update_context() { q[l][i].a=a ; q[l][i]-b=b; q[l][i]-c=c; q[l][i].d=d; if ( (a<-4) || (a> =4) || (b<-4) || (b>=4) || (c<-4) || (c>=4) || (d<-4) || (d>=4 )) { q[l][i].v =1024; } else q[l][i].v=egroups[4+a][4+b][4+c][4+d]; if (i==lg/4 && core_mode==l){ qs[0]=q[l][0]; ratio= ((float) lg)/((float)1024); for(j= 0; j<256;j++) 94 201126508 k = (int) ((float) j*ratio); qs[l+k] = q[l][l+j]; } qs[previous-lg/4+ 1] = q[l][lg/4+l]; previous_lg = 1024; if(i==lg/4 && core_mode==0) { for(j=0; j<258; j++) { Qs[j] =q[l][j]; } previous_lg = min(1024,lg); } } where a, b, c, d represent the value of a currently fully decoded spectral value tuple; where lg/4 Representing a number of 4-tuples associated with the current audio frame; wherein "core_mode== 1" represents a linear prediction domain core mode; and "Core_mode == 0" indicates a frequency domain kernel mode; one "(float)" operator to indicate the use of a floating point number represents; 95 201 126 508 and one of the "(int)" operator represents an integer representation using. 13. An audio encoder for providing encoded audio information based on an input audio message, the audio encoder comprising: an energy concentrated time domain to frequency domain converter for providing a time domain representation based on the input audio information a frequency domain audio representation such that the frequency domain audio representation comprises a set of spectral values; an arithmetic coder configured to encode a neighboring spectral value tuple using a variable length codeword, or a preprocessed version thereof; Wherein the arithmetic coder is configured to map a value of a most significant bit plane of a spectral value tuple to a group of indices and an element index, the element index describing an element selected by the group index in a group; Wherein the arithmetic coder is configured to select a cumulative frequency table from a set of 32 cumulative frequency tables dependent on a state index of the arithmetic coder; and wherein the arithmetic coder is configured to use the selected cumulative frequency table The group index is arithmetically encoded to obtain an arithmetically encoded variable length codeword. 14. A method for providing a decoded audio representation based on an encoded audio representation, the method comprising: providing a plurality of decoded spectral values based on one of the spectral values; and providing a time domain audio representation using the decoded spectral values 96 201126508 wherein an arithmetic coding representation based on the spectral values provides a plurality of decoded spectral values, including a dependency-like (four) derivative-to-group 32 cumulative frequency table selection-cumulative frequency table, applying the selected cumulative frequency table from The variable length codeword representing the group index derives a group of indices, the group index and an element index deriving a value of a most significant bit plane of a spectral value tuple, the element index being selected by the group index in a group An element of the ; and the value of the most significant bit plane using the spectral value tuple provides a decoded spectral value tuple. 15. A method for providing an input audio representation based on: a method for encoding an audio representation to provide a frequency domain audio representation based on a time domain audio representation of the input audio information such that the frequency domain audio representation includes spectral values and An energy is concentrated in a subset of the spectral values; and a phase-spectrum value encoding the set of values, or a pre-processed version n of the _ spectral value, wherein the encoding of the adjacent spectral m-group comprises Mapping the value of the most significant bit plane of the spectral value tuple to a -group index and an element index, the element index representing an element selected by the group index within the group, dependent on describing one of the arithmetic codes A state index of the state selects a cumulative frequency table from a set of 32 cumulative frequency tables, and 97 201126508 arithmetically encodes the group index using the selected cumulative frequency table to obtain an arithmetically encoded variable length codeword. 16. A computer program that, when executed on a computer, can be executed as described in claim 14 or claim 15. 98
TW099102412A 2009-01-28 2010-01-28 Audio encoder, audio decoder, method for encoding an input audio information, method for decoding an input audio information and computer program using improved coding tables TW201126508A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US14788509P 2009-01-28 2009-01-28

Publications (1)

Publication Number Publication Date
TW201126508A true TW201126508A (en) 2011-08-01

Family

ID=42245645

Family Applications (1)

Application Number Title Priority Date Filing Date
TW099102412A TW201126508A (en) 2009-01-28 2010-01-28 Audio encoder, audio decoder, method for encoding an input audio information, method for decoding an input audio information and computer program using improved coding tables

Country Status (3)

Country Link
AR (1) AR075200A1 (en)
TW (1) TW201126508A (en)
WO (1) WO2010086342A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3001593A1 (en) * 2013-01-31 2014-08-01 France Telecom IMPROVED FRAME LOSS CORRECTION AT SIGNAL DECODING.
US20140358565A1 (en) 2013-05-29 2014-12-04 Qualcomm Incorporated Compression of decomposed representations of a sound field
EP2824661A1 (en) 2013-07-11 2015-01-14 Thomson Licensing Method and Apparatus for generating from a coefficient domain representation of HOA signals a mixed spatial/coefficient domain representation of said HOA signals
US10770087B2 (en) 2014-05-16 2020-09-08 Qualcomm Incorporated Selecting codebooks for coding vectors decomposed from higher-order ambisonic audio signals
CN112509591B (en) * 2020-12-04 2024-05-14 北京百瑞互联技术股份有限公司 Audio encoding and decoding method and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101237413B1 (en) * 2005-12-07 2013-02-26 삼성전자주식회사 Method and apparatus for encoding/decoding audio signal

Also Published As

Publication number Publication date
WO2010086342A1 (en) 2010-08-05
AR075200A1 (en) 2011-03-16

Similar Documents

Publication Publication Date Title
US9171550B2 (en) Context-based arithmetic encoding apparatus and method and context-based arithmetic decoding apparatus and method
RU2591663C2 (en) Audio encoder, audio decoder, method of encoding audio information, method of decoding audio information and computer program using detection of group of previously decoded spectral values
EP3573056B1 (en) Audio encoder and audio decoder
AU2011287747B2 (en) Audio encoder, audio decoder, method for encoding an audio information, method for decoding an audio information and computer program using an optimized hash table
CN102844809A (en) Audio encoder, audio decoder, method for encoding and audio information, method for decoding an audio information and computer program using a hash table describing both significant state values and interval boundaries
TW201126508A (en) Audio encoder, audio decoder, method for encoding an input audio information, method for decoding an input audio information and computer program using improved coding tables
HK40103544B (en) Audio encoder and audio decoder
HK40103544A (en) Audio encoder and audio decoder
HK40096560A (en) Audio encoder and audio decoder
HK40096560B (en) Audio encoder and audio decoder
HK40108740B (en) Audio decoding method
HK40108740A (en) Audio decoding method
HK40108741A (en) Audio encoder and audio decoder
HK40108741B (en) Audio encoder and audio decoder
HK40097132A (en) Audio decoder
HK40097132B (en) Audio decoder
HK40018193A (en) Audio encoder and audio decoder
HK40018193B (en) Audio encoder and audio decoder
HK40064511A (en) Audio encoder and audio decoder
HK40064511B (en) Audio encoder and audio decoder
HK1253032B (en) Audio encoder and audio decoder
HK1155845B (en) Audio encoder and audio decoder