TW200816653A

TW200816653A - Data query method and data coding method thereof

Info

Publication number: TW200816653A
Application number: TW95136094A
Authority: TW
Inventors: Cher-Sheng Cheng; Jyh-Jiun Shann; Chung-Ping Chung
Original assignee: Univ Nat Chiao Tung
Priority date: 2006-09-28
Filing date: 2006-09-28
Publication date: 2008-04-01
Also published as: TWI329427B

Abstract

A data query method and a data coding method are disclosed. First, the method includes the step of dividing a plurality of data into a plurality of groups, wherein a first data of each group represents a skip pointer and the other data of each group represent inner pointers. Then, the skip pointers are coded with a first coding algorithm process to generate a plurality of coded skip pointer data. A coding required bits space of the inner pointers is predicted by using a bits space prediction process. Then, the inner pointers of each group are coded respectively with a second coding algorithm process to generate a plurality of coded inner pointer data. Finally, the coded skip pointer data and the inner pointer data are posted sequentially to generate an output coded data based on the group post sequence and the coding required bits space.

Description

200816653 七、指定代表圖： (一）本案指定代表圖為：第（2)圖。 (二）本代表圖之元件符號簡單說明： 20〜24 ··步驟流程；200816653 VII. Designation of representative representatives: (1) The representative representative of the case is: (2). (2) A brief description of the symbol of the representative figure: 20~24 ··Step flow;

八、本案若有化學式時，請揭示最能顯示發明特徵的化學式: 九、發明說明：【發明所屬之技術領域】本發明是有關於一種資料搜尋方法及其數值資料編碼方法，特別是有關於一種内建有可跳躍式解碼能力之轉置檔案以實現快速查詢之數值資料編碼方法。8. If there is a chemical formula in this case, please disclose the chemical formula that best shows the characteristics of the invention: IX. Description of the invention: [Technical field of invention] The present invention relates to a data searching method and a numerical data encoding method thereof, particularly A transcoding file with built-in skipping decoding capability to implement a numerical query encoding method for fast query.

200816653 【先前技術】檢旁透=際：路上進行資料檢索已成為目前最常用的資訊 = Ϊ有相同關鍵字之網頁資料。-個資訊檢索 ΐίιΐ存有許多的文件，少職千個，多則數十億個。在位Ξΐί的貧訊檢索系統中，文件的數目正以每科增加一成長。為了讓使用者可以最短的時間内找到他們所尚的-貝料，資訊檢索系統會建立一個索引結構。統上請-參^ I Ϊ I/係緣示一索引結構之示意圖。傳 ·、 ^吊使用的索引結構是轉置索引（inverted m ，主要兩個棺案組成··一個是索引標案（ϋχ 」=)，一個是轉置檔案（inverted file)。所有出現在 = 斤儲存之文件中的關鍵字（terms)都會被記錄在索 1杈案，例，如圖中關鍵字 “computer”及 architecture ，並利用一個指標指到轉置樓宰中相對應的轉置串列（inverted Ust)，例如關 computer 係對應於轉置檔案中位址355之轉置串列 11，而關鍵字“architecture”係對應於轉置檔案中位址252之轉置串列1〇。轉置串列係紀錄此關鍵字出現於哪些文件中。例如，在資料庫中，文件丨、文件2、文件^、文件12、文件20、文件72及文件80之内容具有關鍵子architecture ，因此轉置串列10紀錄1、2、11、 12、20、72及80之數值。同理，轉置串列1〇紀錄j、3、 12、13、20、73及80之數值。一般而言，索引槽案所需的空間比較小，可以儲存在記憶體中，而轉置檔^較大，必須被壓縮並儲存於磁碟中。轉置袼案比較當使用者使用轉置索引處理一道檢索命令（query) 200816653 t資‘仏索糸統會先到索引槽案搜尋檢索命令中提及的關鍵字，並取得關鍵字相對應的轉置串列，然後將各相關轉置串列中的文件編號依邏輯運算子（AND“、'、〇R等）予以比對，以找出符合檢索命令的所有文件編號，讓使用者可藉由這些文件編號取得真正的文件。例如，當使 1 者輸入關鍵字為” architecture <and> c⑽puter” 時，則資訊檢索系統係對轉置串列10及轉置串列u進行AND運算，找出同時紀錄於轉置串列1〇及轉置串列u 之數值，以第1圖為例，對應關鍵字為” architecture <and> computer”之文件為文件！、文件12、文件⑽及文件 80。而關鍵字為” architecture <〇r> c⑽puter” 之文件為文件1、文件2、文件3、文件1丨、文件12、 =13、文件20、文件72、文件73及文件8〇。藉由上述 ^程，使用者便可於文件資料庫找出欲得到之資料。由於轉置檔案比較大，必須被壓縮並儲存於磁碟，因此，若將轉置串列的數值進行編碼成較節省儲存办 :不定長f的位元串列流，可以減少磁碟讀取所需。見以加快查珣速度。然而，這也導致轉置串列的數解碼過程必須一個數值一個數值的循序解碼， ^置串列數?的效率。例如當使用者輸人關鍵二串列 '中，”，rface”於文件心現，但疋資讯檢索系統仍必須將轉置串列丨〇之一循序解碼，一直解碼到數值72方能進行比對。一為了讓轉置串列編碼後的位元串列流在數過程中，減少多餘的解碼動作，習知技藝合、外的資訊或設計特殊的資料結構，在位：二 200816653 些跳躍點，賦予其开、a、所提出實現跳躍點碼的能力。然而習知技藝相當地大，致使無_σ /’所需增加的儲存空間都點。、、 70串列流中插入太多的跳躍有鑑於習知括之’本發明人基問題’為了能夠兼顧解決 -種資料搜尋方法3:::，與諸多實務經驗，提出少的儲存空間，於ς 貧料編碼方法，可以使用較插入跳躍點，並样編竭後的位元串列流密集地善上述習知技藝心所亡出的資料搜尋方法，以作為改 a缺點之實現方式與依據。【發明内容】有鐘於此，士拉方法及其數值資目的就是在提供-種資料搜尋省儲存空間之功欵、”、、/，以兼具可跳躍式解碼及節根據本發明+ 以對複數個目的，提出—種數值資料編碼方法，用設定一分群^之數值騎編碼’财法包訂列步驟：成複數個^ 絲肋分群參賴難W特之數值分每一兮此砰、、且’母一該些群組之數值數目等於該分群參數，七/二群組之第一個數值係代表為跳躍點，而其他數值係 Γ表為内部點；根據一第一編碼演譯程序對該些群組之跳躍點進行編碼，以產生複數個跳躍點編碼值；根據一空間預估私序，預估該些群組之該些内部點以一第二編碼演譯程序進行編碼所需之位元保留空間；根據該第二編碼演澤程序對該些群組之該些内部點進行編碼，以產生複數個内部點編石馬值；根據該些群組之排列順序及該些位元保留空間，將該些 8 200816653 跳躍點編碼值及該些内部點編碼值依序排列，以產生一輸出編碼值。此外，本發明更提出一種資料搜尋方法，用以搜尋一編碼數值資料中是否包含一目標數值，該編碼數值資料係以上述之數值資料編碼方法進行編碼，該編碼數值資料包含複數個跳躍點編碼值及複數個内部點編碼值，該方法包含下列步驟：（a)根據第一編碼演譯程序，於該編碼數值資料中取得兩跳躍點編碼值，並解碼出一第一比對跳躍點及一第二比對跳躍點；（b)判斷該目標數值是否介於該第一比對跳躍點及該第二比對跳躍點之間，若是，則執行步驟（c)，若否，則執行步驟（d) ; (c)取得位於該第二比對跳躍點隨後之内部點編碼值，並根據一第二編碼演譯程序對該内部點編碼值進行解碼，以取得至少一内部點數值，並判斷該些内部點是否包含該目標數值；（d)根據該第一比對跳躍點、該第二比對跳躍點及一空間預估程序，估算一位元保留空間，並間隔該位元保留空間之後取得另一跳躍點編碼值，並解碼出一另一跳躍點，將該第二比對跳躍點設為該第一比對跳躍點，並將該另一跳躍點設為該第二比對跳躍點，執行步驟（b)。茲為使貴審查委員對本發明之技術特徵及所達到之功效有更進一步之瞭解與認識，謹佐以較佳之實施例及配合詳細之說明如後。【實施方式】以下將參照相關圖式，說明依本發明較佳實施例之資料搜尋方法及其數值資料編碼方法，為使便於理解，下 200816653 述實施例中之相同元件係以相同之符號標示來說明。請參閱第2圖至第4圖，第2圖為本發明之數值資料編碼方法之步驟流程圖，第3圖為本發明之數值序列之分群示意圖，而第4圖為本發明之輸出編碼資料之位元串列流示意圖。在第2圖中，數值資料編碼方法係用以對複數個已排序之數值進行編碼，此方法包含下列步驟：步驟20 :設定一分群參數η，並根據分群參數η將這些已排序之數值分成m個群組，如第3圖所示，每一群組包含 η個數值，每一群組之第一個數值係代表為跳躍點，而其他數值係代表為内部點；步驟21 :根據一第一編碼演譯程序對第3圖所示之跳躍點1至跳躍點m進行編碼，以產生m個跳躍點編碼值，如第 4圖所示之E1至Em，其中第4圖所示之E1至Em分別表示第 3圖所示之跳躍點1至跳躍點m根據一第一編碼演譯程序產生之編碼值；步驟22 :根據一空間預估程序，預估每一第3圖所示之内部點群以一第二編碼演譯程序進行編碼所需之位元保留空間，如第4圖所示之S1及S2，其中第4圖所示之S1及S2 分別表示第3圖所示之内部點群1及内部點群2根據一空間預估程序，預估以一第二編碼演譯程序進行編碼所需之位元保留空間；步驟23 :根據一第二編碼演譯程序對每一内部點群進行 200816653 編碼’以產生複數個内部點編碼值，如第4圖所示之？。至 Plx ^ P21至P2y，其中第4圖所示之pu至ρΐχ及p21至 P2y/分別表示第3圖所示之内部點D2至此及Dn+2至D2n以一第二編碼演譯程序進行編碼產生之編碼值；步驟24:根據這些群組之排列順序及這些位元保留空間’將這些跳躍點編碼值及這些内部點編碼值依序排列，以產生一輸出編碼資料。其中，若這些群組之最後群組之數值數目小於分群參數時，最後群組之數值係代表殘餘點，以一第一編碼演譯程序進行編碼，如第4圖所示之Em+1及Em+2，其中第4圖所示之Em+1及Em+2分別表示第3圖所示之殘餘點Dmn+2及Dmn+3 根據一第一編碼演譯程序產生之編碼值。上述空間預估程序，用以預估内部點群以―第二編碼演譯程序進行編瑪所 :因此内部點編碼值之位元長度係小於或如弟4圖所示，輸出編碼資料資個跳躍點編碼值_，接著配置内; 則㈣千，从J 所空間小於位元保留空問ς】貝Η呆留剩餘空間τ。續之，純元 a間Si’ 點編碼值E3及㈣點編碼值P11至配置跳躍點編碼值及㈣闕碼值 '配置跳躍位元保留空間，則保留剩餘空間。‘场馬值之仏長度小於碼種序及上述第-編碼演譯㈣較佳的是由卜卿編 200816653 prefix-free coding編碼程序所組成，而較佳的是interp〇latiVecoding編碼程序。、〜貝澤程序 5月簽閱第5圖，其係為本發明之數 =列之步驟流程圖。圖中，數值資方二= 個已排序之數值<5, 8, 12, 13, 15, 18, 23去用^對11 進行編碼，此實施例包含下列步驟：，，2, 33> 步驟5(^設定一分群參數為4,將π個叙#八 8，12 ’ 13>、<15，18 ’ 23，28>及<29，1 $值為<5，其中5、15及29為跳躍點，8、12及n ^入W〉共3群組，的内部點’而18'23及28是介於15之間而32及33為殘餘點； ’、之間的内部點，步驟51 ··使用d-gap編碼程序與prefix 編碼程序對跳躍點（5、15、29)與殘餘點（32 =灿呢並分別以PF⑸、PF (15-5)、PF⑵—15) /f仃編石馬，200816653 [Prior Art] Checking the data: The data search on the road has become the most commonly used information at the moment = web pages with the same keywords. - An information retrieval ΐίιΐ has a lot of files, thousands of jobs, and billions. In the poor search system in Ξΐί, the number of files is increasing by one. In order to allow users to find what they are looking for in the shortest possible time, the information retrieval system will establish an index structure. On the system, please refer to - I Ϊ I / 缘示 shows a schematic diagram of the index structure. The index structure used by ··, ^ hang is the transposed index (inverted m, the main two files are composed of one index index (ϋχ = =)), one is the inverted file (inverted file). All appear in = The keywords in the file stored in the kilogram will be recorded in the file, for example, the keywords "computer" and architecture in the figure, and use an indicator to refer to the corresponding transposed string in the transposed flat. Inverted Ust, for example, the computer corresponds to the transpose string 11 of the address 355 in the transposed file, and the keyword "architecture" corresponds to the transposed string 1 of the address 252 in the transposed file. The transpose string records which files the keyword appears in. For example, in the database, the contents of the file 丨, file 2, file ^, file 12, file 20, file 72, and file 80 have key sub-architectures, so The transposed series 10 records the values of 1, 2, 11, 12, 20, 72, and 80. Similarly, the transposed string 1 records the values of j, 3, 12, 13, 20, 73, and 80. Words, the space required for the index slot case is relatively small and can be stored in memory. In the middle, and the transposition file is larger, it must be compressed and stored in the disk. Transpose the case comparison when the user uses the transposition index to process a search command (query) 200816653 t The index slot searches for the keywords mentioned in the retrieval command, and obtains the transposed concatenations corresponding to the keywords, and then the file numbers in the respective transposed concatenations are logical operators (AND ", ', 〇 R, etc. ) to compare to find all the file numbers that match the search command, so that the user can get the real file by using these file numbers. For example, when one enters the keyword "architecture <and> c(10)puter", Then, the information retrieval system performs an AND operation on the transposed series 10 and the transposed series u to find the value recorded in the transposed string 1〇 and the transposed string u at the same time, taking the first picture as an example, corresponding to the key The files with the word "architecture <and> computer" are file !, file 12, file (10), and file 80. The files whose keywords are "architecture <〇r> c(10)puter" are file 1, file 2, file 3, Document 1 file 1 2. =13, file 20, file 72, file 73 and file 8. By the above procedure, the user can find out the information to be obtained in the file database. Since the transposed file is relatively large, it must be compressed and It is stored on the disk. Therefore, if the value of the transposed string is encoded into a more economical storage: a bit string stream of indefinite length f can reduce the need for disk reading. See to speed up the investigation. However, this also causes the number decoding process of the transposed series to have a numerical value of one value for sequential decoding, and the efficiency of the number of strings. For example, when the user enters the key two series ', ', rface' is in the file, but the information retrieval system still has to decode the transposed serial sequence one by one, and decode it until the value is 72. Comparison. In order to let the transposed serial-coded bit string stream flow in the number process, reduce the redundant decoding action, the conventional technology, the external information or the design special data structure, in place: two 200816653 jump points, Give it the ability to open, a, and implement the jump point code. However, the conventional techniques are quite large, resulting in an increase in the storage space required for no _σ /'. Inserting too many jumps into the 70-column stream, in view of the well-informed 'inventor-based problem', in order to be able to solve the problem--a data search method 3:::, with many practical experiences, propose less storage space, In the method of encoding the poor material, it is possible to use the data insertion method that is more than the insertion of the jumping point, and the bit string stream after the compilation is intensively good, which is the realization method of the above-mentioned shortcomings. in accordance with. [Summary of the Invention] In this case, the Syrah method and its numerical value are to provide a kind of data to search for the storage space of the province, ",, /, to have both hoppable decoding and section according to the present invention + For a plurality of purposes, a numerical data encoding method is proposed, which is set by the value of a group of ^ riding code 'financial package ordering steps: into a plurality of ^ ribs group to participate in the difficulty of the value of each special value , and the number of the number of the group is equal to the grouping parameter, the first value of the seventh/two group is represented by the jumping point, and the other numerical system is the internal point; according to a first code The translation program encodes the jump points of the groups to generate a plurality of jump point code values; and according to a spatial estimation private order, the internal points of the groups are estimated to be performed by a second code interpreter Encoding the required bit space; encoding the internal points of the groups according to the second coded program to generate a plurality of internal point coded horse values; according to the order of the groups The bits retain space, The 8 200816653 jump point code values and the internal point code values are sequentially arranged to generate an output code value. In addition, the present invention further provides a data search method for searching whether a coded value data includes a target value. The encoded numerical data is encoded by the above numerical data encoding method, the encoded numerical data comprising a plurality of jumping point encoded values and a plurality of internal point encoded values, the method comprising the following steps: (a) according to the first encoding interpretation program Obtaining two jump point code values in the coded value data, and decoding a first comparison jump point and a second comparison jump point; (b) determining whether the target value is between the first comparison jump point And between the second comparison jump points, if yes, performing step (c), if not, performing step (d); (c) obtaining an internal point code value subsequent to the second comparison jump point, and Decoding the internal point code value according to a second code interpreter to obtain at least one internal point value, and determining whether the internal points include the target value; (d) according to a first comparison jump point, the second comparison jump point, and a spatial estimation program, estimating a one-bit reserved space, and obtaining another jump point code value after the bit space is reserved, and decoding another one Jumping point, setting the second comparison jump point to the first comparison jump point, and setting the other jump point as the second comparison jump point, performing step (b). For a better understanding of the technical features and the efficacies of the present invention, the preferred embodiments and the detailed description are as follows. [Embodiment] Hereinafter, preferred embodiments of the present invention will be described with reference to the related drawings. The data search method of the embodiment and the numerical data encoding method thereof are described with the same reference numerals in the following embodiments of the following embodiments of the present invention in order to facilitate understanding. Please refer to FIG. 2 to FIG. 4, and FIG. A flow chart of the steps of the numerical data encoding method of the present invention, FIG. 3 is a schematic diagram of the grouping of the numerical sequence of the present invention, and FIG. 4 is a schematic diagram of the bit stream of the output encoded data of the present invention. In Fig. 2, the numerical data encoding method is used to encode a plurality of sorted values, and the method comprises the following steps: Step 20: setting a grouping parameter η, and dividing the sorted values according to the grouping parameter η m groups, as shown in Figure 3, each group contains n values, the first value of each group is represented as a jumping point, and the other values are represented as internal points; Step 21: According to one The first code interpreter encodes the jump point 1 to the jump point m shown in FIG. 3 to generate m jump point code values, as shown in FIG. 4, E1 to Em, where FIG. 4 shows E1 to Em respectively represent the code values generated by the jump point 1 to the jump point m shown in FIG. 3 according to a first code interpretation program; Step 22: According to a spatial estimation program, each figure 3 is estimated The internal point group is a bit reserved space required for encoding by a second code interpreter, such as S1 and S2 shown in FIG. 4, wherein S1 and S2 shown in FIG. 4 respectively indicate FIG. The internal point group 1 and the internal point group 2 are estimated to be a second according to a spatial estimation program. The bit space reserved for encoding the interpreter to encode; Step 23: Perform 200816653 encoding for each internal point group according to a second encoding interpreter to generate a plurality of internal point code values, as shown in FIG. What? . To Plx^P21 to P2y, wherein pu to ρΐχ and p21 to P2y/ shown in FIG. 4 respectively represent the internal point D2 shown in FIG. 3 and Dn+2 to D2n are encoded by a second coding interpretation program. The encoded value is generated; Step 24: Arranging the jump point code values and the internal point code values in order according to the order of the groups and the bit reserved spaces to generate an output coded data. Wherein, if the number of values of the last group of the groups is smaller than the grouping parameter, the value of the last group represents the residual point, and is encoded by a first coded interpretation program, such as Em+1 and FIG. Em+2, where Em+1 and Em+2 shown in Fig. 4 respectively represent the coded values generated by the residual points Dmn+2 and Dmn+3 shown in Fig. 3 according to a first coding interpretation program. The space estimation program is used to estimate the internal point group to be coded by the second code interpretation program: therefore, the bit length of the internal point code value is less than or as shown in the figure 4, and the output code data is used. The jump point code value _, then the configuration; then (four) thousand, from J space is less than the bit space reserved ς] Bessie stays the remaining space τ. Continued, the pure element a Si' point coded value E3 and (four) point coded value P11 to the configuration jump point coded value and (4) the code value 'configuration jump bit reserved space, then the remaining space is reserved. Preferably, the length of the field horse is less than the code order and the above-mentioned first-coded interpretation (4) is preferably composed of the 200816653 prefix-free coding coding program, and preferably the interp〇latiVecoding coding program. ~Bazer program May 5th signing the fifth chart, which is the flow chart of the number of steps in the invention. In the figure, the value of the second party = the sorted value < 5, 8, 12, 13, 15, 18, 23 to encode with ^, this embodiment includes the following steps: ,, 2, 33 > Step 5 (^Set a subgroup parameter to 4, π 述#8,12 '13>, <15,18 '23,28> and <29,1 $ value is <5, where 5, 15 and 29 is the jumping point, 8, 12 and n ^ into W> a total of 3 groups, the internal point 'and 18'23 and 28 are between 15 and 32 and 33 are the residual points; ', the internal point between , Step 51 ·· Use the d-gap encoding program and the prefix encoding program to jump points (5, 15, 29) and residual points (32 = Can and PF (5), PF (15-5), PF (2) - 15) / f仃编石马,

(33-32)表示編碼結果，其中，d—gap PF prefix-free coding編碼程序為熟悉此領域者^程序與不再贅述；厅无、知’在此步驟 52:使用 inte_ative CGdin 點群（8、12、13)及内部點群（18、23、28) ^秸序對内部別以.12、13)及_、23、28)表示編4^馬^分 mterpolative coding編碼程序為熟悉此領 :、甲，此不再贅述；苓所热知，在步驟53:使用表1所示之空間預估表估算内部點群（8、 12 200816653 12、13)及内部點群（18、23、28)以 interp〇latiVe coding 編碼程序進行編碼所需之較大位元保留空間，其中g為分群參數，D為包圍内部點群之兩端跳躍點之差值再減1，力為「1〇82(1)-2)"1-2，而兩内部點群之位元保留空間之估算結果分別為8位元及10位元； ^ 〇 ifD = 3 位元保留空間= = \ 2 ifD = 4 3(h + l) + l i/4<D<3x2h+3(33-32) indicates the coding result, in which the d-gap PF prefix-free coding coding program is familiar with the field of the program and will not be described again; Hall No, know 'in this step 52: use the inte_ative CGdin point group (8 , 12, 13) and internal point groups (18, 23, 28) ^ The straw order is internally represented by .12, 13) and _, 23, 28). 4^马^分mterpolative coding coding program is familiar with this :, A, this will not be repeated; as far as I know, in step 53: use the spatial prediction table shown in Table 1 to estimate the internal point group (8, 12 200816653 12, 13) and the internal point group (18, 23, 28) The larger bit reserved space required for encoding by the interp〇latiVe coding encoding program, where g is the grouping parameter, and D is the difference between the jumping points of the two points surrounding the inner point group, and then the force is "1". 82(1)-2)"1-2, and the estimation results of the bit reserved space of the two internal point groups are 8 bits and 10 bits respectively; ^ 〇ifD = 3 bits reserved space = = \ 2 ifD = 4 3(h + l) + li/4<D<3x2h+3

3(h + l) + 2 i/3x2h+3<D 表1 步驟54 ·將跳躍點編碼值及内部點編碼值依序排列，以產生一編碼位元串列流，其中因為IP(8、12、13)之位元長度為7位元，其小於位元保留空間之估算結果，因此需保留 1位兀（8-7=1)，而IP(18、23、28)之位元長度為1〇位元， /、位元保留空間之估算結果相同，編碼後產生之位元串列流依序為 PF(5)、PF(15-5)、IP(8, 12, 13)、保留位元、 PF(29〜15)、ip(18, 23, 28)、PF(32-29)及 PF(33-32)。漭。請參閱第6圖，其係為本發明之資料搜尋方法之步驟圖。此方法用以搜尋一編碼數值資料中是否包含一目 ^值，此編碼數值資料係以上述之數值資料編碼方法進行邱:，此編碼數值資料包含複數個跳躍點編碼值及複數個内、、蝙碼值，該方法包含下列步驟： ^驟60 ·根據第-解碼演譯程序，於該編碼數值資料 13 200816653 中取知兩跳]fL馬值，並解碼第— 二比對跳躍點；二驟61·判斷該目標數值是否介於該第一比對跳躍點及該第一比對跳躍點之間’若是，則執行步驟⑽，若否，則執行步驟63 ; 、 f 步驟62:取得位於該第二比對跳躍點隨後之内部點編碼值，並根據-第二解碼演譯程賴制部關碼值進行解至少—内部點數值，並判斷該㈣部點是否包含步驟63:根據該第二比對跳躍點及-空間預估程序，估算一位元保留空間：並；隔=躍：留空間之後取得另一跳躍點編碼值，並解碼出一，’、點，將該第二比對跳躍點設為該第—比對跳躍點=躍一跳躍點設為該第二比對跳躍點，執行步驟61。兀將該另上述第-解碼演譯程序較佳的是由 prefW coding解碼程序所組成，而第二 2序及較佳的是interpolative coding解碼程序。、、、〜澤程序 -以第5圖所示之編碼位元串列流為例元串列流中是否包含一目標數值Μ，首先，、匕編螞位程序與prefix-free coding解碼程序，彳=d〜肋P解螞取得PF⑸及PFU5-5)，並解石馬出點串列流中為23不介於5及15之間’因此’接著根據上^；'5 ^ 估算 14 200816653 出跳躍點5及跳躍點15之間之位元保留空間為8位元，因此自PF(15-5)隨後間隔8位元，取得PF(29-15)並解碼出跳躍點29。因為23介於15及29之間，因此接著取出 IP(18,23,28)，並解碼出内部點18、23及28，並判斷出此編碼位元串列流包含目標數值23。以上所述僅為舉例性，而非為限制性者。任何未脫離本發明之精神與範疇，而對其進行之等效修改或變更，均應包含於後附之申請專利範圍中。【圖式簡單說明】第1圖係繪示一索引結構之示意圖；第2圖係為本發明之數值資料編碼方法之步驟流程圖；第3圖係為本發明之數值序列之分群示意圖；第4圖係為本發明之輸出編碼資料之位元串列流示意圖；第5圖係為本發明之數值資料編碼方法之實施例之步驟流程圖；以及第6圖係為本發明之資料搜尋方法之步驟流程圖。【主要元件符號說明】 10，11，12 :轉置串列； 20〜24 :步驟流程；3(h + l) + 2 i/3x2h+3<D Table 1 Step 54: Arranging the jump point code value and the internal point code value in order to generate a coded bit string stream, because IP (8, The bit length of 12, 13) is 7 bits, which is less than the estimation result of the bit reserved space, so it is necessary to reserve 1 bit 8 (8-7=1), and the bit length of IP (18, 23, 28) For the 1 〇 bit, the estimation result of the /, bit reserved space is the same, and the bit stream generated after encoding is PF(5), PF(15-5), IP(8, 12, 13), Reserved bits, PF (29~15), ip (18, 23, 28), PF (32-29), and PF (33-32). expansive. Please refer to Fig. 6, which is a step diagram of the data searching method of the present invention. The method is used for searching whether a coded value data includes a value of a value, and the coded value data is performed by the above numerical data coding method: the coded value data includes a plurality of jump point code values and a plurality of numbers, and bats The code value includes the following steps: Step 60: According to the first decoding decoding program, the two-hop]fL horse value is obtained in the encoded numerical data 13 200816653, and the second-to-two comparison jumping point is decoded; 61. Determine whether the target value is between the first comparison jump point and the first comparison jump point. If yes, execute step (10). If not, execute step 63; f, step 62: obtain the location And secondly comparing the internal point code value of the jump point, and decomposing at least the internal point value according to the second decoding interpretation process, and determining whether the (four) part includes step 63: according to the Two comparison jump point and - space estimation program, estimate one yuan reserved space: and; interval = jump: after leaving space to obtain another jump point code value, and decode one, ', point, the second ratio Set the jump point to this The first-comparison jump point = the jump one jump point is set to the second comparison jump point, and step 61 is performed. Preferably, the other first decoding decoding program is composed of a prefW coding decoding program, and the second binary sequence and preferably an interpolative coding decoding program. ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,彳=d~ rib P solves the problem of obtaining PF(5) and PFU5-5), and the solution of the stone in the outflow stream is 23 not between 5 and 15 'so' then according to the upper ^; '5 ^ estimate 14 200816653 Since the bit reserved space between the jump point 5 and the jump point 15 is 8 bits, the PF (29-15) is obtained from the PF (15-5) and then 8 bits are obtained, and the jump point 29 is decoded. Since 23 is between 15 and 29, the IP (18, 23, 28) is then fetched, and the internal points 18, 23, and 28 are decoded, and it is judged that the encoded bit stream contains the target value 23. The above is intended to be illustrative only and not limiting. Any changes or modifications to the spirit and scope of the present invention are intended to be included in the scope of the appended claims. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a schematic diagram showing an index structure; FIG. 2 is a flow chart showing the steps of the numerical data encoding method of the present invention; FIG. 3 is a schematic diagram of a grouping of numerical sequences of the present invention; 4 is a schematic diagram of a bit stream of the output coded data of the present invention; FIG. 5 is a flow chart of steps of an embodiment of the numerical data encoding method of the present invention; and FIG. 6 is a data searching method of the present invention. Step flow chart. [Main component symbol description] 10,11,12: transpose series; 20~24: step flow;

Sl，S2 :位元保留空間； T :保留位元；Sl, S2: bit reserved space; T: reserved bit;

Dl，D2, Dn，Dn+l，Dn+2, D2n，Dmn+l，Dmn+2, Dmn+3 :跳躍點編碼值； 15 200816653Dl, D2, Dn, Dn+l, Dn+2, D2n, Dmn+l, Dmn+2, Dmn+3: jump point code value; 15 200816653

El，E2，E3，Em，Em+l，Em+2 :跳躍點編碼值； Pll，Plx，P21，P2y :内部點編碼值； 50〜54 :步驟流程；以及 60〜63 :步驟流程。 16El, E2, E3, Em, Em+l, Em+2: jump point code value; Pll, Plx, P21, P2y: internal point code value; 50~54: step flow; and 60~63: step flow. 16

Claims

200816653 Patent application scope: The number of values of the group for a plurality of sorted values is equal to the value of the grouping of the groupings; the first value is represented by a jumping point' and the same is performed according to the -code-interpreting program a code to generate a plurality of jump point code values; a burst domain 仃, a flat code; a temperature: 内部 two sequence group of the internal points of the - (four) 兮: order entry 所需 encoding required bit space; The Fc Kang-Di-code interpreter encodes the groups to generate a plurality of internal point code values; and a sequence of ordering hops and a space reserved for the bits, the code values and the code values of the internal points Arranging sequentially to generate a numerical data encoding method for losing 2 4 mi1 rewards, wherein the first code: the program is d, the encoding program and the Prefix-free coding encoding program. 1=1==1 Reward value (4) Code branch, where the first, such as Shen; Vi di-糸 is the lnterP〇latlVe C〇ding encoding program. ^ Declaring the method of encoding the numerical data described in the patent, the number of values in the first paragraph is less than the number of the group, and the final group, and the numerical fine-coded sequence is encoded. • Two kinds of ρ search for money, heart search - code value (4) Whether the paper contains the code value data is based on the value of the first item of the patent application (4) coded square money line code, the code value # material contains complex 17 200816653 several jump points Compilation and Wei column steps: 卩"Ί is the code value' This method contains the following (a) According to the first decoding interpretation program, 扒 jump point coded value, and decode a #二Θ, flat code value poor material Get two jump points in the middle; brother ~ comparison jump point and a second comparison (b) to determine the target value is between the two pairs of jump points, if the county, the rainbow brother compares the jump point and the first step ( d);, step (c), if not, perform the internal point code value according to the second calculus net point, and obtain at least - internal point;; order; the value is decoded to the scalar value; ^ Pray whether these internal points contain the item (d) according to the first comparison jump point, the first estimation procedure, the estimation - the position reserved empty = hop and one space to obtain another jump point The bit reserved space hop is set to the second alignment #+ 踣 point and will One jump 6, such as Shen Shu, and jump to the point, perform step (8). For example, the third interpretation of the patented cadre of the Jiaming _ d-_ solution ^; ^, shoot the first solution to the program Composition: U~free coding 7 = Please refer to the data range 苐5 items for information search interpolate coding%^ 18