[go: up one dir, main page]

WO2009003328A1 - Data query system and method - Google Patents

Data query system and method Download PDF

Info

Publication number
WO2009003328A1
WO2009003328A1 PCT/CN2007/003409 CN2007003409W WO2009003328A1 WO 2009003328 A1 WO2009003328 A1 WO 2009003328A1 CN 2007003409 W CN2007003409 W CN 2007003409W WO 2009003328 A1 WO2009003328 A1 WO 2009003328A1
Authority
WO
WIPO (PCT)
Prior art keywords
word segmentation
homophone
module
search
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2007/003409
Other languages
French (fr)
Chinese (zh)
Inventor
Lili Liu
Qiang Lu
Bo You
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Publication of WO2009003328A1 publication Critical patent/WO2009003328A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures

Definitions

  • a search engine system is a system that collects information resources and provides information inquiry to users after processing and organizing the information. The user can input the content that he wants to query, and the search engine system quickly and accurately locates the information required by the user in the massive information according to the user's input, and returns the search result to the user.
  • the search engine when performing search query analysis of user input strings, the search engine generally adopts the following methods: Directory search mode, adopting a system preset search word, users select through keywords, select keywords; directly input user input as a key The word is queried; and after the user's query request is segmented according to the dictionary, the word segmentation result is used as a keyword for query.
  • Directory search mode adopting a system preset search word, users select through keywords, select keywords; directly input user input as a key
  • the word is queried; and after the user's query request is segmented according to the dictionary, the word segmentation result is used as a keyword for query.
  • the search engine system will directly divide the word into single words for query, so that the search results are numerous, and most of them are garbage results, and the user cannot obtain such massive data. Find the results you really want.
  • the data query system includes: an entry module, configured to receive a query string that the user wants to query; a word segmentation module, configured to perform a preliminary word segmentation on the query string, and determine whether a homophone search is needed for the preliminary word segmentation result, and The judgment result processes the preliminary word segmentation result; and the query module is configured to perform data query on the processed final word segmentation result.
  • the word segmentation module includes: a Chinese word segmentation module, configured to perform preliminary word segmentation on the query string according to a specific rule, and send the preliminary word segmentation result to the portal module if the homophone part search is not required to be performed on the preliminary word segmentation result; the search judgment module, It is used to determine whether it is necessary to perform a homophone search for the preliminary word segmentation result; and a homophone component processing module for performing pinyin labeling on the preliminary word segmentation result in the case where the preliminary word segmentation result needs to be searched for the homophone part, and the pinyin labeling information on the preliminary word segmentation result Perform homophone search.
  • a Chinese word segmentation module configured to perform preliminary word segmentation on the query string according to a specific rule, and send the preliminary word segmentation result to the portal module if the homophone part search is not required to be performed on the preliminary word segmentation result
  • the search judgment module It is used to determine whether it is necessary to perform a homophone search for the preliminary word segmentation result
  • a homophone component processing module
  • the homophone processing module comprises: a pinyin labeling module, which is used for pinyin labeling of the preliminary word segmentation result; and a homophone search module, which is used for homophone search in the homophone dictionary in the homophone dictionary according to the pinyin annotation information.
  • the query module includes: a data query module, configured to perform data query according to the preliminary word segmentation result or the final word segmentation result; the data indexing module is configured to index the data to be queried, and maintain the index.
  • the data query system according to the present invention may further comprise: a homonym dictionary module for generating and storing a homophone dictionary and updating the homophone dictionary in real time.
  • the data query method includes the following steps: S302: Receive a query string to be queried by a user; S304, perform preliminary word segmentation on the query string, determine whether a homophone search is needed for the preliminary word segmentation result, and The word segmentation result is processed; and S306, the data query is performed according to the processed final word segmentation result.
  • Step S304 includes the following steps: S3042-1, performing a preliminary word segmentation on the query string; S3044-1, determining whether a homophone search is needed for the preliminary word segmentation result; S3046-1, in the case where the homophone search is not required for the preliminary word segmentation result
  • step S306 is directly performed.
  • step S304 includes the following steps: S3042-2, performing preliminary word segmentation on the query string; S3044-2, determining whether it is necessary to perform a homophone search for the preliminary word segmentation result; S3046-2, in the case where the homophone search is required for the preliminary word segmentation result, the homophone search is performed in the homophone dictionary in the homophone dictionary, and then step S306 is performed.
  • step S306 the data to be queried is also indexed, and the index is maintained.
  • the data query method according to the present invention may further comprise the steps of: generating and storing a homonym dictionary, and updating the homophone dictionary in real time.
  • the user can obtain a highly accurate result by inputting an accurate and clean keyword, and can search by inputting pinyin.
  • FIG. 1 is a block diagram of a data query system in accordance with an embodiment of the present invention
  • FIG. 2 is a block diagram of a data query system in accordance with another embodiment of the present invention.
  • FIGS. 3A through 3C are FIGS. 1 and 2 A flow chart of the data query method and steps performed by the system. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • FIG. 1 a data query system in accordance with an embodiment of the present invention is illustrated. As shown in FIG. 1
  • the data query system includes: an import module 102, configured to receive a query string that the user wants to query; a word segmentation module 104, configured to perform preliminary word segmentation on the query string, and determine whether a homophone search is needed for the preliminary word segmentation result. And processing the preliminary word segmentation result according to the judgment result; and the query module 106 is configured to perform data query according to the processed final word segmentation result.
  • the word segmentation module 104 includes: a Chinese word segmentation module 1042, configured to perform a preliminary word segmentation on the query string according to a specific rule, and send the preliminary word segmentation result to the portal module without performing a homophone search on the preliminary word segmentation result;
  • the module 1044 is configured to determine whether a homophone search is needed for the preliminary word segmentation result, and the homophone component processing module 1046 is configured to perform pinyin labeling on the preliminary word segmentation result and label the pinyin according to the pinyin when the preliminary word segmentation result needs to be searched for the homophone part Perform a homophone search on the preliminary word segmentation results.
  • the homophone processing module 1046 includes: a pinyin labeling module, configured to perform pinyin labeling on the preliminary word segmentation result; and a homophone search module, configured to perform homophone search on the preliminary word segmentation result in the homophone dictionary according to the pinyin annotation information.
  • the query module 106 includes: a data query module 1062, configured to: perform data query according to the preliminary word segmentation result or the final word segmentation result; the data indexing module 1064 is configured to index the data to be queried, and maintain the index.
  • the data query system according to the present invention may further comprise: a homonym dictionary module for generating and storing a homonym dictionary, and a homologous dictionary for real-time updating.
  • the data query system includes the following parts: a query entry module 202, configured to complete reception of user input content through interaction with an external system, and send the received text content to a word segmentation module, and The word segmentation information returned from the word segmentation module is received, and the word segmentation information is searched as a keyword.
  • the word segmentation module 204 is configured to call the Chinese word segmentation module to obtain a preliminary word segmentation result, and determine whether a homophone search is needed according to the preliminary word segmentation result.
  • the word segmentation information is searched for homophones, and the information to be processed is sent to the homophone processing module, and the homophone information returned from the module is received, and the homophone information is returned to the query entry module together with other word segmentation information.
  • the Chinese word segmentation module 206 is used according to different Word segmentation strategy for users
  • the input information is segmented.
  • the homophone processing module 208 is configured to receive the text information from the word segmentation module, send the text information to the pinyin tagging module, and receive the pinyin tagging information returned from the pinyin tagging module.
  • the pinyin annotation module 210 is configured to convert the text information into corresponding pinyin information.
  • the labeling result is sent to the query entry module, and in the homophone dictionary maintenance process, the labeling result is sent to the homophone dictionary maintenance module.
  • the homophone search module 212 is configured to extract the homophones with the highest frequency of occurrence by searching for the multi-word dictionary, and send the search result to the homophone processing module.
  • the homophone dictionary maintenance module 214 is used to maintain a homonym dictionary that the homophone system needs to use.
  • the dictionary is marked word by word by calling the Pinyin annotation module to form a homophone dictionary. And update the homonym dictionary synchronously when the dictionary is updated.
  • multiple words are marked with multiple pinyin to form multiple terms.
  • the homonym dictionary is sorted in the order of the pinyin.
  • the query module 216 is configured to receive a keyword from the query entry module, and generate a query condition to query the index file to obtain a result matching the user query request, and return the query result to the user index module 218 for maintenance.
  • Full-text indexing The indexing module indexes each word by scanning each word in the information that needs to be searched, indicating the number and location of the word in the article.
  • the query entry module sends the query string to the word segmentation module, and the word segmentation module calls the Chinese word segmentation module to segment the user's query string according to different word segmentation strategies, and judges whether the homophone search is needed according to the word segmentation result. If it is judged that the homophone search is not required, the word segmentation result is directly returned to the query module, and the data search process is entered. If it is judged that the homophone search is required, the information to be processed is sent to the homophone processing module.
  • the homophone processing module receives the to-be-processed information and sends it to the pinyin annotation module, and the pinyin annotation module performs pinyin annotation on the text information, and returns the annotation result to the homophone processing module.
  • the homophone processing module After receiving the phonetic annotation information, the homophone processing module calls the homophone query module to search the pinyin annotation information in the homophone dictionary, and returns the information to the word segmentation module after obtaining the search result.
  • the word segmentation module integrates the information obtained from the homophone processing module and the keyword information obtained from the other word segments, and returns it to the query entry module as a search keyword.
  • the query entry module sends the keyword to the query module.
  • the query module uses the obtained keywords to search the index library, and returns the matching information in the index library to the query entry module.
  • the result is adjusted by the query entry module and returned to the user.
  • the data query system may further use an indexing module to index the data to be searched, maintain the index, and synchronously update the index library information when the source information is updated.
  • the data query method includes the following steps: S302: Receive a query string that the user wants to query; S304, perform preliminary word segmentation on the query string, determine whether it is necessary to perform a homophone search on the preliminary word segmentation result, and The preliminary word segmentation result is processed; and S306, the processed final word segmentation result is used for data query.
  • S302 Receive a query string that the user wants to query
  • S304 perform preliminary word segmentation on the query string, determine whether it is necessary to perform a homophone search on the preliminary word segmentation result, and The preliminary word segmentation result is processed
  • S306 the processed final word segmentation result is used for data query.
  • step S304 includes the following steps: S3042-1, performing a preliminary word segmentation on the query string; S3044-1, determining whether a homophone search is needed for the preliminary word segmentation result; S3046-1, without preliminary word segmentation When the homophone search is performed as a result, step S306 is directly performed.
  • step S304 includes the following steps: S3042-2, performing a preliminary word segmentation on the query string; S3044- 2, determining whether a homophone search is needed for the preliminary word segmentation result; S3046-2, in need of the preliminary word segmentation result
  • the homophone search is performed on the preliminary word segmentation in the homophone dictionary, and then step S306 is performed.
  • step S306 the data to be queried is also indexed, and the index is maintained.
  • the data query method according to the present invention may further comprise the steps of generating and storing a homophone dictionary and updating the homophone dictionary in real time.
  • the invention can process the error input such as homophonic characters and fuzzy sounds input by the user when the user input is analyzed, automatically converts it into standard input, returns the user query result after the search, and the user's error Enter to prompt.
  • the user can quickly and conveniently search for the information he needs, and at the same time, the threshold of the search engine user can be lowered. With the present invention, the user can even find information by directly inputting the form of pinyin.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

A data query system and method are disclosed. The data query system comprises an entry module for receiving the query string which the user wants to search; a segmenting module for primarily segmenting the query string, determining whether to search the homophonic words for the segmenting result, and processing the primary segmenting result according to the determining result; and a query module for making data query according to the final segmenting result processed.

Description

数据查询系统及方法 技术领域 本发明涉及搜索引擎领域, 更具体地涉及一种数据查询系统及方法。 背景技术 搜索引擎系统是一种对信息资源进行搜集,并在对信息进行处理和组织 之后, 为用户提供信息查询的系统。 用户可以输入想要查询的内容, 搜索引 擎系统根据用户的输入在海量信息中快速准确地定位用户需要的信息, 并将 搜索结果返回给用户。 目前,搜索引擎在进行用户输入串的查询分析时 , 一般采取以下几种方 式: 目录搜索方式, 采用系统预置搜索词的方式, 用户通过目录分级选择, 选择关键词; 直接将用户输入作为关键字进行查询; 以及将用户的查询请求 按照词典进行分词后, 将分词结果作为关键字进行查询。 上述几种方式存在以下缺陷:  DATA QUERY SYSTEM AND METHOD FIELD OF THE INVENTION The present invention relates to the field of search engines, and more particularly to a data query system and method. BACKGROUND OF THE INVENTION A search engine system is a system that collects information resources and provides information inquiry to users after processing and organizing the information. The user can input the content that he wants to query, and the search engine system quickly and accurately locates the information required by the user in the massive information according to the user's input, and returns the search result to the user. At present, when performing search query analysis of user input strings, the search engine generally adopts the following methods: Directory search mode, adopting a system preset search word, users select through keywords, select keywords; directly input user input as a key The word is queried; and after the user's query request is segmented according to the dictionary, the word segmentation result is used as a keyword for query. The above several ways have the following defects:

1 )如果由用户选择系统事先预置的关键词, 所选择的内容有限, 不能 满足各种用户需求。 1) If the user selects keywords preset by the system in advance, the selected content is limited and cannot meet various user requirements.

2 )如果采取目录方式选择关键词, 由于分类众多、 关键词众多, 用户 选择需要的关键词比较费时。 2) If the keyword is selected by the directory method, the keywords selected by the user are time-consuming because of the large number of classifications and numerous keywords.

3 )如果直接对用户输入关键字进行查询, 则要求用户描述特别筒洁, 清晰。 而由于搜索引擎面对的是海量数据, 数据的来源众多, 结构多样, 信 息的描述可能与用户输入不尽相同。 所以容易造成用户无法得到真正需要的 信息, 搜索得到的信息相对较少的问题。 若用户输入不准确, 甚至有可能无 法得到搜索结果。 3) If the user directly enters a keyword to query, the user description is required to be particularly clean and clear. Since search engines are faced with massive amounts of data, the sources of data are numerous and diverse, and the description of the information may not be the same as user input. Therefore, it is easy to cause the user to get the information that is really needed, and the information obtained by the search is relatively small. If the user input is not accurate, it may even be impossible to get search results.

4 )如果按照词典进行分词, 若用户输入错误, 可能造成分词失败, 则 搜索引擎系统会直接将该词分为单字进行查询, 这样搜索结果众多, 且多数 为垃圾结果, 用户无法从这些海量数据中找到自己真正想要的结果。 发明内容 鉴于以上所述的一个或多个问题,本发明提供了一种数据查询系统及方 法。 根据本发明实施例的数据查询系统包括: 入口模块, 用于接收用户欲查 询的查询串; 分词模块, 用于对查询串进行初步分词, 判断是否需要对初步 分词结果进行同音词搜索, 并 >据判断结果对初步分词结果进行处理; 以及 查询模块 , 用于 居经过处理的最终分词结果进行数据查询。 其中, 分词模块包括: 中文分词模块, 用于根据特定规则对查询串进行 初步分词, 以及在不需要对初步分词结果进行同音词搜索的情况下, 将初步 分词结果发送至入口模块; 搜索判断模块, 用于判断是否需要对初步分词结 果进行同音词搜索; 以及同音词处理模块, 用于在需要对初步分词结果进行 同音词搜索的情况下, 对初步分词结果进行拼音标注, 并 居拼音标注信息 对初步分词结果进行同音词搜索。 其中, 同音词处理模块包括: 拼音标注模块, 用于对初步分词结果进行 拼音标注; 同音词搜索模块, 用于才艮据拼音标注信息在同音词词典中对初步 分词结果进行同音词搜索。 其中, 查询模块包括: 数据查询模块, 用于根据初步分词结果或最终分 词结果进行数据查询; 数据索引模块, 用于对需要查询的资料建立索引, 并 对索引进行维护。 根据本发明的数据查询系统还可以包括: 同音词词典模块, 用于生成并 存储同音词词典, 并对同音词词典进行实时更新。 根据本发明实施例的数据查询方法, 包括以下步驟: S302, 接收用户欲 查询的查询串; S304, 对查询串进行初步分词, 判断是否需要对初步分词结 果进行同音词搜索, 并根据判断结果对初步分词结果进行处理; 以及 S306, 根据经过处理的最终分词结果进行数据查询。 其中, 步骤 S304 包括以下步骤: S3042-1 , 对查询串进行初步分词; S3044-1 , 判断是否需要对初步分词结果进行同音词搜索; S3046-1 , 在不需 要对初步分词结果进行同音词搜索的情况下, 直接进行步骤 S306。 可选地, 步骤 S304包括以下步骤: S3042-2, 对查询串进行初步分词; S3044-2, 判断是否需要对初步分词结果进行同音词搜索; S3046-2, 在需要 对初步分词结果进行同音词搜索的情况下, 在同音词词典中对初步分词结果 进行同音词搜索, 然后进行步骤 S306。 其中, 在步驟 S306中, 还对需要查询的资料建立索引, 并对索引进行 维护。 >据本发明的数据查询方法还可以包括以下步骤: 生成并存储同音词 词典, 并对同音词词典进行实时更新。 在本发明中, 用户可以通过输入准确、 筒洁的关键字得到准确率高的结 果, 并可以通过输入拼音进行搜索。 并且在用户输入的信息中出现同音别字 或模糊音(如前鼻音、 后鼻音等)别字时, 本发明可以进行自动纠正, 并得 到正确的搜索结果。 附图说明 此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部 分, 本发明的示意性实施例及其说明用于解释本发明, 并不构成对本发明的 不当限定。 在附图中: 图 1是根据本发明实施例的数据查询系统的框图; 图 2是根据本发明另一实施例的数据查询系统的框图; 以及 图 3A至图 3C是图 1和图 2所示的系统完成的数据查询方法及步驟的 流程图。 具体实施方式 下面参考附图, 详细说明本发明的具体实施方式。 参考图 1 , 说明根据本发明实施例的数据查询系统。 如图 1所示, 该数据 查询系统包括: 入口模块 102, 用于接收用户欲查询的查询串; 分词模块 104, 用于对查询串进行初步分词, 判断是否需要对初步分词结果进行同音词搜索, 并根据判断结果对初步分词结果进行处理; 以及查询模块 106, 用于根据经过 处理的最终分词结果进行数据查询。 其中, 分词模块 104包括: 中文分词模块 1042, 用于 居特定规则对查 询串进行初步分词, 以及在不需要对初步分词结果进行同音词搜索的情况下, 将初步分词结果发送至入口模块; 搜索判断模块 1044 , 用于判断是否需要对初 步分词结果进行同音词搜索; 以及同音词处理模块 1046 , 用于在需要对初步分 词结果进行同音词搜索的情况下, 对初步分词结果进行拼音标注, 并根据拼音 标注信息对初步分词结果进行同音词搜索。 其中, 同音词处理模块 1046包括: 拼音标注模块, 用于对初步分词结果 进行拼音标注; 同音词搜索模块, 用于根据拼音标注信息在同音词词典中对初 步分词结果进行同音词搜索。 其中, 查询模块 106包括: 数据查询模块 1062, 用于 >据初步分词结果 或最终分词结果进行数据查询; 数据索引模块 1064, 用于对需要查询的资料建 立索引, 并对索引进行维护。 根据本发明的数据查询系统还可以包括: 同音词词典模块,用于生成并存 储同音词词典, 并^"同音词词典进行实时更新。 参考图 2, 说明根据本发明另一实施例的数据查询系统。 如图 2所示, 该 数据查询系统包括以下几个部分: 查询入口模块 202 ,用于通过与外部系统的交互完成对用户输入内容的接 收, 并将接收到的文字内容发送给分词模块, 以及用于接收从分词模块返回的 分词信息, 并将分词信息作为关键字进行搜索。 分词模块 204, 用于调用中文分词模块得到初步分词结果, 并根据初步分 词结果判断是否需要进行同音词搜索。 若需要对分词信息进行同音词搜索, 则 将待处理信息发送给同音词处理模块, 并接收从该模块返回的同音词信息。 将 同音词信息与其他分词信息一同返回给查询入口模块。 中文分词模块 206, 用于根据不同的分词策略对用户的输入信息进行切 分。 同音词处理模块 208 , 用于接收来自分词模块的文字信息, 将文字信息发 送给拼音标注模块, 并接收从拼音标注模块返回的拼音标注信息。 拼音标注模块 210, 用于将文字信息转换为对应的拼音信息。 在查询流程 中, 将标注结果发送给查询入口模块, 在同音词词典维护流程中, 将标注结果 发送给同音词典维护模块。 同音词查找模块 212, 用于通过查找多音词词典的方式, 提取出现频率最 高的同音词, 并将查找结果发送给同音词处理模块。 同音词典维护模块 214, 用于维护同音词系统需要使用的同音词词典。 通 过调用拼音标注模块对词典进行逐词标注, 形成同音词词典。 并在词典进行更 新的时候同步更新同音词词典。 在拼音标注过程中, 多音字标注多个拼音, 形 成多个词条。 同音词词典按照拼音顺序进行排序。 查询模块 216, 用于接收来自查询入口模块的关键字, 居关键字形成查 询条件对索引文件进行查询, 以得到匹配用户查询要求的结果, 并将查询结果 返回给用户 索引模块 218, 用于维护全文索引。 索引模块通过扫描需要搜索的信息中 的每一个词, 对每一个词建立索引, 指明该词在文章中出现的次数和位置。 在使用图 2所示的数据查询系统进行数据查询时,用户与查询入口模块进 行交互,向查询入口模块提交查询串。查询入口模块将查询串发送给分词模块, 分词模块调用中文分词模块 居不同的分词策略对用户的查询串进行分词, 并 根据分词结果判断是否需要进行同音词搜索。 如果判断不需进行同音词搜索 , 则直接返回分词结果给查询模块, 进入数据搜索流程。 如果判断需要进行同音 词搜索, 则将待处理信息发送给同音词处理模块。 同音词处理模块接收待处理 信息并将之发送给拼音标注模块, 拼音标注模块对文本信息进行拼音标注, 并 将标注结果返回给同音词处理模块。 同音词处理模块接收到拼音标注信息后, 调用同音词查询模块在同音词词典中对拼音标注信息进行查找, 得到查找结果 后将信息返回给分词模块。 分词模块整合从同音词处理模块返回的信息与其他 分词所得的关键词信息, 作为搜索的关键词返回给查询入口模块。 查询入口模 块将关键词发送给查询模块。 查询模块利用得到的关键词对索引库进行查找, 并将索引库中匹配的信息返回给查询入口模块。 由查询入口模块对结果进行调 整, 返回给用户。 另外, 该数据查询系统还可以利用索引模块对需要搜索的资 料建立索引, 对索引进行维护, 并且在源信息更新时, 同步更新索引库信息。 参考图 3A至图 3C, 说明图 1和图 2所示的系统完成的数据查询方法。 如图 3A所示, 该数据查询方法包括以下步骤: S302, 接收用户欲查询的查询 串; S304, 对查询串进行初步分词, 判断是否需要对初步分词结果进行同音词 搜索, 并 #居判断结果对初步分词结果进行处理; 以及 S306, 居经过处理的 最终分词结果进行数据查询。 其中, 如图 3B所示, 步驟 S304包括以下步骤: S3042-1 , 对查询串进行 初步分词; S3044-1 , 判断是否需要对初步分词结果进行同音词搜索; S3046-1 , 在不需要对初步分词结果进行同音词搜索的情况下, 直接进行步骤 S306。 其中, 如图 3C所示, 步骤 S304包括以下步骤: S3042-2, 对查询串进行 初步分词; S3044- 2, 判断是否需要对初步分词结果进行同音词搜索; S3046-2, 在需要对初步分词结果进行同音词搜索的情况下, 在同音词词典中对初步分词 结果进行同音词搜索, 然后进行步驟 S306。 其中, 在步骤 S306中, 还对需要查询的资料建立索引, 并对索引进行维 护。根据本发明的数据查询方法还可以包括以下步骤:生成并存储同音词词典, 并对同音词词典进行实时更新。 本发明可以在对用户输入进行分析的时候,对用户输入的诸如同音字、模 糊音等错误输入进行处理, 将其自动转换为标准输入, 在搜索后, 返回用户查 询结果, 并对用户的错误输入进行提示。 通过本发明, 用户可以快捷方便地搜 索到自己需要的信息, 同时可以降低搜索引擎使用者的门槛。 通过本发明, 用 户甚至可以通过直接输入拼音的形式对信息进行查找。 以上所述仅为本发明的实施例而已,并不用于限制本发明,对于本领域的 技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内, 所作的任何修改、等同替换、 改进等, 均应包含在本发明的权利要求范围之内。 4) If the word segmentation is performed according to the dictionary, if the user input error may cause the word segmentation to fail, the search engine system will directly divide the word into single words for query, so that the search results are numerous, and most of them are garbage results, and the user cannot obtain such massive data. Find the results you really want. SUMMARY OF THE INVENTION In view of one or more of the problems described above, the present invention provides a data query system and method. The data query system according to the embodiment of the present invention includes: an entry module, configured to receive a query string that the user wants to query; a word segmentation module, configured to perform a preliminary word segmentation on the query string, and determine whether a homophone search is needed for the preliminary word segmentation result, and The judgment result processes the preliminary word segmentation result; and the query module is configured to perform data query on the processed final word segmentation result. The word segmentation module includes: a Chinese word segmentation module, configured to perform preliminary word segmentation on the query string according to a specific rule, and send the preliminary word segmentation result to the portal module if the homophone part search is not required to be performed on the preliminary word segmentation result; the search judgment module, It is used to determine whether it is necessary to perform a homophone search for the preliminary word segmentation result; and a homophone component processing module for performing pinyin labeling on the preliminary word segmentation result in the case where the preliminary word segmentation result needs to be searched for the homophone part, and the pinyin labeling information on the preliminary word segmentation result Perform homophone search. The homophone processing module comprises: a pinyin labeling module, which is used for pinyin labeling of the preliminary word segmentation result; and a homophone search module, which is used for homophone search in the homophone dictionary in the homophone dictionary according to the pinyin annotation information. The query module includes: a data query module, configured to perform data query according to the preliminary word segmentation result or the final word segmentation result; the data indexing module is configured to index the data to be queried, and maintain the index. The data query system according to the present invention may further comprise: a homonym dictionary module for generating and storing a homophone dictionary and updating the homophone dictionary in real time. The data query method according to the embodiment of the present invention includes the following steps: S302: Receive a query string to be queried by a user; S304, perform preliminary word segmentation on the query string, determine whether a homophone search is needed for the preliminary word segmentation result, and The word segmentation result is processed; and S306, the data query is performed according to the processed final word segmentation result. Step S304 includes the following steps: S3042-1, performing a preliminary word segmentation on the query string; S3044-1, determining whether a homophone search is needed for the preliminary word segmentation result; S3046-1, in the case where the homophone search is not required for the preliminary word segmentation result Next, step S306 is directly performed. Optionally, step S304 includes the following steps: S3042-2, performing preliminary word segmentation on the query string; S3044-2, determining whether it is necessary to perform a homophone search for the preliminary word segmentation result; S3046-2, in the case where the homophone search is required for the preliminary word segmentation result, the homophone search is performed in the homophone dictionary in the homophone dictionary, and then step S306 is performed. In step S306, the data to be queried is also indexed, and the index is maintained. The data query method according to the present invention may further comprise the steps of: generating and storing a homonym dictionary, and updating the homophone dictionary in real time. In the present invention, the user can obtain a highly accurate result by inputting an accurate and clean keyword, and can search by inputting pinyin. And when a homonym or a fuzzy sound (such as a front nasal sound, a rear nasal sound, etc.) appears in the information input by the user, the present invention can automatically correct and obtain a correct search result. BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are set to illustrate,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, In the drawings: FIG. 1 is a block diagram of a data query system in accordance with an embodiment of the present invention; FIG. 2 is a block diagram of a data query system in accordance with another embodiment of the present invention; and FIGS. 3A through 3C are FIGS. 1 and 2 A flow chart of the data query method and steps performed by the system. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, specific embodiments of the present invention will be described in detail with reference to the accompanying drawings. Referring to Figure 1, a data query system in accordance with an embodiment of the present invention is illustrated. As shown in FIG. 1 , the data query system includes: an import module 102, configured to receive a query string that the user wants to query; a word segmentation module 104, configured to perform preliminary word segmentation on the query string, and determine whether a homophone search is needed for the preliminary word segmentation result. And processing the preliminary word segmentation result according to the judgment result; and the query module 106 is configured to perform data query according to the processed final word segmentation result. The word segmentation module 104 includes: a Chinese word segmentation module 1042, configured to perform a preliminary word segmentation on the query string according to a specific rule, and send the preliminary word segmentation result to the portal module without performing a homophone search on the preliminary word segmentation result; The module 1044 is configured to determine whether a homophone search is needed for the preliminary word segmentation result, and the homophone component processing module 1046 is configured to perform pinyin labeling on the preliminary word segmentation result and label the pinyin according to the pinyin when the preliminary word segmentation result needs to be searched for the homophone part Perform a homophone search on the preliminary word segmentation results. The homophone processing module 1046 includes: a pinyin labeling module, configured to perform pinyin labeling on the preliminary word segmentation result; and a homophone search module, configured to perform homophone search on the preliminary word segmentation result in the homophone dictionary according to the pinyin annotation information. The query module 106 includes: a data query module 1062, configured to: perform data query according to the preliminary word segmentation result or the final word segmentation result; the data indexing module 1064 is configured to index the data to be queried, and maintain the index. The data query system according to the present invention may further comprise: a homonym dictionary module for generating and storing a homonym dictionary, and a homologous dictionary for real-time updating. Referring to Figure 2, a data query system according to another embodiment of the present invention is illustrated. As shown in FIG. 2, the data query system includes the following parts: a query entry module 202, configured to complete reception of user input content through interaction with an external system, and send the received text content to a word segmentation module, and The word segmentation information returned from the word segmentation module is received, and the word segmentation information is searched as a keyword. The word segmentation module 204 is configured to call the Chinese word segmentation module to obtain a preliminary word segmentation result, and determine whether a homophone search is needed according to the preliminary word segmentation result. The word segmentation information is searched for homophones, and the information to be processed is sent to the homophone processing module, and the homophone information returned from the module is received, and the homophone information is returned to the query entry module together with other word segmentation information. The Chinese word segmentation module 206 is used according to different Word segmentation strategy for users The input information is segmented. The homophone processing module 208 is configured to receive the text information from the word segmentation module, send the text information to the pinyin tagging module, and receive the pinyin tagging information returned from the pinyin tagging module. The pinyin annotation module 210 is configured to convert the text information into corresponding pinyin information. In the query process, the labeling result is sent to the query entry module, and in the homophone dictionary maintenance process, the labeling result is sent to the homophone dictionary maintenance module. The homophone search module 212 is configured to extract the homophones with the highest frequency of occurrence by searching for the multi-word dictionary, and send the search result to the homophone processing module. The homophone dictionary maintenance module 214 is used to maintain a homonym dictionary that the homophone system needs to use. The dictionary is marked word by word by calling the Pinyin annotation module to form a homophone dictionary. And update the homonym dictionary synchronously when the dictionary is updated. In the process of pinyin labeling, multiple words are marked with multiple pinyin to form multiple terms. The homonym dictionary is sorted in the order of the pinyin. The query module 216 is configured to receive a keyword from the query entry module, and generate a query condition to query the index file to obtain a result matching the user query request, and return the query result to the user index module 218 for maintenance. Full-text indexing. The indexing module indexes each word by scanning each word in the information that needs to be searched, indicating the number and location of the word in the article. When data query is performed using the data query system shown in FIG. 2, the user interacts with the query entry module and submits a query string to the query entry module. The query entry module sends the query string to the word segmentation module, and the word segmentation module calls the Chinese word segmentation module to segment the user's query string according to different word segmentation strategies, and judges whether the homophone search is needed according to the word segmentation result. If it is judged that the homophone search is not required, the word segmentation result is directly returned to the query module, and the data search process is entered. If it is judged that the homophone search is required, the information to be processed is sent to the homophone processing module. The homophone processing module receives the to-be-processed information and sends it to the pinyin annotation module, and the pinyin annotation module performs pinyin annotation on the text information, and returns the annotation result to the homophone processing module. After receiving the phonetic annotation information, the homophone processing module calls the homophone query module to search the pinyin annotation information in the homophone dictionary, and returns the information to the word segmentation module after obtaining the search result. The word segmentation module integrates the information obtained from the homophone processing module and the keyword information obtained from the other word segments, and returns it to the query entry module as a search keyword. The query entry module sends the keyword to the query module. The query module uses the obtained keywords to search the index library, and returns the matching information in the index library to the query entry module. The result is adjusted by the query entry module and returned to the user. In addition, the data query system may further use an indexing module to index the data to be searched, maintain the index, and synchronously update the index library information when the source information is updated. Referring to Figures 3A through 3C, the data query method performed by the system shown in Figures 1 and 2 will be described. As shown in FIG. 3A, the data query method includes the following steps: S302: Receive a query string that the user wants to query; S304, perform preliminary word segmentation on the query string, determine whether it is necessary to perform a homophone search on the preliminary word segmentation result, and The preliminary word segmentation result is processed; and S306, the processed final word segmentation result is used for data query. As shown in FIG. 3B, step S304 includes the following steps: S3042-1, performing a preliminary word segmentation on the query string; S3044-1, determining whether a homophone search is needed for the preliminary word segmentation result; S3046-1, without preliminary word segmentation When the homophone search is performed as a result, step S306 is directly performed. As shown in FIG. 3C, step S304 includes the following steps: S3042-2, performing a preliminary word segmentation on the query string; S3044- 2, determining whether a homophone search is needed for the preliminary word segmentation result; S3046-2, in need of the preliminary word segmentation result In the case of homophone search, the homophone search is performed on the preliminary word segmentation in the homophone dictionary, and then step S306 is performed. In step S306, the data to be queried is also indexed, and the index is maintained. The data query method according to the present invention may further comprise the steps of generating and storing a homophone dictionary and updating the homophone dictionary in real time. The invention can process the error input such as homophonic characters and fuzzy sounds input by the user when the user input is analyzed, automatically converts it into standard input, returns the user query result after the search, and the user's error Enter to prompt. Through the invention, the user can quickly and conveniently search for the information he needs, and at the same time, the threshold of the search engine user can be lowered. With the present invention, the user can even find information by directly inputting the form of pinyin. The above is only the embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes can be made to the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and scope of the invention are intended to be included within the scope of the appended claims.

Claims

权 利 要 求 书 Claim 1. 一种数据查询系统, 其特征在于, 包括:  A data query system, comprising: 入口模块, 用于接收用户欲查询的查询串;  An entry module, configured to receive a query string that the user wants to query; 分词模块, 用于对所述查询串进行初步分词, 判断是否需要对初步 分词结果进行同音词搜索, 并根据判断结果对所述初步分词结果进行处 理; 以及  a word segmentation module, configured to perform a preliminary word segmentation on the query string, determine whether a homophone search is needed for the preliminary word segmentation result, and process the preliminary word segmentation result according to the judgment result; 查询模块, 用于根据经过处理的最终分词结果进行数据查询。  The query module is configured to perform data query according to the processed final word segmentation result. 2. 根据权利要求 1所述的数据查询系统, 其特征在于, 所述分词模块包括: 中文分词模块, 用于才艮据特定规则对所述查询串进行初步分词, 以 及在不需要对所述初步分词结果进行同音词搜索的情况下 , 将所述初步 分词结果发送至所述入口模块; The data query system according to claim 1, wherein the word segmentation module comprises: a Chinese word segmentation module, configured to perform preliminary word segmentation on the query string according to a specific rule, and When the preliminary word segmentation result is performed by the homophone search, the preliminary word segmentation result is sent to the entry module; 搜索判断模块,用于判断是否需要对所述初步分词结果进行同音词 搜索; 以及  a search judging module, configured to determine whether a homophone search is needed for the preliminary word segmentation result; 同音词处理模块,用于在需要对所述初步分词结果进行同音词搜索 的情况下, 对所述初步分词结果进行拼音标注, 并 居拼音标注信息对 所述初步分词结果进行同音词搜索。  The homophone processing module is configured to perform pinyin labeling on the preliminary word segmentation result when the preliminary word segmentation result needs to be searched for homophones, and perform homophone search on the preliminary word segmentation result in the pinyin tagging information. 3. 根据权利要求 2所述的数据查询系统, 其特征在于, 所述同音词处理模 块包括: 3. The data query system according to claim 2, wherein the homophone processing module comprises: 拼音标注模块, 用于对所述初步分词结果进行拼音标注; 同音词搜索模块,用于根据所述拼音标注信息在同音词词典中对所 述初步分词结果进行同音词搜索。  The pinyin labeling module is configured to perform pinyin labeling on the preliminary word segmentation result; the homophone search module is configured to perform homophone search on the preliminary word segmentation result in the homophone dictionary according to the pinyin tagging information. 4. 根据权利要求 1至 3中任一项所述的数据查询系统, 其特征在于, 所述 查询模块包括: The data query system according to any one of claims 1 to 3, wherein the query module comprises: 数据查询模块,用于 居所述初步分词结果或所述最终分词结果进 4亍数据查询;  a data query module, configured to enter the preliminary word segmentation result or the final word segmentation result into a data query; 数据索引模块, 用于对需要查询的资料建立索引, 并对所述索引进 行维护。 The data indexing module is configured to index the data that needs to be queried, and maintain the index. 5. 根据权利要求 4所述的数据查询系统, 其特征在于, 还包括: The data query system according to claim 4, further comprising: 同音词词典模块, 用于生成并存储所述同音词词典, 并对所述同音 词词典进行实时更新。  The homophone dictionary module is configured to generate and store the homophone dictionary and update the homophone dictionary in real time. 6. 一种数据查询方法, 其特征在于, 包括以下步骤: 6. A data query method, comprising the steps of: S302 , 接收用户欲查询的查询串;  S302. Receive a query string that the user wants to query. S304, 对所述查询串进行初步分词, 判断是否需要对初步分词结果 进行同音词搜索 , 并根据判断结果对所述初步分词结果进行处理; 以及 S304, performing preliminary word segmentation on the query string, determining whether a homophone search is needed for the preliminary word segmentation result, and processing the preliminary word segmentation result according to the judgment result; S306 , 居经过处理的最终分词结果进行数据查询。 S306, the processed final word segmentation result is used for data query. 7. 根据权利要求 6所述的数据查询方法,其特征在于, 所述步骤 S304包括 以下步碌: The data query method according to claim 6, wherein the step S304 comprises the following steps: S3042-1 , 对所述查询串进行初步分词;  S3042-1, performing preliminary word segmentation on the query string; S3044-1 , 判断是否需要对所述初步分词结果进行同音词搜索; S3044-1, determining whether it is necessary to perform a homophone search on the preliminary word segmentation result; S3046-1 ,在不需要对所述初步分词结果进行同音词搜索的情况下, 直接进行所述步骤 S306。 S3046-1, if the homophone search is not required for the preliminary word segmentation result, the step S306 is directly performed. 8. 根据权利要求 6所述的数据查询方法,其特征在于,所述步骤 S304包括 以下步骤: The data query method according to claim 6, wherein the step S304 comprises the following steps: S3042-2, 对所述查询串进行初步分词;  S3042-2, performing preliminary word segmentation on the query string; S3044-2 , 判断是否需要对所述初步分词结果进行同音词搜索; S3044-2, determining whether it is necessary to perform a homophone search on the preliminary word segmentation result; S3046-2 , 在需要对所述初步分词结果进行同音词搜索的情况下, 在同音词词典中对所述初步分词结果进行同音词搜索, 然后进行所述步 骤 S306。 S3046-2, in the case where the homophone search is required for the preliminary word segmentation result, the homophone search is performed in the homophone dictionary in the homophone dictionary, and then the step S306 is performed. 9. 根据权利要求 6至 8中任一项所述的数据查询方法, 其特征在于, 在所 述步 S306中,还对需要查询的资料建立索引,并对所述索引进行维护。 The data query method according to any one of claims 6 to 8, wherein in the step S306, the data to be queried is also indexed, and the index is maintained. 10. ■据权利要求 9所述的数据查询方法, 其特征在于, 还包括以下步骤: 生成并存储所述同音词词典, 并对所述同音词词典进行实时更新。 The data query method according to claim 9, further comprising the steps of: generating and storing the homophone dictionary, and updating the homophone dictionary in real time.
PCT/CN2007/003409 2007-06-29 2007-11-30 Data query system and method Ceased WO2009003328A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200710126036.4 2007-06-29
CNA2007101260364A CN101082936A (en) 2007-06-29 2007-06-29 Data enquiring system and method

Publications (1)

Publication Number Publication Date
WO2009003328A1 true WO2009003328A1 (en) 2009-01-08

Family

ID=38912505

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2007/003409 Ceased WO2009003328A1 (en) 2007-06-29 2007-11-30 Data query system and method

Country Status (2)

Country Link
CN (1) CN101082936A (en)
WO (1) WO2009003328A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408794A (en) * 2017-08-17 2019-03-01 阿里巴巴集团控股有限公司 A kind of frequency dictionary method for building up, segmenting method, server and client side's equipment
CN109977398A (en) * 2019-02-21 2019-07-05 江苏苏宁银行股份有限公司 A kind of speech recognition text error correction method of specific area
CN110851484A (en) * 2019-11-13 2020-02-28 北京香侬慧语科技有限责任公司 Method and device for obtaining multi-index question answers
CN112686041A (en) * 2021-01-06 2021-04-20 北京猿力未来科技有限公司 Pinyin marking method and device

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101685021B (en) * 2008-09-24 2012-12-26 高德软件有限公司 Method and device for acquiring point of interest
CN101699440B (en) * 2009-11-24 2011-12-07 中国电信股份有限公司 Service-based retrieving method and service-based retrieving system
CN101853280B (en) * 2010-05-19 2012-07-04 北京友录在线科技发展有限公司 Method for searching for contacts in hand-held equipment
CN102467544B (en) * 2010-11-16 2015-01-21 中国电信股份有限公司 Information smart searching method and system based on space fuzzy coding
CN103530380B (en) * 2013-10-17 2017-10-17 北京奇虎科技有限公司 A kind of vertical search device and method
CN103577591B (en) * 2013-11-12 2017-02-01 广东金宇恒软件科技有限公司 Method and device for generating accounting voucher
WO2016154838A1 (en) * 2015-03-29 2016-10-06 王志强 Method for providing product information while displaying trademark homophones, and trademark search system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1335574A (en) * 2001-09-05 2002-02-13 罗笑南 Intelligent semantic searching method
DE10339697A1 (en) * 2003-08-28 2005-04-07 Siemens Ag Search request processing method e.g. for search engine and index service, involves dividing retrieval query using dismantling function for search machine inquiry and listing service inquiry
US20050187920A1 (en) * 2004-01-23 2005-08-25 Porto Ranelli, Sa Contextual searching
CN1873642A (en) * 2006-04-29 2006-12-06 上海世纪互联信息系统有限公司 Searching engine with automating sorting function
CN1909522A (en) * 2006-08-18 2007-02-07 北京金山软件有限公司 Method for acquiring front-page keyword and its application system
CN101075308A (en) * 2006-11-08 2007-11-21 腾讯科技(深圳)有限公司 Method for editing e-mail
CN101079032A (en) * 2006-06-23 2007-11-28 腾讯科技(深圳)有限公司 Digital string fuzzy match method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1335574A (en) * 2001-09-05 2002-02-13 罗笑南 Intelligent semantic searching method
DE10339697A1 (en) * 2003-08-28 2005-04-07 Siemens Ag Search request processing method e.g. for search engine and index service, involves dividing retrieval query using dismantling function for search machine inquiry and listing service inquiry
US20050187920A1 (en) * 2004-01-23 2005-08-25 Porto Ranelli, Sa Contextual searching
CN1873642A (en) * 2006-04-29 2006-12-06 上海世纪互联信息系统有限公司 Searching engine with automating sorting function
CN101079032A (en) * 2006-06-23 2007-11-28 腾讯科技(深圳)有限公司 Digital string fuzzy match method
CN1909522A (en) * 2006-08-18 2007-02-07 北京金山软件有限公司 Method for acquiring front-page keyword and its application system
CN101075308A (en) * 2006-11-08 2007-11-21 腾讯科技(深圳)有限公司 Method for editing e-mail

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109408794A (en) * 2017-08-17 2019-03-01 阿里巴巴集团控股有限公司 A kind of frequency dictionary method for building up, segmenting method, server and client side's equipment
CN109977398A (en) * 2019-02-21 2019-07-05 江苏苏宁银行股份有限公司 A kind of speech recognition text error correction method of specific area
CN109977398B (en) * 2019-02-21 2023-06-06 江苏苏宁银行股份有限公司 Speech recognition text error correction method in specific field
CN110851484A (en) * 2019-11-13 2020-02-28 北京香侬慧语科技有限责任公司 Method and device for obtaining multi-index question answers
CN112686041A (en) * 2021-01-06 2021-04-20 北京猿力未来科技有限公司 Pinyin marking method and device
CN112686041B (en) * 2021-01-06 2024-06-04 北京猿力未来科技有限公司 Pinyin labeling method and device

Also Published As

Publication number Publication date
CN101082936A (en) 2007-12-05

Similar Documents

Publication Publication Date Title
WO2009003328A1 (en) Data query system and method
CN108170859B (en) Voice query method, device, storage medium and terminal device
US7272558B1 (en) Speech recognition training method for audio and video file indexing on a search engine
US9613166B2 (en) Search suggestions of related entities based on co-occurrence and/or fuzzy-score matching
US8126897B2 (en) Unified inverted index for video passage retrieval
US9798776B2 (en) Systems and methods for parsing search queries
US20090287680A1 (en) Multi-modal query refinement
CN106383836B (en) Attributing actionable attributes to data describing an identity of an individual
CN105159938B (en) Search method and device
CN1474378A (en) Information search and voice identification based on language model
Mendels et al. Improving speech recognition and keyword search for low resource languages using web data
CN101149758A (en) Search system and search method
WO2003010754A1 (en) Speech input search system
CN103885949A (en) Song searching system and method based on lyrics
CN106708886B (en) Display method and device for in-site search words
CN102339294A (en) Searching method and system for preprocessing keywords
CN106570196A (en) Video program searching method and device
KR20100071829A (en) Method and apparatus for analyzing and searching index
CN119597862A (en) A method and system for enhancing information retrieval based on multi-channel recall mode
CN1744087A (en) Document processing apparatus for searching documents control method therefor,
CN106126588A (en) Method and device for providing related words
US9507834B2 (en) Search suggestions using fuzzy-score matching and entity co-occurrence
KR101739766B1 (en) Apparatus and Method for similarity based keyword mapping in Semantic Search
JPH06274546A (en) Information quantity matching degree calculation system
CN107704461B (en) Intelligent road condition information retrieval method based on data analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07845772

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07845772

Country of ref document: EP

Kind code of ref document: A1