[go: up one dir, main page]

CN112052361A - Searching method, searching device, terminal and storage medium - Google Patents

Searching method, searching device, terminal and storage medium Download PDF

Info

Publication number
CN112052361A
CN112052361A CN201910493017.8A CN201910493017A CN112052361A CN 112052361 A CN112052361 A CN 112052361A CN 201910493017 A CN201910493017 A CN 201910493017A CN 112052361 A CN112052361 A CN 112052361A
Authority
CN
China
Prior art keywords
character
data
characters
index
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910493017.8A
Other languages
Chinese (zh)
Inventor
高才
李俊良
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Oppo Mobile Telecommunications Corp Ltd filed Critical Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority to CN201910493017.8A priority Critical patent/CN112052361A/en
Publication of CN112052361A publication Critical patent/CN112052361A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种搜索方法、装置、终端及存储介质。其中,方法包括:终端获取关键词;在本地数据库查找以获取的关键词为索引的检索索引;所述本地数据库包含至少一个数据对应的检索索引;数据的检索索引是利用单字符分字符方式将相应数据的特征字符进行分字符处理,并利用分字符处理得到的字符序列基于全文检索(FTS)技术创建的;将本地数据库中与查找到的检索索引对应的数据作为搜索结果;输出搜索结果。

Figure 201910493017

The invention discloses a search method, device, terminal and storage medium. Wherein, the method includes: the terminal acquires keywords; searches a local database for a retrieval index indexed by the acquired keywords; the local database contains at least one retrieval index corresponding to the data; The characteristic characters of the corresponding data are processed by character segmentation, and the character sequence obtained by the character segmentation processing is used to create a full-text search (FTS) technology; the data corresponding to the search index found in the local database is used as the search result; the search result is output.

Figure 201910493017

Description

搜索方法、装置、终端及存储介质Search method, device, terminal and storage medium

技术领域technical field

本发明涉及数据处理领域,尤其涉及一种搜索方法、装置、终端及存储介质。The present invention relates to the field of data processing, and in particular, to a search method, device, terminal and storage medium.

背景技术Background technique

智能设备已经随着越来越多用的使用而不断发展。用户可以通过智能设备查询自身感兴趣的信息已经成为一种普遍现象。Smart devices have evolved with increasing usage. It has become a common phenomenon that users can query the information they are interested in through smart devices.

然而,在相关技术中,根据用户的查询信息生成的查询结果中,经常会出现查询慢、查询结果不准确等问题。However, in the related art, in the query result generated according to the user's query information, problems such as slow query and inaccurate query results often occur.

发明内容SUMMARY OF THE INVENTION

为解决现有存在的技术问题,本发明实施例提供一种搜索方法、装置、终端及存储介质。In order to solve the existing technical problems, embodiments of the present invention provide a search method, device, terminal, and storage medium.

本发明实施例的技术方案是这样实现的:The technical solution of the embodiment of the present invention is realized as follows:

本发明实施例提供一种搜索方法,应用于终端,包括:An embodiment of the present invention provides a search method, applied to a terminal, including:

获取关键词;get keywords;

在本地数据库查找以获取的关键词为索引的检索索引;所述本地数据库包含至少一个数据对应的检索索引;数据的检索索引是利用单字符分字符方式将相应数据的特征字符进行分字符处理,并利用分字符处理得到的字符序列基于全文检索(FTS,Full-textsearch)技术创建的;Search the local database for a retrieval index indexed by the acquired keywords; the local database contains at least one retrieval index corresponding to the data; the retrieval index of the data is to use a single-character character-separation method to characterize the characteristic characters of the corresponding data. And the character sequence obtained by character segmentation is created based on full-text search (FTS, Full-textsearch) technology;

将本地数据库中与查找到的检索索引对应的数据作为搜索结果;Use the data corresponding to the search index found in the local database as the search result;

输出搜索结果。Output search results.

上述方案中,所述方法还包括:In the above scheme, the method also includes:

针对应用在本地的第一数据,利用所述第一数据,确定所述第一数据的特征字符;For the first data applied locally, using the first data, determine the characteristic characters of the first data;

利用单字符分字符方式对提取的特征字符进行分字符处理,得到字符序列;Use the single-character character-separation method to characterize the extracted characteristic characters to obtain a character sequence;

利用得到的字符序列,基于FTS技术,为所述第一数据创建检索索引;Using the obtained character sequence, based on the FTS technology, a retrieval index is created for the first data;

将创建的检索索引存储至本地数据库。Store the created index to the local database.

上述方案中,所述利用单字符分字符方式对提取的特征字符进行分字符处理,包括:In the above-mentioned scheme, the character-dividing processing of the extracted characteristic characters by using the single-character character-dividing method includes:

针对提取的特征字符中同类型的多个字符,按照单个字符进行分字符;并按照单个字符与多个字符中除相应字符外的其它字符进行基于单个字符组合的方式进行分字符。For multiple characters of the same type in the extracted characteristic characters, character segmentation is performed according to a single character; and character segmentation is performed based on the combination of a single character and other characters in the multiple characters except the corresponding character based on a single character.

上述方案中,所述按照单个字符与多个字符中除相应字符外的其它字符进行基于单个字符组合的方式进行分字符,包括:In the above solution, the character segmentation is performed according to the combination of a single character and other characters in the multiple characters except the corresponding character based on a single character, including:

针对同类型多个字符中的第一个字符,在第一字符后添加多个字符中的第二个字符,得到第一字符序;在第一字符序后添加多个字符中的第三字符,得到第二字符序列;以此类推,直至在字符序后添加多个字符中的最后一个字符,得到字符序列;For the first character in multiple characters of the same type, add the second character of the multiple characters after the first character to obtain the first character sequence; add the third character of the multiple characters after the first character sequence , to get the second character sequence; and so on, until the last character of multiple characters is added after the character sequence to obtain the character sequence;

针对同类型多个字符中的第二个字符,在第二字符添加多个字符中的第三字符,得到第三字符序列;在第三字符序后添加多个字符中的第四字符,得到第四字符序列;以此类推,直至在字符序后添加多个字符中的最后一个字符,得到字符序;For the second character in multiple characters of the same type, add the third character of the multiple characters to the second character to obtain the third character sequence; add the fourth character of the multiple characters after the third character sequence to obtain The fourth character sequence; and so on, until the last character of multiple characters is added after the character sequence to obtain the character sequence;

针对同类型多个字符中除第一字符、第二字符外的其它字符,以此类推,直至同类型字符中的最后一个字符;For other characters except the first character and the second character among multiple characters of the same type, and so on, until the last character of the same type of characters;

分字符处理得到的所有字符形成字符序列。All characters obtained by character processing form character sequences.

上述方案中,所述利用得到的多个字符序列,基于全文检索技术,为所述第一数据创建检索索引,包括:In the above solution, the use of the obtained multiple character sequences to create a retrieval index for the first data based on a full-text retrieval technology includes:

利用得到的多个字符序列,基于FTS5技术,为所述第一数据创建检索索引。Using the obtained multiple character sequences, based on the FTS5 technology, a retrieval index is created for the first data.

上述方案中,本地获取到所述第一数据的同时,为所述第一数据创建检索索引。In the above solution, when the first data is obtained locally, a retrieval index is created for the first data.

上述方案中,所述第一数据应用到所述终端上后,确定未为所述第一数据创建检索索引时,为所述第一数据创建检索索引。In the above solution, after the first data is applied to the terminal, when it is determined that a retrieval index is not created for the first data, a retrieval index is created for the first data.

上述方案中,确定未为所述第一数据创建检索索引,包括:In the above solution, it is determined that a retrieval index is not created for the first data, including:

本地数据库中未存储有所述第一数据的检索索引时,确定未为所述第一数据创建检索索引。When the retrieval index of the first data is not stored in the local database, it is determined that a retrieval index is not created for the first data.

上述方案中,搜索的数据为本地安装的应用程序(APP)。In the above solution, the searched data is a locally installed application program (APP).

本发明实施例还提供了一种搜索装置,设置在终端上,包括:The embodiment of the present invention also provides a search device, which is set on the terminal and includes:

获取单元,用于获取关键词;Get unit, used to get keywords;

查找单元,用于在本地数据库查找以获取的关键词为索引的检索索引;所述本地数据库包含至少一个数据对应的检索索引;数据的检索索引是利用单字符分字符方式将相应数据的特征字符进行分字符处理,并利用分字符处理得到的字符序列基于FTS技术创建的;The search unit is used to search the retrieval index with the obtained keyword as the index in the local database; the local database contains at least one retrieval index corresponding to the data; Perform character segmentation processing, and use the character sequence obtained by character segmentation processing to create based on FTS technology;

处理单元,用于将本地数据库中与查找到的检索索引对应的数据作为搜索结果;并输出搜索结果。The processing unit is used for taking the data corresponding to the search index found in the local database as the search result; and outputting the search result.

本发明实施例还提供了一种终端,包括:处理器和用于存储能够在处理器上运行的计算机程序的存储器,An embodiment of the present invention further provides a terminal, including: a processor and a memory for storing a computer program that can run on the processor,

其中,所述处理器用于运行所述计算机程序时,执行上述任一方法的步骤。Wherein, when the processor is configured to execute the steps of any of the above methods when running the computer program.

本发明实施例还提供了一种存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述任一方法的步骤。The embodiment of the present invention also provides a storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of any of the foregoing methods are implemented.

本发明实施例提供的搜索方法、装置、终端及存储介质,获取关键词;在本地数据库查找以获取的关键词为索引的检索索引;所述本地数据库包含至少一个数据对应的检索索引;数据的检索索引是利用单字符分字符方式将相应数据的特征字符进行分字符处理,并利用分字符处理得到的字符序列基于FTS技术创建的;将本地数据库中与查找到的检索索引对应的数据作为搜索结果;输出搜索结果,由于通过单字符分字符方式,并利用FTS技术为数据创建检索索引,由于分词的粒度是单字符,所以查询结果准确;同时,创建检索索引时采用了FTS技术,因此,检索时间短。The search method, device, terminal and storage medium provided by the embodiments of the present invention obtain keywords; search a local database for a retrieval index indexed by the obtained keywords; the local database includes at least one retrieval index corresponding to data; The retrieval index is to use the single-character character segmentation method to characterize the characteristic characters of the corresponding data, and use the character sequence obtained by the character segmentation to create based on the FTS technology; the data corresponding to the search index found in the local database is used as the search index. Result; output search results, because the single-character character segmentation method and the use of FTS technology to create a retrieval index for the data, because the granularity of the word segmentation is a single character, the query results are accurate; at the same time, the FTS technology is used when creating the retrieval index. Therefore, Short retrieval time.

附图说明Description of drawings

图1为本发明实施例检索的方法流程示意图;1 is a schematic flowchart of a retrieval method according to an embodiment of the present invention;

图2为本发明应用实施例创建检索索引及检索流程示意图;FIG. 2 is a schematic diagram of a process flow diagram of creating a retrieval index and retrieval according to an application embodiment of the present invention;

图3为本发明应用实施例检索结果示意图;3 is a schematic diagram of a retrieval result of an application embodiment of the present invention;

图4为本发明实施例检索装置结构示意图;4 is a schematic structural diagram of a retrieval device according to an embodiment of the present invention;

图5为本发明实施例终端结构示意图。FIG. 5 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

具体实施方式Detailed ways

下面结合附图及具体实施例对本发明作进一步详细的说明。The present invention will be described in further detail below with reference to the accompanying drawings and specific embodiments.

需要说明的是:在本发明实例中,“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that: in the examples of the present invention, "first", "second", etc. are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence.

另外,本发明实施例所记载的技术方案之间,在不冲突的情况下,可以任意组合。In addition, the technical solutions described in the embodiments of the present invention may be combined arbitrarily if there is no conflict.

在本发明实施例中,多个的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。In the embodiments of the present invention, a plurality of means at least two, such as two, three, etc., unless otherwise explicitly and specifically defined.

如前所述,根据用户的查询信息生成的查询结果中,经常会出现查询慢、查询结果不准确等问题,举个例子来说,当用户想要在本地查询某个安装的APP(可以理解为召回某个安装的APP)时,相关技术中,可以用以下方式来召回安装的APP:As mentioned above, in the query results generated according to the user's query information, there are often problems such as slow query and inaccurate query results. For example, when a user wants to query an installed APP locally (understandable In order to recall an installed APP), in the related art, the installed APP can be recalled in the following ways:

第一种方式,利用SQLite引擎的查询语法,使用模糊查询技术,如like查询语法来获得召回某个安装的APP;The first way is to use the query syntax of the SQLite engine and use fuzzy query technology, such as the like query syntax, to obtain the recall of an installed APP;

第二种方式,通过传统分词技术和FTS技术能力,召回某个安装的APP。The second way is to recall an installed APP through traditional word segmentation technology and FTS technology capabilities.

其中,对于第一种方式,由于使用模糊查询技术,所以查询效率较低,也就是说查询的时间比较长,在查询过程中,还会出现卡顿的现象。Among them, for the first method, because the fuzzy query technology is used, the query efficiency is low, that is to say, the query time is relatively long, and the phenomenon of freezing will also occur during the query process.

对于第二种方式,由于传统的分词技术中分词的粒度不太合适,所以不能尽可能召回所对应的APP(即用户需要的),即可能召回不到所对应的APP名称结果,查询结果不准确。For the second method, because the granularity of word segmentation in the traditional word segmentation technology is not suitable, the corresponding APP (that is, what the user needs) cannot be recalled as much as possible, that is, the corresponding APP name result may not be recalled, and the query result may not be recalled. precise.

综上所述,相关技术中的召回方案均存在一定缺陷。To sum up, the recall schemes in the related art all have certain defects.

基于此,在本发明的各种实施例中,通过单字符分字符,并利用FTS的特性,将分字符后得到的字符序列构建索引。Based on this, in various embodiments of the present invention, a single character is divided into characters, and the character sequence obtained after character division is used to construct an index by using the characteristics of FTS.

本发明实施例的方案,由于通过单字符分字符方式,并利用FTS技术为数据创建检索索引,由于分词的粒度是单字符,所以查询结果准确;同时,创建检索索引时采用了FTS技术,因此,检索时间短。In the solution of the embodiment of the present invention, the retrieval index is created for the data by using the single-character character segmentation method and the FTS technology is used. Since the granularity of the word segmentation is a single character, the query result is accurate; , the retrieval time is short.

本发明实施例提供一种搜索方法,应用于终端,如图1所示,该方法包括:An embodiment of the present invention provides a search method, which is applied to a terminal. As shown in FIG. 1 , the method includes:

步骤101:获取关键词;Step 101: obtain keywords;

步骤102:在本地数据库查找以获取的关键词为索引的检索索引;Step 102: search a local database for a retrieval index indexed by the acquired keyword;

这里,所述本地数据库包含至少一个数据对应的检索索引;数据的检索索引是利用单字符分字符方式将相应数据的特征字符进行分字符处理,并利用分字符处理得到的字符序列基于FTS技术创建的。Here, the local database contains at least one retrieval index corresponding to the data; the retrieval index of the data is to use the single-character character segmentation method to perform character segmentation processing on the characteristic characters of the corresponding data, and use the character sequence obtained by the character segmentation processing to create based on FTS technology. of.

步骤103:将本地数据库中与查找到的检索索引对应的数据作为搜索结果;Step 103: use the data corresponding to the search index found in the local database as the search result;

步骤104:输出搜索结果。Step 104: Output the search result.

其中,实际应用时,所述终端可以是移动终端,比如手机、平板电脑(pad)等。Wherein, in practical application, the terminal may be a mobile terminal, such as a mobile phone, a tablet computer (pad), and the like.

在步骤101中,所述获取关键词是指:获取用户输入的关键词。实际应用时,用户可通过终端的显示屏上显示的搜索界面输入关键词。用户输入的关键词用于查询用户需要的数据(搜索的数据),比如本地安装的APP、本地通信录中的联系人信息或本地短信等。In step 101, the obtaining keywords refers to: obtaining keywords input by the user. In practical application, the user can input keywords through the search interface displayed on the display screen of the terminal. The keyword input by the user is used to query data (searched data) required by the user, such as a locally installed APP, contact information in a local address book, or a local short message.

实际应用时,所述关键词可以是各种语言的字符,比如文字、拼音、符号、或数字等。In practical applications, the keywords may be characters in various languages, such as characters, pinyin, symbols, or numbers.

在步骤102中,换句话说,用户输入的关键词直接作为检索索引。由于数据库中的检索索引是利用单字符分字符方式将相应数据的特征字符进行分字符处理,并利用分字符处理得到的字符序列基于FTS技术创建的,所以用户输入的关键词一定是包含在数据库中的,因此终端只需要在本地数据库中查找以获取的关键词为索引的检索索引即可,在查找前不需要再进行任何处理。In step 102, in other words, the keyword input by the user is directly used as a retrieval index. Since the retrieval index in the database is character-divided by characterizing the characteristic characters of the corresponding data, and the character sequence obtained by the character-dividing process is created based on the FTS technology, the keywords entered by the user must be included in the database. Therefore, the terminal only needs to search the local database for the retrieval index indexed by the acquired keyword, and does not need to perform any processing before searching.

实际应用时,对于本地的数据,需要建立对应的检索索引。In practical applications, for local data, a corresponding retrieval index needs to be established.

基于此,在一实施例中,该方法还可以包括:Based on this, in one embodiment, the method may further include:

针对应用在本地的第一数据,利用所述第一数据,确定所述第一数据的特征字符;For the first data applied locally, using the first data, determine the characteristic characters of the first data;

利用单字符分字符方式对提取的特征字符进行分字符处理,得到字符序列;Use the single-character character-separation method to characterize the extracted characteristic characters to obtain a character sequence;

利用得到的字符序列,基于FTS技术,为所述第一数据创建检索索引;Using the obtained character sequence, based on the FTS technology, a retrieval index is created for the first data;

将创建的检索索引存储至本地数据库。Store the created index to the local database.

其中,应用于在本地的数据是指:在所述终端上应用的数据,比如安装在所述终端上的APP、所述终端通信录中的联系人信息、或所述终端接收的短信等。The data applied locally refers to data applied on the terminal, such as an APP installed on the terminal, contact information in the terminal address book, or short messages received by the terminal.

实际应用时,不同的数据确定的特征字符不同,举个例子来说,比如对于安装在本地的APP,确定的特征字符可以包括APP的名称、APP用途的同义词、APP名称的拼音、APP名称的拼音中的每个字的第一个字母等中的至少一个。对于通信录中的联系人信息,确定的特征字符可以包括:联系人姓名、联系人姓名的拼音、联系人姓名中每个字拼音的第一个字母等中的至少一个。本发明实施例不对确定特征字符的具体过程进行限定。In practical application, different data determine different characteristic characters. For example, for an APP installed locally, the determined characteristic characters may include the name of the APP, the synonym of the purpose of the APP, the pinyin of the name of the APP, and the synonym of the name of the APP. At least one of the first letter of each word in Pinyin, etc. For the contact information in the address book, the determined characteristic characters may include at least one of the contact name, the pinyin of the contact name, the first letter of the pinyin of each word in the contact name, and the like. The embodiment of the present invention does not limit the specific process of determining the characteristic character.

在一实施例中,所述利用单字符分字符方式对提取的特征字符进行分字符处理,包括:In one embodiment, the method of characterizing the extracted characteristic characters by using a single-character character-dividing method includes:

针对提取的特征字符中同类型的多个字符,按照单个字符进行分字符;并按照单个字符与多个字符中除相应字符外的其它字符进行基于单个字符组合的方式进行分字符。For multiple characters of the same type in the extracted characteristic characters, character segmentation is performed according to a single character; and character segmentation is performed based on the combination of a single character and other characters in the multiple characters except the corresponding character based on a single character.

在一实施例中,所述按照单个字符与多个字符中除相应字符外的其它字符进行基于单个字符组合的方式进行分字符,包括:In one embodiment, the character segmentation according to the combination of a single character and other characters in the plurality of characters except the corresponding character based on a single character includes:

针对同类型多个字符中的第一个字符,在第一字符后添加多个字符中的第二个字符,得到第一字符序;在第一字符序后添加多个字符中的第三字符,得到第二字符序列;以此类推,直至在字符序后添加多个字符中的最后一个字符,得到字符序列;For the first character in multiple characters of the same type, add the second character of the multiple characters after the first character to obtain the first character sequence; add the third character of the multiple characters after the first character sequence , to get the second character sequence; and so on, until the last character of multiple characters is added after the character sequence to obtain the character sequence;

针对同类型多个字符中的第二个字符,在第二字符添加多个字符中的第三字符,得到第三字符序列;在第三字符序后添加多个字符中的第四字符,得到第四字符序列;以此类推,直至在字符序后添加多个字符中的最后一个字符,得到字符序;For the second character in multiple characters of the same type, add the third character of the multiple characters to the second character to obtain the third character sequence; add the fourth character of the multiple characters after the third character sequence to obtain The fourth character sequence; and so on, until the last character of multiple characters is added after the character sequence to obtain the character sequence;

针对同类型多个字符中除第一字符、第二字符外的其它字符,以此例推,直至同类型字符中的最后一个字符;For the other characters except the first character and the second character among the multiple characters of the same type, this example is deduced until the last character of the same type of characters;

分字符处理得到的所有字符形成字符序列。All characters obtained by character processing form character sequences.

这里,对同类型的多个字符,实际应用时可以是汉字的多个字符,可以是数字的多个字符,可以是字母的多个字符等。Here, multiple characters of the same type may be multiple characters of Chinese characters, multiple characters of numbers, multiple characters of letters, etc. in practical application.

实际应用时,得到的特征字符是按照顺序排列的,对特征字符中的同类型的每个字符,按照上述方式进行分字符处理。In practical application, the obtained characteristic characters are arranged in order, and each character of the same type in the characteristic characters is subjected to character segmentation processing according to the above method.

对于上述过程,举个例子来说,假设对于一个APP,确定的特征字符为使用技巧,则有:For the above process, for example, assuming that for an APP, the determined characteristic characters are using skills, there are:

第一个字符为使,分字符后得到:使、使用、使用技、使用技巧;The first character is the make, and after dividing the characters, you can get: make, use, use skill, use skill;

第二字符为用,分字符后得到:用、用技、用技巧;The second character is use, and after dividing the characters, we get: use, use skill, use skill;

第三字符为技,分字符后得到:技、技巧;The third character is the skill, after dividing the character, you can get: skill, skill;

第四字符为巧,分字符后得到:巧。The fourth character is Qiao, after dividing the characters, we get: Qiao.

则字符序列包含:使、使用、使用技、使用技巧、用、用技、用技巧、技、技巧、巧。Then the character sequence includes: make, use, use skill, use skill, use, use skill, use skill, skill, skill, skill.

在一实施例中,所述利用得到的多个字符序列,基于全文检索技术,为所述第一数据创建检索索引,包括:In one embodiment, the use of the obtained multiple character sequences to create a retrieval index for the first data based on a full-text retrieval technology includes:

利用得到的多个字符序列,基于FTS5技术,为所述第一数据创建检索索引。Using the obtained multiple character sequences, based on the FTS5 technology, a retrieval index is created for the first data.

实际应用时,可以在所述终端一获取到所述第一数据时,就为所述第一数据创建检索索引。In practical application, a retrieval index may be created for the first data as soon as the terminal acquires the first data.

基于此,在一实施例中,本地获取到所述第一数据的同时,利用所述第一数据,确定所述第一数据的特征字符利用所述第一数据,确定所述第一数据的特征字符。Based on this, in one embodiment, while the first data is locally acquired, the first data is used to determine the characteristic characters of the first data. Characteristic characters.

举个例子来说,当所述第一数据为本地安装的APP时,所述终端从网络侧获取到APP安装数据包后,在安装APP时(安装过程中),就为该APP创建检索索引,并存储至本地数据库。For example, when the first data is a locally installed APP, after the terminal obtains the APP installation data package from the network side, when the APP is installed (during the installation process), a retrieval index is created for the APP. , and store it in the local database.

再比如,当所述第一数据为通信录中的联系人信息时,所述终端一获取到联系人信息,就为该联系人信息创建检索索引,并存储至本地数据库(比如创建检索索引操作的开始时刻与获取到联系人信息的时刻之间的时间间隔小于预设时长,比如5ms等)。For another example, when the first data is the contact information in the address book, as soon as the terminal obtains the contact information, it creates a retrieval index for the contact information and stores it in a local database (for example, an operation of creating a retrieval index). The time interval between the start time of , and the time when the contact information is obtained is less than a preset duration, such as 5ms, etc.).

再比如,当所述第一数据为短信时,所述终端一接收到该短信,就为该短信创建检索索引,并存储至本地数据库(比如创建检索索引操作的开始时刻与接收到短信的时刻之间的时间间隔小于预设时长,比如3ms等)。For another example, when the first data is a short message, as soon as the terminal receives the short message, a retrieval index is created for the short message and stored in a local database (such as the start time of the operation of creating the retrieval index and the moment when the short message is received. The time interval between them is less than the preset duration, such as 3ms, etc.).

这里,需要说明的是:如前所述,创建检索索引操作的过程包括:确定特征字符、分字符处理、利用分字符处理得到的字符序列,基于FTS创建检索索引。创建检索索引的具体处理过程已在上文详述,这里不再赘述。Here, it should be noted that: as mentioned above, the process of creating a retrieval index includes: determining characteristic characters, character segmentation processing, using a character sequence obtained by character segmentation processing, and creating a retrieval index based on FTS. The specific process of creating a retrieval index has been described in detail above, and will not be repeated here.

实际应用时,还可以在所述终端应用所述第一数据后,在所述终端空闲时(比如处理器的负载小于预设负载量等),为所述第一数据创建检索索引。In practical application, after the terminal applies the first data, when the terminal is idle (for example, the load of the processor is less than a preset load amount, etc.), a retrieval index may be created for the first data.

基于此,在一实施例中,所述第一数据应用到所述终端上后,确定未为所述第一数据创建检索索引时,为所述第一数据创建检索索引。Based on this, in an embodiment, after the first data is applied to the terminal, when it is determined that a retrieval index is not created for the first data, a retrieval index is created for the first data.

举个例子来说,当所述第一数据为本地安装的APP时,所述终端安装APP后,在所述终端空闲且网络连接正常时为该APP创建检索索引,并存储至本地数据库。For example, when the first data is an APP installed locally, after installing the APP on the terminal, a retrieval index is created for the APP when the terminal is idle and the network connection is normal, and stored in the local database.

再比如,当所述第一数据为通信录中的联系人信息时,所述终端获取到联系人信息后,在所述终端空闲时为该联系人信息创建检索索引,并存储至本地数据库。For another example, when the first data is contact information in the address book, after acquiring the contact information, the terminal creates a retrieval index for the contact information when the terminal is idle, and stores it in a local database.

再比如,当所述第一数据为短信时,所述终端接收到该短信后,在所述终端空闲时为该短信创建检索索引,并存储至本地数据库。For another example, when the first data is a short message, after receiving the short message, the terminal creates a retrieval index for the short message when the terminal is idle, and stores it in a local database.

这里,需要说明的是:如前所述,创建检索索引操作的过程包括:确定特征字符、分字符处理、利用分字符处理得到的字符序列,基于FTS创建检索索引。创建检索索引的具体处理过程已在上文详述,这里不再赘述。Here, it should be noted that: as mentioned above, the process of creating a retrieval index includes: determining characteristic characters, character segmentation processing, using a character sequence obtained by character segmentation processing, and creating a retrieval index based on FTS. The specific process of creating a retrieval index has been described in detail above, and will not be repeated here.

实际应用时,所述终端会在本地数据库查找第一数据的检索索引,当未本地数据库查找到(可以理解为未存储有)所述第一数据的检索索引时,确定未为所述第一数据创建检索索引。In practical application, the terminal will look up the retrieval index of the first data in the local database, and when the retrieval index of the first data is not found in the local database (it can be understood as not being stored), it is determined that it is not the first data retrieval index. Data to create a retrieval index.

在步骤104中,实际应用时,在所述终端上呈现搜索结果,更具体地,在所述终端的显示屏上显示搜索结果。In step 104, in actual application, the search results are displayed on the terminal, and more specifically, the search results are displayed on the display screen of the terminal.

本发明实施例提供的方案,获取关键词;在本地数据库查找以获取的关键词为索引的检索索引;所述本地数据库包含至少一个数据对应的检索索引;数据的检索索引是利用单字符分字符方式将相应数据的特征字符进行分字符处理,并利用分字符处理得到的字符序列基于FTS技术创建的;将本地数据库中与查找到的检索索引对应的数据作为搜索结果;输出搜索结果,由于通过单字符分字符方式,并利用FTS技术为数据创建检索索引,由于分词的粒度是单字符,所以查询结果准确;同时,创建检索索引时采用了FTS技术,因此,检索时间短。The solution provided by the embodiment of the present invention is to acquire keywords; search a local database for a retrieval index indexed by the acquired keywords; the local database contains at least one retrieval index corresponding to data; the retrieval index of data is a single-character sub-character In this way, the characteristic characters of the corresponding data are processed into characters, and the character sequence obtained by the character separation is created based on the FTS technology; the data corresponding to the search index found in the local database is used as the search result; Single-character character segmentation method, and use FTS technology to create a retrieval index for the data. Because the granularity of word segmentation is single-character, the query result is accurate; at the same time, FTS technology is used to create the retrieval index, so the retrieval time is short.

下面结合应用实施例对本发明再作进一步详细的说明。The present invention will be described in further detail below in conjunction with application examples.

在本应用实施例中,本地搜索安装的APP,即召回安装的APP。In this application embodiment, the installed APP is searched locally, that is, the installed APP is recalled.

在本应用实施例中,APP安装时,为该APP创建检索索引。In this application embodiment, when an APP is installed, a retrieval index is created for the APP.

如图2所示,需要安装一个APP时,终端的安装模块获取到用户的安装指令,从应用商店(服务器)获取要安装的APP的数据,全局搜索进程会订阅该APP的消息,由终端的实时字序列单字分词模块从服务器的数据库(DB)获得该APP的相关信息,从而确定特征字符,比如该APP的名称,并利用字序列单字分字符方法,对特征字符进行分字符处理(可以理解为分词处理,是指中文分词,是将一个汉字序列切分成一个个单独词语,即连续的字序列按一定的规范重新整理,组合成词序列的过程),假设APP名称为“使用技巧”,则分词处理后的结果如下:As shown in Figure 2, when an APP needs to be installed, the installation module of the terminal obtains the user's installation instruction, and obtains the data of the APP to be installed from the application store (server). The real-time word sequence single-character word segmentation module obtains the relevant information of the APP from the database (DB) of the server, so as to determine the characteristic characters, such as the name of the APP, and uses the word sequence single-character character segmentation method to characterize the characteristic characters. For word segmentation processing, it refers to Chinese word segmentation, which is a process of dividing a sequence of Chinese characters into individual words, that is, the continuous sequence of words is rearranged according to certain specifications and combined into a sequence of words), assuming the name of the APP is "Using Skills", The result of word segmentation is as follows:

使、使用、使用技、使用技巧;use, use, use skills, use skills;

用、用技、用技巧;use, use skills, use skills;

技、技巧;skill, skill

巧。skillful.

这10个词组成词序列,然后由终端的FTS索引模块利用词序列,填充到FTS引擎构建相应检索索引。These 10 words form a word sequence, and then the FTS index module of the terminal uses the word sequence to fill in the FTS engine to construct a corresponding retrieval index.

实际应用时,终端的字序列单字分词全量构建模块会在预设时间判断是否为该APP创建了检索索引,当未为该APP创建检索索引时,会从服务器的DB获得该APP的相关信息,从而利用上述方式获得该APP的特征字符,然后由FTS索引模块利用词序列,填充到FTS引擎构建相应检索索引。In actual application, the terminal's word sequence single-character word segmentation full building module will determine whether a retrieval index has been created for the APP at a preset time, and when no retrieval index has been created for the APP, it will obtain the relevant information of the APP from the server's DB. Therefore, the characteristic characters of the APP are obtained by the above method, and then the FTS index module uses the word sequence to fill in the FTS engine to construct a corresponding retrieval index.

其中,当安装时,未能及时获得该APP的相关信息(比如由于网络的原因未能及时获得该APP的相关信息)时,可以由终端的字序列单字分词全量构建模块来获得该APP的相关信息,利用上述方式获得该APP的特征字符。比如,字序列单字分词全量构建模块每天会遍历本地的数据,当发现该APP未构建检索索引时,在终端空闲且网络连接正常时,获得该APP的相关信息,并利用上述方式获得该APP的特征字符。Among them, when installing, if the relevant information of the APP cannot be obtained in time (for example, the relevant information of the APP cannot be obtained in time due to network reasons), the terminal's word sequence, word segmentation and full construction module can be used to obtain the relevant information of the APP. information, and use the above method to obtain the characteristic characters of the APP. For example, the word sequence single-word segmentation full-scale construction module will traverse local data every day. When it is found that the APP has not built a retrieval index, when the terminal is idle and the network connection is normal, it will obtain the relevant information of the APP, and use the above method to obtain the APP's relevant information. Characteristic characters.

如图2所述,当用户要检索该APP时,全局搜索会调用起本地检索服务,用于通过全局搜索界面输入关键词,比如用技,检索器收到用户输入的关键词后,发送给FTS索引模块,由FTS索引模块在本地数据库查找用技为索引的检索索引,然后将使用技巧这个APP显示在显示屏上,如图3所示。As shown in Figure 2, when the user wants to retrieve the APP, the global search will invoke the local retrieval service, which is used to input keywords through the global search interface. FTS index module, the FTS index module searches the local database for the retrieval index indexed by technology, and then displays the application of the technology on the display screen, as shown in Figure 3.

由于APP名称长度命名不会特别长(一般情况下汉字少于8个),所以对于字序列单字分词(单字符分字符方式)这种技术来说,分字符的结果不会出现大量的词序列。Since the length of the APP name is not particularly long (generally less than 8 Chinese characters), for the technology of word sequence single-character word segmentation (single-character character segmentation method), the result of character segmentation will not appear a large number of word sequences .

当采用相关技术时,可以利用SQLite引擎的查询语法,使用模糊查询技术,如like查询语法来获得召回某个安装的APP。对于查询(召回)速度,在相同的设备环境下,对于包含517430条记录的SQLite数据库,使用FTS创建的数据库查询耗时为0.03秒,没有使用全文检索的数据库,即使用like查询语句查询耗时为22.5秒,性能上是原来的1000倍。When using related technologies, the query syntax of the SQLite engine can be used, and fuzzy query techniques, such as the like query syntax, can be used to obtain the recall of a certain installed APP. For the query (recall) speed, under the same device environment, for the SQLite database containing 517430 records, the query time of the database created by FTS is 0.03 seconds, and the database without full-text search, that is, the query time using the like query statement For 22.5 seconds, the performance is 1000 times the original.

从上面的描述可以看出,采用本发明实施例的方案,检索速度(召回速度)和检索准确性(召回准确性)大大提高,极大提升了用户体验。It can be seen from the above description that the retrieval speed (recall speed) and retrieval accuracy (recall accuracy) are greatly improved by adopting the solution of the embodiment of the present invention, and the user experience is greatly improved.

为实现本发明实施例的方法,本发明实施例还提供了一种搜索装置,设置在终端上,如图4所示,该装置包括:In order to implement the method of the embodiment of the present invention, the embodiment of the present invention further provides a search device, which is set on the terminal. As shown in FIG. 4 , the device includes:

获取单元41,用于获取关键词;an obtaining unit 41 for obtaining keywords;

查找单元42,用于在本地数据库查找以获取的关键词为索引的检索索引;所述本地数据库包含至少一个数据对应的检索索引;数据的检索索引是利用单字符分字符方式将相应数据的特征字符进行分字符处理,并利用分字符处理得到的字符序列基于FTS技术创建的;The search unit 42 is used for searching the retrieval index with the obtained keyword as an index in the local database; the local database includes a retrieval index corresponding to at least one data; Characters are processed by character division, and the character sequence obtained by character division processing is created based on FTS technology;

处理单元43,用于将本地数据库中与查找到的检索索引对应的数据作为搜索结果;并输出搜索结果。The processing unit 43 is configured to use the data corresponding to the search index found in the local database as the search result; and output the search result.

其中,在一实施例中,该装置还可以包括:Wherein, in one embodiment, the device may further include:

分词单元,用于针对应用在本地的第一数据,利用所述第一数据,确定所述第一数据的特征字符;并利用单字符分字符方式对提取的特征字符进行分字符处理,得到字符序列;The word segmentation unit is used to determine the characteristic characters of the first data by using the first data for the first data applied locally; and characterize the extracted characteristic characters by using the single-character character segmentation method to obtain the character sequence;

索引建立单元,用于利用得到的字符序列,基于FTS技术,为所述第一数据创建检索索引;并将创建的检索索引存储至本地数据库。The index building unit is configured to use the obtained character sequence to create a retrieval index for the first data based on the FTS technology; and store the created retrieval index in a local database.

在一实施例中,所述分词单元,具体用于:In one embodiment, the word segmentation unit is specifically used for:

针对提取的特征字符中同类型的多个字符,按照单个字符进行分字符;并按照单个字符与多个字符中除相应字符外的其它字符进行基于单个字符组合的方式进行分字符。For multiple characters of the same type in the extracted characteristic characters, character segmentation is performed according to a single character; and character segmentation is performed based on the combination of a single character and other characters in the multiple characters except the corresponding character based on a single character.

其中,所述按照单个字符与多个字符中除相应字符外的其它字符进行基于单个字符组合的方式进行分字符,包括:Wherein, the character segmentation is performed based on the combination of a single character and other characters in the multiple characters except the corresponding character based on a single character, including:

所述分词单元针对同类型多个字符中的第一个字符,在第一字符后添加多个字符中的第二个字符,得到第一字符序;在第一字符序后添加多个字符中的第三字符,得到第二字符序列;以此类推,直至在字符序后添加多个字符中的最后一个字符,得到字符序列;The word segmentation unit, for the first character in the multiple characters of the same type, adds the second character of the multiple characters after the first character to obtain the first character sequence; and adds the multiple characters after the first character sequence. The third character of , the second character sequence is obtained; and so on, until the last character of multiple characters is added after the character sequence, the character sequence is obtained;

所述分词单元针对同类型多个字符中的第二个字符,在第二字符添加多个字符中的第三字符,得到第三字符序列;在第三字符序后添加多个字符中的第四字符,得到第四字符序列;以此类推,直至在字符序后添加多个字符中的最后一个字符,得到字符序;The word segmentation unit adds the third character of the multiple characters to the second character for the second character of the multiple characters of the same type to obtain a third character sequence; after the third character sequence, adds the third character of the multiple characters. Four characters, the fourth character sequence is obtained; and so on, until the last character of multiple characters is added after the character sequence, the character sequence is obtained;

所述分词单元针对同类型多个字符中除第一字符、第二字符外的其它字符,以此类推,直至同类型字符中的最后一个字符;The word segmentation unit is directed to other characters except the first character and the second character in the multiple characters of the same type, and so on, until the last character in the same type of characters;

分字符处理得到的所有字符形成字符序列。All characters obtained by character processing form character sequences.

在一实施例中,所述索引建立单元,具体用于:In one embodiment, the index establishing unit is specifically used for:

利用得到的多个字符序列,基于FTS5技术,为所述第一数据创建检索索引。Using the obtained multiple character sequences, based on the FTS5 technology, a retrieval index is created for the first data.

实际应用时,可以在一获取到所述第一数据时,就为所述第一数据创建检索索引。In practical application, a retrieval index may be created for the first data when the first data is acquired.

基于此,在一实施例中,所述分词单元,具体用于:Based on this, in one embodiment, the word segmentation unit is specifically used for:

本地获取到所述第一数据的同时,利用所述第一数据,确定所述第一数据的特征字符;并利用单字符分字符方式对提取的特征字符进行分字符处理,得到字符序列;While locally acquiring the first data, use the first data to determine the characteristic characters of the first data; and use the single-character character division method to perform character segmentation processing on the extracted characteristic characters to obtain a character sequence;

相应地,索引建立单元,用于利用得到的字符序列,基于FTS技术,为所述第一数据创建检索索引;并将创建的检索索引存储至本地数据库。Correspondingly, the index establishing unit is configured to use the obtained character sequence to create a retrieval index for the first data based on the FTS technology; and store the created retrieval index to the local database.

实际应用时,还可以在所述终端应用所述第一数据后,在所述终端空闲时(比如处理器的负载小于预设负载量等),为所述第一数据创建检索索引。In practical application, after the terminal applies the first data, when the terminal is idle (for example, the load of the processor is less than a preset load amount, etc.), a retrieval index may be created for the first data.

基于此,在一实施例中,所述分词单元,具体用于:Based on this, in one embodiment, the word segmentation unit is specifically used for:

所述第一数据应用到所述终端上后,确定未为所述第一数据创建检索索引时,利用所述第一数据,确定所述第一数据的特征字符;并利用单字符分字符方式对提取的特征字符进行分字符处理,得到字符序列;After the first data is applied to the terminal, when it is determined that a retrieval index has not been created for the first data, the first data is used to determine the characteristic characters of the first data; and a single-character character separation method is used. Perform character segmentation on the extracted characteristic characters to obtain character sequences;

相应地,索引建立单元,用于利用得到的字符序列,基于FTS技术,为所述第一数据创建检索索引;并将创建的检索索引存储至本地数据库。Correspondingly, the index establishing unit is configured to use the obtained character sequence to create a retrieval index for the first data based on the FTS technology; and store the created retrieval index to the local database.

其中,在一实施例中,确定未为所述第一数据创建检索索引,包括:Wherein, in one embodiment, determining that a retrieval index is not created for the first data includes:

本地数据库中未存储有所述第一数据的检索索引时,所述分词单元确定未为所述第一数据创建检索索引。When the retrieval index of the first data is not stored in the local database, the word segmentation unit determines that a retrieval index has not been created for the first data.

实际应用时,所述获取单元41、查找单元42、处理单元43、分词单元及索引建立单元可由搜索装置中的处理器来实现。当然,处理器需要运行存储器中存储的程序来实现上述各程序模块的功能。In practical application, the obtaining unit 41 , the searching unit 42 , the processing unit 43 , the word segmentation unit and the index establishing unit can be implemented by a processor in the search device. Of course, the processor needs to run the program stored in the memory to realize the functions of the above program modules.

需要说明的是:上述实施例提供的搜索装置在进行搜索时,仅以上述各程序模块的划分进行举例说明,实际应用中,可以根据需要而将上述处理分配由不同的程序模块完成,即将装置的内部结构划分成不同的程序模块,以完成以上描述的全部或者部分处理。另外,上述实施例提供的搜索装置与搜索方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be noted that: when the search device provided in the above embodiment performs a search, only the division of the above program modules is used as an example for illustration. The internal structure is divided into different program modules to complete all or part of the processing described above. In addition, the search apparatus and the search method embodiments provided by the above embodiments belong to the same concept, and the specific implementation process thereof is detailed in the method embodiments, which will not be repeated here.

基于上述程序模块的硬件实现,且为了实现本发明实施例的方法,本发明实施例还提供了一种终端。图5为本发明实施例终端的硬件组成结构示意图,如图5所示,终端50包括:Based on the hardware implementation of the above program modules, and in order to implement the method of the embodiment of the present invention, the embodiment of the present invention further provides a terminal. FIG. 5 is a schematic diagram of a hardware structure of a terminal according to an embodiment of the present invention. As shown in FIG. 5 , the terminal 50 includes:

通信接口51,能够与其它设备比如网络设备等进行信息交互;A communication interface 51, capable of information interaction with other devices such as network devices;

第一处理器52,与所述通信接口51连接,以实现与其它设备进行信息交互,用于运行计算机程序时,执行上述终端侧一个或多个技术方案提供的方法。而所述计算机程序存储在存储器53上。The first processor 52 is connected to the communication interface 51 to realize information interaction with other devices, and is used to execute the methods provided by one or more technical solutions on the terminal side when running a computer program. And the computer program is stored on the memory 53 .

当然,实际应用时,终端50中的各个组件通过总线系统54耦合在一起。可理解,总线系统54用于实现这些组件之间的连接通信。总线系统54除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图5中将各种总线都标为总线系统54。Of course, in practical application, the various components in the terminal 50 are coupled together through the bus system 54 . It will be appreciated that the bus system 54 is used to implement the connection communication between these components. In addition to the data bus, the bus system 54 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, the various buses are designated as bus system 54 in FIG. 5 .

本发明实施例中的存储器53用于存储各种类型的数据以支持终端50的操作。这些数据的示例包括:用于在终端50上操作的任何计算机程序。The memory 53 in the embodiment of the present invention is used to store various types of data to support the operation of the terminal 50 . Examples of these data include: any computer program used to operate on the terminal 50 .

可以理解,存储器53可以是易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(ROM,Read Only Memory)、可编程只读存储器(PROM,Programmable Read-Only Memory)、可擦除可编程只读存储器(EPROM,Erasable Programmable Read-Only Memory)、电可擦除可编程只读存储器(EEPROM,Electrically Erasable Programmable Read-Only Memory)、磁性随机存取存储器(FRAM,ferromagnetic random access memory)、快闪存储器(Flash Memory)、磁表面存储器、光盘、或只读光盘(CD-ROM,Compact Disc Read-Only Memory);磁表面存储器可以是磁盘存储器或磁带存储器。易失性存储器可以是随机存取存储器(RAM,Random AccessMemory),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(SRAM,Static Random Access Memory)、同步静态随机存取存储器(SSRAM,Synchronous Static Random Access Memory)、动态随机存取存储器(DRAM,Dynamic Random Access Memory)、同步动态随机存取存储器(SDRAM,SynchronousDynamic Random Access Memory)、双倍数据速率同步动态随机存取存储器(DDRSDRAM,Double Data Rate Synchronous Dynamic Random Access Memory)、增强型同步动态随机存取存储器(ESDRAM,Enhanced Synchronous Dynamic Random Access Memory)、同步连接动态随机存取存储器(SLDRAM,SyncLink Dynamic Random Access Memory)、直接内存总线随机存取存储器(DRRAM,Direct Rambus Random Access Memory)。本发明实施例描述的存储器52旨在包括但不限于这些和任意其它适合类型的存储器。It is understood that the memory 53 may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memory. Among them, the non-volatile memory may be a read-only memory (ROM, Read Only Memory), a programmable read-only memory (PROM, Programmable Read-Only Memory), an erasable programmable read-only memory (EPROM, Erasable Programmable Read-only memory) Only Memory), Electrically Erasable Programmable Read-Only Memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), Magnetic Random Access Memory (FRAM, ferromagnetic random access memory), Flash Memory (Flash Memory), Magnetic Surface Memory , CD-ROM, or Compact Disc Read-Only Memory (CD-ROM, Compact Disc Read-Only Memory); the magnetic surface memory can be a magnetic disk memory or a tape memory. The volatile memory may be Random Access Memory (RAM), which is used as an external cache memory. By way of example and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Synchronous Static Random Access Memory (SSRAM), Dynamic Random Access Memory Memory (DRAM, Dynamic Random Access Memory), Synchronous Dynamic Random Access Memory (SDRAM, SynchronousDynamic Random Access Memory), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM, Double Data Rate Synchronous Dynamic Random Access Memory), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), Synchronous Link Dynamic Random Access Memory (SLDRAM, SyncLink Dynamic Random Access Memory), Direct Memory Bus Random Access Memory (DRRAM, Direct Rambus Random Access Memory) . The memory 52 described in the embodiments of the present invention is intended to include, but not be limited to, these and any other suitable types of memory.

上述本发明实施例揭示的方法可以应用于处理器52中,或者由处理器52实现。处理器52可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器52中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器52可以是通用处理器、DSP,或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。处理器52可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本发明实施例所公开的方法的步骤,可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于存储介质中,该存储介质位于存储器53,处理器52读取存储器53中的程序,结合其硬件完成前述方法的步骤。The methods disclosed in the above embodiments of the present invention may be applied to the processor 52 or implemented by the processor 52 . The processor 52 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method can be completed by a hardware integrated logic circuit in the processor 52 or an instruction in the form of software. The aforementioned processor 52 may be a general-purpose processor, a DSP, or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. The processor 52 may implement or execute the methods, steps, and logical block diagrams disclosed in the embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in combination with the embodiments of the present invention can be directly embodied as being executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software module may be located in a storage medium, and the storage medium is located in the memory 53, and the processor 52 reads the program in the memory 53, and completes the steps of the foregoing method in combination with its hardware.

可选地,所述处理器52执行所述程序时实现本发明实施例的各个方法中由终端实现的相应流程,为了简洁,在此不再赘述。Optionally, when the processor 52 executes the program, the corresponding processes implemented by the terminal in each method of the embodiments of the present invention are implemented, which is not repeated here for brevity.

在示例性实施例中,本发明实施例还提供了一种存储介质,即计算机存储介质,具体为计算机可读存储介质,例如包括存储计算机程序的第一存储器53,上述计算机程序可由终端的处理器52执行,以完成前述方法所述步骤。计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、Flash Memory、磁表面存储器、光盘、或CD-ROM等存储器。In an exemplary embodiment, an embodiment of the present invention further provides a storage medium, that is, a computer storage medium, specifically a computer-readable storage medium, for example, including a first memory 53 storing a computer program, and the above-mentioned computer program can be processed by the terminal The device 52 is executed to complete the steps of the aforementioned method. The computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM.

在本申请所提供的几个实施例中,应该理解到,所揭露的装置、终端和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。In the several embodiments provided in this application, it should be understood that the disclosed apparatus, terminal and method may be implemented in other manners. The device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined, or Can be integrated into another system, or some features can be ignored, or not implemented. In addition, the coupling, or direct coupling, or communication connection between the various components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms. of.

上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。The unit described above as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit, that is, it may be located in one place or distributed to multiple network units; Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.

另外,在本发明各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may all be integrated into one processing unit, or each unit may be separately used as a unit, or two or more units may be integrated into one unit; the above-mentioned integration The unit can be implemented either in the form of hardware or in the form of hardware plus software functional units.

本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于一计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps of implementing the above method embodiments can be completed by program instructions related to hardware, the aforementioned program can be stored in a computer-readable storage medium, and when the program is executed, execute It includes the steps of the above method embodiments; and the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic disk or an optical disk and other media that can store program codes.

或者,本发明上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本发明各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Alternatively, if the above-mentioned integrated unit of the present invention is implemented in the form of a software function module and sold or used as an independent product, it may also be stored in a computer-readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in the form of software products in essence or the parts that make contributions to the prior art. The computer software products are stored in a storage medium and include several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) is caused to execute all or part of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic disk or an optical disk and other mediums that can store program codes.

以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。The above are only specific embodiments of the present invention, but the protection scope of the present invention is not limited to this. Any person skilled in the art can easily think of changes or substitutions within the technical scope disclosed by the present invention. should be included within the protection scope of the present invention. Therefore, the protection scope of the present invention should be based on the protection scope of the claims.

Claims (12)

1.一种搜索方法,其特征在于,应用于终端,包括:1. a search method, is characterized in that, is applied to terminal, comprises: 获取关键词;get keywords; 在本地数据库查找以获取的关键词为索引的检索索引;所述本地数据库包含至少一个数据对应的检索索引;数据的检索索引是利用单字符分字符方式将相应数据的特征字符进行分字符处理,并利用分字符处理得到的字符序列基于全文检索FTS技术创建的;Search the local database for a retrieval index indexed by the acquired keywords; the local database contains at least one retrieval index corresponding to the data; the retrieval index of the data is to use a single-character character-separation method to characterize the characteristic characters of the corresponding data. It is created based on full-text retrieval FTS technology using the character sequence obtained by character segmentation; 将本地数据库中与查找到的检索索引对应的数据作为搜索结果;Use the data corresponding to the search index found in the local database as the search result; 输出搜索结果。Output search results. 2.根据权利要求1所述的方法,其特征在于,所述方法还包括:2. The method according to claim 1, wherein the method further comprises: 针对应用在本地的第一数据,利用所述第一数据,确定所述第一数据的特征字符;For the first data applied locally, using the first data, determine the characteristic characters of the first data; 利用单字符分字符方式对提取的特征字符进行分字符处理,得到字符序列;Use the single-character character-separation method to characterize the extracted characteristic characters to obtain a character sequence; 利用得到的字符序列,基于FTS技术,为所述第一数据创建检索索引;Using the obtained character sequence, based on the FTS technology, a retrieval index is created for the first data; 将创建的检索索引存储至本地数据库。Store the created index to the local database. 3.根据权利要求2所述的方法,其特征在于,所述利用单字符分字符方式对提取的特征字符进行分字符处理,包括:3. The method according to claim 2, wherein the character-dividing process is performed on the extracted characteristic characters using the single-character character-dividing mode, comprising: 针对提取的特征字符中同类型的多个字符,按照单个字符进行分字符;并按照单个字符与多个字符中除相应字符外的其它字符进行基于单个字符组合的方式进行分字符。For multiple characters of the same type in the extracted characteristic characters, character segmentation is performed according to a single character; and character segmentation is performed based on the combination of a single character and other characters in the multiple characters except the corresponding character based on a single character. 4.根据权利要求3所述的方法,其特征在于,所述按照单个字符与多个字符中除相应字符外的其它字符进行基于单个字符组合的方式进行分字符,包括:4. method according to claim 3, is characterized in that, described according to single character and other characters in multiple characters except corresponding character carry out the mode based on single character combination to carry out character division, comprise: 针对同类型多个字符中的第一个字符,在第一字符后添加多个字符中的第二个字符,得到第一字符序;在第一字符序后添加多个字符中的第三字符,得到第二字符序列;以此类推,直至在字符序后添加多个字符中的最后一个字符,得到字符序列;For the first character in multiple characters of the same type, add the second character of the multiple characters after the first character to obtain the first character sequence; add the third character of the multiple characters after the first character sequence , to get the second character sequence; and so on, until the last character of multiple characters is added after the character sequence to obtain the character sequence; 针对同类型多个字符中的第二个字符,在第二字符添加多个字符中的第三字符,得到第三字符序列;在第三字符序后添加多个字符中的第四字符,得到第四字符序列;以此类推,直至在字符序后添加多个字符中的最后一个字符,得到字符序;For the second character in multiple characters of the same type, add the third character of the multiple characters to the second character to obtain the third character sequence; add the fourth character of the multiple characters after the third character sequence to obtain The fourth character sequence; and so on, until the last character of multiple characters is added after the character sequence to obtain the character sequence; 针对同类型多个字符中除第一字符、第二字符外的其它字符,以此类推,直至同类型字符中的最后一个字符;For other characters except the first character and the second character among multiple characters of the same type, and so on, until the last character of the same type of characters; 分字符处理得到的所有字符形成字符序列。All characters obtained by character processing form character sequences. 5.根据权利要求2所述的方法,其特征在于,所述利用得到的多个字符序列,基于全文检索技术,为所述第一数据创建检索索引,包括:5. The method according to claim 2, wherein, creating a retrieval index for the first data by using the obtained multiple character sequences based on a full-text retrieval technology, comprising: 利用得到的多个字符序列,基于FTS5技术,为所述第一数据创建检索索引。Using the obtained multiple character sequences, based on the FTS5 technology, a retrieval index is created for the first data. 6.根据权利要求2所述的方法,其特征在于,本地获取到所述第一数据的同时,为所述第一数据创建检索索引。6 . The method according to claim 2 , wherein a retrieval index is created for the first data while locally acquiring the first data. 7 . 7.根据权利要求2所述的方法,其特征在于,7. The method of claim 2, wherein 所述第一数据应用到所述终端上后,确定未为所述第一数据创建检索索引时,为所述第一数据创建检索索引。After the first data is applied to the terminal, when it is determined that a retrieval index is not created for the first data, a retrieval index is created for the first data. 8.根据权利要求7所述的方法,其特征在于,所述确定未为所述第一数据创建检索索引,包括:8. The method according to claim 7, wherein the determining that a retrieval index is not created for the first data comprises: 本地数据库中未存储有所述第一数据的检索索引时,确定未为所述第一数据创建检索索引。When the retrieval index of the first data is not stored in the local database, it is determined that a retrieval index is not created for the first data. 9.根据权利要求1至8任一项所述的方法,其特征在于,搜索的数据为本地安装的应用程序APP。9. The method according to any one of claims 1 to 8, wherein the searched data is a locally installed application program APP. 10.一种搜索装置,其特征在于,设置在终端上,包括:10. A search device, characterized in that, being arranged on a terminal, comprising: 获取单元,用于获取关键词;Get unit, used to get keywords; 查找单元,用于在本地数据库查找以获取的关键词为索引的检索索引;所述本地数据库包含至少一个数据对应的检索索引;数据的检索索引是利用单字符分字符方式将相应数据的特征字符进行分字符处理,并利用分字符处理得到的字符序列基于FTS技术创建的;The search unit is used to search the retrieval index with the obtained keyword as the index in the local database; the local database contains at least one retrieval index corresponding to the data; Perform character segmentation processing, and use the character sequence obtained by character segmentation processing to create based on FTS technology; 处理单元,用于将本地数据库中与查找到的检索索引对应的数据作为搜索结果;并输出搜索结果。The processing unit is used for taking the data corresponding to the search index found in the local database as the search result; and outputting the search result. 11.一种终端,其特征在于,包括:处理器和用于存储能够在处理器上运行的计算机程序的存储器,11. A terminal, characterized in that, comprising: a processor and a memory for storing a computer program that can be run on the processor, 其中,所述处理器用于运行所述计算机程序时,执行权利要求1至9任一项所述方法的步骤。Wherein, when the processor is configured to execute the computer program, the steps of the method of any one of claims 1 to 9 are executed. 12.一种存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至9任一项所述方法的步骤。12. A storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 9 are implemented.
CN201910493017.8A 2019-06-06 2019-06-06 Searching method, searching device, terminal and storage medium Pending CN112052361A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910493017.8A CN112052361A (en) 2019-06-06 2019-06-06 Searching method, searching device, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910493017.8A CN112052361A (en) 2019-06-06 2019-06-06 Searching method, searching device, terminal and storage medium

Publications (1)

Publication Number Publication Date
CN112052361A true CN112052361A (en) 2020-12-08

Family

ID=73609782

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910493017.8A Pending CN112052361A (en) 2019-06-06 2019-06-06 Searching method, searching device, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN112052361A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114201532A (en) * 2021-12-09 2022-03-18 富途网络科技(深圳)有限公司 Data query method, device, medium and equipment based on ESOP system
CN117130981A (en) * 2023-01-04 2023-11-28 荣耀终端有限公司 Application content searching method and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102339294A (en) * 2010-07-27 2012-02-01 卓望数码技术(深圳)有限公司 Searching method and system for preprocessing keywords
CN107861753A (en) * 2017-06-26 2018-03-30 平安普惠企业管理有限公司 APP generations index, search method and system and readable storage medium storing program for executing
CN108241713A (en) * 2016-12-27 2018-07-03 南京烽火软件科技有限公司 A kind of inverted index search method based on polynary cutting

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102339294A (en) * 2010-07-27 2012-02-01 卓望数码技术(深圳)有限公司 Searching method and system for preprocessing keywords
CN108241713A (en) * 2016-12-27 2018-07-03 南京烽火软件科技有限公司 A kind of inverted index search method based on polynary cutting
CN107861753A (en) * 2017-06-26 2018-03-30 平安普惠企业管理有限公司 APP generations index, search method and system and readable storage medium storing program for executing

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114201532A (en) * 2021-12-09 2022-03-18 富途网络科技(深圳)有限公司 Data query method, device, medium and equipment based on ESOP system
CN117130981A (en) * 2023-01-04 2023-11-28 荣耀终端有限公司 Application content searching method and electronic equipment
CN117130981B (en) * 2023-01-04 2025-01-07 荣耀终端有限公司 Application content searching method and electronic equipment

Similar Documents

Publication Publication Date Title
CN104899220B (en) Application program recommendation method and system
CN112487150B (en) File management method, system, storage medium and electronic equipment
US9910932B2 (en) System and method for completing a user query and for providing a query response
CN110413738A (en) An information processing method, device, server and storage medium
US11526575B2 (en) Web browser with enhanced history classification
US11347815B2 (en) Method and system for generating an offline search engine result page
US20180218079A1 (en) Method and system for generating phrase blacklist to prevent certain content from appearing in a search result in response to search queries
CN111858831B (en) APT analysis-based elastic search index method, device and computer equipment
WO2015081848A1 (en) Socialized extended search method and corresponding device and system
CN115080514B (en) Index data generation method, information retrieval method, device and computer system
CN108959436A (en) Dictionary edit methods and system for voice dialogue platform
US20170357712A1 (en) Method and system for searching and identifying content items in response to a search query using a matched keyword whitelist
WO2019013833A1 (en) Multiple entity aware typeahead in searches
WO2019071907A1 (en) Method for identifying help information based on operation page, and application server
CN110750627A (en) Material retrieval method and device, electronic equipment and storage medium
CN112559913B (en) Data processing method, device, computing equipment and readable storage medium
CN114356968A (en) Query statement generation method and device, computer equipment and storage medium
CN112052361A (en) Searching method, searching device, terminal and storage medium
CN110489032B (en) Dictionary query method for electronic book and electronic equipment
US20220398291A1 (en) Smart browser history search
CN112115228A (en) Search method, device, terminal and storage medium
CN114297143A (en) A method for searching files, a method, device and mobile terminal for displaying files
US11836197B2 (en) Search processing method and apparatus based on clipboard data
US8892596B1 (en) Identifying related documents based on links in documents
CN113220992B (en) A method, system and medium for recommending information flow content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201208