[go: up one dir, main page]

CN105760399A - Data retrieval method and device - Google Patents

Data retrieval method and device Download PDF

Info

Publication number
CN105760399A
CN105760399A CN201410799739.3A CN201410799739A CN105760399A CN 105760399 A CN105760399 A CN 105760399A CN 201410799739 A CN201410799739 A CN 201410799739A CN 105760399 A CN105760399 A CN 105760399A
Authority
CN
China
Prior art keywords
text
call center
keyword
retrieval
center server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410799739.3A
Other languages
Chinese (zh)
Inventor
赵荣乾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201410799739.3A priority Critical patent/CN105760399A/en
Publication of CN105760399A publication Critical patent/CN105760399A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例公开了一种数据检索方法及设备,涉及通信技术领域,用以提高检索结果的准确性,并提高检索效率。具体方案为:呼叫中心服务器接收用户终端发送的语音信息,并获取对语音信息进行文本转换得到的文本信息;获取对文本信息进行文本分析识别出的第一分词集合,第一分词集合中包含至少一个文本分词;从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,关键字索引列表中包含至少一个关键字索引项,关键字索引项中包含检索关键字和与检索关键字对应的检索数据的标识;从知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。本发明用于数据检索过程中。

The embodiment of the invention discloses a data retrieval method and equipment, which relate to the field of communication technology and are used to improve the accuracy of retrieval results and improve retrieval efficiency. The specific solution is: the call center server receives the voice information sent by the user terminal, and obtains the text information obtained by converting the voice information into text; obtains the first word segmentation set identified by text analysis of the text information, and the first word segmentation set contains at least A text word segmentation; from the keyword index list of the retrieval data stored in the knowledge database, search for the retrieval keyword matching the text word segmentation in the first word segmentation set, the keyword index list contains at least one keyword index item, keyword The index item includes the search keyword and the identifier of the search data corresponding to the search keyword; from the knowledge database, the search data indicated by the identifier of the retrieved data corresponding to the found search keyword is retrieved. The invention is used in data retrieval process.

Description

一种数据检索方法及设备A data retrieval method and device

技术领域technical field

本发明涉及通信技术领域,尤其涉及一种数据检索方法及设备。The invention relates to the field of communication technology, in particular to a data retrieval method and device.

背景技术Background technique

随着信息化高速发展,各个行业或者领域的信息储备量也越来越多。例如,为用户提供通信服务的运营商为了能够向用户提供较为全面、快速、准确的业务咨询和资费查询等服务,构建了该电信运营商的客服人员能够进行数据检索的知识数据库。With the rapid development of informatization, the amount of information reserves in various industries or fields is also increasing. For example, in order to be able to provide users with more comprehensive, fast and accurate services such as business consultation and tariff inquiry, an operator that provides communication services to users has built a knowledge database that can be retrieved by the customer service personnel of the telecom operator.

其中,客服人员在通过呼叫中心服务器接通与用户的语音通信后,则可以根据个人的业务经验和技能从用户语音信息中识别出检索关键字;然后将检索关键字输入知识数据库,由知识数据库匹配出与检索关键字匹配的信息作为检索数据,并向客服人员展示检索数据;客服人员还需要根据个人的业务经验和技能从知识数据库展示的检索数据中提取与用户语音信息关联度较高的检索数据,并根据提取出的检索数据为用户做出答复,或者将提取出的检索数据反馈给用户。Among them, after the customer service personnel connect to the voice communication with the user through the call center server, they can identify the search keywords from the user voice information according to their personal business experience and skills; then input the search keywords into the knowledge database, and the knowledge database Match the information that matches the search keyword as the search data, and display the search data to the customer service personnel; the customer service personnel also need to extract information that is highly correlated with the user's voice information from the search data displayed in the knowledge database based on their personal business experience and skills. Retrieve data, and respond to the user according to the extracted retrieval data, or feed back the extracted retrieval data to the user.

但是,上述数据检索过程中客服人员从知识数据库中检索时,若要快速的检索到与用户语音信息之间的匹配度和关联度较高的检索数据,则需要客服人员具备丰富的个人的业务经验和较高的个人的业务技能,检索结果的准确性过分依赖于客服人员的个人的业务经验和技能;并且,客服人员在检索过程中不可避免的会由于疏忽或者大意出现一些检索错误,则会影响检索的准确性和检索效率,导致检索结果的准确性和检索效率较低。However, in the above data retrieval process, when the customer service personnel retrieve from the knowledge database, if they want to quickly retrieve the retrieved data with a high degree of matching and correlation with the user's voice information, it is necessary for the customer service personnel to have rich personal business knowledge. Experience and high personal business skills, the accuracy of the search results depends too much on the personal business experience and skills of the customer service personnel; and, the customer service personnel will inevitably make some retrieval errors due to negligence or carelessness during the retrieval process, then It will affect the accuracy and retrieval efficiency of retrieval, resulting in low accuracy and retrieval efficiency of retrieval results.

发明内容Contents of the invention

本发明的实施例提供一种数据检索方法及设备,用以提高检索结果的准确性,并提高检索效率。Embodiments of the present invention provide a data retrieval method and device to improve the accuracy of retrieval results and improve retrieval efficiency.

为达到上述目的,本发明的实施例采用如下技术方案:In order to achieve the above object, embodiments of the present invention adopt the following technical solutions:

本发明实施例的第一方面,提供一种数据检索方法,包括:The first aspect of the embodiments of the present invention provides a data retrieval method, including:

呼叫中心服务器接收用户终端发送的语音信息,并获取对所述语音信息进行文本转换得到的文本信息;The call center server receives the voice information sent by the user terminal, and obtains the text information obtained by converting the voice information into text;

所述呼叫中心服务器获取对所述文本信息进行文本分析识别出的第一分词集合,所述第一分词集合中包含至少一个文本分词;The call center server acquires a first word segmentation set identified by performing text analysis on the text information, and the first word segmentation set includes at least one text segmentation word;

所述呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与所述第一分词集合中的文本分词匹配的检索关键字,所述关键字索引列表中包含至少一个关键字索引项,所述关键字索引项中包含检索关键字和与所述检索关键字对应的检索数据的标识,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识;The call center server searches for a retrieval keyword that matches the text word segmentation in the first word segmentation set from the keyword index list of the retrieval data stored in the knowledge database, and the keyword index list contains at least one keyword An index item, the keyword index item includes a search keyword and an identifier of the search data corresponding to the search keyword, and at least one search data and the identifier of the search data are stored in the knowledge database;

所述呼叫中心服务器从所述知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。The call center server retrieves, from the knowledge database, the retrieved data indicated by the identifier of the retrieved data corresponding to the found retrieved keyword.

结合第一方面,在第一种可能的实现方式中,在所述呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,所述方法还包括:With reference to the first aspect, in a first possible implementation manner, in the keyword index items in the keyword index list of the retrieved data stored by the call center server in the knowledge database, search Before the search keyword matched by the text word segmentation, the method also includes:

所述呼叫中心服务器读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;The call center server reads the first retrieval data stored in the knowledge database, and the first retrieval data is any retrieval data stored in the knowledge database;

所述呼叫中心服务器获取对所述第一检索数据进行文本分析识别出的第二分词集合,所述第二分词集合中包含至少一个文本分词;The call center server obtains a second word segmentation set identified by performing text analysis on the first retrieval data, and the second word segmentation set includes at least one text segmentation word;

所述呼叫中心服务器根据所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;The call center server generates the retrieval keywords of the first retrieval data according to the text segmentation in the second segmentation set;

所述呼叫中心服务器根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项,生成包含至少一个所述关键字索引项的所述关键字索引列表。The call center server generates one keyword index item according to the retrieval keyword of the first retrieval data and the identifier of the first retrieval data, and generates the keyword index containing at least one keyword index item list.

结合第一方面,在第二种可能的实现方式中,在所述呼叫中心服务器从所述知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,所述方法还包括:With reference to the first aspect, in a second possible implementation manner, in the keyword index items in the keyword index list of the retrieved data stored by the call center server in the knowledge database, search for items related to the first Before the retrieval keyword matched by the text word segmentation in the word segmentation set, the method also includes:

接收文本分析服务器发送的所述关键字索引列表。The keyword index list sent by the text analysis server is received.

结合第一方面、第一种可能的实现方式和第二种可能的实现方式中的任意一项,在第三种可能的实现方式中,所述呼叫中心服务器获取对所述文本信息进行文本分析识别出的第一分词集合,包括:With reference to any one of the first aspect, the first possible implementation manner, and the second possible implementation manner, in a third possible implementation manner, the call center server obtains and performs text analysis on the text information The recognized first participle set, including:

所述呼叫中心服务器对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词;The call center server performs text analysis on the text information, and recognizes a third participle set, which contains at least one text participle;

所述呼叫中心服务器过滤所述第三分词集合中包含的无意义文本分词,得到所述第一分词集合,所述无意义文本分词包括:惊叹词和语气词。The call center server filters the meaningless text participles included in the third participle set to obtain the first participle set, and the meaningless text participle includes: exclamation words and modal particles.

结合第一方面、第一种可能的实现方式和第二种可能的实现方式中的任意一项,在第四种可能的实现方式中,所述呼叫中心服务器获取对所述文本信息进行文本分析识别出的第一分词集合,包括:With reference to any one of the first aspect, the first possible implementation manner, and the second possible implementation manner, in a fourth possible implementation manner, the call center server obtains and performs text analysis on the text information The recognized first participle set, including:

所述呼叫中心服务器对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词;The call center server performs text analysis on the text information, and recognizes a third participle set, which contains at least one text participle;

所述呼叫中心服务器过滤所述第三分词集合中包含的第一文本分词,得到所述第一分词集合,所述第一文本分词为所述第三分词集合中与所述关键字索引列表中的任意一个检索关键字均不匹配的文本分词。The call center server filters the first text segmentation included in the third word segmentation set to obtain the first text segmentation set, and the first text segmentation is the combination of the third word segmentation set and the keyword index list A text segment that does not match any of the search keywords.

结合第一方面或第二种可能的实现方式,在第五种可能的实现方式中,所述语音信息由以句子为维度的至少一个语音片段构成,所述文本信息由至少一个文本片段构成;With reference to the first aspect or the second possible implementation manner, in a fifth possible implementation manner, the speech information is composed of at least one speech segment whose dimension is a sentence, and the text information is composed of at least one text segment;

所述呼叫中心服务器接收用户终端发送的语音信息,并获取对所述语音信息进行文本转换得到的文本信息,包括:The call center server receives the voice information sent by the user terminal, and obtains the text information obtained by converting the voice information into text, including:

所述呼叫中心服务器接收所述用户终端发送的所述语音片段;The call center server receives the voice segment sent by the user terminal;

所述呼叫中心服务器在接收到一个所述语音片段后,向语音分析服务器发送所述语音片段,以使得所述语音分析服务器将所述语音片段转换为所述文本片段;The call center server sends the voice segment to the voice analysis server after receiving one of the voice segments, so that the voice analysis server converts the voice segment into the text segment;

所述呼叫中心服务器接收所述语音分析服务器发送的所述文本片段。The call center server receives the text segment sent by the speech analysis server.

结合第一方面或第二种可能的实现方式,在第六种可能的实现方式中,所述呼叫中心服务器获取对所述文本信息进行文本分析识别出的第一分词集合,包括:With reference to the first aspect or the second possible implementation manner, in a sixth possible implementation manner, the call center server obtains the first word segmentation set identified by performing text analysis on the text information, including:

所述呼叫中心服务器向所述文本分析服务器发送所述文本信息,以使得所述文本分析服务器对所述文本信息进行文本分析,识别出所述第一分词集合;The call center server sends the text information to the text analysis server, so that the text analysis server performs text analysis on the text information and identifies the first word segmentation set;

所述呼叫中心服务器接收所述文本分析服务器发送的所述第一分词集合。The call center server receives the first word segmentation set sent by the text analysis server.

本发明实施例的第二方面,提供一种数据检索方法,包括:A second aspect of the embodiments of the present invention provides a data retrieval method, including:

文本分析服务器接收呼叫中心服务器发送的文本信息,所述文本信息为所述呼叫中心服务器根据语音信息转换得到的;The text analysis server receives the text information sent by the call center server, and the text information is converted by the call center server according to the voice information;

所述文本分析服务器对所述文本信息进行文本分析,识别出所述第一分词集合,所述第一分词集合中包含至少一个文本分词;The text analysis server performs text analysis on the text information, and identifies the first word segment set, and the first word segment set contains at least one text segment word;

所述文本分析服务器向所述呼叫中心服务器发送所述第一分词集合,以使得所述呼叫中心服务器根据所述第一分词集合中的文本分词从知识数据库中检索数据。The text analysis server sends the first word segmentation set to the call center server, so that the call center server retrieves data from the knowledge database according to the text segmentation words in the first word segmentation set.

结合第二方面,在第一种可能的实现方式中,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识;With reference to the second aspect, in a first possible implementation manner, at least one search data and an identifier of the search data are stored in the knowledge database;

所述方法还包括:The method also includes:

所述文本分析服务器读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;The text analysis server reads the first retrieval data stored in the knowledge database, and the first retrieval data is any retrieval data stored in the knowledge database;

所述文本分析服务器对所述第一检索数据进行文本分析,识别出第二分词集合,所述第二分词集合中包含至少一个文本分词;The text analysis server performs text analysis on the first retrieval data, and identifies a second word segmentation set, and the second word segmentation set contains at least one text segmentation word;

所述文本分析服务器根据所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;The text analysis server generates the retrieval keywords of the first retrieval data according to the text segmentation in the second segmentation set;

所述文本分析服务器根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项,生成包含至少一个所述关键字索引项的所述关键字索引列表;The text analysis server generates a keyword index item according to the search keyword of the first search data and the identifier of the first search data, and generates the keyword index containing at least one keyword index item list;

所述文本分析服务器向所述呼叫中心服务器发送所述关键字索引列表。The text analysis server sends the keyword index list to the call center server.

本发明实施例的第三方面,提供一种呼叫中心服务器,包括:A third aspect of the embodiments of the present invention provides a call center server, including:

接收单元,用于接收用户终端发送的语音信息;a receiving unit, configured to receive voice information sent by the user terminal;

文本转换单元,用于获取对所述接收单元接收的所述语音信息进行文本转换得到的文本信息;a text conversion unit, configured to obtain text information obtained by performing text conversion on the voice information received by the receiving unit;

文本分析单元,用于获取对所述文本转换单元获得的所述文本信息进行文本分析识别出的第一分词集合,所述第一分词集合中包含至少一个文本分词;a text analysis unit, configured to obtain a first word segmentation set identified by performing text analysis on the text information obtained by the text conversion unit, and the first word segmentation set includes at least one text segment;

关键字查找单元,用于从知识数据库中存储的检索数据的关键字索引列表中,查找与所述文本分析单元获得的所述第一分词集合中的文本分词匹配的检索关键字,所述关键字索引列表中包含至少一个关键字索引项,所述关键字索引项中包含检索关键字和与所述检索关键字对应的检索数据的标识,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识;A keyword search unit, configured to search for a search keyword that matches the text word segmentation in the first word segmentation set obtained by the text analysis unit from the keyword index list of the retrieval data stored in the knowledge database, the keyword The word index list contains at least one keyword index item, and the keyword index item includes a search keyword and an identification of the search data corresponding to the search keyword, and at least one search data and the search data are stored in the knowledge database. identification of the retrieved data;

检索单元,用于从所述知识数据库中,检索所述关键字查找单元查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。The retrieval unit is configured to retrieve, from the knowledge database, the retrieval data indicated by the identifier of the retrieved data corresponding to the retrieval keyword found by the keyword searching unit.

结合第三方面,在第一种可能的实现方式中,所述呼叫中心服务器,还可以包括:With reference to the third aspect, in a first possible implementation manner, the call center server may further include:

读取单元,用于在所述关键字查找单元从所述知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;A reading unit, configured to search for, among the keyword index items in the keyword index list of the retrieval data stored in the knowledge database by the keyword search unit, the words that match the text word segmentation in the first word segmentation set Before retrieving keywords, read the first retrieval data stored in the knowledge database, and the first retrieval data is any retrieval data stored in the knowledge database;

所述文本分析单元,还用于获取对所述读取单元读取的所述第一检索数据进行文本分析识别出的第二分词集合,所述第二分词集合中包含至少一个文本分词;The text analysis unit is further configured to obtain a second word segmentation set identified by performing text analysis on the first retrieval data read by the reading unit, and the second word segmentation set includes at least one text segmentation word;

生成单元,用于根据所述文本分析单元获得的所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项。A generation unit, configured to generate a retrieval keyword of the first retrieval data according to the text word segmentation in the second word segmentation set obtained by the text analysis unit; according to the retrieval keyword of the first retrieval data and the The identification of the first retrieved data generates a key index item.

结合第三方面,在第二种可能的实现方式中,所述接收单元,还用于在所述关键字查找单元从所述知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,接收文本分析服务器发送的所述关键字索引列表。With reference to the third aspect, in a second possible implementation manner, the receiving unit is further configured to use the keyword index in the keyword index list of the retrieved data stored in the knowledge database by the keyword searching unit In the item, before searching for the retrieval keyword matching the text word segmentation in the first word segmentation set, the keyword index list sent by the text analysis server is received.

结合第三方面、第一种可能的实现方式和第二种可能的实现方式中的任意一项,在第三种可能的实现方式中,所述文本分析单元,包括:In combination with any one of the third aspect, the first possible implementation manner, and the second possible implementation manner, in a third possible implementation manner, the text analysis unit includes:

文本分析模块,用于对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词;A text analysis module, configured to perform text analysis on the text information, and identify a third word segment set, the third word segment set includes at least one text segment word;

第一过滤模块,用于过滤所述文本分析模块识别出的所述第三分词集合中包含的无意义文本分词,得到所述第一分词集合,所述无意义文本分词包括:惊叹词和语气词。The first filtering module is used to filter the nonsense text participles contained in the third participle set identified by the text analysis module to obtain the first participle set, and the nonsense text participles include: exclamation words and tone word.

结合第三方面、第一种可能的实现方式和第二种可能的实现方式中的任意一项,在第四种可能的实现方式中,所述文本分析单元,包括:In combination with any one of the third aspect, the first possible implementation manner, and the second possible implementation manner, in a fourth possible implementation manner, the text analysis unit includes:

文本分析模块,用于对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词;A text analysis module, configured to perform text analysis on the text information, and identify a third word segment set, the third word segment set includes at least one text segment word;

第二过滤模块,用于过滤所述文本分析模块识别出的所述第三分词集合中包含的第一文本分词,得到所述第一分词集合,所述第一文本分词为所述第三分词集合中与所述关键字索引列表中的任意一个检索关键字均不匹配的文本分词。The second filtering module is configured to filter the first text segmentation contained in the third segmentation set identified by the text analysis module to obtain the first segmentation set, and the first text segmentation is the third segmentation Text word segmentation in the collection that does not match any search keyword in the keyword index list.

结合第三方面或第二种可能的实现方式,在第五种可能的实现方式中,所述语音信息由以句子为维度的至少一个语音片段构成,所述文本信息由至少一个文本片段构成;With reference to the third aspect or the second possible implementation manner, in a fifth possible implementation manner, the speech information is composed of at least one speech segment whose dimension is a sentence, and the text information is composed of at least one text segment;

所述接收单元,具体用于接收所述用户终端发送的所述语音片段;The receiving unit is specifically configured to receive the voice segment sent by the user terminal;

所述文本转换单元,具体用于在所述接收单元接收到一个所述语音片段后,向语音分析服务器发送所述语音片段,以使得所述语音分析服务器将所述语音片段转换为所述文本片段;接收所述语音分析服务器发送的所述文本片段。The text conversion unit is specifically configured to, after the receiving unit receives a voice segment, send the voice segment to a voice analysis server, so that the voice analysis server converts the voice segment into the text Fragment: receiving the text fragment sent by the speech analysis server.

结合第三方面或第二种可能的实现方式,在第六种可能的实现方式中,所述文本分析单元,具体用于向所述文本分析服务器发送所述文本信息,以使得所述文本分析服务器对所述文本信息进行文本分析,识别出所述第一分词集合;接收所述文本分析服务器发送的所述第一分词集合。With reference to the third aspect or the second possible implementation manner, in a sixth possible implementation manner, the text analysis unit is specifically configured to send the text information to the text analysis server, so that the text analysis The server performs text analysis on the text information to identify the first word segmentation set; and receives the first word segmentation set sent by the text analysis server.

本发明实施例的第四方面,提供一种文本分析服务器,包括:A fourth aspect of the embodiments of the present invention provides a text analysis server, including:

接收单元,用于接收呼叫中心服务器发送的文本信息,所述文本信息为所述呼叫中心服务器根据语音信息转换得到的;a receiving unit, configured to receive text information sent by the call center server, the text information being converted by the call center server according to the voice information;

文本分析单元,用于对所述接收单元接收的所述文本信息进行文本分析,识别出所述第一分词集合,所述第一分词集合中包含至少一个文本分词;A text analysis unit, configured to perform text analysis on the text information received by the receiving unit, and identify the first word segment set, the first word segment set includes at least one text segment word;

发送单元,用于向所述呼叫中心服务器发送所述文本分析单元分析得到的所述第一分词集合,以使得所述呼叫中心服务器根据所述第一分词集合中的文本分词从知识数据库中检索数据。A sending unit, configured to send the first word segmentation set analyzed by the text analysis unit to the call center server, so that the call center server retrieves from the knowledge database according to the text word segmentation in the first word segmentation set data.

结合第四方面,在第一种可能的实现方式中,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识;With reference to the fourth aspect, in a first possible implementation manner, at least one retrieval data and an identifier of the retrieval data are stored in the knowledge database;

所述文本分析服务器,还包括:The text analysis server also includes:

读取单元,用于读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;A reading unit, configured to read first retrieval data stored in the knowledge database, where the first retrieval data is any retrieval data stored in the knowledge database;

所述文本分析单元,还用于对所述读取单元读取的所述第一检索数据进行文本分析,识别出第二分词集合,所述第二分词集合中包含至少一个文本分词;The text analysis unit is further configured to perform text analysis on the first retrieval data read by the reading unit, and identify a second word segment set, the second word segment set includes at least one text segment word;

生成单元,用于根据所述文本分析单元分析得到的所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项,生成包含至少一个所述关键字索引项的所述关键字索引列表;A generating unit, configured to generate a retrieval keyword of the first retrieval data according to the text segmentation in the second segmentation set analyzed by the text analysis unit; according to the retrieval keyword of the first retrieval data and the The identification of the first retrieval data generates one of the keyword index items, and generates the keyword index list including at least one of the keyword index items;

所述发送单元,还用于向所述呼叫中心服务器发送所述生成单元生成的所述关键字索引列表。The sending unit is further configured to send the keyword index list generated by the generating unit to the call center server.

本发明实施例提供的数据检索方法及设备,呼叫中心服务器接收用户终端发送的语音信息,并获取对语音信息进行文本转换得到的文本信息;呼叫中心服务器获取对文本信息进行文本分析识别出的第一分词集合,第一分词集合中包含至少一个文本分词;呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,关键字索引列表中包含至少一个关键字索引项,关键字索引项中包含检索关键字和与检索关键字对应的检索数据的标识,知识数据库中存储有至少一个检索数据和所述检索数据的标识;呼叫中心服务器从所述知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。In the data retrieval method and device provided by the embodiments of the present invention, the call center server receives the voice information sent by the user terminal, and obtains the text information obtained by converting the voice information into text; A set of word segments, the first set of word segments contains at least one text segment; the call center server searches for the search keywords that match the text segments in the first set of word segments from the keyword index list of the retrieval data stored in the knowledge database, and the key The word index list includes at least one keyword index item, the keyword index item includes a search keyword and an identifier of the search data corresponding to the search keyword, and at least one search data and the identifier of the search data are stored in the knowledge database; The call center server retrieves, from the knowledge database, the retrieved data indicated by the identifier of the retrieved data corresponding to the found retrieved keyword.

与现有技术中从知识数据库检索数据时,过分依赖于客服人员的个人的业务经验和技能,检索结果的准确性和检索效率较低相比,通过本方案,呼叫中心服务器在接收到用户终端发送的语音信息后,可以通过对语音信息进行文本分析得到分析结果(第一分词集合),然后从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,不需要客服人员根据其个人的业务经验和技能从用户语音信息中识别出检索关键字,并且呼叫中心服务器自动进行检索关键字的识别相较于客服人员根据其个人的业务经验和技能进行检索关键字的识别,可以提高检索关键字的识别效率,并避免由于客服人员由于疏忽或者大意出现的检索错误,提高识别检索关键字的准确性,在高效、准确地获取到检索关键字后,则可以快速、准确地获取到检索结果。Compared with the prior art when retrieving data from the knowledge database, relying too much on the personal business experience and skills of the customer service personnel, the accuracy and retrieval efficiency of the retrieval results are relatively low, through this scheme, the call center server receives the user terminal After the voice information is sent, the analysis result (the first word segmentation set) can be obtained by performing text analysis on the voice information, and then from the keyword index list of the retrieval data stored in the knowledge database, search for the text word segmentation in the first word segmentation set Matched search keywords do not require customer service personnel to identify search keywords from user voice information based on their personal business experience and skills, and the call center server automatically recognizes search keywords compared to customer service personnel based on their personal business Recognition of search keywords based on experience and skills can improve the recognition efficiency of search keywords, avoid search errors due to negligence or carelessness of customer service personnel, improve the accuracy of recognition of search keywords, and obtain search keywords efficiently and accurately. After keywords, you can quickly and accurately obtain the search results.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained according to these drawings without any creative effort.

图1为本发明实施例一提供的一种数据检索方法流程图;Fig. 1 is a flow chart of a data retrieval method provided by Embodiment 1 of the present invention;

图2为本发明实施例二提供的一种数据检索方法流程图;Fig. 2 is a flow chart of a data retrieval method provided by Embodiment 2 of the present invention;

图3为本发明实施例三提供的一种数据检索方法流程图;FIG. 3 is a flowchart of a data retrieval method provided by Embodiment 3 of the present invention;

图4为本发明实施例四提供的一种数据检索方法流程图;FIG. 4 is a flowchart of a data retrieval method provided by Embodiment 4 of the present invention;

图5为本发明实施例五提供的一种呼叫中心服务器的结构组成示意图;FIG. 5 is a schematic diagram of the structural composition of a call center server provided in Embodiment 5 of the present invention;

图6为本发明实施例五提供的另一种呼叫中心服务器的结构组成示意图;FIG. 6 is a schematic diagram of the structural composition of another call center server provided by Embodiment 5 of the present invention;

图7为本发明实施例五提供的另一种呼叫中心服务器的结构组成示意图;FIG. 7 is a schematic diagram of the structure and composition of another call center server provided by Embodiment 5 of the present invention;

图8为本发明实施例五提供的另一种呼叫中心服务器的结构组成示意图;FIG. 8 is a schematic diagram of the structure and composition of another call center server provided by Embodiment 5 of the present invention;

图9为本发明实施例六提供的一种文本分析服务器的结构组成示意图;FIG. 9 is a schematic diagram of the structural composition of a text analysis server provided in Embodiment 6 of the present invention;

图10为本发明实施例六提供的另一种文本分析服务器的结构组成示意图;FIG. 10 is a schematic diagram of the structural composition of another text analysis server provided by Embodiment 6 of the present invention;

图11为本发明实施例七提供的一种呼叫中心服务器的结构组成示意图;FIG. 11 is a schematic diagram of the structural composition of a call center server provided by Embodiment 7 of the present invention;

图12为本发明实施例七提供的另一种呼叫中心服务器的结构组成示意图;FIG. 12 is a schematic diagram of the structural composition of another call center server provided by Embodiment 7 of the present invention;

图13为本发明实施例七提供的另一种呼叫中心服务器的结构组成示意图;FIG. 13 is a schematic diagram of the structural composition of another call center server provided by Embodiment 7 of the present invention;

图14为本发明实施例八提供的一种文本分析服务器的结构组成示意图;FIG. 14 is a schematic diagram of the structural composition of a text analysis server provided in Embodiment 8 of the present invention;

图15为本发明实施例八提供的一种数据检索系统的结构组成示意图。FIG. 15 is a schematic diagram of the structure and composition of a data retrieval system provided by Embodiment 8 of the present invention.

具体实施方式detailed description

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

本文中描述的各种技术可用于各种无线通信系统,例如当前2G,3G通信系统和下一代通信系统,例如全球移动通信系统(GSM,GlobalSystemforMobilecommunications),码分多址(CDMA,CodeDivisionMultipleAccess)系统,时分多址(TDMA,TimeDivisionMultipleAccess)系统,宽带码分多址(WCDMA,WidebandCodeDivisionMultipleAccessWireless),频分多址(FDMA,FrequencyDivisionMultipleAddressing)系统,正交频分多址(OFDMA,OrthogonalFrequency-DivisionMultipleAccess)系统,单载波FDMA(SC-FDMA)系统,通用分组无线业务(GPRS,GeneralPacketRadioService)系统,长期演进(LTE,LongTermEvolution)系统,以及其他此类通信系统。The various technologies described herein can be used in various wireless communication systems, such as current 2G, 3G communication systems and next-generation communication systems, such as Global System for Mobile Communications (GSM, Global System for Mobile communications), Code Division Multiple Access (CDMA, CodeDivisionMultipleAccess) system, Time Division Multiple Access (TDMA, TimeDivisionMultipleAccess) system, Wideband Code Division Multiple Access (WCDMA, WidebandCodeDivisionMultipleAccessWireless), Frequency Division Multiple Access (FDMA, FrequencyDivisionMultipleAddressing) system, Orthogonal Frequency Division Multiple Access (OFDMA, OrthogonalFrequency-DivisionMultipleAccess) system, single carrier FDMA (SC-FDMA) system, General Packet Radio Service (GPRS, General Packet Radio Service) system, Long Term Evolution (LTE, Long Term Evolution) system, and other such communication systems.

本文中结合终端和/或基站和/或基站控制器来描述各种方面。Various aspects are described herein in connection with terminals and/or base stations and/or base station controllers.

用户终端,可以是无线终端也可以是有线终端,无线终端可以是指向用户提供语音和/或数据连通性的设备,具有无线连接功能的手持式设备、或连接到无线调制解调器的其他处理设备。无线终端可以经无线接入网(例如,RAN,RadioAccessNetwork)与一个或多个核心网进行通信,无线终端可以是移动终端,如移动电话(或称为“蜂窝”电话)和具有移动终端的计算机,例如,可以是便携式、袖珍式、手持式、计算机内置的或者车载的移动装置,它们与无线接入网交换语言和/或数据。例如,个人通信业务(PCS,PersonalCommunicationService)电话、无绳电话、会话发起协议(SIP)话机、无线本地环路(WLL,WirelessLocalLoop)站、个人数字助理(PDA,PersonalDigitalAssistant)等设备。无线终端也可以称为系统、订户单元(SubscriberUnit)、订户站(SubscriberStation),移动站(MobileStation)、移动台(Mobile)、远程站(RemoteStation)、接入点(AccessPoint)、远程终端(RemoteTerminal)、接入终端(AccessTerminal)、用户终端(UserTerminal)、用户代理(UserAgent)、用户设备(UserDevice)、或用户装备(UserEquipment)。The user terminal may be a wireless terminal or a wired terminal. The wireless terminal may be a device that provides voice and/or data connectivity to the user, a handheld device with a wireless connection function, or other processing devices connected to a wireless modem. The wireless terminal can communicate with one or more core networks via a radio access network (for example, RAN, RadioAccessNetwork), and the wireless terminal can be a mobile terminal, such as a mobile phone (or called a "cellular" phone) and a computer with a mobile terminal , for example, may be portable, pocket, handheld, computer built-in, or vehicle-mounted mobile devices that exchange voice and/or data with the radio access network. For example, Personal Communication Service (PCS, Personal Communication Service) phone, cordless phone, Session Initiation Protocol (SIP) phone, Wireless Local Loop (WLL, Wireless Local Loop) station, Personal Digital Assistant (PDA, Personal Digital Assistant) and other equipment. Wireless terminal can also be called system, subscriber unit (SubscriberUnit), subscriber station (SubscriberStation), mobile station (MobileStation), mobile station (Mobile), remote station (RemoteStation), access point (AccessPoint), remote terminal (RemoteTerminal) , an access terminal (AccessTerminal), a user terminal (UserTerminal), a user agent (UserAgent), a user equipment (UserDevice), or a user equipment (UserEquipment).

另外,本文中术语“系统”和“网络”在本文中常被可互换使用。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。Additionally, the terms "system" and "network" are often used herein interchangeably. The term "and/or" in this article is just an association relationship describing associated objects, which means that there can be three relationships, for example, A and/or B can mean: A exists alone, A and B exist simultaneously, and there exists alone B these three situations. In addition, the character "/" in this article generally indicates that the contextual objects are an "or" relationship.

本发明实施例的方法应用于呼叫中心系统中,呼叫中心系统是充分利用现代通讯与计算机技术,如互动式语音应答(英文:InteractiveVoiceResponse,简称:IVR)、自动呼叫分配(英文:AutomaticCallDistributor,简称:ACD)等技术,可以自动灵活地处理大量各种不同的电话呼入和呼出业务和服务的运营操作系统。The method of the embodiment of the present invention is applied in the call center system, and the call center system makes full use of modern communication and computer technology, such as interactive voice response (English: InteractiveVoiceResponse, abbreviated: IVR), automatic call distribution (English: AutomaticCallDistributor, abbreviated: ACD) and other technologies can automatically and flexibly handle a large number of different telephone inbound and outbound business and operational operating systems.

其中,现有的呼叫中心系统主要由呼叫中心服务器和知识数据库构成。以通信运营商(为用户提供通信服务的运营商)的呼叫中心系统处理用户查询呼叫为例,客服人员在通过呼叫中心系统中的呼叫中心服务器接通与用户的语音通信后,则可以根据个人的业务经验和技能从用户语音信息中识别出检索关键字;然后将检索关键字输入知识数据库,由知识数据库匹配出与检索关键字匹配的信息作为检索数据,并向客服人员展示检索数据;客服人员还需要根据个人的业务经验和技能从知识数据库展示的检索数据中提取与用户语音信息关联度较高的检索数据,并根据提取出的检索数据为用户做出答复,或者将提取出的检索数据反馈给用户。Among them, the existing call center system is mainly composed of a call center server and a knowledge database. Take the call center system of a communication operator (operator that provides communication services for users) as an example to process user inquiry calls. After the customer service staff connects to the voice communication with the user through the call center server in the call center system, they can The business experience and skills of the user identify the search keywords from the user's voice information; then enter the search keywords into the knowledge database, and the knowledge database matches the information that matches the search keywords as the search data, and displays the search data to the customer service personnel; Personnel also need to extract retrieval data that is highly correlated with user voice information from the retrieval data displayed in the knowledge database based on personal business experience and skills, and respond to users based on the extracted retrieval data, or use the extracted retrieval data to The data is fed back to the user.

与现有技术不同的是,本发明实施例中的呼叫中心系统在处理用户查询呼叫时,则可以通过呼叫中心服务器实现对语音信息进行文本转换的文本转换和对文本信息进行文本分析代替客服人员在处理用户查询过程中进行的信息识别和分析工作,可以提高检索结果的准确性,并提高检索效率。现以下列具体实施例对本发明提供的数据检索方法以及实现该数据检索方法的设备进行说明:Different from the prior art, when the call center system in the embodiment of the present invention handles user inquiry calls, the call center server can realize text conversion of voice information and text analysis of text information instead of customer service personnel. Information identification and analysis in the process of processing user queries can improve the accuracy of retrieval results and improve retrieval efficiency. The data retrieval method provided by the present invention and the equipment for implementing the data retrieval method are now described with the following specific examples:

实施例一Embodiment one

本发明实施例提供一种数据检索方法,如图1所示,包括:An embodiment of the present invention provides a data retrieval method, as shown in Figure 1, including:

S101、呼叫中心服务器接收用户终端发送的语音信息,并获取对语音信息进行文本转换得到的文本信息。S101. The call center server receives voice information sent by a user terminal, and acquires text information obtained by performing text conversion on the voice information.

其中,用户可以通过用户终端向呼叫中心服务器发起语音通信请求,以建立用户终端与呼叫中心服务器之间的语音通信链路后,向呼叫中心服务器发送的语音信息。其中,用户终端发送的语音信息由以句子为维度的至少一个语音片段构成。Wherein, the user may initiate a voice communication request to the call center server through the user terminal, so as to send voice information to the call center server after establishing a voice communication link between the user terminal and the call center server. Wherein, the voice information sent by the user terminal is composed of at least one voice segment whose dimension is a sentence.

需要说明的是,本发明实施例中的呼叫中心服务器可以为能够为用户提供通信服务的运营商接收用户呼叫的呼叫中心管理服务器。It should be noted that the call center server in the embodiment of the present invention may be a call center management server that receives calls from users from operators that can provide communication services to users.

S102、呼叫中心服务器获取对文本信息进行文本分析识别出的第一分词集合,第一分词集合中包含至少一个文本分词。S102. The call center server acquires a first word segmentation set identified through text analysis of the text information, where the first word segmentation set includes at least one text segmentation word.

其中,呼叫中心服务器获取对文本信息进行文本分析识别出的第一分词集合的方法具体可以包括:呼叫中心服务器对文本信息进行断句、分词,以获取至少一个文本分词。Wherein, the method for the call center server to obtain the first word segmentation set identified by text analysis of the text information may specifically include: the call center server performs sentence segmentation and word segmentation on the text information to obtain at least one text segment.

示例性的,呼叫中心服务器可以对文本信息进行文本分析,识别出第三分词集合(第三分词集合中包含至少一个文本分词),然后过滤第三分词集合中包含的无意义文本分词,得到第一分词集合,无意义文本分词包括:惊叹词和语气词。Exemplarily, the call center server can perform text analysis on the text information, identify the third word segmentation set (the third word segmentation set contains at least one text segmentation word), and then filter the meaningless text segmentation words contained in the third word segmentation set to obtain the first A set of participles, meaningless text participles include: exclamation words and modal particles.

示例性的,呼叫中心服务器对文本信息进行文本分析,识别出第三分词集合(第三分词集合中包含至少一个文本分词),然后过滤第三分词集合中包含的第一文本分词,得到第一分词集合,第一文本分词为第三分词集合中与关键字索引列表中的任意一个检索关键字均不匹配的文本分词。Exemplarily, the call center server performs text analysis on the text information, identifies the third word segmentation set (the third word segmentation set contains at least one text segmentation word), and then filters the first text segmentation word contained in the third word segmentation set to obtain the first The word segmentation set, the first text word segmentation is the text word segmentation in the third word segmentation set that does not match any search keyword in the keyword index list.

进一步的,在本发明实施例的一种应用场景中,呼叫中心服务器可以向文本分析服务器发送文本信息,由文本分析服务器对文本信息进行文本分析,识别出第一分词集合;然后接收文本分析服务器发送的第一分词集合。Further, in an application scenario of the embodiment of the present invention, the call center server can send text information to the text analysis server, and the text analysis server performs text analysis on the text information to identify the first word segmentation set; then the text analysis server receives The set of first participles sent.

S103、呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字。S103. The call center server searches for a search keyword that matches the text word segmentation in the first word segmentation set from the keyword index list of the search data stored in the knowledge database.

其中,关键字索引列表中包含至少一个关键字索引项,关键字索引项中包含检索关键字和与检索关键字对应的检索数据的标识,知识数据库中存储有至少一个检索数据和检索数据的标识。Wherein, the keyword index list includes at least one keyword index item, and the keyword index item includes a retrieval keyword and an identification of the retrieval data corresponding to the retrieval keyword, and at least one retrieval data and the identification of the retrieval data are stored in the knowledge database .

在本发明实施例的一种应用场景中,关键字索引项可以为呼叫中心服务器对知识数据库中存储的每一个检索数据进行文本分析,得到的每一个检索数据的检索关键字和检索关键字对应的检索数据的标识构成。In an application scenario of the embodiment of the present invention, the keyword index item may be that the call center server performs text analysis on each retrieval data stored in the knowledge database, and the retrieval keyword of each retrieval data obtained corresponds to the retrieval keyword The identity of the retrieved data constitutes.

其中,呼叫中心服务器对知识数据库中存储的每一个检索数据进行文本分析,得到每一个检索数据的检索关键字的具体方法可以包括:呼叫中心服务器读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;获取对所述第一检索数据进行文本分析识别出的第二分词集合,所述第二分词集合中包含至少一个文本分词;根据所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字。Wherein, the call center server performs text analysis on each retrieval data stored in the knowledge database, and the specific method for obtaining the retrieval keywords of each retrieval data may include: the call center server reads the first retrieval data stored in the knowledge database , the first search data is any search data stored in the knowledge database; obtain a second word segmentation set identified by text analysis on the first search data, and the second word segmentation set contains at least one text Word segmentation: generating a search keyword for the first search data according to the text word segmentation in the second word segmentation set.

进一步的,在本发明实施例的另一种应用场景中,呼叫中心服务器可以接收文本分析服务器发送的关键字索引列表。其中,文本分析服务器生成关键字索引列表的方法与呼叫中心服务器生成的关键字索引列表的方法类似,本实施例这里不再赘述。Furthermore, in another application scenario of the embodiment of the present invention, the call center server may receive the keyword index list sent by the text analysis server. Wherein, the method for generating the keyword index list by the text analysis server is similar to the method for generating the keyword index list by the call center server, which will not be repeated here in this embodiment.

S104、呼叫中心服务器从知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。S104. The call center server searches the knowledge database for the search data indicated by the identifier of the search data corresponding to the found search keyword.

其中,呼叫中心服务器在查找到与第一分词集合中的文本分词匹配的检索关键字后,则可以从知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。Wherein, after the call center server finds the retrieval keyword matching the text word segmentation in the first word segmentation set, it can retrieve the retrieval data indicated by the identifier of the retrieved data corresponding to the retrieval keyword found from the knowledge database. data.

本发明实施例提供的数据检索方法,呼叫中心服务器接收用户终端发送的语音信息,并获取对语音信息进行文本转换得到的文本信息;呼叫中心服务器获取对文本信息进行文本分析识别出的第一分词集合,第一分词集合中包含至少一个文本分词;呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,关键字索引列表中包含至少一个关键字索引项,关键字索引项中包含检索关键字和与检索关键字对应的检索数据的标识,知识数据库中存储有至少一个检索数据和所述检索数据的标识;呼叫中心服务器从所述知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。In the data retrieval method provided by the embodiment of the present invention, the call center server receives the voice information sent by the user terminal, and obtains the text information obtained by converting the voice information into text; the call center server obtains the first participle recognized by text analysis of the text information Set, the first word segmentation set contains at least one text segmentation word; the call center server searches the search keyword matching the text segmentation word in the first word segmentation set from the keyword index list of the retrieval data stored in the knowledge database, keyword index The list contains at least one keyword index item, and the keyword index item includes a retrieval keyword and an identification of the retrieval data corresponding to the retrieval keyword, and at least one retrieval data and the identification of the retrieval data are stored in the knowledge database; the call center The server retrieves the retrieved data indicated by the identifier of the retrieved data corresponding to the found retrieved keyword from the knowledge database.

与现有技术中从知识数据库检索数据时,过分依赖于客服人员的个人的业务经验和技能,检索结果的准确性和检索效率较低相比,通过本方案,呼叫中心服务器在接收到用户终端发送的语音信息后,可以通过对语音信息进行文本分析得到分析结果(第一分词集合),然后从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,不需要客服人员根据其个人的业务经验和技能从用户语音信息中识别出检索关键字,并且呼叫中心服务器自动进行检索关键字的识别相较于客服人员根据其个人的业务经验和技能进行检索关键字的识别,可以提高检索关键字的识别效率,并避免由于客服人员由于疏忽或者大意出现的检索错误,提高识别检索关键字的准确性,在高效、准确地获取到检索关键字后,则可以快速、准确地获取到检索结果。Compared with the prior art when retrieving data from the knowledge database, relying too much on the personal business experience and skills of the customer service personnel, the accuracy and retrieval efficiency of the retrieval results are relatively low, through this scheme, the call center server receives the user terminal After the voice information is sent, the analysis result (the first word segmentation set) can be obtained by performing text analysis on the voice information, and then from the keyword index list of the retrieval data stored in the knowledge database, search for the text word segmentation in the first word segmentation set Matched search keywords do not require customer service personnel to identify search keywords from user voice information based on their personal business experience and skills, and the call center server automatically recognizes search keywords compared to customer service personnel based on their personal business Recognition of search keywords based on experience and skills can improve the recognition efficiency of search keywords, avoid search errors due to negligence or carelessness of customer service personnel, improve the accuracy of recognition of search keywords, and obtain search keywords efficiently and accurately. After keywords, you can quickly and accurately obtain the search results.

实施例二Embodiment two

本发明实施例提供一种数据检索方法,如图2所示,包括:An embodiment of the present invention provides a data retrieval method, as shown in Figure 2, including:

S201、文本分析服务器接收呼叫中心服务器发送的文本信息,文本信息为呼叫中心服务器根据语音信息转换得到的。S201. The text analysis server receives the text information sent by the call center server, where the text information is converted by the call center server according to the voice information.

其中,文本信息由至少一个文本片段构成,语音信息由以句子为维度的至少一个语音片段构成。Wherein, the text information is composed of at least one text segment, and the speech information is composed of at least one speech segment with a sentence as a dimension.

呼叫中心服务器可以在与用户终端进行语音通信的过程中,在接收到用户终端发送的一个语音片段后,则向语音分析服务器发送语音片段,以使得语音分析服务器将语音片段转换为文本片段;呼叫中心服务器在接收语音分析服务器发送的文本片段后,则可以向文本分析服务器发送文本片段。The call center server can send a voice segment to the voice analysis server after receiving a voice segment sent by the user terminal during voice communication with the user terminal, so that the voice analysis server converts the voice segment into a text segment; call After receiving the text segment sent by the voice analysis server, the central server may send the text segment to the text analysis server.

S202、文本分析服务器对文本信息进行文本分析,识别出第一分词集合,第一分词集合中包含至少一个文本分词。S202. The text analysis server performs text analysis on the text information, and identifies a first word segment set, where the first word segment set includes at least one text segment word.

其中,文本分析服务器对文本信息进行文本分析识别出的第一分词集合的方法具体可以包括:文本分析服务器对文本信息进行断句、分词,以识别出至少一个文本分词。Wherein, the method for the text analysis server to perform text analysis on the text information to identify the first word segmentation set may specifically include: the text analysis server performs sentence segmentation and word segmentation on the text information to identify at least one text segmentation.

示例性的,文本分析服务器可以对文本信息进行文本分析,识别出第三分词集合(第三分词集合中包含至少一个文本分词),然后过滤第三分词集合中包含的无意义文本分词,得到第一分词集合,无意义文本分词包括:惊叹词和语气词。Exemplarily, the text analysis server can perform text analysis on the text information, identify the third word segmentation set (the third word segmentation set contains at least one text segmentation word), and then filter the meaningless text segmentation words contained in the third word segmentation set to obtain the first A set of participles, meaningless text participles include: exclamation words and modal particles.

示例性的,文本分析服务器可以对文本信息进行文本分析,识别出第三分词集合(第三分词集合中包含至少一个文本分词),然后过滤第三分词集合中包含的第一文本分词,得到第一分词集合,第一文本分词为第三分词集合中与关键字索引列表中的任意一个检索关键字均不匹配的文本分词。Exemplarily, the text analysis server may perform text analysis on the text information, identify a third word segment set (the third word segment set contains at least one text segment word), and then filter the first text segment words contained in the third word segment set to obtain the first text segment A word segmentation set, the first text segmentation is a text segmentation that does not match any search keyword in the keyword index list in the third segmentation set.

S203、文本分析服务器向呼叫中心服务器发送第一分词集合,以使得呼叫中心服务器根据第一分词集合中的文本分词从知识数据库中检索数据。S203. The text analysis server sends the first word segmentation set to the call center server, so that the call center server retrieves data from the knowledge database according to the text segmentation words in the first word segmentation set.

进一步的,文本分析服务器可以对知识数据库中存储的每一个检索数据进行文本分析,得到每一个检索数据的检索关键字,然后生成包含第一检索数据的检索关键字和第一检索数据的标识的关键字索引项,并生成包含至少一个关键字索引项的关键字索引列表,然后向呼叫中心服务器发送关键字索引列表,以便于呼叫中心服务器可以从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,并从知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。Further, the text analysis server can perform text analysis on each retrieval data stored in the knowledge database to obtain the retrieval keyword of each retrieval data, and then generate a text file containing the retrieval keyword of the first retrieval data and the identifier of the first retrieval data. Keyword index items, and generate a keyword index list containing at least one keyword index item, and then send the keyword index list to the call center server, so that the call center server can retrieve the keyword index list of data stored in the knowledge database , search for a search keyword matching the text word segmentation in the first word segmentation set, and retrieve the search data indicated by the identifier of the retrieved data corresponding to the found search keyword from the knowledge database.

可以理解的是,本实施例的方法还可以包括:文本分析服务器读取知识数据库中存储的第一检索数据,第一检索数据为知识数据库中存储的任意一个检索数据;文本分析服务器对所述第一检索数据进行文本分析,识别出第二分词集合,第二分词集合中包含至少一个文本分词;文本分析服务器根据第二分词集合中的文本分词,生成第一检索数据的检索关键字;文本分析服务器根据第一检索数据的检索关键字和第一检索数据的标识生成一个关键字索引项;文本服务器生成包含至少一个关键字索引项的关键字索引列表;文本分析服务器向呼叫中心服务器发送关键字索引列表。It can be understood that the method of this embodiment may also include: the text analysis server reads the first retrieval data stored in the knowledge database, where the first retrieval data is any retrieval data stored in the knowledge database; Text analysis is performed on the first retrieval data, and a second word segmentation set is identified, which contains at least one text segmentation word; the text analysis server generates a retrieval keyword of the first retrieval data according to the text segmentation words in the second word segmentation set; the text The analysis server generates a keyword index item according to the retrieval keyword of the first retrieval data and the identification of the first retrieval data; the text server generates a keyword index list containing at least one keyword index item; the text analysis server sends the key to the call center server List of word indexes.

本发明实施例提供的数据检索方法,文本分析服务器接收呼叫中心服务器发送的文本信息,所述文本信息为所述呼叫中心服务器根据语音信息转换得到的;所述文本分析服务器对所述文本信息进行文本分析,识别出所述第一分词集合,所述第一分词集合中包含至少一个文本分词;所述文本分析服务器向所述呼叫中心服务器发送所述第一分词集合,以使得所述呼叫中心服务器根据所述第一分词集合中的文本分词从知识数据库中检索数据。In the data retrieval method provided by the embodiment of the present invention, the text analysis server receives the text information sent by the call center server, and the text information is converted by the call center server according to the voice information; the text analysis server performs the text information on the text information Text analysis, identifying the first word segmentation set, the first word segmentation set contains at least one text segmentation word; the text analysis server sends the first word segmentation set to the call center server, so that the call center The server retrieves data from the knowledge database according to the text word segmentation in the first word segmentation set.

与现有技术中从知识数据库检索数据时,过分依赖于客服人员的个人的业务经验和技能,检索结果的准确性和检索效率较低相比,通过本方案,文本分析服务器可以对呼叫中心服务器发送的文本信息进行文本分析得到分析结果(第一分词集合),并将第一分词集合发送至呼叫中心服务器,由呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,不需要客服人员根据其个人的业务经验和技能从用户语音信息中识别出检索关键字,并且呼叫中心服务器自动进行检索关键字的识别相较于客服人员根据其个人的业务经验和技能进行检索关键字的识别,可以提高检索关键字的识别效率,并避免由于客服人员由于疏忽或者大意出现的检索错误,提高识别检索关键字的准确性,在高效、准确地获取到检索关键字后,则可以快速、准确地获取到检索结果。Compared with the prior art, when retrieving data from the knowledge database, relying too much on the personal business experience and skills of the customer service personnel, the accuracy and retrieval efficiency of the retrieval results are relatively low, through this scheme, the text analysis server can query the call center server The sent text information is subjected to text analysis to obtain the analysis result (the first word segmentation set), and the first word segmentation set is sent to the call center server, and the call center server searches the keyword index list of the retrieval data stored in the knowledge database, and The search keywords matched by the text word segmentation in the first word segmentation set do not require the customer service personnel to identify the search keywords from the user voice information based on their personal business experience and skills, and the call center server automatically performs the recognition and comparison of the search keywords Because the customer service personnel identify the retrieval keywords based on their personal business experience and skills, it can improve the recognition efficiency of the retrieval keywords, avoid retrieval errors due to negligence or carelessness of the customer service personnel, and improve the accuracy of identifying retrieval keywords. After the search keywords are obtained efficiently and accurately, the search results can be obtained quickly and accurately.

实施例三Embodiment Three

本发明实施例提供一种数据检索方法,应用于当呼叫中心服务器集成有能够实现对语音信息进行文本转换的文本转换模块和能够实现对文本转换模块对语音信息进行文本转换得到的文本信息进行文本分析的文本分析模块,即呼叫中心服务器可实现对语音信息进行文本转换,进而对进行文本转换得到的文本信息进行文本分析的功能的具体场景中。An embodiment of the present invention provides a data retrieval method, which is applied when the call center server is integrated with a text conversion module capable of performing text conversion on speech information and capable of performing text conversion on the text information obtained by text conversion of the speech information by the text conversion module. The analyzed text analysis module, that is, the call center server can perform text conversion on speech information, and then perform text analysis on the text information obtained through text conversion in a specific scenario.

在这种应用场景中,如图3所示,该数据检索方法,可以包括:In this application scenario, as shown in Figure 3, the data retrieval method may include:

S301、呼叫中心服务器读取知识数据库中存储的第一检索数据,第一检索数据为知识数据库中存储的任意一个检索数据。S301. The call center server reads first retrieval data stored in the knowledge database, where the first retrieval data is any retrieval data stored in the knowledge database.

S302、呼叫中心服务器获取对第一检索数据进行文本分析,识别出的第二分词集合,第二分词集合中包含至少一个文本分词。S302. The call center server acquires a second word segmentation set identified by performing text analysis on the first retrieval data, and the second word segmentation set includes at least one text segmentation word.

其中,呼叫中心服务器可以对知识数据库中存储的任意一个检索数据进行文本分析,识别出的一个第二分词集合,第二分词集合中包含至少一个文本分词。呼叫中心服务器对知识数据库中存储的检索数据进行文本分析的方法与呼叫中心服务器对文本信息进行文本分析的方法类似,呼叫中心服务器对知识数据库中存储的检索数据进行文本分析的方法可以参考本发明实施例中呼叫中心服务器对文本信息进行文本分析的具体方法,本发明实施例这里不再赘述。Wherein, the call center server may perform text analysis on any retrieval data stored in the knowledge database, and identify a second word segmentation set, and the second word segmentation set includes at least one text segmentation word. The method for the call center server to perform text analysis on the retrieval data stored in the knowledge database is similar to the method for the call center server to perform text analysis on the text information, and the method for the call center server to perform text analysis on the retrieval data stored in the knowledge database can refer to the present invention The specific method for the text analysis performed by the call center server on the text information in the embodiment will not be repeated here in the embodiment of the present invention.

S303、呼叫中心服务器根据第二分词集合中的文本分词,生成第一检索数据的检索关键字。S303. The call center server generates a search keyword of the first search data according to the text word segmentation in the second word segmentation set.

S304、呼叫中心服务器根据第一检索数据的检索关键字和第一检索数据的标识生成一个关键字索引项。S304. The call center server generates a keyword index item according to the search keyword of the first search data and the identifier of the first search data.

S305、呼叫中心服务器生成包含至少一个关键字索引项的关键字索引列表。S305. The call center server generates a keyword index list including at least one keyword index item.

示例性的,本发明实施例以表1为例,给出了一个关键字索引列表的实例:Exemplarily, the embodiment of the present invention takes Table 1 as an example to provide an example of a keyword index list:

表1:Table 1:

S306、呼叫中心服务器接收用户终端发送的语音信息,并获取对语音信息进行文本转换得到的文本信息。S306. The call center server receives the voice information sent by the user terminal, and acquires text information obtained by performing text conversion on the voice information.

其中,呼叫中心服务器可以对语音信息进行文本转换,以得到文本信息。呼叫中心服务器可以对语音信息进行文本转换得到文本信息的方法可以参考现有技术中实现语音转换文本的具体方法,本发明实施例这里不再赘述。Wherein, the call center server may perform text conversion on the voice information to obtain text information. The method that the call center server can perform text conversion on voice information to obtain text information can refer to the specific method for realizing voice conversion to text in the prior art, which will not be repeated here in the embodiment of the present invention.

示例性的,呼叫中心服务器中可以安装具备语音转换文本功能的应用程序,以实现语音信息到文本信息的转换。Exemplarily, an application program capable of converting speech to text may be installed in the call center server, so as to realize the conversion from speech information to text information.

S307、呼叫中心服务器获取对文本信息进行文本分析识别出的第一分词集合,第一分词集合中包含至少一个文本分词。S307. The call center server acquires a first word segmentation set identified by performing text analysis on the text information, where the first word segmentation set includes at least one text segmentation word.

其中,呼叫中心服务器获取对文本信息进行文本分析识别出的第一分词集合的方法具体可以包括:呼叫中心服务器对文本信息进行断句、分词,以获取至少一个文本分词。Wherein, the method for the call center server to obtain the first word segmentation set identified by text analysis of the text information may specifically include: the call center server performs sentence segmentation and word segmentation on the text information to obtain at least one text segment.

示例性的,呼叫中心服务器在获取到对语音信息进行文本转换得到的文本信息:“麻烦帮我查一下套餐A的资费情况”后,则可以对文本信息:“麻烦帮我查一下套餐A的资费情况”进行断句、分词,则可以获取到由文本分词:“查”、“套餐A”和“资费情况”构成的第一分词集合。Exemplarily, after the call center server obtains the text information obtained by converting the voice information into text: "Please check the tariff of package A for me", then the text information: "Please check the tariff of package A for me." Tariff situation” is segmented and word-segmented, and the first participle set consisting of text segmentation: “cha”, “package A” and “tariff situation” can be obtained.

进一步的,呼叫中心服务器在对文本信息进行文本分析得到文本分词后,还可以过滤掉分析得到的文本分词中的无意义文本分词,以提高数据检索效率。其中,本发明实施例中的无意义文本分词可以包括:惊叹词和语气词;无意义文本分词还可以包括问候语和礼貌用语等。Furthermore, after the call center server performs text analysis on the text information to obtain the text segmentation, it can also filter out meaningless text segmentation in the analyzed text segmentation, so as to improve data retrieval efficiency. Wherein, the nonsense text participle in the embodiment of the present invention may include: exclamation words and modal particles; the nonsense text participle may also include greetings and polite expressions.

示例性的,假设呼叫中心服务器获取到的文本信息为“您好,请帮我查一下套餐A的资费情况,谢谢啦”后,则可以对文本信息:“您好,请帮我查一下套餐A的资费情况,谢谢啦”进行断句、分词,则可以获取到由文本分词:“您好”(问候语)、“请”(礼貌用语)、“查”、“套餐A”、“资费情况”、“谢谢”(礼貌用语)和“啦”(惊叹词/语气词)构成的第三分词集合;呼叫中心服务器可以过滤第三分词集合中的问候语:“您好”、礼貌用语:“请”和“谢谢”以及惊叹词/语气词:“啦”,得到由文本分词:“查”、“套餐A”和“资费情况”构成的第一分词集合。Exemplarily, suppose that the text message obtained by the call center server is "Hi, please help me check the tariff of package A, thank you", then the text message can be: "Hello, please help me check the package A The tariff situation of A, thank you" can be segmented and word-segmented, and the text segmentation can be obtained: "Hello" (greeting), "please" (polite language), "check", "package A", "rate situation ", "Thank you" (polite expressions) and "啦" (exclamation words/modal particles) form the third participle set; the call center server can filter the greetings in the third participle set: "Hello", polite words: " Please" and "thank you" and the exclamation/modal particle: "la", get the first participle set consisting of the text participle: "cha", "package A" and "tariff".

进一步的,呼叫中心服务器还可以在获取到第三分词集合后,从关键字索引列表中查找与第三分词集合中的每一个文本分词匹配的检索关键字;若关键字索引列表中的任意一个检索关键字均与第三分词集合中的一个文本分词不匹配,呼叫中心服务器则可以过滤该文本分词。Further, the call center server can also search the keyword index list for a search keyword that matches each text word in the third word segmentation set after obtaining the third word segmentation set; if any of the keyword index lists The search keywords do not match a text segment in the third segment set, and the call center server can filter the text segment.

本实施例的方法还可以包括:呼叫中心服务器对文本信息进行文本分析,识别出第三分词集合(第三分词集合中包含至少一个文本分词),然后过滤第三分词集合中包含的第一文本分词,得到第一分词集合,第一文本分词为第三分词集合中与关键字索引列表中的任意一个检索关键字均不匹配的文本分词。The method of this embodiment may also include: the call center server performs text analysis on the text information, identifies a third word segmentation set (at least one text segmentation word is included in the third word segmentation set), and then filters the first text contained in the third word segmentation set Word segmentation, to obtain the first word segmentation set, the first text segmentation is the text segmentation that does not match any search keyword in the keyword index list in the third word segmentation set.

基于上述实例,假设关键字索引列表中的任意一个检索关键字均与文本分词:“查”不匹配,呼叫中心服务器则可以过滤文本分词:“查”,得到由文本分词:“套餐A”和“资费情况”构成的第一分词集合。Based on the above example, assuming that any search keyword in the keyword index list does not match the text word segmentation: "check", the call center server can filter the text word segmentation: "check", and obtain the text word segmentation: "package A" and The first participle set composed of "tariff situation".

需要说明的是,在本发明实施例中可以先执行S301-S305,再执行S306-S307;也可以先执行S306-S307,再执行S301-S305。本发明实施例对于S301-S305和S306-S307执行的先后顺序不做限制。It should be noted that, in the embodiment of the present invention, S301-S305 may be executed first, and then S306-S307 may be executed; or S306-S307 may be executed first, and then S301-S305 may be executed. The embodiment of the present invention does not limit the sequence of execution of S301-S305 and S306-S307.

S308、呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字。S308. The call center server searches for a search keyword that matches the text word segmentation in the first word segmentation set from the keyword index list of the search data stored in the knowledge database.

S309、呼叫中心服务器从知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。S309. The call center server searches the knowledge database for the search data indicated by the identifier of the search data corresponding to the found search keyword.

其中,知识数据库中存储有至少一个检索数据和检索数据的标识。Wherein, at least one retrieval data and an identification of the retrieval data are stored in the knowledge database.

示例性的,本发明实施例以表2为例,知识数据库可以采用列表的方式存储多个检索数据和每一个检索数据的标识:Exemplarily, the embodiment of the present invention takes Table 2 as an example, the knowledge database can store a plurality of retrieval data and the identification of each retrieval data in the form of a list:

表2:Table 2:

本发明实施例提供的数据检索方法,呼叫中心服务器接收用户终端发送的语音信息,并获取对语音信息进行文本转换得到的文本信息;呼叫中心服务器获取对文本信息进行文本分析识别出的第一分词集合,第一分词集合中包含至少一个文本分词;呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,关键字索引列表中包含至少一个关键字索引项,关键字索引项中包含检索关键字和与检索关键字对应的检索数据的标识,知识数据库中存储有至少一个检索数据和所述检索数据的标识;呼叫中心服务器从所述知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。In the data retrieval method provided by the embodiment of the present invention, the call center server receives the voice information sent by the user terminal, and obtains the text information obtained by converting the voice information into text; the call center server obtains the first participle recognized by text analysis of the text information Set, the first word segmentation set contains at least one text segmentation word; the call center server searches the search keyword matching the text segmentation word in the first word segmentation set from the keyword index list of the retrieval data stored in the knowledge database, keyword index The list contains at least one keyword index item, and the keyword index item includes a retrieval keyword and an identification of the retrieval data corresponding to the retrieval keyword, and at least one retrieval data and the identification of the retrieval data are stored in the knowledge database; the call center The server retrieves the retrieved data indicated by the identifier of the retrieved data corresponding to the found retrieved keyword from the knowledge database.

与现有技术中从知识数据库检索数据时,过分依赖于客服人员的个人的业务经验和技能,检索结果的准确性和检索效率较低相比,通过本方案,呼叫中心服务器在接收到用户终端发送的语音信息后,可以通过对语音信息进行文本分析得到分析结果(第一分词集合),然后从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,不需要客服人员根据其个人的业务经验和技能从用户语音信息中识别出检索关键字,并且呼叫中心服务器自动进行检索关键字的识别相较于客服人员根据其个人的业务经验和技能进行检索关键字的识别,可以提高检索关键字的识别效率,并避免由于客服人员由于疏忽或者大意出现的检索错误,提高识别检索关键字的准确性,在高效、准确地获取到检索关键字后,则可以快速、准确地获取到检索结果。Compared with the prior art when retrieving data from the knowledge database, relying too much on the personal business experience and skills of the customer service personnel, the accuracy and retrieval efficiency of the retrieval results are relatively low, through this scheme, the call center server receives the user terminal After the voice information is sent, the analysis result (the first word segmentation set) can be obtained by performing text analysis on the voice information, and then from the keyword index list of the retrieval data stored in the knowledge database, search for the text word segmentation in the first word segmentation set Matched search keywords do not require customer service personnel to identify search keywords from user voice information based on their personal business experience and skills, and the call center server automatically recognizes search keywords compared to customer service personnel based on their personal business Recognition of search keywords based on experience and skills can improve the recognition efficiency of search keywords, avoid search errors due to negligence or carelessness of customer service personnel, improve the accuracy of recognition of search keywords, and obtain search keywords efficiently and accurately. After keywords, you can quickly and accurately obtain the search results.

实施例四Embodiment Four

本发明实施例提供一种数据检索方法,应用于呼叫中心服务器不具备对语音信息进行文本转换和对文本信息进行文本分析的功能的具体场景中。The embodiment of the present invention provides a data retrieval method, which is applied in a specific scenario where the call center server does not have the functions of performing text conversion on voice information and text analysis on text information.

在这种应用场景中,如图4所示,该数据检索方法,可以包括:In this application scenario, as shown in Figure 4, the data retrieval method may include:

S401、文本分析服务器读取知识数据库中存储的第一检索数据,第一检索数据为知识数据库中存储的任意一个检索数据。S401. The text analysis server reads first retrieval data stored in the knowledge database, where the first retrieval data is any retrieval data stored in the knowledge database.

示例性的,知识数据库可以采用列表的方式存储多个检索数据和每一个检索数据的标识,知识数据库的具体形式和内容可以参考本发明实施例表2中所示的知识数据库的列表形式实例。Exemplarily, the knowledge database can store a plurality of retrieval data and the identification of each retrieval data in the form of a list. For the specific form and content of the knowledge database, refer to the list form example of the knowledge database shown in Table 2 of the embodiment of the present invention.

S402、文本分析服务器对第一检索数据进行文本分析,识别出第二分词集合,第二分词集合中包含至少一个文本分词。S402. The text analysis server performs text analysis on the first retrieval data, and identifies a second word segment set, where the second word segment set includes at least one text segment.

其中,第一检索数据为知识数据库中存储的任意一个检索数据。文本分析服务器可以对知识数据库中存储的任意一个检索数据进行文本分析,识别出的一个第二分词集合,第二分词集合中包含至少一个文本分词。文本分析服务器对知识数据库中存储的检索数据进行文本分析的方法与呼叫中心服务器对文本信息进行文本分析的方法类似,文本分析服务器对知识数据库中存储的检索数据进行文本分析的方法可以参考本发明实施例中呼叫中心服务器对文本信息进行文本分析的具体方法,本发明实施例这里不再赘述。Wherein, the first retrieval data is any retrieval data stored in the knowledge database. The text analysis server can perform text analysis on any retrieval data stored in the knowledge database, and identify a second word segmentation set, which contains at least one text segmentation word. The method for the text analysis server to perform text analysis on the retrieval data stored in the knowledge database is similar to the method for text analysis performed by the call center server on text information, and the method for the text analysis server to perform text analysis on the retrieval data stored in the knowledge database can refer to the present invention The specific method for the text analysis performed by the call center server on the text information in the embodiment will not be repeated here in the embodiment of the present invention.

S403、文本分析服务器根据第二分词集合中的文本分词,生成第一检索数据的检索关键字。S403. The text analysis server generates a search keyword of the first search data according to the text word segmentation in the second word segmentation set.

S404、文本分析服务器根据第一检索数据的检索关键字和第一检索数据的标识生成一个关键字索引项。S404. The text analysis server generates a keyword index item according to the search keyword of the first search data and the identifier of the first search data.

S405、文本服务器生成包含至少一个关键字索引项的关键字索引列表,并向呼叫中心服务器发送至少一个关键字索引项构成的关键字索引列表。S405. The text server generates a keyword index list including at least one keyword index item, and sends the keyword index list composed of at least one keyword index item to the call center server.

关键字索引列表的具体形式和内容可以参考本发明实施例表1中所示的关键字索引列表。并且,文本服务器生成关键字索引列表的方法可以参考本发明实施例中对呼叫中心服务器生成关键字索引列表方法的具体描述,本发明实施例这里不再赘述。For the specific form and content of the keyword index list, refer to the keyword index list shown in Table 1 of the embodiment of the present invention. In addition, for the method for the text server to generate the keyword index list, reference may be made to the specific description of the method for the call center server to generate the keyword index list in the embodiment of the present invention, and details will not be repeated here in the embodiment of the present invention.

S406、呼叫中心服务器接收用户终端发送的语音信息,并获取对语音信息进行文本转换得到的文本信息。S406. The call center server receives the voice information sent by the user terminal, and acquires text information obtained by performing text conversion on the voice information.

其中,语音信息由以句子为维度的至少一个语音片段构成。Wherein, the voice information is composed of at least one voice segment with a sentence as a dimension.

如图4所示,S406具体可以包括:S406a-S406d:As shown in Figure 4, S406 may specifically include: S406a-S406d:

S406a、呼叫中心服务器接收用户终端发送的语音片段。S406a. The call center server receives the voice segment sent by the user terminal.

S406b、呼叫中心服务器在接收到一个语音片段后,向语音分析服务器发送语音片段。S406b. After receiving a voice segment, the call center server sends the voice segment to the voice analysis server.

S406c、语音分析服务器将语音片段转换为文本片段。S406c. The speech analysis server converts the speech segment into a text segment.

其中,语音分析服务器将语音片段转换为文本片段的方法与呼叫中心服务器将语音片段转换为文本片段的方法类似,本发明实施例这里不再赘述。Wherein, the method for the voice analysis server to convert the voice segment into the text segment is similar to the method for the call center server to convert the voice segment into the text segment, which will not be repeated here in the embodiment of the present invention.

S406d、呼叫中心服务器接收语音分析服务器发送的文本片段。S406d. The call center server receives the text segment sent by the speech analysis server.

S407、呼叫中心服务器向文本分析服务器发送文本信息。S407. The call center server sends the text information to the text analysis server.

其中,文本信息由至少一个文本片段构成,呼叫中心服务器可以在接收到语音分析服务器发送的一个文本片段后,则向文本分析服务器发送该文本片段;当然,呼叫中心服务器也可以在接收到语音分析服务器发送的构成文本信息的所有文本片段后,则向文本分析服务器发送该文本信息。Wherein, the text information is composed of at least one text fragment, and the call center server may send the text fragment to the text analysis server after receiving a text fragment sent by the speech analysis server; of course, the call center server may also receive the speech analysis After sending all the text fragments constituting the text information, the server sends the text information to the text analysis server.

S408、文本分析服务器对文本信息进行文本分析,识别出第一分词集合。S408. The text analysis server performs text analysis on the text information, and identifies the first word segmentation set.

需要说明的是,文本分析服务器对文本信息进行文本分析,识别出第一分词集合的方法可以参考呼叫中心服务器对文本信息进行文本分析,识别出第一分词集合的具体方法,本发明实施例这里不再赘述。It should be noted that, the text analysis server performs text analysis on the text information, and the method for identifying the first word segmentation set can refer to the specific method for the call center server to perform text analysis on the text information, and identify the first word segmentation set, the embodiment of the present invention is here No longer.

S409、文本分析服务器向呼叫中心服务器发送第一分词集合。S409. The text analysis server sends the first word segmentation set to the call center server.

需要说明的是,在本发明实施例中可以先执行S401-S405,再执行S406-S409;也可以先执行S406-S409,再执行S401-S405。本发明实施例对于S401-S405和S406-S409执行的先后顺序不做限制。It should be noted that, in the embodiment of the present invention, S401-S405 may be executed first, and then S406-S409 may be executed; or S406-S409 may be executed first, and then S401-S405 may be executed. In this embodiment of the present invention, there is no limitation on the execution sequence of S401-S405 and S406-S409.

S410、呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字。S410. The call center server searches for a search keyword that matches the text word segmentation in the first word segmentation set from the keyword index list of the search data stored in the knowledge database.

S411、呼叫中心服务器从知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。S411. The call center server searches the knowledge database for the search data indicated by the identifier of the search data corresponding to the found search keyword.

本发明实施例提供的数据检索方法,呼叫中心服务器接收用户终端发送的语音信息后可以将语音信息发送至语音分析服务器,由语音分析服务器对语音信息进行文本转换得到的文本信息,并经由呼叫中心服务器将文本信息转发至文本分析服务器,由文本分析服务器对文本信息进行文本分析识别出的第一分词集合,第一分词集合中包含至少一个文本分词;呼叫中心服务器再从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,关键字索引列表中包含至少一个关键字索引项,所述关键字索引项中包含检索关键字和与所述检索关键字对应的检索数据的标识,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识;呼叫中心服务器从知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。In the data retrieval method provided by the embodiment of the present invention, the call center server can send the voice information to the voice analysis server after receiving the voice information sent by the user terminal, and the text information obtained by converting the voice information to text by the voice analysis server is sent through the call center The server forwards the text information to the text analysis server, and the text analysis server performs text analysis on the text information to identify the first word segmentation set, which contains at least one text segmentation word; the call center server retrieves from the knowledge database stored In the keyword index list of the data, search keywords matching the text word segmentation in the first word segmentation set, the keyword index list contains at least one keyword index item, and the keyword index item includes the search keyword and the The identifier of the retrieval data corresponding to the retrieval keyword, at least one retrieval data and the identifier of the retrieval data are stored in the knowledge database; the call center server retrieves the retrieval data corresponding to the retrieved retrieval keyword from the knowledge database The retrieved data indicated by the ID of the data.

与现有技术中从知识数据库检索数据时,过分依赖于客服人员的个人的业务经验和技能,检索结果的准确性和检索效率较低相比,通过本方案,呼叫中心服务器在接收到用户终端发送的语音信息后,可以通过对语音信息进行文本分析得到分析结果(第一分词集合),然后从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,不需要客服人员根据其个人的业务经验和技能从用户语音信息中识别出检索关键字,并且呼叫中心服务器自动进行检索关键字的识别相较于客服人员根据其个人的业务经验和技能进行检索关键字的识别,可以提高检索关键字的识别效率,并避免由于客服人员由于疏忽或者大意出现的检索错误,提高识别检索关键字的准确性,在高效、准确地获取到检索关键字后,则可以快速、准确地获取到检索结果。Compared with the prior art when retrieving data from the knowledge database, relying too much on the personal business experience and skills of the customer service personnel, the accuracy and retrieval efficiency of the retrieval results are relatively low, through this scheme, the call center server receives the user terminal After the voice information is sent, the analysis result (the first word segmentation set) can be obtained by performing text analysis on the voice information, and then from the keyword index list of the retrieval data stored in the knowledge database, search for the text word segmentation in the first word segmentation set Matched search keywords do not require customer service personnel to identify search keywords from user voice information based on their personal business experience and skills, and the call center server automatically recognizes search keywords compared to customer service personnel based on their personal business Recognition of search keywords based on experience and skills can improve the recognition efficiency of search keywords, avoid search errors due to negligence or carelessness of customer service personnel, improve the accuracy of recognition of search keywords, and obtain search keywords efficiently and accurately. After keywords, you can quickly and accurately obtain the search results.

实施例五Embodiment five

本发明实施例提供一种呼叫中心服务器,用以执行图1所示的数据检索方法,如图5所示,该呼叫中心服务器包括:接收单元51、文本转换单元52、文本分析单元53、关键字查找单元54和检索单元55。An embodiment of the present invention provides a call center server for executing the data retrieval method shown in FIG. 1, as shown in FIG. A word lookup unit 54 and a retrieval unit 55 .

接收单元51,用于接收用户终端发送的语音信息。The receiving unit 51 is configured to receive voice information sent by the user terminal.

文本转换单元52,用于获取对所述接收单元接收的所述语音信息进行文本转换得到的文本信息。The text conversion unit 52 is configured to obtain text information obtained by performing text conversion on the voice information received by the receiving unit.

文本分析单元53,用于获取对所述文本转换单元52获得的所述文本信息进行文本分析识别出的第一分词集合,所述第一分词集合中包含至少一个文本分词。The text analysis unit 53 is configured to obtain a first word segment set identified by performing text analysis on the text information obtained by the text conversion unit 52, and the first word segment set includes at least one text segment word.

关键字查找单元54,用于从知识数据库中存储的检索数据的关键字索引列表中,查找与所述文本分析单元53获得的所述第一分词集合中的文本分词匹配的检索关键字,所述关键字索引列表中包含至少一个关键字索引项,所述关键字索引项中包含检索关键字和与所述检索关键字对应的检索数据的标识,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识。A keyword search unit 54, configured to search for a search keyword that matches the text word segmentation in the first word segmentation set obtained by the text analysis unit 53 from the keyword index list of the retrieval data stored in the knowledge database, so The keyword index list includes at least one keyword index item, the keyword index item includes a search keyword and an identifier of the search data corresponding to the search keyword, and at least one search data is stored in the knowledge database and the ID of the retrieved data.

检索单元55,用于从所述知识数据库中,检索所述关键字查找单元54查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。The retrieval unit 55 is configured to retrieve, from the knowledge database, the retrieval data indicated by the identifier of the retrieved data corresponding to the retrieval keyword found by the keyword searching unit 54 .

其中,所述接收单元51接收到的所述语音信息为:用户终端向呼叫中心服务器发起语音通信请求后,以建立用户终端与呼叫中心服务器之间的语音通信链路后,接收自用户终端的。其中,用户终端发送的语音信息由以句子为维度的至少一个语音片段构成。Wherein, the voice information received by the receiving unit 51 is: after the user terminal initiates a voice communication request to the call center server to establish a voice communication link between the user terminal and the call center server, the voice information received from the user terminal . Wherein, the voice information sent by the user terminal is composed of at least one voice segment whose dimension is a sentence.

其中,文本分析单元53可以对文本信息进行断句、分词,以获取至少一个文本分词。Wherein, the text analysis unit 53 may perform sentence segmentation and word segmentation on the text information to obtain at least one text word segment.

进一步的,如图6所示,所述呼叫中心服务器,还可以包括:读取单元56和生成单元57。Further, as shown in FIG. 6 , the call center server may further include: a reading unit 56 and a generating unit 57 .

读取单元56,用于在所述关键字查找单元54从所述知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据。The reading unit 56 is used to search for the text word segmentation in the first word segmentation set in the keyword index item in the keyword index list of the retrieval data stored in the knowledge database by the keyword search unit 54 Before matching the search keywords, read the first search data stored in the knowledge database, where the first search data is any search data stored in the knowledge database.

所述文本分析单元53,还用于获取对所述读取单元56读取的所述第一检索数据进行文本分析识别出的第二分词集合,所述第二分词集合中包含至少一个文本分词。The text analysis unit 53 is further configured to obtain a second word segmentation set identified by performing text analysis on the first retrieval data read by the reading unit 56, and the second word segmentation set includes at least one text segmentation word .

生成单元57,用于根据所述文本分析单元53获得的所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项,生成包含至少一个所述关键字索引项的所述关键字索引列表。A generating unit 57, configured to generate a retrieval keyword of the first retrieval data according to the text segmentation in the second segmentation set obtained by the text analysis unit 53; The identification of the first retrieval data generates one keyword index item, and generates the keyword index list including at least one keyword index item.

进一步的,所述接收单元51,还用于在所述关键字查找单元54从所述知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,接收文本分析服务器发送的所述关键字索引列表。Further, the receiving unit 51 is also configured to search the keyword index items in the keyword index list of the retrieval data stored in the knowledge database by the keyword searching unit 54 to find Before the search keywords matched by the text word segmentation in the collection, the keyword index list sent by the text analysis server is received.

进一步的,在本发明实施例的一种应用场景中,如图7所示,所述文本分析单元53,包括:文本分析模块531和第一过滤模块532。Further, in an application scenario of the embodiment of the present invention, as shown in FIG. 7 , the text analysis unit 53 includes: a text analysis module 531 and a first filtering module 532 .

文本分析模块531,用于对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词。The text analysis module 531 is configured to perform text analysis on the text information to identify a third word segment set, and the third word segment set includes at least one text segment word.

第一过滤模块532,用于过滤所述文本分析模块531识别出的所述第三分词集合中包含的无意义文本分词,得到所述第一分词集合,所述无意义文本分词包括:惊叹词和语气词。The first filtering module 532 is configured to filter the nonsense text participles contained in the third participle set identified by the text analysis module 531 to obtain the first participle set, and the nonsense text participles include: exclamation words and tone words.

进一步的,在本发明实施例的另一种应用场景中,如图8所示,所述文本分析单元,包括:文本分析模块531和第二过滤模块533。Further, in another application scenario of the embodiment of the present invention, as shown in FIG. 8 , the text analysis unit includes: a text analysis module 531 and a second filtering module 533 .

文本分析模块531,用于对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词。The text analysis module 531 is configured to perform text analysis on the text information to identify a third word segment set, and the third word segment set includes at least one text segment word.

第二过滤模块533,用于过滤所述文本分析模块识531别出的所述第三分词集合中包含的第一文本分词,得到所述第一分词集合,所述第一文本分词为所述第三分词集合中与所述关键字索引列表中的任意一个检索关键字均不匹配的文本分词。The second filtering module 533 is configured to filter the first text segmentation included in the third word segmentation set identified by the text analysis module 531 to obtain the first segmentation set, and the first text segmentation is the Text word segmentation in the third word segmentation set that does not match any search keyword in the keyword index list.

进一步的,在本发明实施例的另一种应用场景中,所述文本分析单元,包括:文本分析模块531、第一过滤模块532和第二过滤模块533。Furthermore, in another application scenario of the embodiment of the present invention, the text analysis unit includes: a text analysis module 531 , a first filter module 532 and a second filter module 533 .

在这种应用场景中,在第一过滤模块532过滤所述文本分析模块531识别出的所述第三分词集合中包含的无意义文本分词的同时,第二过滤模块533过滤所述文本分析模块识531别出的所述第三分词集合中包含的第一文本分词;此时,第一过滤模块532过滤得到的分词集合与第二过滤模块533过滤得到的分词集合的交集则可以作为所述第一分词集合。In this application scenario, while the first filtering module 532 filters the meaningless text word segmentation contained in the third word segmentation set identified by the text analysis module 531, the second filtering module 533 filters the text analysis module The first text participle contained in the third participle set identified by 531; at this time, the intersection of the participle set filtered by the first filter module 532 and the participle set filtered by the second filter module 533 can be used as the first participle set.

进一步的,所述语音信息由以句子为维度的至少一个语音片段构成,所述文本信息由至少一个文本片段构成。Further, the speech information is composed of at least one speech segment whose dimension is a sentence, and the text information is composed of at least one text segment.

所述接收单元51,具体用于接收所述用户终端发送的所述语音片段。The receiving unit 51 is specifically configured to receive the voice segment sent by the user terminal.

所述文本转换单元52,具体用于在所述接收单元51接收到一个所述语音片段后,向语音分析服务器发送所述语音片段,以使得所述语音分析服务器将所述语音片段转换为所述文本片段;接收所述语音分析服务器发送的所述文本片段。The text conversion unit 52 is specifically configured to, after the receiving unit 51 receives a voice segment, send the voice segment to the voice analysis server, so that the voice analysis server converts the voice segment into the the text segment; receiving the text segment sent by the speech analysis server.

进一步的,所述文本分析单元53,具体用于向所述文本分析服务器发送所述文本信息,以使得所述文本分析服务器对所述文本信息进行文本分析,识别出所述第一分词集合;接收所述文本分析服务器发送的所述第一分词集合。Further, the text analysis unit 53 is specifically configured to send the text information to the text analysis server, so that the text analysis server performs text analysis on the text information and identifies the first word segmentation set; The first word segmentation set sent by the text analysis server is received.

需要说明的是,本发明实施例中的呼叫中心服务器可以为能够为用户提供通信服务的运营商接收用户呼叫的呼叫中心管理服务器。本发明实施例提供的呼叫中心服务器中部分功能模块的具体描述可以参考方法实施例中的对应内容,本实施例这里不再详细赘述。It should be noted that the call center server in the embodiment of the present invention may be a call center management server that receives calls from users from operators that can provide communication services for users. For the specific description of some functional modules in the call center server provided by the embodiment of the present invention, reference may be made to the corresponding content in the method embodiment, and details will not be repeated here in this embodiment.

本发明实施例提供的呼叫中心服务器,呼叫中心服务器接收用户终端发送的语音信息,并获取对语音信息进行文本转换得到的文本信息;呼叫中心服务器获取对文本信息进行文本分析识别出的第一分词集合,第一分词集合中包含至少一个文本分词;呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,关键字索引列表中包含至少一个关键字索引项,关键字索引项中包含检索关键字和与检索关键字对应的检索数据的标识,知识数据库中存储有至少一个检索数据和所述检索数据的标识;呼叫中心服务器从所述知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。The call center server provided by the embodiment of the present invention, the call center server receives the voice information sent by the user terminal, and obtains the text information obtained by performing text conversion on the voice information; the call center server obtains the first participle recognized by text analysis of the text information Set, the first word segmentation set contains at least one text segmentation word; the call center server searches the search keyword matching the text segmentation word in the first word segmentation set from the keyword index list of the retrieval data stored in the knowledge database, keyword index The list contains at least one keyword index item, and the keyword index item includes a retrieval keyword and an identification of the retrieval data corresponding to the retrieval keyword, and at least one retrieval data and the identification of the retrieval data are stored in the knowledge database; the call center The server retrieves the retrieved data indicated by the identifier of the retrieved data corresponding to the found retrieved keyword from the knowledge database.

与现有技术中从知识数据库检索数据时,过分依赖于客服人员的个人的业务经验和技能,检索结果的准确性和检索效率较低相比,通过本方案,呼叫中心服务器在接收到用户终端发送的语音信息后,可以通过对语音信息进行文本分析得到分析结果(第一分词集合),然后从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,不需要客服人员根据其个人的业务经验和技能从用户语音信息中识别出检索关键字,并且呼叫中心服务器自动进行检索关键字的识别相较于客服人员根据其个人的业务经验和技能进行检索关键字的识别,可以提高检索关键字的识别效率,并避免由于客服人员由于疏忽或者大意出现的检索错误,提高识别检索关键字的准确性,在高效、准确地获取到检索关键字后,则可以快速、准确地获取到检索结果。Compared with the existing technology, when retrieving data from the knowledge database, which relies too much on the personal business experience and skills of the customer service personnel, and the accuracy and retrieval efficiency of the retrieval results are low, through this scheme, the call center server receives the user terminal After the voice information is sent, the analysis result (the first word segmentation set) can be obtained by performing text analysis on the voice information, and then from the keyword index list of the retrieval data stored in the knowledge database, search for the text word segmentation in the first word segmentation set Matched search keywords do not require customer service personnel to identify search keywords from user voice information based on their personal business experience and skills, and the call center server automatically recognizes search keywords compared to customer service personnel based on their personal business Recognition of search keywords based on experience and skills can improve the recognition efficiency of search keywords, avoid retrieval errors due to negligence or carelessness of customer service personnel, improve the accuracy of recognition of search keywords, and obtain search keywords efficiently and accurately. After keywords, you can quickly and accurately obtain the search results.

实施例六Embodiment six

本发明实施例提供一种文本分析服务器,用以执行图2所示的数据检索方法,如图9所示,该文本分析服务器包括:接收单元61、文本分析单元62和发送单元63。An embodiment of the present invention provides a text analysis server for executing the data retrieval method shown in FIG. 2 . As shown in FIG. 9 , the text analysis server includes: a receiving unit 61 , a text analysis unit 62 and a sending unit 63 .

接收单元61,用于接收呼叫中心服务器发送的文本信息,所述文本信息为所述呼叫中心服务器根据语音信息转换得到的。The receiving unit 61 is configured to receive the text information sent by the call center server, the text information is converted by the call center server according to the voice information.

文本分析单元62,用于对所述接收单元61接收的所述文本信息进行文本分析,识别出所述第一分词集合,所述第一分词集合中包含至少一个文本分词。The text analysis unit 62 is configured to perform text analysis on the text information received by the receiving unit 61 to identify the first word segment set, and the first word segment set includes at least one text segment word.

发送单元63,用于向所述呼叫中心服务器发送所述文本分析单元62分析得到的所述第一分词集合,以使得所述呼叫中心服务器根据所述第一分词集合中的文本分词从知识数据库中检索数据。The sending unit 63 is configured to send the first word segmentation set analyzed by the text analysis unit 62 to the call center server, so that the call center server obtains the information from the knowledge database according to the text word segmentation in the first word segmentation set. retrieve data from.

其中,文本分析单元62可以对文本信息进行断句、分词,以识别出至少一个文本分词。Wherein, the text analysis unit 62 may segment and segment the text information to identify at least one text segment.

进一步的,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识。Further, the knowledge database stores at least one retrieval data and the identification of the retrieval data.

如图10所示,所述文本分析服务器,还可以包括:读取单元64和生成单元65。As shown in FIG. 10 , the text analysis server may further include: a reading unit 64 and a generating unit 65 .

读取单元64,用于读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据。The reading unit 64 is configured to read the first retrieval data stored in the knowledge database, where the first retrieval data is any retrieval data stored in the knowledge database.

所述文本分析单元63,还用于对所述读取单元读取的所述第一检索数据进行文本分析,识别出第二分词集合,所述第二分词集合中包含至少一个文本分词。The text analysis unit 63 is further configured to perform text analysis on the first retrieval data read by the reading unit to identify a second word segment set, and the second word segment set includes at least one text segment word.

生成单元65,用于根据所述文本分析单元63分析得到的所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项,生成包含至少一个所述关键字索引项的所述关键字索引列表。A generating unit 65, configured to generate a retrieval keyword of the first retrieval data according to the text segmentation in the second segmentation set analyzed by the text analysis unit 63; according to the retrieval keyword of the first retrieval data and the identifier of the first retrieval data to generate one keyword index item, and generate the keyword index list including at least one keyword index item.

所述发送单元63,还用于向所述呼叫中心服务器发送所述生成单元65生成的所述关键字索引列表。The sending unit 63 is further configured to send the keyword index list generated by the generating unit 65 to the call center server.

需要说明的是,本发明实施例提供的文本分析服务器中部分功能模块的具体描述可以参考方法实施例中的对应内容,本实施例这里不再详细赘述。It should be noted that, for the specific description of some functional modules in the text analysis server provided by the embodiment of the present invention, reference may be made to the corresponding content in the method embodiment, and details will not be described in detail here in this embodiment.

本发明实施例提供的文本分析服务器,可以接收呼叫中心服务器发送的文本信息,所述文本信息为所述呼叫中心服务器根据语音信息转换得到的;所述文本分析服务器对所述文本信息进行文本分析,识别出所述第一分词集合,所述第一分词集合中包含至少一个文本分词;所述文本分析服务器向所述呼叫中心服务器发送所述第一分词集合,以使得所述呼叫中心服务器根据所述第一分词集合中的文本分词从知识数据库中检索数据。The text analysis server provided by the embodiment of the present invention can receive the text information sent by the call center server, and the text information is converted by the call center server according to the voice information; the text analysis server performs text analysis on the text information , identifying the first word segmentation set, which contains at least one text segmentation word; the text analysis server sends the first word segmentation set to the call center server, so that the call center server according to The text word segmentation in the first word segmentation set retrieves data from the knowledge database.

与现有技术中从知识数据库检索数据时,过分依赖于客服人员的个人的业务经验和技能,检索结果的准确性和检索效率较低相比,通过本方案,文本分析服务器可以对呼叫中心服务器发送的文本信息进行文本分析得到分析结果(第一分词集合),并将第一分词集合发送至呼叫中心服务器,由呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,不需要客服人员根据其个人的业务经验和技能从用户语音信息中识别出检索关键字,并且呼叫中心服务器自动进行检索关键字的识别相较于客服人员根据其个人的业务经验和技能进行检索关键字的识别,可以提高检索关键字的识别效率,并避免由于客服人员由于疏忽或者大意出现的检索错误,提高识别检索关键字的准确性,在高效、准确地获取到检索关键字后,则可以快速、准确地获取到检索结果。Compared with the prior art, when retrieving data from the knowledge database, relying too much on the personal business experience and skills of the customer service personnel, the accuracy and retrieval efficiency of the retrieval results are relatively low, through this scheme, the text analysis server can query the call center server The sent text information is subjected to text analysis to obtain the analysis result (the first word segmentation set), and the first word segmentation set is sent to the call center server, and the call center server searches the keyword index list of the retrieval data stored in the knowledge database, and The search keywords matched by the text word segmentation in the first word segmentation set do not require the customer service personnel to identify the search keywords from the user voice information based on their personal business experience and skills, and the call center server automatically performs the recognition and comparison of the search keywords Because the customer service personnel identify the retrieval keywords based on their personal business experience and skills, it can improve the recognition efficiency of the retrieval keywords, avoid retrieval errors due to negligence or carelessness of the customer service personnel, and improve the accuracy of identifying retrieval keywords. After the search keywords are obtained efficiently and accurately, the search results can be obtained quickly and accurately.

实施例七Embodiment seven

在硬件实现上,实施例五中的各个单元可以以硬件形式内嵌于或独立于呼叫中心服务器的处理器中,也可以以软件形式存储于呼叫中心服务器的存储器中,以便于处理器调用执行以上各个单元对应的操作,该处理器可以为中央处理单元(CPU)、微处理器、单片机等。In terms of hardware implementation, each unit in Embodiment 5 can be embedded in the processor of the call center server in the form of hardware or be independent of the processor of the call center server, and can also be stored in the memory of the call center server in the form of software, so that the processor can call and execute For the corresponding operations of the above units, the processor may be a central processing unit (CPU), a microprocessor, a single-chip microcomputer, and the like.

本发明实施例提供一种呼叫中心服务器,如图11所示,该呼叫中心服务器包括:接收器71和处理器72。An embodiment of the present invention provides a call center server. As shown in FIG. 11 , the call center server includes: a receiver 71 and a processor 72 .

接收器71,用于接收用户终端发送的语音信息。The receiver 71 is configured to receive voice information sent by the user terminal.

处理器72,用于获取对所述接收器71接收的所述语音信息进行文本转换得到的文本信息;获取对所述文本信息进行文本分析识别出的第一分词集合,所述第一分词集合中包含至少一个文本分词;从知识数据库中存储的检索数据的关键字索引列表中,查找与所述第一分词集合中的文本分词匹配的检索关键字,所述关键字索引列表中包含至少一个关键字索引项,所述关键字索引项中包含检索关键字和与所述检索关键字对应的检索数据的标识,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识;从所述知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。The processor 72 is configured to obtain text information obtained by performing text conversion on the voice information received by the receiver 71; obtain a first word segmentation set identified by performing text analysis on the text information, and the first word segmentation set contains at least one text word; from the keyword index list of the retrieval data stored in the knowledge database, search for a search keyword that matches the text word in the first word segmentation set, and the keyword index list contains at least one A keyword index item, the keyword index item includes a retrieval keyword and an identification of the retrieval data corresponding to the retrieval keyword, and at least one retrieval data and the identification of the retrieval data are stored in the knowledge database; from In the knowledge database, the retrieved data indicated by the identifier of the retrieved data corresponding to the retrieved retrieved keyword is retrieved.

进一步的,在本发明实施例的一种应用场景中,所述处理器72,还用于在从知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;获取对所述第一检索数据进行文本分析识别出的第二分词集合,所述第二分词集合中包含至少一个文本分词;根据所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项;生成包含至少一个所述关键字索引项的所述关键字索引列表。Further, in an application scenario of the embodiment of the present invention, the processor 72 is also configured to search for the keyword index items in the keyword index list of the retrieved data stored in the knowledge database, which are related to the Before the retrieval keywords of the text word segmentation matching in the first word segmentation set, read the first retrieval data stored in the knowledge database, the first retrieval data is any retrieval data stored in the knowledge database; The second word segmentation set identified by the text analysis of the first search data, the second word segmentation set contains at least one text word segmentation; according to the text word segmentation in the second word segmentation set, generate the first search data Retrieve keywords; generate a keyword index item according to the search keyword of the first search data and the identifier of the first search data; generate the keyword index list including at least one keyword index item .

在本发明实施例的另一种应用场景中,所述接收器71,还用于在所述处理器72从所述知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,接收文本分析服务器发送的所述关键字索引列表。In another application scenario of the embodiment of the present invention, the receiver 71 is also used to include keyword index items in the keyword index list of the retrieved data stored by the processor 72 in the knowledge database , before searching for a retrieval keyword matching the text word segmentation in the first word segmentation set, receiving the keyword index list sent by the text analysis server.

在上述两种应用场景中,如图12所示,所述呼叫中心服务器,还可以包括:存储器73。In the above two application scenarios, as shown in FIG. 12 , the call center server may further include: a memory 73 .

存储器73,用于存储所述处理器72生成的所述关键字索引列表;或者,存储器73,用于存储所述接收器71接收到的所述关键字索引列表。The memory 73 is configured to store the keyword index list generated by the processor 72 ; or, the memory 73 is configured to store the keyword index list received by the receiver 71 .

进一步的,所述处理器72,具体用于对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词;过滤所述第三分词集合中包含的无意义文本分词,得到所述第一分词集合,所述无意义文本分词包括:惊叹词和语气词。Further, the processor 72 is specifically configured to perform text analysis on the text information, identify a third word segmentation set, the third word segmentation set contains at least one text segmentation word; filter the third word segmentation set contains The nonsense text word segmentation to obtain the first word segmentation set, the nonsense text word segmentation includes: exclamation words and modal particles.

进一步的,所述处理器72,具体用于对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词;过滤所述第三分词集合中包含的第一文本分词,得到所述第一分词集合,所述第一文本分词为所述第三分词集合中与所述关键字索引列表中的任意一个检索关键字均不匹配的文本分词。Further, the processor 72 is specifically configured to perform text analysis on the text information, identify a third word segmentation set, the third word segmentation set contains at least one text segmentation word; filter the third word segmentation set contains The first text segmentation to obtain the first word segmentation set, the first text segmentation is the text segmentation in the third word segmentation set that does not match any search keyword in the keyword index list.

进一步的,所述语音信息由以句子为维度的至少一个语音片段构成,所述文本信息由至少一个文本片段构成。Further, the speech information is composed of at least one speech segment whose dimension is a sentence, and the text information is composed of at least one text segment.

所述接收器71,还用于接收所述用户终端发送的所述语音片段。The receiver 71 is further configured to receive the voice segment sent by the user terminal.

如图13所示,所述呼叫中心服务器,还可以包括:发送器74。As shown in FIG. 13 , the call center server may further include: a sender 74 .

发送器74,用于在所述接收器71接收到一个所述语音片段后,向语音分析服务器发送所述语音片段,以使得所述语音分析服务器将所述语音片段转换为所述文本片段。The sender 74 is configured to send the voice segment to a voice analysis server after the receiver 71 receives the voice segment, so that the voice analysis server converts the voice segment into the text segment.

所述接收器71,还用于接收所述语音分析服务器发送的所述文本片段。The receiver 71 is further configured to receive the text segment sent by the speech analysis server.

进一步的,所述发送器74,还用于向所述文本分析服务器发送所述文本信息,以使得所述文本分析服务器对所述文本信息进行文本分析,识别出所述第一分词集合。Further, the sender 74 is further configured to send the text information to the text analysis server, so that the text analysis server performs text analysis on the text information and identifies the first word segmentation set.

所述接收器71,还用于接收所述文本分析服务器发送的所述第一分词集合。The receiver 71 is further configured to receive the first word segmentation set sent by the text analysis server.

在本发明实施例中,接收器71、处理器72、存储器73和发送器74通过总线连接并完成相互间的通信。In the embodiment of the present invention, the receiver 71, the processor 72, the memory 73, and the transmitter 74 are connected through a bus and complete mutual communication.

其中,总线可以是工业标准体系结构(IndustryStandardArchitecture,ISA)总、外部设备互连(PeripheralComponentInterconnect,PCI)总线或扩展工业标准体系结构(ExtendedIndustryStandardArchitecture,EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图11-图13中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。Wherein, the bus may be an Industry Standard Architecture (Industry Standard Architecture, ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus. The bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in FIGS. 11-13 , but it does not mean that there is only one bus or one type of bus.

处理器72可以为中央处理器(CentralProcessingUnit,CPU),也可以为CPU、数字信号处理器(DigitalSignalProcessor,DSP)以及通信单元中的控制芯片(例如基带芯片)的组合。在本发明实施例中,CPU可以是单运算核心,也可以包括多运算核心。The processor 72 may be a central processing unit (Central Processing Unit, CPU), or a combination of a CPU, a digital signal processor (Digital Signal Processor, DSP) and a control chip (such as a baseband chip) in the communication unit. In this embodiment of the present invention, the CPU may be a single computing core, or may include multiple computing cores.

接收器71和发送器74可以为通过天线发送和接收无线信号的装置,也可以为其他提供信号发送和接收接口的装置。The receiver 71 and the transmitter 74 may be devices that send and receive wireless signals through antennas, or other devices that provide signal sending and receiving interfaces.

需要说明的是,本发明实施例提供的呼叫中心服务器中部分功能模块的具体描述可以参考方法实施例中的对应内容,本实施例这里不再详细赘述。It should be noted that, for the specific description of some functional modules in the call center server provided by the embodiment of the present invention, reference may be made to the corresponding content in the method embodiment, and details will not be repeated here in this embodiment.

本发明实施例提供的呼叫中心服务器,可以接收用户终端发送的语音信息,并获取对语音信息进行文本转换得到的文本信息;呼叫中心服务器获取对文本信息进行文本分析识别出的第一分词集合,第一分词集合中包含至少一个文本分词;呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,关键字索引列表中包含至少一个关键字索引项,关键字索引项中包含检索关键字和与检索关键字对应的检索数据的标识,知识数据库中存储有至少一个检索数据和所述检索数据的标识;呼叫中心服务器从所述知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。The call center server provided by the embodiment of the present invention can receive the voice information sent by the user terminal, and obtain the text information obtained by performing text conversion on the voice information; the call center server obtains the first word segmentation set identified by text analysis on the text information, The first word segmentation set contains at least one text segmentation word; the call center server searches the retrieval keyword matching the text segmentation word in the first word segmentation set from the keyword index list of the retrieval data stored in the knowledge database, and in the keyword index list Including at least one keyword index item, the keyword index item includes a retrieval keyword and an identifier of the retrieval data corresponding to the retrieval keyword, and at least one retrieval data and the identifier of the retrieval data are stored in the knowledge database; In the knowledge database, the retrieved data indicated by the identifier of the retrieved data corresponding to the retrieved keyword is retrieved.

与现有技术中从知识数据库检索数据时,过分依赖于客服人员的个人的业务经验和技能,检索结果的准确性和检索效率较低相比,通过本方案,呼叫中心服务器在接收到用户终端发送的语音信息后,可以通过对语音信息进行文本分析得到分析结果(第一分词集合),然后从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,不需要客服人员根据其个人的业务经验和技能从用户语音信息中识别出检索关键字,并且呼叫中心服务器自动进行检索关键字的识别相较于客服人员根据其个人的业务经验和技能进行检索关键字的识别,可以提高检索关键字的识别效率,并避免由于客服人员由于疏忽或者大意出现的检索错误,提高识别检索关键字的准确性,在高效、准确地获取到检索关键字后,则可以快速、准确地获取到检索结果。Compared with the prior art when retrieving data from the knowledge database, relying too much on the personal business experience and skills of the customer service personnel, the accuracy and retrieval efficiency of the retrieval results are relatively low, through this scheme, the call center server receives the user terminal After the voice information is sent, the analysis result (the first word segmentation set) can be obtained by performing text analysis on the voice information, and then from the keyword index list of the retrieval data stored in the knowledge database, search for the text word segmentation in the first word segmentation set Matched search keywords do not require customer service personnel to identify search keywords from user voice information based on their personal business experience and skills, and the call center server automatically recognizes search keywords compared to customer service personnel based on their personal business Recognition of search keywords based on experience and skills can improve the recognition efficiency of search keywords, avoid search errors due to negligence or carelessness of customer service personnel, improve the accuracy of recognition of search keywords, and obtain search keywords efficiently and accurately. After keywords, you can quickly and accurately obtain the search results.

实施例八Embodiment eight

在硬件实现上,实施例六中的各个单元可以以硬件形式内嵌于或独立于文本分析服务器的处理器中,也可以以软件形式存储于文本分析服务器的存储器中,以便于处理器调用执行以上各个单元对应的操作,该处理器可以为中央处理单元(CPU)、微处理器、单片机等。In terms of hardware implementation, each unit in Embodiment 6 can be embedded in or independent of the processor of the text analysis server in the form of hardware, and can also be stored in the memory of the text analysis server in the form of software, so that the processor can call and execute For the corresponding operations of the above units, the processor may be a central processing unit (CPU), a microprocessor, a single-chip microcomputer, and the like.

本发明实施例提供一种文本分析服务器,如图14所示,包括:接收器81、处理器82和发送器83。An embodiment of the present invention provides a text analysis server, as shown in FIG. 14 , including: a receiver 81 , a processor 82 and a sender 83 .

接收器81,用于接收呼叫中心服务器发送的文本信息,所述文本信息为所述呼叫中心服务器根据语音信息转换得到的。The receiver 81 is configured to receive the text information sent by the call center server, and the text information is converted by the call center server according to the voice information.

处理器82,用于对所述接收器81接收的所述文本信息进行文本分析,识别出所述第一分词集合,所述第一分词集合中包含至少一个文本分词。The processor 82 is configured to perform text analysis on the text information received by the receiver 81 to identify the first word segment set, and the first word segment set includes at least one text segment word.

发送器83,用于向所述呼叫中心服务器发送所述处理器82识别出的所述第一分词集合,以使得所述呼叫中心服务器根据所述第一分词集合中的文本分词从知识数据库中检索数据。The sender 83 is configured to send the first word segmentation set identified by the processor 82 to the call center server, so that the call center server can learn from the knowledge database according to the text word segmentation in the first word segmentation set. Retrieve data.

进一步的,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识。Further, the knowledge database stores at least one retrieval data and the identification of the retrieval data.

所述处理器82,还用于读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;对所述第一检索数据进行文本分析,识别出第二分词集合,所述第二分词集合中包含至少一个文本分词;根据所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项;生成包含至少一个所述关键字索引项的所述关键字索引列表。The processor 82 is also configured to read the first retrieval data stored in the knowledge database, where the first retrieval data is any retrieval data stored in the knowledge database; Text analysis, identifying a second word segmentation set, which contains at least one text segmentation; according to the text segmentation in the second word segmentation set, generating a retrieval keyword for the first retrieval data; according to the The retrieval keyword of the first retrieval data and the identifier of the first retrieval data generate a keyword index item; generate the keyword index list including at least one keyword index item.

所述发送器83,还用于向所述呼叫中心服务器发送所述处理器82生成的所述关键字索引列表。The sender 83 is further configured to send the keyword index list generated by the processor 82 to the call center server.

在本发明实施例中,接收器81、处理器82、发送器83通过总线连接并完成相互间的通信。In the embodiment of the present invention, the receiver 81, the processor 82, and the transmitter 83 are connected through a bus to complete mutual communication.

其中,总线可以是工业标准体系结构(IndustryStandardArchitecture,ISA)总、外部设备互连(PeripheralComponentInterconnect,PCI)总线或扩展工业标准体系结构(ExtendedIndustryStandardArchitecture,EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。为便于表示,图14中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。Wherein, the bus may be an Industry Standard Architecture (Industry Standard Architecture, ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus. The bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in FIG. 14 , but it does not mean that there is only one bus or one type of bus.

处理器82可以为中央处理器(CentralProcessingUnit,CPU),也可以为CPU、数字信号处理器(DigitalSignalProcessor,DSP)以及通信单元中的控制芯片(例如基带芯片)的组合。在本发明实施例中,CPU可以是单运算核心,也可以包括多运算核心。The processor 82 may be a central processing unit (Central Processing Unit, CPU), or a combination of a CPU, a digital signal processor (Digital Signal Processor, DSP) and a control chip (such as a baseband chip) in the communication unit. In this embodiment of the present invention, the CPU may be a single computing core, or may include multiple computing cores.

接收器81和发送器82可以为通过天线发送和接收无线信号的装置,也可以为其他提供信号发送和接收接口的装置。The receiver 81 and the transmitter 82 may be devices that send and receive wireless signals through antennas, or other devices that provide signal sending and receiving interfaces.

需要说明的是,本发明实施例提供的文本分析服务器中部分功能模块的具体描述可以参考方法实施例中的对应内容,本实施例这里不再详细赘述。It should be noted that, for the specific description of some functional modules in the text analysis server provided by the embodiment of the present invention, reference may be made to the corresponding content in the method embodiment, and details will not be described in detail here in this embodiment.

本发明实施例提供的文本分析服务器,可以接收呼叫中心服务器发送的文本信息,所述文本信息为所述呼叫中心服务器根据语音信息转换得到的;所述文本分析服务器对所述文本信息进行文本分析,识别出所述第一分词集合,所述第一分词集合中包含至少一个文本分词;所述文本分析服务器向所述呼叫中心服务器发送所述第一分词集合,以使得所述呼叫中心服务器根据所述第一分词集合中的文本分词从知识数据库中检索数据。The text analysis server provided by the embodiment of the present invention can receive the text information sent by the call center server, and the text information is converted by the call center server according to the voice information; the text analysis server performs text analysis on the text information , identifying the first word segmentation set, which contains at least one text segmentation word; the text analysis server sends the first word segmentation set to the call center server, so that the call center server according to The text word segmentation in the first word segmentation set retrieves data from the knowledge database.

与现有技术中从知识数据库检索数据时,过分依赖于客服人员的个人的业务经验和技能,检索结果的准确性和检索效率较低相比,通过本方案,文本分析服务器可以对呼叫中心服务器发送的文本信息进行文本分析得到分析结果(第一分词集合),并将第一分词集合发送至呼叫中心服务器,由呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,不需要客服人员根据其个人的业务经验和技能从用户语音信息中识别出检索关键字,并且呼叫中心服务器自动进行检索关键字的识别相较于客服人员根据其个人的业务经验和技能进行检索关键字的识别,可以提高检索关键字的识别效率,并避免由于客服人员由于疏忽或者大意出现的检索错误,提高识别检索关键字的准确性,在高效、准确地获取到检索关键字后,则可以快速、准确地获取到检索结果。Compared with the prior art, when retrieving data from the knowledge database, relying too much on the personal business experience and skills of the customer service personnel, the accuracy and retrieval efficiency of the retrieval results are relatively low, through this scheme, the text analysis server can query the call center server The sent text information is subjected to text analysis to obtain the analysis result (the first word segmentation set), and the first word segmentation set is sent to the call center server, and the call center server searches the keyword index list of the retrieval data stored in the knowledge database, and The search keywords matched by the text word segmentation in the first word segmentation set do not require the customer service personnel to identify the search keywords from the user voice information based on their personal business experience and skills, and the call center server automatically performs the recognition and comparison of the search keywords Because the customer service personnel identify the retrieval keywords based on their personal business experience and skills, it can improve the recognition efficiency of the retrieval keywords, avoid retrieval errors due to negligence or carelessness of the customer service personnel, and improve the accuracy of identifying retrieval keywords. After the search keywords are obtained efficiently and accurately, the search results can be obtained quickly and accurately.

需要说明的是,在本发明实施例的第一种应用场景中,呼叫中心服务器中可以集成有能够实现对语音信息进行文本转换的文本转换模块和能够实现对文本转换模块对语音信息进行文本转换得到的文本信息进行文本分析的文本分析模块,即呼叫中心服务器内部即可实现对语音信息进行文本转换,进而对进行文本转换得到的文本信息进行文本分析的功能。It should be noted that, in the first application scenario of the embodiment of the present invention, the call center server may be integrated with a text conversion module capable of performing text conversion on voice information and a text conversion module capable of performing text conversion on voice information. The text analysis module for performing text analysis on the obtained text information, that is, the function of performing text conversion on the voice information within the call center server, and then performing text analysis on the text information obtained through the text conversion.

在本发明实施例的第二种应用场景中,呼叫中心服务器需要向语音分析服务器发送语音信息,由语音分析服务器实现对语音信息进行文本转换,得到文本信息,然后接收语音分析服务器发送的文本信息;呼叫中心服务器需要向文本分析服务器发送文本信息,由文本分析服务器实现对文本信息进行文本分析,得到分析结果,然后接收文本分析服务器发送的分析结果。In the second application scenario of the embodiment of the present invention, the call center server needs to send voice information to the voice analysis server, and the voice analysis server implements text conversion of the voice information to obtain text information, and then receives the text information sent by the voice analysis server ; The call center server needs to send text information to the text analysis server, and the text analysis server implements text analysis on the text information to obtain the analysis result, and then receives the analysis result sent by the text analysis server.

对应于第二种应用场景,本发明实施例提供一种数据检索系统,如图15所示,该数据检索系统包括:呼叫中心服务器91、语音分析服务器92、文本分析服务器93和知识数据库94。Corresponding to the second application scenario, the embodiment of the present invention provides a data retrieval system. As shown in FIG. 15 , the data retrieval system includes: a call center server 91 , a voice analysis server 92 , a text analysis server 93 and a knowledge database 94 .

所述呼叫中心服务器91,用于接收用户终端发送的语音信息,并向所述语音分析服务器92发送所述语音信息,所述语音信息由以句子为维度的至少一个语音片段构成。The call center server 91 is configured to receive the voice information sent by the user terminal, and send the voice information to the voice analysis server 92, the voice information is composed of at least one voice segment with a sentence as the dimension.

所述语音分析服务器92,用于将所述语音片段转换为所述文本片段,并向所述呼叫中心服务器91发送文本信息,所述文本信息由至少一个文本片段构成。The speech analysis server 92 is configured to convert the speech segment into the text segment, and send text information to the call center server 91, the text information is composed of at least one text segment.

所述呼叫中心服务器91,还用于向所述文本分析服务器93发送所述文本信息。The call center server 91 is further configured to send the text information to the text analysis server 93 .

所述文本分析服务器93,用于对所述文本信息进行文本分析,识别出第一分词集合,所述第一分词集合中包含至少一个文本分词。The text analysis server 93 is configured to perform text analysis on the text information to identify a first word segment set, and the first word segment set includes at least one text segment word.

所述呼叫中心服务器91,还用于从知识数据库中存储的检索数据的关键字索引列表中,查找与所述第一分词集合中的文本分词匹配的检索关键字,所述关键字索引列表中包含至少一个关键字索引项,所述关键字索引项中包含检索关键字和与所述检索关键字对应的检索数据的标识,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识;从所述知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。The call center server 91 is further configured to search for a search keyword that matches the text word segmentation in the first word segmentation set from the keyword index list of the retrieval data stored in the knowledge database, in the keyword index list Contains at least one keyword index item, the keyword index item includes a search keyword and an identifier of the search data corresponding to the search keyword, and the knowledge database stores at least one search data and an identifier of the search data An identifier: from the knowledge database, retrieve the retrieved data indicated by the identifier of the retrieved data corresponding to the found retrieval keyword.

进一步的,所述文本分析服务器93,还用于向所述呼叫中心服务器91发送所述关键字索引列表。Further, the text analysis server 93 is further configured to send the keyword index list to the call center server 91 .

进一步的,所述文本分析服务器93,还用于在向所述呼叫中心服务器91发送所述关键字索引列表之前,读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;对所述第一检索数据进行文本分析,识别出第二分词集合,所述第二分词集合中包含至少一个文本分词;根据所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项;生成包含至少一个所述关键字索引项的所述关键字索引列表。Further, the text analysis server 93 is also used to read the first retrieval data stored in the knowledge database before sending the keyword index list to the call center server 91, the first retrieval data For any retrieval data stored in the knowledge database; performing text analysis on the first retrieval data to identify a second word segmentation set, which contains at least one text segmentation word; according to the second word segmentation The text word segmentation in the set, generate the retrieval keyword of the first retrieval data; generate a keyword index item according to the retrieval keyword of the first retrieval data and the identification of the first retrieval data; generate a keyword index item containing at least The keyword index list of one of the keyword index items.

本发明实施例提供的数据检索系统,与现有技术中从知识数据库检索数据时,过分依赖于客服人员的个人的业务经验和技能,检索结果的准确性和检索效率较低相比,通过本方案,呼叫中心服务器在接收到用户终端发送的语音信息后,可以通过对语音信息进行文本分析得到分析结果(第一分词集合),然后从知识数据库中存储的检索数据的关键字索引列表中,查找与第一分词集合中的文本分词匹配的检索关键字,不需要客服人员根据其个人的业务经验和技能从用户语音信息中识别出检索关键字,并且呼叫中心服务器自动进行检索关键字的识别相较于客服人员根据其个人的业务经验和技能进行检索关键字的识别,可以提高检索关键字的识别效率,并避免由于客服人员由于疏忽或者大意出现的检索错误,提高识别检索关键字的准确性,在高效、准确地获取到检索关键字后,则可以快速、准确地获取到检索结果。Compared with the data retrieval system provided by the embodiment of the present invention, when retrieving data from the knowledge database in the prior art, it relies too much on the personal business experience and skills of the customer service personnel, and the accuracy and retrieval efficiency of the retrieval results are relatively low. Solution, after receiving the voice information sent by the user terminal, the call center server can obtain the analysis result (the first word segmentation set) by performing text analysis on the voice information, and then from the keyword index list of the retrieval data stored in the knowledge database, Find search keywords that match the text word segmentation in the first word segmentation set, without the need for customer service personnel to identify the search keywords from user voice information based on their personal business experience and skills, and the call center server automatically recognizes the search keywords Compared with the identification of search keywords by customer service personnel based on their personal business experience and skills, it can improve the identification efficiency of search keywords, avoid retrieval errors due to negligence or carelessness of customer service personnel, and improve the accuracy of identifying search keywords After obtaining the retrieval keywords efficiently and accurately, the retrieval results can be obtained quickly and accurately.

通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Through the description of the above embodiments, those skilled in the art can clearly understand that for the convenience and brevity of the description, only the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned functions can be allocated according to needs It is completed by different functional modules, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. For the specific working process of the above-described system, device, and unit, reference may be made to the corresponding process in the foregoing method embodiments, and details are not repeated here.

在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be Incorporation may either be integrated into another system, or some features may be omitted, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-OnlyMemory)、随机存取存储器(RAM,RandomAccessMemory)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) execute all or part of the steps of the method described in each embodiment of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-OnlyMemory), random access memory (RAM, RandomAccessMemory), magnetic disk or optical disk and other media that can store program codes.

以上所述,仅为本发明的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以所述权利要求的保护范围为准。The above is only a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Anyone skilled in the art can easily think of changes or substitutions within the technical scope disclosed in the present invention. Should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention should be determined by the protection scope of the claims.

Claims (18)

1.一种数据检索方法,其特征在于,包括:1. A data retrieval method, characterized in that, comprising: 呼叫中心服务器接收用户终端发送的语音信息,并获取对所述语音信息进行文本转换得到的文本信息;The call center server receives the voice information sent by the user terminal, and obtains the text information obtained by converting the voice information into text; 所述呼叫中心服务器获取对所述文本信息进行文本分析识别出的第一分词集合,所述第一分词集合中包含至少一个文本分词;The call center server acquires a first word segmentation set identified by performing text analysis on the text information, and the first word segmentation set includes at least one text segmentation word; 所述呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中,查找与所述第一分词集合中的文本分词匹配的检索关键字,所述关键字索引列表中包含至少一个关键字索引项,所述关键字索引项中包含检索关键字和与所述检索关键字对应的检索数据的标识,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识;The call center server searches for a retrieval keyword that matches the text word segmentation in the first word segmentation set from the keyword index list of the retrieval data stored in the knowledge database, and the keyword index list contains at least one keyword An index item, the keyword index item includes a search keyword and an identifier of the search data corresponding to the search keyword, and at least one search data and the identifier of the search data are stored in the knowledge database; 所述呼叫中心服务器从所述知识数据库中,检索查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。The call center server retrieves, from the knowledge database, the retrieved data indicated by the identifier of the retrieved data corresponding to the found retrieved keyword. 2.根据权利要求1所述的方法,其特征在于,在所述呼叫中心服务器从知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,所述方法还包括:2. method according to claim 1, is characterized in that, in the keyword index item in the keyword index list in the retrieval data that described call center server stores from knowledge database, search and described first word segmentation set Before the search keyword matched by the text word segmentation, the method also includes: 所述呼叫中心服务器读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;The call center server reads the first retrieval data stored in the knowledge database, and the first retrieval data is any retrieval data stored in the knowledge database; 所述呼叫中心服务器获取对所述第一检索数据进行文本分析识别出的第二分词集合,所述第二分词集合中包含至少一个文本分词;The call center server obtains a second word segmentation set identified by performing text analysis on the first retrieval data, and the second word segmentation set includes at least one text segmentation word; 所述呼叫中心服务器根据所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;The call center server generates the retrieval keywords of the first retrieval data according to the text segmentation in the second segmentation set; 所述呼叫中心服务器根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项;The call center server generates a keyword index item according to the retrieval keyword of the first retrieval data and the identifier of the first retrieval data; 所述呼叫中心服务器生成包含至少一个所述关键字索引项的所述关键字索引列表。The call center server generates the keyword index list including at least one keyword index item. 3.根据权利要求1所述的方法,其特征在于,在所述呼叫中心服务器从所述知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,所述方法还包括:3. The method according to claim 1, characterized in that, in the keyword index item in the keyword index list of the retrieved data stored by the call center server from the knowledge database, search for the keyword index item related to the first Before the retrieval keyword matched by the text word segmentation in the word segmentation set, the method also includes: 所述呼叫中心服务器接收文本分析服务器发送的所述关键字索引列表。The call center server receives the keyword index list sent by the text analysis server. 4.根据权利要求1-3中任一项所述的方法,其特征在于,所述呼叫中心服务器获取对所述文本信息进行文本分析识别出的第一分词集合,包括:4. according to the method described in any one in claim 1-3, it is characterized in that, described call center server obtains the first participle set that text analysis is identified to described text information, comprises: 所述呼叫中心服务器对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词;The call center server performs text analysis on the text information, and recognizes a third participle set, which contains at least one text participle; 所述呼叫中心服务器过滤所述第三分词集合中包含的无意义文本分词,得到所述第一分词集合,所述无意义文本分词包括:惊叹词和语气词。The call center server filters the meaningless text participles included in the third participle set to obtain the first participle set, and the meaningless text participle includes: exclamation words and modal particles. 5.根据权利要求1-3中任一项所述的方法,其特征在于,所述呼叫中心服务器获取对所述文本信息进行文本分析识别出的第一分词集合,包括:5. according to the method described in any one in claim 1-3, it is characterized in that, described call center server obtains the first participle set that the text analysis of described text information is identified, comprises: 所述呼叫中心服务器对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词;The call center server performs text analysis on the text information, and recognizes a third participle set, which contains at least one text participle; 所述呼叫中心服务器过滤所述第三分词集合中包含的第一文本分词,得到所述第一分词集合,所述第一文本分词为所述第三分词集合中与所述关键字索引列表中的任意一个检索关键字均不匹配的文本分词。The call center server filters the first text segmentation included in the third word segmentation set to obtain the first text segmentation set, and the first text segmentation is the combination of the third word segmentation set and the keyword index list A text segment that does not match any of the search keywords. 6.根据权利要求1或3所述的方法,其特征在于,所述语音信息由以句子为维度的至少一个语音片段构成,所述文本信息由至少一个文本片段构成;6. The method according to claim 1 or 3, wherein the speech information is composed of at least one speech segment taking a sentence as a dimension, and the text information is composed of at least one text segment; 所述呼叫中心服务器接收用户终端发送的语音信息,并获取对所述语音信息进行文本转换得到的文本信息,包括:The call center server receives the voice information sent by the user terminal, and obtains the text information obtained by converting the voice information into text, including: 所述呼叫中心服务器接收所述用户终端发送的所述语音片段;The call center server receives the voice segment sent by the user terminal; 所述呼叫中心服务器在接收到一个所述语音片段后,向语音分析服务器发送所述语音片段,以使得所述语音分析服务器将所述语音片段转换为所述文本片段;The call center server sends the voice segment to the voice analysis server after receiving one of the voice segments, so that the voice analysis server converts the voice segment into the text segment; 所述呼叫中心服务器接收所述语音分析服务器发送的所述文本片段。The call center server receives the text segment sent by the speech analysis server. 7.根据权利要求1或3所述的方法,其特征在于,所述呼叫中心服务器获取对所述文本信息进行文本分析识别出的第一分词集合,包括:7. The method according to claim 1 or 3, wherein the call center server acquires the first word segmentation set identified by text analysis of the text information, including: 所述呼叫中心服务器向所述文本分析服务器发送所述文本信息,以使得所述文本分析服务器对所述文本信息进行文本分析,识别出所述第一分词集合;The call center server sends the text information to the text analysis server, so that the text analysis server performs text analysis on the text information and identifies the first word segmentation set; 所述呼叫中心服务器接收所述文本分析服务器发送的所述第一分词集合。The call center server receives the first word segmentation set sent by the text analysis server. 8.一种数据检索方法,其特征在于,包括:8. A data retrieval method, characterized in that, comprising: 文本分析服务器接收呼叫中心服务器发送的文本信息,所述文本信息为所述呼叫中心服务器根据语音信息转换得到的;The text analysis server receives the text information sent by the call center server, and the text information is converted by the call center server according to the voice information; 所述文本分析服务器对所述文本信息进行文本分析,识别出所述第一分词集合,所述第一分词集合中包含至少一个文本分词;The text analysis server performs text analysis on the text information, and identifies the first word segment set, and the first word segment set contains at least one text segment word; 所述文本分析服务器向所述呼叫中心服务器发送所述第一分词集合,以使得所述呼叫中心服务器根据所述第一分词集合中的文本分词从知识数据库中检索数据。The text analysis server sends the first word segmentation set to the call center server, so that the call center server retrieves data from the knowledge database according to the text segmentation words in the first word segmentation set. 9.根据权利要求8所述的方法,其特征在于,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识;9. The method according to claim 8, wherein at least one retrieval data and an identifier of the retrieval data are stored in the knowledge database; 所述方法还包括:The method also includes: 所述文本分析服务器读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;The text analysis server reads the first retrieval data stored in the knowledge database, and the first retrieval data is any retrieval data stored in the knowledge database; 所述文本分析服务器对所述第一检索数据进行文本分析,识别出第二分词集合,所述第二分词集合中包含至少一个文本分词;The text analysis server performs text analysis on the first retrieval data, and identifies a second word segmentation set, and the second word segmentation set contains at least one text segmentation word; 所述文本分析服务器根据所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;The text analysis server generates the retrieval keywords of the first retrieval data according to the text segmentation in the second segmentation set; 所述文本分析服务器根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项;The text analysis server generates one keyword index item according to the retrieval keyword of the first retrieval data and the identifier of the first retrieval data; 所述文本服务器生成包含至少一个所述关键字索引项的所述关键字索引列表;The text server generates the keyword index list including at least one keyword index item; 所述文本分析服务器向所述呼叫中心服务器发送所述关键字索引列表。The text analysis server sends the keyword index list to the call center server. 10.一种呼叫中心服务器,其特征在于,包括:10. A call center server, characterized in that, comprising: 接收单元,用于接收用户终端发送的语音信息;a receiving unit, configured to receive voice information sent by the user terminal; 文本转换单元,用于获取对所述接收单元接收的所述语音信息进行文本转换得到的文本信息;a text conversion unit, configured to obtain text information obtained by performing text conversion on the voice information received by the receiving unit; 文本分析单元,用于获取对所述文本转换单元获得的所述文本信息进行文本分析识别出的第一分词集合,所述第一分词集合中包含至少一个文本分词;a text analysis unit, configured to obtain a first word segmentation set identified by performing text analysis on the text information obtained by the text conversion unit, and the first word segmentation set includes at least one text segment; 关键字查找单元,用于从知识数据库中存储的检索数据的关键字索引列表中,查找与所述文本分析单元获得的所述第一分词集合中的文本分词匹配的检索关键字,所述关键字索引列表中包含至少一个关键字索引项,所述关键字索引项中包含检索关键字和与所述检索关键字对应的检索数据的标识,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识;A keyword search unit, configured to search for a search keyword that matches the text word segmentation in the first word segmentation set obtained by the text analysis unit from the keyword index list of the retrieval data stored in the knowledge database, the keyword The word index list contains at least one keyword index item, and the keyword index item includes a search keyword and an identification of the search data corresponding to the search keyword, and at least one search data and the search data are stored in the knowledge database. identification of the retrieved data; 检索单元,用于从所述知识数据库中,检索所述关键字查找单元查找到的检索关键字所对应的检索的数据的标识所指示的检索数据。The retrieval unit is configured to retrieve, from the knowledge database, the retrieval data indicated by the identifier of the retrieved data corresponding to the retrieval keyword found by the keyword searching unit. 11.根据权利要求10所述的呼叫中心服务器,其特征在于,还包括:11. The call center server according to claim 10, further comprising: 读取单元,用于在所述关键字查找单元从所述知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;A reading unit, configured to search for, among the keyword index items in the keyword index list of the retrieval data stored in the knowledge database by the keyword search unit, the words that match the text word segmentation in the first word segmentation set Before retrieving keywords, read the first retrieval data stored in the knowledge database, and the first retrieval data is any retrieval data stored in the knowledge database; 所述文本分析单元,还用于获取对所述读取单元读取的所述第一检索数据进行文本分析识别出的第二分词集合,所述第二分词集合中包含至少一个文本分词;The text analysis unit is further configured to obtain a second word segmentation set identified by performing text analysis on the first retrieval data read by the reading unit, and the second word segmentation set includes at least one text segmentation word; 生成单元,用于根据所述文本分析单元获得的所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项,生成包含至少一个所述关键字索引项的所述关键字索引列表。A generation unit, configured to generate a retrieval keyword of the first retrieval data according to the text word segmentation in the second word segmentation set obtained by the text analysis unit; according to the retrieval keyword of the first retrieval data and the The identification of the first retrieval data generates one keyword index item, and generates the keyword index list including at least one keyword index item. 12.根据权利要求10所述的呼叫中心服务器,其特征在于,所述接收单元,还用于在所述关键字查找单元从所述知识数据库中存储的检索数据的关键字索引列表中的关键字索引项中,查找与所述第一分词集合中的文本分词匹配的检索关键字之前,接收文本分析服务器发送的所述关键字索引列表。12. The call center server according to claim 10, characterized in that, the receiving unit is also used for key words in the keyword index list of the retrieved data stored in the knowledge database by the keyword search unit In the word index item, before searching for the search keywords matching the text word segments in the first word segment set, the keyword index list sent by the text analysis server is received. 13.根据权利要求10-12中任一项所述的呼叫中心服务器,其特征在于,所述文本分析单元,包括:13. The call center server according to any one of claims 10-12, wherein the text analysis unit includes: 文本分析模块,用于对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词;A text analysis module, configured to perform text analysis on the text information, and identify a third word segment set, the third word segment set includes at least one text segment word; 第一过滤模块,用于过滤所述文本分析模块识别出的所述第三分词集合中包含的无意义文本分词,得到所述第一分词集合,所述无意义文本分词包括:惊叹词和语气词。The first filtering module is used to filter the nonsense text participles contained in the third participle set identified by the text analysis module to obtain the first participle set, and the nonsense text participles include: exclamation words and tone word. 14.根据权利要求10-12中任一项所述的呼叫中心服务器,其特征在于,所述文本分析单元,包括:14. The call center server according to any one of claims 10-12, wherein the text analysis unit includes: 文本分析模块,用于对所述文本信息进行文本分析,识别出第三分词集合,所述第三分词集合中包含至少一个文本分词;A text analysis module, configured to perform text analysis on the text information, and identify a third word segment set, the third word segment set includes at least one text segment word; 第二过滤模块,用于过滤所述文本分析模块识别出的所述第三分词集合中包含的第一文本分词,得到所述第一分词集合,所述第一文本分词为所述第三分词集合中与所述关键字索引列表中的任意一个检索关键字均不匹配的文本分词。The second filtering module is configured to filter the first text segmentation contained in the third segmentation set identified by the text analysis module to obtain the first segmentation set, and the first text segmentation is the third segmentation Text word segmentation in the collection that does not match any search keyword in the keyword index list. 15.根据权利要求10或12所述的呼叫中心服务器,其特征在于,所述语音信息由以句子为维度的至少一个语音片段构成,所述文本信息由至少一个文本片段构成;15. The call center server according to claim 10 or 12, wherein the voice information is composed of at least one voice segment taking a sentence as a dimension, and the text information is composed of at least one text segment; 所述接收单元,具体用于接收所述用户终端发送的所述语音片段;The receiving unit is specifically configured to receive the voice segment sent by the user terminal; 所述文本转换单元,具体用于在所述接收单元接收到一个所述语音片段后,向语音分析服务器发送所述语音片段,以使得所述语音分析服务器将所述语音片段转换为所述文本片段;接收所述语音分析服务器发送的所述文本片段。The text conversion unit is specifically configured to, after the receiving unit receives a voice segment, send the voice segment to a voice analysis server, so that the voice analysis server converts the voice segment into the text Fragment: receiving the text fragment sent by the speech analysis server. 16.根据权利要求10或12所述的呼叫中心服务器,其特征在于,所述文本分析单元,具体用于向所述文本分析服务器发送所述文本信息,以使得所述文本分析服务器对所述文本信息进行文本分析,识别出所述第一分词集合;接收所述文本分析服务器发送的所述第一分词集合。16. The call center server according to claim 10 or 12, wherein the text analysis unit is specifically configured to send the text information to the text analysis server, so that the text analysis server can analyze the Performing text analysis on the text information to identify the first word segmentation set; receiving the first word segmentation set sent by the text analysis server. 17.一种文本分析服务器,其特征在于,包括:17. A text analysis server, characterized in that, comprising: 接收单元,用于接收呼叫中心服务器发送的文本信息,所述文本信息为所述呼叫中心服务器根据语音信息转换得到的;a receiving unit, configured to receive text information sent by the call center server, the text information being converted by the call center server according to the voice information; 文本分析单元,用于对所述接收单元接收的所述文本信息进行文本分析,识别出所述第一分词集合,所述第一分词集合中包含至少一个文本分词;A text analysis unit, configured to perform text analysis on the text information received by the receiving unit, and identify the first word segment set, the first word segment set includes at least one text segment word; 发送单元,用于向所述呼叫中心服务器发送所述文本分析单元分析得到的所述第一分词集合,以使得所述呼叫中心服务器根据所述第一分词集合中的文本分词从知识数据库中检索数据。A sending unit, configured to send the first word segmentation set analyzed by the text analysis unit to the call center server, so that the call center server retrieves from the knowledge database according to the text word segmentation in the first word segmentation set data. 18.根据权利要求17所述的文本分析服务器,其特征在于,所述知识数据库中存储有至少一个检索数据和所述检索数据的标识;18. The text analysis server according to claim 17, wherein at least one retrieval data and an identifier of the retrieval data are stored in the knowledge database; 所述文本分析服务器,还包括:The text analysis server also includes: 读取单元,用于读取所述知识数据库中存储的第一检索数据,所述第一检索数据为所述知识数据库中存储的任意一个检索数据;A reading unit, configured to read first retrieval data stored in the knowledge database, where the first retrieval data is any retrieval data stored in the knowledge database; 所述文本分析单元,还用于对所述读取单元读取的所述第一检索数据进行文本分析,识别出第二分词集合,所述第二分词集合中包含至少一个文本分词;The text analysis unit is further configured to perform text analysis on the first retrieval data read by the reading unit, and identify a second word segment set, the second word segment set includes at least one text segment word; 生成单元,用于根据所述文本分析单元分析得到的所述第二分词集合中的文本分词,生成所述第一检索数据的检索关键字;根据所述第一检索数据的检索关键字和所述第一检索数据的标识生成一个所述关键字索引项,生成包含至少一个所述关键字索引项的所述关键字索引列表;A generating unit, configured to generate a retrieval keyword of the first retrieval data according to the text segmentation in the second segmentation set analyzed by the text analysis unit; according to the retrieval keyword of the first retrieval data and the The identification of the first retrieval data generates one of the keyword index items, and generates the keyword index list including at least one of the keyword index items; 所述发送单元,还用于向所述呼叫中心服务器发送所述生成单元生成的所述关键字索引列表。The sending unit is further configured to send the keyword index list generated by the generating unit to the call center server.
CN201410799739.3A 2014-12-19 2014-12-19 Data retrieval method and device Pending CN105760399A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410799739.3A CN105760399A (en) 2014-12-19 2014-12-19 Data retrieval method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410799739.3A CN105760399A (en) 2014-12-19 2014-12-19 Data retrieval method and device

Publications (1)

Publication Number Publication Date
CN105760399A true CN105760399A (en) 2016-07-13

Family

ID=56339936

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410799739.3A Pending CN105760399A (en) 2014-12-19 2014-12-19 Data retrieval method and device

Country Status (1)

Country Link
CN (1) CN105760399A (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108172304A (en) * 2017-12-18 2018-06-15 广州七乐康药业连锁有限公司 A kind of medical information visible processing method and system based on user's medical treatment feedback
CN108228657A (en) * 2016-12-22 2018-06-29 沈阳美行科技有限公司 The implementation method and device of a kind of key search
WO2018144465A1 (en) * 2017-01-31 2018-08-09 Interactive Intelligence Group, Inc. System and method for speech-based interaction resolution
CN108563788A (en) * 2018-04-27 2018-09-21 腾讯科技(深圳)有限公司 Blockchain-based data query method, device, server and storage medium
CN108831452A (en) * 2018-04-27 2018-11-16 合肥尧威智能科技有限公司 A kind of intelligent sound identification finishing service management system
CN109376224A (en) * 2018-10-24 2019-02-22 深圳市壹鸽科技有限公司 Corpus filter method and device
CN110020002A (en) * 2018-08-21 2019-07-16 平安普惠企业管理有限公司 Querying method, device, equipment and the computer storage medium of event handling scheme
CN110265032A (en) * 2019-06-05 2019-09-20 平安科技(深圳)有限公司 Conferencing data analysis and processing method, device, computer equipment and storage medium
CN110347784A (en) * 2019-05-23 2019-10-18 深圳壹账通智能科技有限公司 Report form inquiring method, device, storage medium and electronic equipment
CN110580276A (en) * 2018-06-08 2019-12-17 百度在线网络技术(北京)有限公司 method and apparatus for processing information
CN110750702A (en) * 2019-09-11 2020-02-04 中国科学院上海微系统与信息技术研究所 Micro-service retrieval method and device, electronic equipment and storage medium
CN110781676A (en) * 2019-09-30 2020-02-11 腾讯科技(深圳)有限公司 Text processing method, device and system, electronic equipment and storage medium
CN111306709A (en) * 2020-01-20 2020-06-19 海信视像科技股份有限公司 Air conditioner
CN111435376A (en) * 2019-01-15 2020-07-21 北京京东尚科信息技术有限公司 Information processing method and system, computer system, and computer-readable storage medium
CN112527973A (en) * 2020-12-25 2021-03-19 北京博瑞彤芸科技股份有限公司 Method and system for searching disease information
CN112650878A (en) * 2019-10-11 2021-04-13 北京声智科技有限公司 Retrieval method, system, device and medium
CN112735413A (en) * 2020-12-25 2021-04-30 浙江大华技术股份有限公司 Instruction analysis method based on camera device, electronic equipment and storage medium
CN112735475A (en) * 2020-12-25 2021-04-30 北京博瑞彤芸科技股份有限公司 Method and system for searching disease knowledge through voice
CN114697286A (en) * 2022-04-18 2022-07-01 上海迎盾科技有限公司 Method and device for processing instant messaging data and computer readable storage medium
CN116126896A (en) * 2022-09-30 2023-05-16 中电金信软件有限公司 Data retrieval method and device
CN119601018A (en) * 2025-02-07 2025-03-11 浙江阿甘数智信息科技有限公司 A multi-dimensional customer identification method and system based on online voice

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510222A (en) * 2009-02-20 2009-08-19 北京大学 Multilayer index voice document searching method and system thereof
CN101996195A (en) * 2009-08-28 2011-03-30 中国移动通信集团公司 Searching method and device of voice information in audio files and equipment
CN102591932A (en) * 2011-12-23 2012-07-18 优视科技有限公司 Voice search method and system, mobile terminal, transfer server
CN102750365A (en) * 2012-06-14 2012-10-24 华为软件技术有限公司 Retrieval method and system of instant voice messages, user device and server
CN103440253A (en) * 2013-07-25 2013-12-11 清华大学 Speech retrieval method and system
US20140172868A1 (en) * 2012-12-17 2014-06-19 General Electric Company System and method for storage, querying, and analysis service for time series data
CN103885949A (en) * 2012-12-19 2014-06-25 中国科学院声学研究所 Song searching system and method based on lyrics

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510222A (en) * 2009-02-20 2009-08-19 北京大学 Multilayer index voice document searching method and system thereof
CN101996195A (en) * 2009-08-28 2011-03-30 中国移动通信集团公司 Searching method and device of voice information in audio files and equipment
CN102591932A (en) * 2011-12-23 2012-07-18 优视科技有限公司 Voice search method and system, mobile terminal, transfer server
CN102750365A (en) * 2012-06-14 2012-10-24 华为软件技术有限公司 Retrieval method and system of instant voice messages, user device and server
US20140172868A1 (en) * 2012-12-17 2014-06-19 General Electric Company System and method for storage, querying, and analysis service for time series data
CN103885949A (en) * 2012-12-19 2014-06-25 中国科学院声学研究所 Song searching system and method based on lyrics
CN103440253A (en) * 2013-07-25 2013-12-11 清华大学 Speech retrieval method and system

Cited By (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108228657A (en) * 2016-12-22 2018-06-29 沈阳美行科技有限公司 The implementation method and device of a kind of key search
CN108228657B (en) * 2016-12-22 2022-05-27 沈阳美行科技股份有限公司 Method and device for realizing keyword retrieval
WO2018144465A1 (en) * 2017-01-31 2018-08-09 Interactive Intelligence Group, Inc. System and method for speech-based interaction resolution
CN108172304A (en) * 2017-12-18 2018-06-15 广州七乐康药业连锁有限公司 A kind of medical information visible processing method and system based on user's medical treatment feedback
CN108172304B (en) * 2017-12-18 2021-04-02 广州七乐康药业连锁有限公司 Medical information visualization processing method and system based on user medical feedback
CN108563788A (en) * 2018-04-27 2018-09-21 腾讯科技(深圳)有限公司 Blockchain-based data query method, device, server and storage medium
CN108831452A (en) * 2018-04-27 2018-11-16 合肥尧威智能科技有限公司 A kind of intelligent sound identification finishing service management system
CN110580276A (en) * 2018-06-08 2019-12-17 百度在线网络技术(北京)有限公司 method and apparatus for processing information
CN110580276B (en) * 2018-06-08 2022-06-28 百度在线网络技术(北京)有限公司 Method and apparatus for processing information
CN110020002A (en) * 2018-08-21 2019-07-16 平安普惠企业管理有限公司 Querying method, device, equipment and the computer storage medium of event handling scheme
CN110020002B (en) * 2018-08-21 2024-01-12 山西掌柜鼎科技有限公司 Query method, device, equipment and computer storage medium of event processing scheme
CN109376224A (en) * 2018-10-24 2019-02-22 深圳市壹鸽科技有限公司 Corpus filter method and device
CN111435376A (en) * 2019-01-15 2020-07-21 北京京东尚科信息技术有限公司 Information processing method and system, computer system, and computer-readable storage medium
CN110347784A (en) * 2019-05-23 2019-10-18 深圳壹账通智能科技有限公司 Report form inquiring method, device, storage medium and electronic equipment
CN110265032A (en) * 2019-06-05 2019-09-20 平安科技(深圳)有限公司 Conferencing data analysis and processing method, device, computer equipment and storage medium
CN110750702A (en) * 2019-09-11 2020-02-04 中国科学院上海微系统与信息技术研究所 Micro-service retrieval method and device, electronic equipment and storage medium
CN110750702B (en) * 2019-09-11 2023-03-31 中国科学院上海微系统与信息技术研究所 Micro-service retrieval method and device, electronic equipment and storage medium
CN110781676A (en) * 2019-09-30 2020-02-11 腾讯科技(深圳)有限公司 Text processing method, device and system, electronic equipment and storage medium
CN112650878A (en) * 2019-10-11 2021-04-13 北京声智科技有限公司 Retrieval method, system, device and medium
CN111306709B (en) * 2020-01-20 2021-09-07 海信视像科技股份有限公司 Air conditioner
CN111306709A (en) * 2020-01-20 2020-06-19 海信视像科技股份有限公司 Air conditioner
CN112735475B (en) * 2020-12-25 2023-02-21 北京博瑞彤芸科技股份有限公司 Method and system for searching disease knowledge through voice
CN112527973A (en) * 2020-12-25 2021-03-19 北京博瑞彤芸科技股份有限公司 Method and system for searching disease information
CN112735475A (en) * 2020-12-25 2021-04-30 北京博瑞彤芸科技股份有限公司 Method and system for searching disease knowledge through voice
CN112735413A (en) * 2020-12-25 2021-04-30 浙江大华技术股份有限公司 Instruction analysis method based on camera device, electronic equipment and storage medium
CN112735413B (en) * 2020-12-25 2024-05-31 浙江大华技术股份有限公司 Instruction analysis method based on camera device, electronic equipment and storage medium
CN114697286A (en) * 2022-04-18 2022-07-01 上海迎盾科技有限公司 Method and device for processing instant messaging data and computer readable storage medium
CN114697286B (en) * 2022-04-18 2024-04-26 上海迎盾科技有限公司 Instant messaging data processing method and device and computer readable storage medium
CN116126896A (en) * 2022-09-30 2023-05-16 中电金信软件有限公司 Data retrieval method and device
CN116126896B (en) * 2022-09-30 2024-12-06 中电金信软件有限公司 Data retrieval method and device
CN119601018A (en) * 2025-02-07 2025-03-11 浙江阿甘数智信息科技有限公司 A multi-dimensional customer identification method and system based on online voice

Similar Documents

Publication Publication Date Title
CN105760399A (en) Data retrieval method and device
US8892572B2 (en) Video search system and method of use
KR101880521B1 (en) Matching users in a location-based service
US11275728B2 (en) Processing method and device of the user input information
CN110941708B (en) Intelligent question and answer library establishing method, intelligent question and answer method and device and computer equipment
CN108682421B (en) Voice recognition method, terminal equipment and computer readable storage medium
US10387460B2 (en) Method and apparatus for processing text information
US20160344581A9 (en) Text message including a contextual attribute of a mobile device
WO2015043392A1 (en) Method, apparatus and system for forwarding messages
CN105898085A (en) Harassment communication account identification method and device
CN105227736A (en) A kind of update method, Apparatus and system of reserved phone number
CN103002103A (en) Method and device for mass sending of short messages
EP2498523A1 (en) Short message forwarding method and mobile terminal
US9264870B2 (en) Mobile terminal, server and calling method based on cloud contact list
EP2466854A1 (en) Mobile search method and system thereof
CN109255023B (en) Clue information processing method and device
CN114222284A (en) Terminal equipment identification method and device and readable storage medium
CN1917510B (en) Method for accessing address list, and system of address list
CN104702758A (en) Terminal and method thereof for managing multimedia notepad
CN105681523A (en) Method and apparatus for sending birthday blessing short message automatically
CN110445934A (en) Call-information processing method, system, terminal and readable storage medium storing program for executing
CN205407889U (en) High in clouds voice service provides system
CN104243666B (en) language processing method and device
CN116414876A (en) Log processing method and communication equipment
CN107332967A (en) Contact person display method and device, mobile terminal and computer readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160713

RJ01 Rejection of invention patent application after publication