[go: up one dir, main page]

CN114969242B - Method and device for automatically completing query content - Google Patents

Method and device for automatically completing query content Download PDF

Info

Publication number
CN114969242B
CN114969242B CN202210675071.6A CN202210675071A CN114969242B CN 114969242 B CN114969242 B CN 114969242B CN 202210675071 A CN202210675071 A CN 202210675071A CN 114969242 B CN114969242 B CN 114969242B
Authority
CN
China
Prior art keywords
word
entity
query
content
natural language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210675071.6A
Other languages
Chinese (zh)
Other versions
CN114969242A (en
Inventor
田有朋
李俊
黄亚东
王小卫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202210675071.6A priority Critical patent/CN114969242B/en
Publication of CN114969242A publication Critical patent/CN114969242A/en
Application granted granted Critical
Publication of CN114969242B publication Critical patent/CN114969242B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/322Trees
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides a method and a device for automatically completing query contents, wherein in the method for automatically completing the query contents, natural language query contents aiming at target data, which are currently input by a user, are acquired. And cutting the natural language query content to obtain a plurality of query words. And querying a plurality of dictionary trees corresponding to different entity categories by taking a plurality of query words as current query words to acquire candidate words corresponding to a plurality of entity categories of each query word, wherein the dictionary trees are pre-constructed according to data query aiming at target data. And selecting each target candidate word from the candidate words at least based on the entity category corresponding to each candidate word of each query word. And determining each target candidate word as the complement content of the natural language query content.

Description

Method and device for automatically completing query content
The application relates to a split application of an application patent application with the application number 202210058334.9 of 202210058334.9 and the name of a method and a device for automatically completing query contents, which is filed in 2022, 01 and 19 days.
Technical Field
One or more embodiments of the present disclosure relate to the field of data analysis, and in particular, to a method and apparatus for automatic completion of query content.
Background
Natural language queries (natural language query, NLQ) refer to query analysis of data using natural language. The data here may be stored in a database, an Excel table, or a search engine.
When a user uses natural language query data, in order to improve the input efficiency of the user, the user is generally intelligently prompted with the content which the user may want to input later, namely, the natural language query content of the user is complemented when the user inputs part of the content.
Traditional completion methods usually use sentences as granularity completion, i.e. the prompt content is usually a whole sentence. However, when a user has entered a portion of content, it is often desirable to be able to prompt words related to the user's natural language query content, rather than an irrelevant sentence. Accordingly, it is desirable to provide a completion scheme that enables more accurate completion of the user's natural language query content.
Disclosure of Invention
One or more embodiments of the present disclosure describe a method and an apparatus for automatically completing query content, which may perform completion with word as granularity, so that accuracy of completing the content may be improved, and user experience may be further improved.
In a first aspect, a method for automatically completing query content is provided, including:
acquiring natural language query content aiming at target data, which is currently input by a user;
Segmenting the natural language query content to obtain a plurality of query words;
Inquiring a plurality of dictionary trees corresponding to different entity categories by taking the plurality of inquiry words as current inquiry words to obtain candidate words corresponding to a plurality of entity categories of each inquiry word;
selecting each target candidate word from the candidate words at least based on the entity category corresponding to each candidate word of each query word;
And determining the complement content of the natural language query content according to the target candidate words.
In a second aspect, an apparatus for query content automatic completion is provided, including:
The acquisition unit is used for acquiring natural language query content aiming at target data which is currently input by a user;
The segmentation unit is used for segmenting the natural language query content to obtain a plurality of query words;
The query unit is used for querying a plurality of dictionary trees corresponding to different entity categories by taking the plurality of query words as current query words to obtain candidate words corresponding to a plurality of entity categories of each query word;
A selecting unit, configured to select each target candidate word from the candidate words based at least on an entity class corresponding to each candidate word of each query word;
And the determining unit is used for determining the complement content of the natural language query content according to the target candidate words.
In a third aspect, there is provided a computer storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method of the first aspect.
In a fourth aspect, there is provided a computing device comprising a memory having executable code stored therein and a processor which, when executing the executable code, implements the method of the first aspect.
According to the method and the device for automatically completing the query content, for each query word obtained based on the natural language query content, candidate words of the query word with corresponding entity categories are obtained through a query dictionary tree. And screening each candidate word based on the corresponding entity category to obtain the candidate word serving as the complement content. That is, the present scheme can acquire candidate words as complement content based on entity categories. Because the entity categories are provided with the conventional combination mode, the method can solve the problem that the complement content is not related to the natural language query content by selecting the candidate words based on the entity categories. In addition, the candidate words are used as the complement contents, namely, the natural language query contents can be complemented with word granularity, so that the accuracy of the complement contents can be improved, and the user experience can be further improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present description, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present description, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an implementation scenario disclosed in one embodiment of the present disclosure;
FIG. 2 illustrates a method flow diagram for query content auto-completion, according to one embodiment;
FIG. 3a illustrates a prefix tree schematic according to one embodiment;
FIG. 3b illustrates a suffix tree schematic diagram according to an embodiment;
FIG. 4a shows a state machine schematic according to one embodiment;
FIG. 4b shows a state machine schematic according to another embodiment;
FIG. 5 illustrates an apparatus for automatic completion of query content, according to one embodiment.
Detailed Description
The following describes the scheme provided in the present specification with reference to the drawings.
In the field of data analysis, data queries, i.e., reading data from databases, excel tables, or search engines, are typically involved.
Conventionally, data queries are typically performed based on a particular query language, such as reading data from a database based on SQL statements, which, however, increases the usage threshold of the data. For this purpose, the following two improvements are proposed:
First, natural language processing (Natural Language Processing, NLP) based methods, however, do not guarantee that the data read is completely accurate, i.e., the method is probabilistic accurate.
Second, a seq2SQL based method reads data by translating natural language directly into SQL statements. However, the method only supports about 80% of accuracy under single-table single-layer aggregation, and cannot support various complex data analysis requirements in real scenes inside enterprises. That is, the method has low accuracy and narrow coverage.
Because both schemes have certain defects, further improved schemes directly use natural language to query data. When a user uses natural language query data, in order to improve the input efficiency of the user, the natural language query content of the user needs to be complemented.
Currently, the completion method used in the search engine is usually to complete the sentence with granularity, that is, the whole sentence of the prompt content is usually. However, when a user has entered a portion of content, it is often desirable to be able to prompt words related to the user's natural language query content, rather than an irrelevant sentence. Therefore, the inventor of the application proposes to complement the word as granularity, namely to complement the natural language query content of the user with finer granularity, thereby improving the accuracy of the complement content and further improving the user experience.
Fig. 1 is a schematic diagram of an implementation scenario disclosed in one embodiment of the present specification. In fig. 1, first, natural language query contents for target data currently input by a user may be acquired. Then, the natural language query content can be segmented to obtain a plurality of query words W 1、W2、…、WN, wherein N is the number of the query words. And querying a plurality of dictionary trees corresponding to different entity categories by taking a plurality of query words as current query words to obtain candidate words W 11、W12、W21、W22、W23…、WN1 and W N2 of each query word, wherein the entity categories corresponding to each candidate word can be C 2、C1、C1、C2、C1…、C2 and C 2 respectively. Finally, each target candidate word, W 11、W22、WN1 and W N2, may be selected from each candidate word based on the entity class to which each candidate word of each query word corresponds. And determining the complement content of the natural language query content according to each target candidate word.
In one example, target candidate words may be selected from candidate words based on a state machine of a regular expression, described in detail below.
The following embodiments of the present specification provide detailed descriptions of the embodiments.
FIG. 2 illustrates a flow diagram of a method of query content auto-completion, according to one embodiment. The method may be performed by any apparatus, device, platform, cluster of devices having computing, processing capabilities. As shown in fig. 2, the method may include at least the following steps.
Step 202, acquiring natural language query content aiming at target data, which is currently input by a user.
The entity categories may be divided into two categories, one of which is a public category, and may include at least one of time, operators, units, functions, intents, and the like. The other is a private category that may include at least one of dimensions, dimension values, metrics, and the like. In one example, the private category described above may be determined based on a key value of a key-value pair.
In one example, the above-mentioned entity words corresponding to time may be, for example, "XX year", "XX month", "XX day", "last N day", "last days", "last N year", "last year", and "today", etc. The entity words corresponding to the operators may be, for example, "greater than," "less than," "equal to," "greater than," and "above," etc. The entity words corresponding to the units may be, for example, "years", "several", and "several people", etc. The entity words corresponding to the functions may be, for example, "maximum", "minimum", and "average", etc. The entity words corresponding to the dimension can be, for example, a "city", "sales amount", a "class", etc., and the dimension value is a value of the dimension, for example, the dimension value corresponding to the dimension "city" can be, for example, a "Beijing" or a "Shanghai", etc.
Specifically, the natural language query content currently input by the user can be acquired based on the position of the cursor. For example, the entire content of the position of the cursor in the input box is used as the natural language query content.
And 204, segmenting the natural language query content to obtain a plurality of query words.
In one example, prior to slicing the natural language query, entity recognition may be performed to obtain a base entity class of the natural language query.
For example, assuming that the content of the natural language query currently input by the user is "pay for each city yesterday", the basic entity category, time and dimension, can be obtained through entity identification. The word corresponding to time is yesterday, and the word corresponding to dimension is city.
After entity identification, the method can be used for carrying out segmentation on yesterday city payment to obtain query words such as yesterday, city payment, payment and the like.
Step 206, using the query words as current query words, querying dictionary trees corresponding to different entity categories to obtain candidate words corresponding to multiple entity categories.
The plurality of dictionary trees may be pre-constructed from data queries for the target data. The data queries herein are also referred to as history queries, and the corresponding history natural language query content may include entity words corresponding to the common category and/or entity words corresponding to the private category.
Taking the first dictionary tree corresponding to the first entity class (any one of the public classes or any one of the private classes) of the plurality of dictionary trees as an example, the first dictionary tree may include a plurality of branches, wherein each branch represents one entity word corresponding to the first entity class in the historical natural language query content. In addition, each inter-node path in each branch corresponds to at least part of the words in the represented entity words, respectively, and the value of the leaf node is the represented entity word. The value of the branch node is a combination word of each word corresponding to each inter-node path from the root node to the branch node.
The query process for the first dictionary tree specifically includes sequentially matching a current query word with each branch in the first dictionary tree, and if a word covered by any first branch contains the current query word, taking a value of a leaf node of the first branch as a candidate word of a first entity class of the current query word.
Taking a certain branch in the first dictionary tree as an example, the word-by-word matching specifically may include word-by-word matching each word in the current query word with each word corresponding to the path between the nodes in the branch. If each word corresponding to the path between each node in the branch contains each word in the current query word, the matching is determined to be successful, otherwise, the matching is failed.
In one example, the first lexicon tree may include a prefix tree and a suffix tree. Wherein the prefix tree may be constructed based on at least some words from the beginning of each entity word corresponding to the first entity category in the historical natural language query content. The suffix tree may be constructed based on at least some words from the historical natural language query content that are ending in each entity word corresponding to the first entity category.
Fig. 3a shows a prefix tree schematic according to an embodiment. In fig. 3a, the prefix tree may include a plurality of branches, wherein the entity word represented by the leftmost branch is "payment amount", and the corresponding entity category is dimension. In addition, the words corresponding to the paths between nodes in the branch are "branch", "payment", "gold" and "amount" (i.e. the paths between nodes in the branch correspond to all words in the represented entity word) respectively, and the value of the leaf node is "payment amount". The values of the 3 branch nodes are respectively 'branch', 'payment', 'pay gold'. Similarly, the entity word represented by the branch on the secondary left side is transaction number, and the corresponding entity category is dimension. In addition, the words corresponding to the paths among the nodes in the branch are respectively 'transaction', 'easy', 'pen' and 'number', and the value of the leaf node is 'transaction pen number'. The values of the 3 branch nodes are respectively "trade", "trade" and "trade pen".
It can be seen that the entity classes of the entity words represented by the branches in fig. 3a are the same, i.e. are all entity classes of the prefix tree.
Fig. 3b shows a suffix tree schematic according to an embodiment. In FIG. 3b, the suffix tree may include a plurality of branches, wherein the leftmost branch represents an entity term of "payment," and the corresponding entity class is dimension. In addition, the words corresponding to the paths between nodes in the branch are respectively "paid", "gold" and "amount" (i.e., the paths between nodes in the branch respectively correspond to part of the words in the represented entity word), and the value of the leaf node is "payment amount". The values of the two branch nodes are respectively 'payment' and 'payment'. Similarly, the entity word represented by the branch on the secondary left side is "payment amount", and the corresponding entity category is dimension. In addition, the words corresponding to the paths among the nodes in the branch are gold and quota respectively, and the value of the leaf node is payment amount. The value of 1 branch node is gold.
It can be seen that the entity classes of the entity words represented by the branches in fig. 3b are the same, i.e. are all entity classes of the suffix tree. The entity categories of the prefix tree and the suffix tree are also the same. Similarly, the several lexicon trees described above may also include prefix trees and suffix trees corresponding to other entity categories.
When the first dictionary tree includes a prefix tree and a suffix tree, the query process for the first dictionary tree may specifically include querying the prefix tree with a current query word as a prefix word to obtain a first entity word of a first entity class of the current query word, and querying the suffix tree with the current query word as a suffix word to obtain a second entity word of the first entity class of the current query word. The first entity word and the second entity word constitute respective candidate words of the first entity class of the current query word.
The above query process for the prefix tree and the suffix tree is similar, and the detailed query process may refer to the above description of the query process for the first dictionary tree, only by replacing the first dictionary tree with the prefix tree or the suffix tree.
Taking the prefix tree shown in fig. 3a as an example, if the current query term is "payment", the candidate term obtained by the query may be "payment amount" and the corresponding entity class is dimension. Taking the suffix tree shown in fig. 3b as an example, if the current query term is: "amount", the candidate term obtained by the query may be: "payment amount", and the corresponding entity class is dimension.
In the foregoing example where the natural language query content is "yesterday city payment", the obtained candidate words may be, for example, "payment amount", "payment count", "payment number", and "payment date", etc.
Furthermore, in practical applications, there may be some words that may correspond to multiple entity categories at the same time, e.g., the word "Beijing" may correspond to both entity category: dimension and entity category: dimension value. For such words we are often referred to as confusing words.
When each candidate word of each query word includes an confusion word, a plurality of entity categories corresponding to the confusion word can be displayed to a user, and then the final entity category of the confusion word is determined according to a selection instruction of the user.
Step 208, selecting each target candidate word from the candidate words at least based on the entity class corresponding to each candidate word of each query word.
Specifically, the duplicate removal process may be performed on each candidate word, and then, for any first candidate word in each candidate word after the duplicate removal process, an entity class sequence is formed based on the basic entity class and the target entity class of the first candidate word. And checking the entity class sequence by using the regular expression, and if the checking is passed, taking the first candidate word as a target candidate word.
Taking the foregoing natural language query content as "yesterday cities" as an example, as described above, the basic entity category obtained by identifying the entity is time and dimension, and assuming that the first candidate word is "payment amount" and the corresponding entity category is dimension, the formed entity category sequence may be { time, dimension }.
In addition, the regular expression (Regular Expression) is a pattern for describing a set of string features to match a particular string. The mode description is carried out through special characters and common characters, so that the aim of text matching is fulfilled.
The special characters may include, but are not limited to "\", "", "x" and "{ }", and the common characters may be each english character representing each entity class.
In one example, the verifying the entity class sequence by using the regular expression may include inputting the entity class sequence into a state machine corresponding to the regular expression and performing state migration, where the state migration includes comparing a current entity class in the entity class sequence with a labeled entity class corresponding to a migration edge of the current state, migrating to a next state if the current entity class is consistent with the labeled entity class, updating the current entity class, and ending if the current entity class is not consistent with the labeled entity class. After the state transition is finished, if the state of the state machine is a matching state, checking passing, otherwise, checking failing.
FIG. 4a illustrates a state machine schematic according to one embodiment. In fig. 4a, the state machine may be based on a regular expression "a (bb) +a" conversion, where a and b here represent two different entity classes, respectively. S 0-S4 in fig. 4a are the 5 states of the state machine, respectively, and S 4 is the matching state. Further, the one-way arrow from each state represents the transition edge of that state, and the characters above or below the one-way arrow represent the nominal entity class of the corresponding transition edge. For example, the class of calibration entities for the transition edge of state S 0 is "a".
The above state transition procedure is described below with reference to fig. 4 a.
Assuming that the entity class sequence (hereinafter referred to as sequence) is abbbba, the 1 st a in the sequence is first taken as the current entity class, the state S 0 is taken as the current state, and the 1 st a is matched with the labeled entity class of the migration edge of the state S 0, so that the next state S 1 is migrated, namely, the state S 1 is taken as the updated current state, the 1 st b in the sequence is taken as the updated current entity class, and then the 1 st b is matched with the labeled entity class of the migration edge of the state S 1, namely, the "b" is carried out until the migration end condition is met. The migration end condition of this includes, but is not limited to, a failure of a match or completion of a match for each entity class in the sequence.
In this example, after each entity class in the sequence is matched, state S 4 may be reached, so that the sequence check passes.
It should be understood that fig. 4a is only an exemplary illustration, and in practical applications, the transition edge of the state may be multiple. For example, the state machine described in the embodiments of the present disclosure may also be as shown in fig. 4 b.
In the example that the natural language query content is "yesterday city payment", the selected target candidate words may be, for example, "payment amount", "payment number" and "payment number".
It should be noted that the regular expressions described in this specification may be written based on conventional combinations among entity categories. Therefore, the target candidate words screened based on the regular expression have stronger relevance with the natural language query content of the user, so that the problem that the complement content is not related to the natural language query content can be solved, and the computing resource can be saved.
Step 210, determining the complement content of the natural language query content according to each target candidate word.
In one example, the target candidate words may be ranked first according to a ranking algorithm. And then determining each sorted target candidate word as the complement content of the natural language query content.
The sorting algorithm can comprise any one of a longest matching algorithm, a state priority algorithm, a dictionary base algorithm, a word combination heat algorithm, a custom priority algorithm and a word use frequency algorithm.
In addition, it should be noted that the complement of the present embodiment may change with the movement of the cursor. For example, when the position of the cursor is detected to be located at the middle position of the natural language query content, the content of the natural language query content cut to the middle position is taken as updated natural language query content. And complementing the updated natural language query content. Therefore, the complement method of the scheme is more flexible.
The method for complementing the updated natural language query content may also be implemented through steps 202-210, which is not repeated herein.
For example, assuming that the natural language query content currently input by the user is "pay pens for each city," the natural language query content may be first completed. Then, when the cursor moves between "pay" and "pen", the "pay for each city" is completed.
In view of the above, the method for automatically completing query content provided in the embodiments of the present disclosure may obtain candidate words as completed content based on entity categories. Because the entity categories are provided with the conventional combination mode, the method can solve the problem that the complement content is not related to the natural language query content by selecting the candidate words based on the entity categories. In addition, the candidate words are used as the complement contents, namely, the natural language query contents can be complemented with word granularity, so that the accuracy of the complement contents can be improved, and the user experience can be further improved.
Corresponding to the above method for automatically completing query contents, an embodiment of the present disclosure further provides an apparatus for automatically completing query contents, as shown in fig. 5, where the apparatus may include:
The acquiring unit 502 is configured to acquire natural language query content for target data currently input by a user.
And the segmentation unit 504 is configured to segment the natural language query content to obtain a plurality of query terms.
A query unit 506, configured to query a plurality of dictionary trees corresponding to different entity categories with a plurality of query words as current query words, to obtain candidate words corresponding to a plurality of entity categories for each query word, where the plurality of dictionary trees are pre-constructed according to a data query for target data.
Wherein the entity category comprises at least one of time, operator, unit, function, intention, dimension value, metric and the like.
Optionally, the plurality of dictionary trees includes a first dictionary tree corresponding to a first entity class, the first dictionary tree including a prefix tree and a suffix tree. The prefix tree is constructed based on at least part of words from the beginning of each entity word of the first entity class, and the suffix tree is constructed based on at least part of words from the end of each entity word of the first entity class;
The query unit 506 is specifically configured to:
inquiring a prefix tree by taking the current query word as a prefix word to obtain a first entity word of a first entity class of the current query word, and inquiring a suffix tree by taking the current query word as a suffix word to obtain a second entity word of the first entity class of the current query word;
the first entity word and the second entity word form candidate words of the first entity class of the current query word.
Optionally, the plurality of dictionary trees include a first dictionary tree corresponding to a first entity category, the first dictionary tree includes a plurality of branches, each inter-node path in each branch corresponds to at least part of characters in the represented entity word, and the value of a leaf node is the represented entity word;
The query unit 506 is specifically configured to:
And sequentially matching the current query word with each branch in the first dictionary tree, and taking the value of the leaf node of the first branch as a candidate word of the first entity class of the current query word if the word covered by any first branch contains the current query word.
And a selecting unit 508, configured to select each target candidate word from the candidate words at least based on the entity class corresponding to each candidate word of each query word.
A determining unit 510, configured to determine, according to each target candidate word, the complement content of the natural language query content.
Optionally, the apparatus further comprises:
the identifying unit 512 is configured to perform entity identification on the natural language query content, so as to obtain a corresponding basic entity category.
The selection unit 508 includes:
A forming module 5082 is configured to form, for any first candidate word of the candidate words, an entity class sequence based on the base entity class and the target entity class of the first candidate word.
And the verification module 5084 is configured to verify the entity class sequence by using the regular expression, and if the verification passes, use the first candidate word as a target candidate word.
The verification module 5084 specifically is configured to:
Inputting an entity class sequence into a state machine corresponding to the regular expression, and performing state migration, wherein the state migration comprises the steps of comparing a current entity class in the entity class sequence with a labeling entity class corresponding to a migration edge of the current state, and migrating to the next state and updating the current entity class if the current entity class is consistent with the labeling entity class;
after the state transition is finished, if the state of the state machine is a matching state, checking passing, otherwise, checking failing.
Optionally, the apparatus further comprises:
a ranking unit 514, configured to rank the target candidate words according to a ranking algorithm.
The determining unit 510 is specifically configured to:
Determining each sorted target candidate word as the complement content of the natural language query content;
The sorting algorithm comprises any one of a longest matching algorithm, a state priority algorithm, a dictionary base algorithm, a phrase heat algorithm, a custom priority algorithm and a word use frequency algorithm.
Optionally, the apparatus further comprises a complementing unit 516;
the obtaining unit 502 is further configured to, when it is detected that the position of the cursor is located at a middle position of the natural language query content, use content in the natural language query content up to the middle position as updated natural language query content;
and a complementing unit 516, configured to complement the updated natural language query content.
The functions of the functional modules of the apparatus in the foregoing embodiments of the present disclosure may be implemented by the steps of the foregoing method embodiments, so that the specific working process of the apparatus provided in one embodiment of the present disclosure is not repeated herein.
The device for automatically completing the query content provided by the embodiment of the specification can provide accuracy of completing the content.
According to an embodiment of another aspect, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed in a computer, causes the computer to perform the method described in connection with fig. 2.
According to an embodiment of yet another aspect, there is also provided a computing device including a memory having executable code stored therein and a processor that, when executing the executable code, implements the method described in connection with fig. 2.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the apparatus embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments in part.
The steps of a method or algorithm described in connection with the disclosure herein may be embodied in hardware, or may be embodied in software instructions executed by a processor. The software instructions may be comprised of corresponding software modules that may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. In addition, the ASIC may reside in a server. The processor and the storage medium may reside as discrete components in a server.
Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the present invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The foregoing detailed description of the embodiments has further described the objects, technical solutions and advantages of the present specification, and it should be understood that the foregoing description is only a detailed description of the embodiments of the present specification, and is not intended to limit the scope of the present specification, but any modifications, equivalents, improvements, etc. made on the basis of the technical solutions of the present specification should be included in the scope of the present specification.

Claims (18)

1.一种查询内容自动补全的方法,包括:1. A method for automatically completing query content, comprising: 获取用户当前输入的针对目标数据的自然语言查询内容;Obtain the natural language query content currently input by the user for the target data; 对所述自然语言查询内容进行切分,得到若干查询词;Segmenting the natural language query content to obtain a number of query words; 将所述若干查询词分别作为当前查询词,针对当前查询词查询分别对应于多个实体类别的多个词典树,以获取该当前查询词的对应于多个实体类别的各候选词;所述多个词典树根据针对所述目标数据的历史自然语言查询预先构建;The plurality of query words are respectively used as current query words, and multiple dictionary trees corresponding to multiple entity categories are queried for the current query words to obtain candidate words corresponding to the multiple entity categories for the current query words; the multiple dictionary trees are pre-built according to historical natural language queries for the target data; 至少基于各查询词的各候选词对应的实体类别,从所述各候选词中筛选出各目标候选词;At least based on the entity categories corresponding to the candidate words of the query words, select the target candidate words from the candidate words; 根据所述各目标候选词,确定所述自然语言查询内容的补全内容,所述补全内容用于提示用户后续可能想要输入的内容。According to the target candidate words, the completion content of the natural language query content is determined, and the completion content is used to prompt the user for content that may be desired to input later. 2.根据权利要求1所述的方法,其中,所述多个词典树包括对应于第一实体类别的第一词典树;所述第一词典树包括前缀树和后缀树;所述前缀树基于所述第一实体类别的各实体词的从开端开始的至少部分字构建得到;所述后缀树基于所述第一实体类别的各实体词的截至末尾的至少部分字构建得到;2. The method according to claim 1, wherein the plurality of dictionary trees include a first dictionary tree corresponding to a first entity category; the first dictionary tree includes a prefix tree and a suffix tree; the prefix tree is constructed based on at least part of the characters of each entity word of the first entity category starting from the beginning; the suffix tree is constructed based on at least part of the characters of each entity word of the first entity category ending at the end; 所述查询分别对应于多个实体类别的多个词典树,包括:The query corresponds to a plurality of dictionary trees of a plurality of entity categories, respectively, including: 将当前查询词作为前缀词,查询所述前缀树,得到当前查询词的第一实体类别的第一实体词,以及将当前查询词作为后缀词,查询所述后缀树,得到当前查询词的第一实体类别的第二实体词;Taking the current query word as a prefix word, querying the prefix tree, and obtaining a first entity word of a first entity category of the current query word; and taking the current query word as a suffix word, querying the suffix tree, and obtaining a second entity word of the first entity category of the current query word; 所述第一实体词和第二实体词构成当前查询词的第一实体类别的各候选词。The first entity word and the second entity word constitute candidate words of the first entity category of the current query word. 3.根据权利要求1所述的方法,其中,所述多个词典树包括对应于第一实体类别的第一词典树,所述第一词典树包括多个分支,每个分支中的各节点间路径分别对应于所代表实体词中的至少部分字,且叶子节点的值为所代表实体词;3. The method according to claim 1, wherein the plurality of dictionary trees include a first dictionary tree corresponding to a first entity category, the first dictionary tree includes a plurality of branches, each node path in each branch corresponds to at least part of the characters in the represented entity word, and the value of the leaf node is the represented entity word; 所述查询分别对应于多个实体类别的多个词典树,包括:The query corresponds to a plurality of dictionary trees of a plurality of entity categories, respectively, including: 依次将当前查询词与所述第一词典树中的各个分支进行逐字匹配,若任意的第一分支所覆盖的字包含当前查询词,则将该第一分支的叶子节点的值作为当前查询词的第一实体类别的一个候选词。The current query word is matched with each branch in the first dictionary tree word by word in turn. If the word covered by any first branch contains the current query word, the value of the leaf node of the first branch is used as a candidate word of the first entity category of the current query word. 4.根据权利要求1所述的方法,其中,在所述对所述自然语言查询内容进行切分之前,还包括:4. The method according to claim 1, wherein before segmenting the natural language query content, it further comprises: 对所述自然语言查询内容进行实体识别,得到对应的基础实体类别;Performing entity recognition on the natural language query content to obtain corresponding basic entity categories; 所述从所述各候选词中筛选出目标候选词,包括:The step of selecting a target candidate word from the candidate words includes: 对于所述各候选词中任意的第一候选词,基于所述基础实体类别以及所述第一候选词的目标实体类别形成实体类别序列;For any first candidate word among the candidate words, forming an entity category sequence based on the basic entity category and the target entity category of the first candidate word; 利用正则表达式,对所述实体类别序列进行校验,若校验通过,则将所述第一候选词作为一个目标候选词。The entity category sequence is verified using a regular expression, and if the verification passes, the first candidate word is used as a target candidate word. 5.根据权利要求4所述的方法,其中,所述对所述实体类别序列进行校验,包括:5. The method according to claim 4, wherein the verifying the entity category sequence comprises: 将所述实体类别序列输入所述正则表达式对应的状态机,并进行状态迁移;所述状态迁移包括:将所述实体类别序列中的当前实体类别与当前状态的迁移边对应的标注实体类别进行比对,若相一致,则迁移至下一状态,并更新当前实体类别;否则结束;Input the entity category sequence into the state machine corresponding to the regular expression, and perform state migration; the state migration includes: comparing the current entity category in the entity category sequence with the labeled entity category corresponding to the migration edge of the current state, if they are consistent, migrate to the next state and update the current entity category; otherwise end; 在所述状态迁移结束后,若所述状态机的状态为匹配状态,则校验通过,否则校验不通过。After the state migration is completed, if the state of the state machine is a matching state, the verification passes, otherwise the verification fails. 6.根据权利要求1所述的方法,其中,在所述将所述各目标候选词确定为所述自然语言查询内容的补全内容之前,所述方法还包括:6. The method according to claim 1, wherein, before determining each target candidate word as a completion content of the natural language query content, the method further comprises: 按照排序算法,对所述各目标候选词进行排序;Sorting the target candidate words according to a sorting algorithm; 将排序后的各目标候选词确定为所述自然语言查询内容的补全内容;Determine the sorted target candidate words as the completion content of the natural language query content; 其中,所述排序算法包括以下中的任一项:最长匹配算法、状态优先级算法、词典基数算法、词组合热度算法、自定义优先级算法以及词使用频次算法。The sorting algorithm includes any one of the following: longest match algorithm, state priority algorithm, dictionary cardinality algorithm, word combination heat algorithm, custom priority algorithm and word usage frequency algorithm. 7.根据权利要求1所述的方法,还包括:7. The method according to claim 1, further comprising: 当检测到光标的位置位于所述自然语言查询内容的中间位置时,将所述自然语言查询内容中截至中间位置的内容作为更新的自然语言查询内容;When it is detected that the cursor is located at the middle position of the natural language query content, the content up to the middle position in the natural language query content is used as the updated natural language query content; 对所述更新的自然语言查询内容进行补全。The updated natural language query content is completed. 8.根据权利要求1所述的方法,其中,所述实体类别包括以下中的至少一项:时间、运算符、单位、函数、意图、维度、维值以及度量。8. The method of claim 1, wherein the entity category comprises at least one of: time, operator, unit, function, intent, dimension, dimension value, and measure. 9.一种查询内容自动补全的装置,包括:9. A device for automatically completing query content, comprising: 获取单元,用于获取用户当前输入的针对目标数据的自然语言查询内容;An acquisition unit, used to acquire natural language query content currently input by the user for target data; 切分单元,用于对所述自然语言查询内容进行切分,得到若干查询词;A segmentation unit, used to segment the natural language query content to obtain a number of query words; 查询单元,用于将所述若干查询词分别作为当前查询词,针对当前查询词查询分别对应于多个实体类别的多个词典树,以获取该当前查询词的对应于多个实体类别的各候选词;所述多个词典树根据针对所述目标数据的历史自然语言查询预先构建;A query unit, configured to use the plurality of query words as current query words respectively, and query a plurality of dictionary trees corresponding to a plurality of entity categories respectively for the current query words, so as to obtain candidate words corresponding to the plurality of entity categories for the current query words; the plurality of dictionary trees are pre-constructed according to historical natural language queries for the target data; 选取单元,用于至少基于各查询词的各候选词对应的实体类别,从所述各候选词中筛选出各目标候选词;A selection unit, configured to select target candidate words from the candidate words based at least on the entity categories corresponding to the candidate words of the query words; 确定单元,用于根据所述各目标候选词,确定所述自然语言查询内容的补全内容,所述补全内容用于提示用户后续可能想要输入的内容。The determination unit is used to determine the completion content of the natural language query content according to the target candidate words, and the completion content is used to prompt the user for the content that he may want to input later. 10.根据权利要求9所述的装置,其中,所述多个词典树包括对应于第一实体类别的第一词典树;所述第一词典树包括前缀树和后缀树;所述前缀树基于所述第一实体类别的各实体词的从开端开始的至少部分字构建得到;所述后缀树基于所述第一实体类别的各实体词的截至末尾的至少部分字构建得到;10. The apparatus according to claim 9, wherein the plurality of dictionary trees include a first dictionary tree corresponding to a first entity category; the first dictionary tree includes a prefix tree and a suffix tree; the prefix tree is constructed based on at least part of the characters of each entity word of the first entity category starting from the beginning; the suffix tree is constructed based on at least part of the characters of each entity word of the first entity category ending at the end; 所述查询单元具体用于:The query unit is specifically used for: 将当前查询词作为前缀词,查询所述前缀树,得到当前查询词的第一实体类别的第一实体词,以及将当前查询词作为后缀词,查询所述后缀树,得到当前查询词的第一实体类别的第二实体词;Taking the current query word as a prefix word, querying the prefix tree, and obtaining a first entity word of a first entity category of the current query word; and taking the current query word as a suffix word, querying the suffix tree, and obtaining a second entity word of the first entity category of the current query word; 所述第一实体词和第二实体词构成当前查询词的第一实体类别的各候选词。The first entity word and the second entity word constitute candidate words of the first entity category of the current query word. 11.根据权利要求9所述的装置,其中,所述多个词典树包括对应于第一实体类别的第一词典树,所述第一词典树包括多个分支,每个分支中的各节点间路径分别对应于所代表实体词中的至少部分字,且叶子节点的值为所代表实体词;11. The device according to claim 9, wherein the plurality of dictionary trees include a first dictionary tree corresponding to a first entity category, the first dictionary tree includes a plurality of branches, each node path in each branch corresponds to at least part of the characters in the represented entity word, and the value of the leaf node is the represented entity word; 所述查询单元具体用于:The query unit is specifically used for: 依次将当前查询词与所述第一词典树中的各个分支进行逐字匹配,若任意的第一分支所覆盖的字包含当前查询词,则将该第一分支的叶子节点的值作为当前查询词的第一实体类别的一个候选词。The current query word is matched with each branch in the first dictionary tree word by word in turn. If the word covered by any first branch contains the current query word, the value of the leaf node of the first branch is used as a candidate word of the first entity category of the current query word. 12.根据权利要求9所述的装置,还包括:12. The apparatus according to claim 9, further comprising: 识别单元,用于对所述自然语言查询内容进行实体识别,得到对应的基础实体类别;An identification unit, configured to perform entity identification on the natural language query content to obtain a corresponding basic entity category; 所述选取单元包括:The selection unit comprises: 形成模块,用于对于所述各候选词中任意的第一候选词,基于所述基础实体类别以及所述第一候选词的目标实体类别形成实体类别序列;A forming module, configured to form an entity category sequence for any first candidate word among the candidate words based on the basic entity category and the target entity category of the first candidate word; 校验模块,用于利用正则表达式,对所述实体类别序列进行校验,若校验通过,则将所述第一候选词作为一个目标候选词。The verification module is used to verify the entity category sequence using a regular expression, and if the verification passes, the first candidate word is used as a target candidate word. 13.根据权利要求12所述的装置,其中,所述校验模块具体用于:13. The device according to claim 12, wherein the verification module is specifically used for: 将所述实体类别序列输入所述正则表达式对应的状态机,并进行状态迁移;所述状态迁移包括:将所述实体类别序列中的当前实体类别与当前状态的迁移边对应的标注实体类别进行比对,若相一致,则迁移至下一状态,并更新当前实体类别;否则结束;Input the entity category sequence into the state machine corresponding to the regular expression, and perform state migration; the state migration includes: comparing the current entity category in the entity category sequence with the labeled entity category corresponding to the migration edge of the current state, if they are consistent, migrate to the next state and update the current entity category; otherwise end; 在所述状态迁移结束后,若所述状态机的状态为匹配状态,则校验通过,否则校验不通过。After the state migration is completed, if the state of the state machine is a matching state, the verification passes, otherwise the verification fails. 14.根据权利要求9所述的装置,还包括:14. The apparatus according to claim 9, further comprising: 排序单元,用于按照排序算法,对所述各目标候选词进行排序;A sorting unit, used to sort the target candidate words according to a sorting algorithm; 所述确定单元具体用于:The determining unit is specifically used for: 将排序后的各目标候选词确定为所述自然语言查询内容的补全内容;Determine the sorted target candidate words as the completion content of the natural language query content; 其中,所述排序算法包括以下中的任一项:最长匹配算法、状态优先级算法、词典基数算法、词组合热度算法、自定义优先级算法以及词使用频次算法。The sorting algorithm includes any one of the following: longest match algorithm, state priority algorithm, dictionary cardinality algorithm, word combination heat algorithm, custom priority algorithm and word usage frequency algorithm. 15.根据权利要求9所述的装置,还包括:补全单元;15. The apparatus according to claim 9, further comprising: a completion unit; 所述获取单元,还用于当检测到光标的位置位于所述自然语言查询内容的中间位置时,将所述自然语言查询内容中截至中间位置的内容作为更新的自然语言查询内容;The acquisition unit is further configured to, when detecting that the cursor is located at a middle position of the natural language query content, use the content of the natural language query content up to the middle position as the updated natural language query content; 所述补全单元,用于对所述更新的自然语言查询内容进行补全。The completion unit is used to complete the updated natural language query content. 16.根据权利要求9所述的装置,其中,所述实体类别包括以下中的至少一项:时间、运算符、单位、函数、意图、维度、维值以及度量。16. The apparatus of claim 9, wherein the entity category comprises at least one of: time, operator, unit, function, intent, dimension, dimension value, and measure. 17.一种计算机可读存储介质,其上存储有计算机程序,其中,当所述计算机程序在计算机中执行时,令计算机执行权利要求1-8中任一项所述的方法。17. A computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed in a computer, the computer is caused to execute the method according to any one of claims 1 to 8. 18.一种计算设备,包括存储器和处理器,其中,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现权利要求1-8中任一项所述的方法。18. A computing device comprising a memory and a processor, wherein the memory stores executable codes, and when the processor executes the executable codes, the method according to any one of claims 1 to 8 is implemented.
CN202210675071.6A 2022-01-19 2022-01-19 Method and device for automatically completing query content Active CN114969242B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210675071.6A CN114969242B (en) 2022-01-19 2022-01-19 Method and device for automatically completing query content

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210058334.9A CN114090722B (en) 2022-01-19 2022-01-19 Method and device for automatically completing query content
CN202210675071.6A CN114969242B (en) 2022-01-19 2022-01-19 Method and device for automatically completing query content

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202210058334.9A Division CN114090722B (en) 2022-01-19 2022-01-19 Method and device for automatically completing query content

Publications (2)

Publication Number Publication Date
CN114969242A CN114969242A (en) 2022-08-30
CN114969242B true CN114969242B (en) 2025-04-08

Family

ID=80308602

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202210058334.9A Active CN114090722B (en) 2022-01-19 2022-01-19 Method and device for automatically completing query content
CN202210675071.6A Active CN114969242B (en) 2022-01-19 2022-01-19 Method and device for automatically completing query content

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202210058334.9A Active CN114090722B (en) 2022-01-19 2022-01-19 Method and device for automatically completing query content

Country Status (1)

Country Link
CN (2) CN114090722B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115495624A (en) * 2022-09-30 2022-12-20 魔狸科技(上海)有限公司 A user guidance method for Chinese natural language query method
CN115757699B (en) * 2022-11-19 2023-07-25 深圳市宁远科技股份有限公司 Medical platform intelligent user entity searching system based on fuzzy matching
CN117312485A (en) * 2023-09-27 2023-12-29 东北大学 Regular expression matching method of log data oriented to database management system
CN119003571B (en) * 2024-07-29 2025-04-08 朴道征信有限公司 Query statement generation method, device, electronic equipment and computer readable medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111063447A (en) * 2019-12-17 2020-04-24 北京懿医云科技有限公司 Query and text processing method and device, electronic equipment and storage medium

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4701292B2 (en) * 2009-01-05 2011-06-15 インターナショナル・ビジネス・マシーンズ・コーポレーション Computer system, method and computer program for creating term dictionary from specific expressions or technical terms contained in text data
CN103198149B (en) * 2013-04-23 2017-02-08 中国科学院计算技术研究所 Method and system for query error correction
US9740736B2 (en) * 2013-09-19 2017-08-22 Maluuba Inc. Linking ontologies to expand supported language
US9208204B2 (en) * 2013-12-02 2015-12-08 Qbase, LLC Search suggestions using fuzzy-score matching and entity co-occurrence
CN105808688B (en) * 2016-03-02 2021-02-05 百度在线网络技术(北京)有限公司 Complementary retrieval method and device based on artificial intelligence
KR101908073B1 (en) * 2017-01-26 2018-12-18 주식회사 마이셀럽스 Sentence completion type search system and method that recommends words of high interest as search words
CN108563637A (en) * 2018-04-13 2018-09-21 北京理工大学 A kind of sentence entity complementing method of fusion triple knowledge base
CN110750704B (en) * 2019-10-23 2022-03-11 深圳计算科学研究院 Method and device for automatically completing query
KR102330494B1 (en) * 2019-12-23 2021-11-24 이혜연 Method of searching video clips to make contents for learning korean language
CN113946719A (en) * 2020-07-15 2022-01-18 华为技术有限公司 Word completion method and device
CN112287680B (en) * 2020-10-23 2024-04-09 微医云(杭州)控股有限公司 Entity extraction method, device and equipment of inquiry information and storage medium
CN112560477B (en) * 2020-12-09 2024-04-16 科大讯飞(北京)有限公司 Text completion method, electronic equipment and storage device
CN113779176B (en) * 2020-12-14 2025-07-18 北京沃东天骏信息技术有限公司 Query request completion method, device, electronic equipment and storage medium
CN112800769B (en) * 2021-02-20 2024-06-14 深圳追一科技有限公司 Named entity recognition method, named entity recognition device, named entity recognition computer equipment and named entity recognition storage medium
CN113821592B (en) * 2021-06-23 2024-06-28 腾讯科技(深圳)有限公司 Data processing method, device, equipment and storage medium
CN113495900B (en) * 2021-08-12 2024-03-15 国家电网有限公司大数据中心 Method and device for obtaining structured query language statements based on natural language

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111063447A (en) * 2019-12-17 2020-04-24 北京懿医云科技有限公司 Query and text processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN114969242A (en) 2022-08-30
CN114090722B (en) 2022-04-22
CN114090722A (en) 2022-02-25

Similar Documents

Publication Publication Date Title
CN114969242B (en) Method and device for automatically completing query content
CN112199366B (en) Data table processing method, device and equipment
CN112035730B (en) Semantic retrieval method and device and electronic equipment
WO2021174783A1 (en) Near-synonym pushing method and apparatus, electronic device, and medium
JP5768063B2 (en) Matching metadata sources using rules that characterize conformance
WO2018157805A1 (en) Automatic questioning and answering processing method and automatic questioning and answering system
CN110569328B (en) Entity linking method, electronic device and computer equipment
CN110147421B (en) Target entity linking method, device, equipment and storage medium
US20200356726A1 (en) Dependency graph based natural language processing
US20220138240A1 (en) Source code retrieval
CN108710662B (en) Language conversion method and device, storage medium, data query system and method
US20100257440A1 (en) High precision web extraction using site knowledge
CN106649557B (en) Semantic association mining method for defect report and mail list
JP2020512651A (en) Search method, device, and non-transitory computer-readable storage medium
EP4575822A1 (en) Data source mapper for enhanced data retrieval
CN114049642B (en) A text recognition method and computing device for imaged form documents
CN117493174B (en) Test case determination and cloud disk regression test method and device
CN115062049A (en) Data blood margin analysis method and device
CN116414872B (en) Data searching method and system based on natural language identification and knowledge graph
CN117951038A (en) Rust language document test automatic generation method and device based on code large model
Revindasari et al. Traceability between business process and software component using Probabilistic Latent Semantic Analysis
CN117892014A (en) A context-aware API recommendation method with implicit feedback mechanism
US11520798B2 (en) Model validation for query intent
CN115422180A (en) Data verification method and system
CN115577694A (en) Intelligent recommendation method for standard writing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant