CN116701565B - Methods, apparatus, electronic devices and storage media for determining search results - Google Patents
Methods, apparatus, electronic devices and storage media for determining search resultsInfo
- Publication number
- CN116701565B CN116701565B CN202210176275.5A CN202210176275A CN116701565B CN 116701565 B CN116701565 B CN 116701565B CN 202210176275 A CN202210176275 A CN 202210176275A CN 116701565 B CN116701565 B CN 116701565B
- Authority
- CN
- China
- Prior art keywords
- matching
- sample
- information
- dimension
- sub
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3334—Selection or weighting of terms from queries, including natural language queries
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9532—Query formulation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application relates to the technical field of information retrieval, in particular to a retrieval result determining method, a retrieval result determining device, electronic equipment and a storage medium, wherein the method comprises the steps of determining first association characteristic information of a target retrieval word and a candidate object based on semantic information of the target retrieval word and semantic information of the candidate object; the first association characteristic information is input into a matching data prediction network to obtain candidate matching data of the target retrieval word and the candidate object, the matching data prediction network is trained based on a data matching sample containing sample tag data, the sample tag data is obtained by carrying out information matching on the sample retrieval word and the sample object, and a plurality of target retrieval objects corresponding to the target retrieval word are determined from the candidate object based on the candidate matching data. The application can improve the accuracy of sample construction, thereby improving the accuracy of the retrieval result.
Description
Technical Field
The present application relates to the field of information retrieval technologies, and in particular, to a method and apparatus for determining a retrieval result, an electronic device, and a storage medium.
Background
The information retrieval is a process of retrieving a result matching with the user's intention according to a keyword, and has applications of retrieval in different scenes, such as internet retrieval, paper retrieval, e-commerce retrieval, and the like.
For example, in the early stage of the e-commerce industry, because the user operation data is less and the quality of the user operation data is not high, the sample construction is inaccurate due to the fact that the sample is constructed based on the user operation data, and further, the model is constructed based on the sample, and the problem that the search result is inaccurate is caused by the fact that the model is used for prediction.
Disclosure of Invention
The application aims to solve the technical problem of providing a search result determining method, a search result determining device, electronic equipment and a storage medium, which can improve the accuracy of sample construction and further improve the accuracy of search results.
In order to solve the above technical problems, in one aspect, a method for determining a search result is provided, including:
determining first association characteristic information of a target search term and a candidate object based on semantic information of the target search term and semantic information of the candidate object;
The first association characteristic information is input into a matching data prediction network to obtain candidate matching data of the target search word and the candidate object, the matching data prediction network is trained based on a data matching sample containing sample tag data, and the sample tag data is obtained by carrying out information matching on the sample search word and the sample object;
And determining a plurality of target retrieval objects corresponding to the target retrieval words from the candidate objects based on the candidate matching data.
In another aspect, there is provided a search result determination apparatus including:
The first associated feature determining module is used for determining first associated feature information of the target search word and the candidate object based on semantic information of the target search word and semantic information of the candidate object;
The matching data prediction module is used for inputting the first associated characteristic information into a matching data prediction network to obtain candidate matching data of the target retrieval word and the candidate object, wherein the matching data prediction network is obtained by training a data matching sample containing sample tag data;
and the search result determining module is used for determining a plurality of target search objects corresponding to the target search words from the candidate objects based on the candidate matching data.
In another aspect, the present application provides an electronic device, where the device includes a processor and a memory, where at least one instruction or at least one program is stored in the memory, where the at least one instruction or the at least one program is loaded and executed by the processor to implement a search result determining method as described above.
In another aspect, the present application provides a computer storage medium having at least one instruction or at least one program stored therein, the at least one instruction or the at least one program loaded by a processor and executed with the search result determining method as described above.
The embodiment of the application has the following beneficial effects:
The method comprises the steps of determining first association characteristic information of a target search word and candidate objects based on semantic information of the target search word and semantic information of the candidate objects, inputting the first association characteristic information into a matching data prediction network to obtain candidate matching data of the target search word and the candidate objects, training the matching data prediction network based on data matching samples containing sample tag data, obtaining the sample tag data through information matching of the sample search word and the sample objects, and determining a plurality of target search objects corresponding to the target search word from the candidate objects based on the candidate matching data. The method comprises the steps of obtaining a sample label by carrying out information matching on a sample search term and a sample object, and distinguishing the sample label from a method for determining the sample label based on user operation data, so that the problem of inaccurate samples caused by low quality of the user operation data is solved, the accuracy of sample construction is improved, a matching data prediction network is obtained based on constructed sample training, the prediction accuracy of a matching data prediction model can be improved, in the information retrieval process, the matching data prediction network is used for predicting candidate matching data, and then a target search object is determined from the candidate object based on the candidate matching data, so that the accuracy of a search result can be improved.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic illustration of an implementation environment provided by an embodiment of the present application;
FIG. 2 is a flowchart of a search result determining method according to an embodiment of the present application;
FIG. 3 is a flowchart of a method for generating sample tag data based on information matching according to an embodiment of the present application;
FIG. 4 is a flowchart of a method for determining dimension matching information for each matching dimension according to an embodiment of the present application;
FIG. 5 is a flowchart of a method for determining sub-match information for multiple sub-match dimensions according to an embodiment of the present application;
FIG. 6 is a flowchart of a method for obtaining sample tag data based on weight calculation according to an embodiment of the present application;
FIG. 7 is a flowchart of a method for training a matching data prediction network according to an embodiment of the present application;
FIG. 8 is a flowchart of a method for constructing associated feature information provided by an embodiment of the present application;
FIG. 9 is a schematic diagram of a matching data prediction network implementation provided by an embodiment of the present application;
fig. 10 is a schematic diagram of a search result determining apparatus according to an embodiment of the present application;
Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the accompanying drawings, for the purpose of making the objects, technical solutions and advantages of the present application more apparent. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed or inherent to such process, method, article, or apparatus, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that, the user information (including, but not limited to, user equipment information, user personal information, etc.) and the data (including, but not limited to, data for presentation, analyzed data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party.
Referring to fig. 1, a schematic diagram of an implementation environment provided by an embodiment of the present application is shown, where the implementation environment may include at least one first terminal 110 and a second terminal 120, and the first terminal 110 and the second terminal 120 may perform data communication through a network.
Specifically, the first terminal 110 sends an information retrieval request to the second terminal 120, the information retrieval request includes a target retrieval word, the second terminal 120 receives the information retrieval request, performs information matching based on the target retrieval word and the candidate object, determines a target retrieval object from the candidate object based on the information matching result, generates a retrieval result based on the target retrieval object, and returns the retrieval result to the first terminal 110.
The first terminal 110 may communicate with the second terminal 120 based on Browser/Server (B/S) or Client/Server (C/S) mode. The first terminal 110 may include a smart phone, a tablet computer, a notebook computer, a digital assistant, a smart wearable device, an in-vehicle terminal, a server, or other type of physical device, and may also include software running in the physical device, such as an application program, etc. The operating system running on the first terminal 110 in the embodiment of the present application may include, but is not limited to, an android system, an IOS system, linux, windows, and the like.
The second terminal 120 may establish a communication connection with the first terminal 110 through a wire or wirelessly, and the second terminal 120 may include a server that operates independently, or a distributed server, or a server cluster formed by a plurality of servers, where the servers may be cloud servers.
In order to solve the problem that in the prior art, constructing a sample based on user operation data may cause inaccurate sample construction, and further, based on the sample construction model, predicting by the model may cause inaccurate search results, an embodiment of the present application provides a search result determining method, an execution subject of the method may be the above-mentioned second terminal, and specifically, referring to fig. 2, the method may include:
S210, determining first association characteristic information of a target search term and a candidate object based on semantic information of the target search term and semantic information of the candidate object.
In this embodiment, the target search term may include one or more keywords, and when the target search term includes a plurality of keywords, the keywords may be connected through a preset search symbol, so as to obtain a search symbol preset for the corresponding target search term, which may include, but is not limited to, and, or, not, etc., where the search symbol in the prior art may be applied to this embodiment.
Specifically, the target search term may include multilingual keywords, and when the target search term includes multiple keywords, each keyword in the target search term may be in the same language or different languages, for example, the target search term may be a first brand, milk, or the target search term may be a first brand, and milk.
The semantic information of the target search word can be obtained through semantic analysis on the target keyword, the semantic analysis specifically comprises word segmentation processing, semantic recognition and the like, the semantic analysis process specifically can be realized through a preset semantic analysis model, the semantic analysis model can segment the input target search word, in the word segmentation process, a corresponding word segmentation method can be adopted according to grammar of different languages, then semantic recognition can be carried out on the segmented word, wherein for sample search words containing different languages, the semantic recognition specifically also comprises word segmentation translation so as to translate the multilingual segmented word into the segmented word of the target language, the subsequent information matching is facilitated, and the information matching efficiency is improved.
The semantic information of the candidate object can be obtained by analysis through a semantic analysis model in advance and stored, so that the semantic information of the candidate object is analyzed once, repeated semantic analysis is avoided, the analysis efficiency of the semantic information of the candidate object can be improved, and the system computing resources are saved.
The first associated feature information can represent the matching feature information of the target retrieval word and the candidate object, can be determined through semantic information of the target retrieval word and a matching result of the semantic information of the candidate object, and can be specifically a feature sequence or a feature vector and the like, wherein each dimension of the feature sequence or the feature vector can be used for representing the matching feature information of the target retrieval word and the candidate object.
S220, inputting the first association characteristic information into a matching data prediction network to obtain candidate matching data of the target search word and the candidate object, wherein the matching data prediction network is trained based on a data matching sample containing sample tag data, and the sample tag data is obtained by carrying out information matching on the sample search word and the sample object.
The matching data prediction network can predict the matching degree of the target keyword and the candidate object based on the input association characteristic information of the target keyword and the candidate object, so as to obtain candidate matching data of the target keyword and the candidate object, wherein the candidate matching data can be specifically identified by a numerical value or degree, and the matching degree of the target keyword and the candidate object can be determined through the candidate matching data.
The matching data prediction network can be obtained by training a data matching sample containing sample tag data, the sample tag data in the embodiment of the application is obtained by carrying out information matching on sample retrieval words and sample objects, and for different sample retrieval word-sample object matching pairs, matching can be carried out respectively to obtain corresponding sample tag data, so that the obtained sample tag data can represent the information of the corresponding sample retrieval word-sample object matching pair, and the problem of inaccurate sample tag data caused by directly using user operation data to construct a sample is avoided.
In addition, the sample search term may be selected from user operation data or may be selected randomly, or the sample object may be selected from recall results or may be selected randomly, and the embodiment is not limited specifically.
S230, determining a plurality of target retrieval objects corresponding to the target retrieval words from the candidate objects based on the candidate matching data.
According to the above, the candidate matching data can represent the matching degree of the target keyword and the candidate object, so that when determining the search result, a candidate object with higher matching degree with the target keyword can be selected from the candidate objects as the target search object.
Specifically, when determining the target search object, the candidate objects with the candidate matching data being greater than or equal to the preset matching data may be determined as the target search object, or the N candidate objects ranked in topN may be determined as the target search object based on the ranking of the candidate matching data from large to small.
According to the embodiment of the application, the sample label is obtained by carrying out information matching on the sample retrieval word and the sample object, which is different from a method for determining the sample label based on the user operation data, so that the problem of inaccurate sample caused by low quality of a user operation data layer is avoided, the accuracy of sample construction is improved, the prediction accuracy of a matching data prediction model can be improved by training to obtain the matching data prediction network based on the constructed sample, and further, in the information retrieval process, the prediction of candidate matching data is carried out based on the matching data prediction network, and then the target retrieval object is determined from the candidate object based on the candidate matching data, so that the accuracy of a retrieval result can be improved.
Referring to fig. 3, a method for generating sample tag data based on information matching is shown, which may include:
s310, determining semantic information of a plurality of matching dimensions of the sample retrieval word and semantic information of a plurality of matching dimensions of the sample object.
S320, matching the sample retrieval word with the semantic information of the sample object in the plurality of matching dimensions respectively to obtain dimension matching information corresponding to the plurality of matching dimensions respectively.
S330, obtaining the sample tag data based on dimension matching information corresponding to the matching dimensions.
The semantic information of the sample retrieval word in the plurality of matching dimensions can be extracted from the semantic information of the sample retrieval word, and the semantic information of the plurality of matching dimensions of the sample object can be extracted from the semantic information of the sample object in advance and stored.
In the process of specifically generating sample label data, semantic information of a sample search word and sample objects in a plurality of matching dimensions can be respectively matched, the semantic information of each matching dimension can represent characteristic information of the matching dimension, the semantic information of the sample search word and the sample objects in the same matching dimension are respectively matched, and the matching information of the sample search word and the sample objects in the matching dimension can be obtained.
Taking e-commerce retrieval as an example, the object concrete can be a commodity, so that the plurality of matching dimensions can comprise category matching dimensions, brand matching dimensions, product word matching dimensions, commodity description matching dimensions and the like, the category matching information of the sample retrieval words and the sample commodity can be determined by matching semantic information in the category matching dimensions, the brand matching information of the sample retrieval words and the sample commodity can be determined by matching semantic information in the brand matching dimensions, the product word matching information of the sample retrieval words and the sample commodity can be determined by matching semantic information in the product word matching dimensions, and the commodity description matching information of the sample retrieval words and the sample commodity can be determined by matching semantic information in the commodity description matching dimensions.
Therefore, when information matching is carried out on the sample retrieval words and the sample objects, semantic information under a plurality of matching dimensions can be respectively matched, corresponding sample tag data are determined based on matching results of the plurality of matching dimensions, so that comprehensive matching of the sample retrieval words and the sample objects in the plurality of matching dimensions can be realized, the matching comprehensiveness of the sample retrieval words and the sample objects is further improved, and the accuracy of determining the sample tag data is improved.
Further, each matching dimension includes a plurality of sub-matching dimensions, and accordingly, referring to FIG. 4, a method for determining dimension matching information for each matching dimension is shown, the method may include:
S410, under each matching dimension, matching the sample retrieval word with the semantic information of the sample object in the sub-matching dimensions respectively to obtain sub-matching information corresponding to the sub-matching dimensions respectively.
S420, summing sub-matching information corresponding to the sub-matching dimensions respectively to obtain dimension matching information of each matching dimension.
Specifically, each matching dimension can further include a plurality of sub-matching dimensions, so that when semantic information in each matching dimension is matched, the semantic information in the plurality of sub-matching dimensions in the matching dimension can be matched to obtain sub-matching information corresponding to the plurality of sub-matching dimensions respectively.
Taking the above category matching dimension as an example, the category matching dimension may include a multi-level category matching dimension, that is, each level of category corresponds to one sub-matching dimension, and the category may include a first level category, a second level category, a third level category, and a fourth level category. Specifically, if the sample search term is "cola", it may be determined that the first class of the sample search term is beverage, the second class of the sample search term is carbonated beverage, and the third class of the sample search term is cola, and correspondingly, the multi-stage class information corresponding to the sample search term and the multi-stage class information of the sample object may be respectively matched to obtain matching information in each sub-matching dimension, and then the matching information in each sub-matching dimension is summed to obtain dimension matching information in the matching dimension.
Taking the brand matching dimension as an example, the brand matching dimension may include multilingual brand matching dimensions, where the brand matching dimension of each language corresponds to one sub-matching dimension, for example, for a first brand, where the text is expressed as a first brand, and the english is expressed as "a", correspondingly, the multilingual brand matching dimension corresponding to the sample search term may be matched with semantic information in the multilingual brand matching dimension of the sample object, so as to obtain matching information in each sub-matching dimension, and then the matching information in each sub-matching dimension is summed to obtain the dimensional matching information in the matching dimension.
Taking the above product word matching dimension as an example, the product word matching dimension may include multilingual product word matching dimensions, where each multilingual product word matching dimension corresponds to one sub-matching dimension, for example, for a product word "cola", where the text is expressed as "cola", and the english is expressed as "cowe", and correspondingly, the multilingual product word matching dimension corresponding to the sample search term and semantic information in the multilingual product word matching dimension of the sample object are respectively matched to obtain matching information in each sub-matching dimension, and then the matching information in each sub-matching dimension is summed to obtain dimension matching information in the matching dimension.
Taking the matching dimension of the commodity description as an example, the commodity description can generally contain attribute information such as efficacy, characteristics, specifications and the like of the commodity, so that each attribute information dimension can correspond to one sub-matching dimension, correspondingly, matching the plurality of attribute information dimensions corresponding to the sample retrieval word with semantic information in the plurality of attribute information dimensions of the sample object respectively to obtain matching information in each sub-matching dimension, and then summing the matching information in each sub-matching dimension to obtain dimension matching information in the matching dimension.
Specifically, a method for calculating matching information corresponding to a matching dimension is shown in formula (1):
wherein I (x) is an indication function:
Q i corresponds to each sub-matching dimension of the sample term and G j corresponds to each sub-matching dimension of the sample object.
The method comprises the steps that each matching dimension further comprises a plurality of sub-matching dimensions, semantic information in the sub-matching dimensions in each matching dimension is respectively matched to obtain sub-matching information corresponding to the sub-matching dimensions, summation is carried out on the basis of the sub-matching information corresponding to the sub-matching dimensions to obtain dimension matching information of each matching dimension, and therefore on one hand, comprehensive matching of sample retrieval words and sample objects in each matching dimension can be achieved; on the other hand, matching failure caused by the fact that multilingual seed matching dimensionality is not set can be avoided, the problem that matching information is inaccurate is further solved, and accuracy of determining sample tag data is further improved.
Referring to fig. 5, a method for determining sub-match information of multiple sub-match dimensions is shown, which may include:
s510, judging whether semantic information of the sample search word exists corresponding to the sub-matching dimension or not under each sub-matching dimension, if so, executing the step S520, and if not, executing the step S530.
S520, matching the semantic information of the sample retrieval word in the sub-matching dimension with the semantic information of the sample object in the sub-matching dimension to obtain sub-matching information of the sample retrieval word and the sample object in the sub-matching dimension.
S530, determining preset matching information as the sub-matching information.
Taking the brand matching dimension as an example, the brand matching dimension may include multilingual brand matching dimensions, where the brand matching dimension of each language corresponds to one sub-matching dimension, for example, for a first brand, where the text is expressed as a "first brand", the english is expressed as "a", the sample search term includes a "first brand", that is, the semantic information of the sample search term has semantic information with the chinese brand matching dimension, and then the "first brand" in the sample search term and the semantic information of the sample object in the chinese brand matching dimension may be matched, so as to obtain sub-matching information of the "first brand" and the sample object in the chinese brand matching dimension. The sample search word does not contain "a", that is, semantic information corresponding to the english brand matching dimension does not exist in the semantic information of the sample search word, the preset matching information may be determined as corresponding sub-matching information, and the preset matching information may be "NULL" or "0".
Therefore, when the semantic information matching of the sub-matching dimension is specifically carried out, whether the sample search word has corresponding semantic information under each sub-matching dimension or not can be judged first, if so, the corresponding matching is carried out, sub-matching information is obtained, and if not, the preset matching information is determined to be the sub-matching information, so that the efficiency of information matching can be improved.
Additionally, referring to fig. 6, a method for obtaining sample tag data based on a weighted calculation is shown, which may include:
S610, determining matching weights respectively corresponding to the dimension matching information of the plurality of matching dimensions.
S620, carrying out weighted summation on the dimension matching information of the plurality of matching dimensions based on the matching weight to obtain the sample tag data.
The matching weight corresponding to each matching dimension can be set based on different scene needs, so that on the basis of calculating dimension matching information corresponding to each matching dimension, weighting and summing are carried out on each dimension matching information, corresponding sample tag data are obtained, and as the matching weight is set based on different scene needs, the calculated sample tag data are matched with an application scene, the matching weight can be flexibly set, and therefore the flexibility of determining the sample tag data is improved.
The concrete calculation method of the sample label data is shown in the formula (3):
The α, β, γ, δ is a matching weight, and specifically, the matching weight value may be α=0.4, β=0.25, γ=0.2, and δ=0.15.
Referring to fig. 7, a method for matching data predictive network training is shown, which may include:
S710, determining second association feature information of the sample retrieval word and the sample object based on semantic information of the sample retrieval word and semantic information of the sample object.
S720, inputting the second associated characteristic information into a preset data prediction network to obtain prediction matching data of the sample retrieval word and the sample object.
S730, training the preset data prediction network based on the sample tag data and the predicted matching data to obtain the matching data prediction network.
The method for determining the second association characteristic information is similar to the method for determining the first association characteristic information, and will not be described in detail herein.
And carrying out parameter adjustment on the preset data prediction network based on the sample tag data and the predicted matching data to obtain a matched data prediction network. The matching data prediction network in this embodiment may specifically be a regression model, where the regression model may include a conventional machine learning model, such as lr (Linear Regression ), xgboost Regression, and the like, and also includes a deep learning regression model, where the predicted value of the matching data prediction network is a continuous value.
The sample label data is obtained by carrying out information matching on the sample retrieval words and the sample objects, particularly can be obtained by carrying out information matching on the basis of semantic information of a plurality of matching dimensions of the sample retrieval words and semantic information of a plurality of matching dimensions of the sample objects, so that the accuracy of the sample label data is improved, and model training is further carried out on the basis of the sample label data, so that the accuracy of prediction of a matching data prediction network can be improved.
For a specific method for constructing the associated feature information, referring to fig. 8, the method may include:
S810, determining matching features of the target retrieval word and the candidate object under a plurality of feature dimensions based on semantic information of the target retrieval word and semantic information of the candidate object.
S820, carrying out feature construction based on the matching features in the feature dimensions to obtain the first associated feature information.
The matching characteristic information of the target retrieval word and the candidate object in each characteristic information item is obtained by matching the semantic information of the target retrieval word and the semantic information of the candidate object. For example, category matching feature information items, brand word matching feature information items, product word matching feature information items, search terms, commodity title matching feature items and the like can be preset, each matching feature information item can be represented by one feature dimension or a plurality of feature dimensions, a specific feature value of each feature dimension can be a decimal between 0 and 1, or 0 or 1, and the feature values of the feature dimensions are sequentially spliced to obtain first associated feature information, and the first associated feature information is specifically applied in the form of feature vectors or feature sequences.
For category matching feature information items, whether the target search word is matched with the multi-level category of the candidate object can be judged, the matching is 1, the non-matching is 0, for brand word matching feature information items, whether the target search word is matched with the brand of the candidate object can be judged, the matching is 1, the matching comprises matching consistency and otherwise called matching, the non-matching is 0, for product word matching feature information items, the number of product words matched with the candidate object or the same in the target search word can be calculated, the proportion of the product words segmented by the target search word is occupied, and the like.
The method comprises the steps of obtaining a target search term, matching semantic information of a candidate object, obtaining matching features under a plurality of feature dimensions, improving convenience and efficiency of determining the matching features, further carrying out feature construction based on the matching features under the plurality of feature dimensions, obtaining associated feature information, wherein feature values of each feature dimension have corresponding feature information items, and therefore the interpretability of the associated feature information is improved.
The related characteristic information can also be determined through a depth network model, and the specific steps can include inputting semantic information of the target retrieval word and semantic information of the candidate object into the depth network model to obtain characteristic word vectors of the target retrieval word and the candidate object, wherein the characteristic word vectors can be used as the related characteristic vectors.
Specifically, referring to fig. 9, when training the matching data prediction network, firstly extracting matching features from samples of < query, commodity > and inputting the extracted matching features into the matching data prediction network to obtain corresponding predicted matching data, when predicting matching data based on the matching data prediction network, firstly extracting matching features of < target search terms, commodity > and inputting the extracted matching features into the matching data prediction network to obtain corresponding predicted matching data.
The search result determining method provided by the embodiment can be applied to the field of electronic commerce search or paper search, and the like, and particularly can be applied to scenes with fewer user operation data and low data quality in the initial stage of a project.
The present embodiment also provides a search result determining apparatus, referring to fig. 10, the apparatus may include:
A first associated feature determining module 1010, configured to determine first associated feature information of a target term and a candidate object based on semantic information of the target term and semantic information of the candidate object;
The matching data prediction module 1020 is configured to input the first associated feature information to a matching data prediction network to obtain candidate matching data of the target search term and the candidate object, where the matching data prediction network is obtained by training based on a data matching sample including sample tag data, and the sample tag data is obtained by performing information matching on the sample search term and the sample object;
and the search result determining module 1030 is configured to determine, from the candidate objects, a plurality of target search objects corresponding to the target search term based on the candidate matching data.
Further, the apparatus further comprises:
the first determining module is used for determining semantic information of a plurality of matching dimensions of the sample retrieval word and semantic information of a plurality of matching dimensions of the sample object;
The first matching module is used for respectively matching the sample retrieval word with the semantic information of the sample object in the plurality of matching dimensions to obtain dimension matching information respectively corresponding to the plurality of matching dimensions;
and the sample tag data generation module is used for obtaining the sample tag data based on dimension matching information corresponding to the plurality of matching dimensions respectively.
Further, the apparatus further comprises:
The second associated feature information determining module is used for determining second associated feature information of the sample retrieval word and the sample object based on semantic information of the sample retrieval word and semantic information of the sample object;
The first prediction module is used for inputting the second associated characteristic information into a preset data prediction network to obtain prediction matching data of the sample retrieval word and the sample object;
And the training module is used for training the preset data prediction network based on the sample tag data and the predicted matching data to obtain the matching data prediction network.
Further, each matching dimension includes a plurality of sub-matching dimensions, and the first matching module includes:
The second matching module is used for respectively matching the semantic information of the sample retrieval word and the sample object in the plurality of sub-matching dimensions under each matching dimension to obtain sub-matching information respectively corresponding to the plurality of sub-matching dimensions;
And the summation module is used for summing the sub-matching information corresponding to the plurality of sub-matching dimensions respectively to obtain the dimension matching information of each matching dimension.
Further, the second matching module includes:
The second determining module is used for matching the semantic information of the sample retrieval word in the sub-matching dimension with the semantic information of the sample object in the sub-matching dimension under each sub-matching dimension if the semantic information of the sample retrieval word has the semantic information corresponding to the sub-matching dimension, so as to obtain the sub-matching information of the sample retrieval word and the sample object in the sub-matching dimension;
A third determining module, configured to determine preset matching information as the sub-matching information if the semantic information of the sample search term does not have the semantic information corresponding to the sub-matching dimension
Further, the sample tag data generation module includes:
The matching weight determining module is used for determining matching weights respectively corresponding to the dimension matching information of the plurality of matching dimensions;
And the weighted summation module is used for carrying out weighted summation on the dimension matching information of the plurality of matching dimensions based on the matching weight to obtain the sample tag data.
Further, the first association characteristic determining module includes:
a fourth determining module, configured to determine, based on semantic information of the target term and semantic information of the candidate object, matching features of the target term and the candidate object under multiple feature dimensions;
And the feature construction module is used for carrying out feature construction based on the matching features in the feature dimensions to obtain the first associated feature information.
The device provided in the above embodiment can execute the method provided in any embodiment of the present application, and has the corresponding functional modules and beneficial effects of executing the method. Technical details not described in detail in the above embodiments may be found in the methods provided by any of the embodiments of the present application.
The present embodiment also provides a computer readable storage medium having stored therein at least one instruction or at least one program, the at least one instruction or the at least one program loaded by a processor and executed by a method according to any of the above embodiments.
According to one aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium and executes the computer instructions to cause the computer device to perform any of the methods described above.
Fig. 11 is a block diagram of an electronic device, which may be a server, for a search result determination method according to an exemplary embodiment, and an internal structure diagram thereof may be as shown in fig. 11. The electronic device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the electronic device is configured to provide computing and control capabilities. The memory of the electronic device includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the electronic device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a search result determination method.
The present specification provides method operational steps as described in the examples or flowcharts, but may include more or fewer operational steps based on conventional or non-inventive labor. The steps and sequences recited in the embodiments are merely one manner of performing the sequence of steps and are not meant to be exclusive of the sequence of steps performed. In actual system or interrupt product execution, the methods illustrated in the embodiments or figures may be performed sequentially or in parallel (e.g., in the context of parallel processors or multi-threaded processing).
The structures shown in this embodiment are only partial structures related to the present application and do not constitute limitations of the apparatus to which the present application is applied, and a specific apparatus may include more or less components than those shown, or may combine some components, or may have different arrangements of components. It should be understood that the methods, apparatuses, etc. disclosed in the embodiments may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, and the division of the modules is merely a division of one logic function, and may be implemented in other manners, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or unit modules.
Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
While the application has been described in detail with reference to the foregoing embodiments, it will be understood by those skilled in the art that the foregoing embodiments may be modified or equivalents may be substituted for some of the features thereof, and that the modifications or substitutions do not depart from the spirit and scope of the embodiments of the application.
Claims (11)
1. A search result determining method, characterized by comprising:
determining first association characteristic information of a target search term and a candidate object based on semantic information of the target search term and semantic information of the candidate object;
the first association characteristic information is input into a matching data prediction network to obtain candidate matching data of the target retrieval word and the candidate object, the matching data prediction network is trained based on a data matching sample containing sample tag data, the sample tag data is obtained by carrying out information matching on the sample retrieval word and the sample object, and the method for determining the sample tag data comprises the steps of determining semantic information of a plurality of matching dimensions of the sample retrieval word and semantic information of a plurality of matching dimensions of the sample object, wherein at least one matching dimension comprises multilingual seed matching dimensions, and each matching dimension comprises a plurality of sub-matching dimensions;
Under each sub-matching dimension, if the semantic information of the sample retrieval word has the semantic information corresponding to the sub-matching dimension, matching the semantic information of the sample retrieval word in the sub-matching dimension with the semantic information of the sample object in the sub-matching dimension to obtain the sub-matching information of the sample retrieval word and the sample object in the sub-matching dimension; respectively matching the multilingual seed matching dimension corresponding to the sample retrieval word in the at least one matching dimension with the multilingual seed matching dimension corresponding to the sample object in the at least one matching dimension to obtain dimension matching information in each language sub-matching dimension;
if the semantic information of the sample search word does not have the semantic information corresponding to the sub-matching dimension, determining preset matching information as the sub-matching information;
summing the sub-matching information respectively corresponding to the plurality of sub-matching dimensions to obtain dimension matching information of each matching dimension;
obtaining the sample tag data based on dimension matching information respectively corresponding to the plurality of matching dimensions;
And determining a plurality of target retrieval objects corresponding to the target retrieval words from the candidate objects based on the candidate matching data.
2. The method according to claim 1, wherein the method further comprises:
determining second associated feature information of the sample retrieval word and the sample object based on the semantic information of the sample retrieval word and the semantic information of the sample object;
Inputting the second associated feature information into a preset data prediction network to obtain prediction matching data of the sample retrieval word and the sample object;
And training the preset data prediction network based on the sample tag data and the predicted matching data to obtain the matching data prediction network.
3. The method according to claim 1, wherein the obtaining the sample tag data based on dimension matching information corresponding to the plurality of matching dimensions respectively includes:
Determining matching weights respectively corresponding to the dimension matching information of the plurality of matching dimensions;
And carrying out weighted summation on the dimension matching information of the plurality of matching dimensions based on the matching weight to obtain the sample tag data.
4. The method of claim 1, wherein the determining the first associated feature information of the target term and the candidate object based on the semantic information of the target term and the semantic information of the candidate object comprises:
Determining matching features of the target retrieval word and the candidate object under a plurality of feature dimensions based on semantic information of the target retrieval word and semantic information of the candidate object;
And carrying out feature construction based on the matching features in the feature dimensions to obtain the first associated feature information.
5. A search result determining apparatus, comprising:
The first associated feature determining module is used for determining first associated feature information of the target search word and the candidate object based on semantic information of the target search word and semantic information of the candidate object;
The matching data prediction module is used for inputting the first associated feature information into a matching data prediction network to obtain candidate matching data of the target search word and the candidate object, the matching data prediction network is trained based on data matching samples containing sample tag data, the sample tag data is obtained through information matching of the sample search word and the sample object, the method for determining the sample tag data comprises the steps of determining semantic information of a plurality of matching dimensions of the sample search word and semantic information of a plurality of matching dimensions of the sample object, at least one matching dimension comprises a multilingual seed matching dimension, each matching dimension comprises a plurality of sub-matching dimensions, under each sub-matching dimension, if semantic information of the sample search word exists in the semantic information of the sub-matching dimension, the sample search word is matched with the semantic information of the sample object in the sub-matching dimension to obtain sub-matching information of the sample search word and the sample object in the sub-matching dimension, at least one matching dimension is matched with the corresponding sub-matching information of the sample object in the sub-matching dimension to obtain semantic information of the sample search word in the sub-matching dimension, determining preset matching information as the sub-matching information; summing the sub-matching information corresponding to the plurality of sub-matching dimensions respectively to obtain dimension matching information of each matching dimension;
and the search result determining module is used for determining a plurality of target search objects corresponding to the target search words from the candidate objects based on the candidate matching data.
6. The apparatus of claim 5, wherein the apparatus further comprises:
The second associated feature information determining module is used for determining second associated feature information of the sample retrieval word and the sample object based on semantic information of the sample retrieval word and semantic information of the sample object;
The first prediction module is used for inputting the second associated characteristic information into a preset data prediction network to obtain prediction matching data of the sample retrieval word and the sample object;
And the training module is used for training the preset data prediction network based on the sample tag data and the predicted matching data to obtain the matching data prediction network.
7. The apparatus of claim 5, wherein the sample tag data generation module comprises:
The matching weight determining module is used for determining matching weights respectively corresponding to the dimension matching information of the plurality of matching dimensions;
And the weighted summation module is used for carrying out weighted summation on the dimension matching information of the plurality of matching dimensions based on the matching weight to obtain the sample tag data.
8. The apparatus of claim 5, wherein the first associated feature determination module comprises:
a fourth determining module, configured to determine, based on semantic information of the target term and semantic information of the candidate object, matching features of the target term and the candidate object under multiple feature dimensions;
And the feature construction module is used for carrying out feature construction based on the matching features in the feature dimensions to obtain the first associated feature information.
9. An electronic device comprising a processor and a memory, wherein the memory has stored therein at least one instruction or at least one program that is loaded and executed by the processor to implement the retrieval result determination method of any one of claims 1 to 4.
10. A computer storage medium having stored therein at least one instruction or at least one program, the at least one instruction or the at least one program loaded by a processor and executed the retrieval result determination method according to any one of claims 1 to 4.
11. A computer program product, characterized in that the computer program product comprises computer instructions stored in a computer-readable storage medium, which computer instructions are read from the computer-readable storage medium by a processor of a computer device, which computer instructions are executed by the processor, so that the computer device performs the retrieval result determination method according to any one of claims 1 to 4.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210176275.5A CN116701565B (en) | 2022-02-25 | 2022-02-25 | Methods, apparatus, electronic devices and storage media for determining search results |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202210176275.5A CN116701565B (en) | 2022-02-25 | 2022-02-25 | Methods, apparatus, electronic devices and storage media for determining search results |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN116701565A CN116701565A (en) | 2023-09-05 |
| CN116701565B true CN116701565B (en) | 2025-11-04 |
Family
ID=87837990
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210176275.5A Active CN116701565B (en) | 2022-02-25 | 2022-02-25 | Methods, apparatus, electronic devices and storage media for determining search results |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN116701565B (en) |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111460264A (en) * | 2020-03-30 | 2020-07-28 | 口口相传(北京)网络技术有限公司 | Training method and device of semantic similarity matching model |
| CN111506596A (en) * | 2020-04-21 | 2020-08-07 | 腾讯科技(深圳)有限公司 | Information retrieval method, information retrieval device, computer equipment and storage medium |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR101469526B1 (en) * | 2014-08-29 | 2014-12-05 | 한국지질자원연구원 | Web-based semantic information retrieval system using context awareness ontology |
| CN107169052B (en) * | 2017-04-26 | 2019-03-05 | 北京小度信息科技有限公司 | Recommended method and device |
| CN107491518B (en) * | 2017-08-15 | 2020-08-04 | 北京百度网讯科技有限公司 | Search recall method and device, server and storage medium |
| CN110377831B (en) * | 2019-07-25 | 2022-05-17 | 拉扎斯网络科技(上海)有限公司 | Retrieval method, apparatus, readable storage medium and electronic device |
| CN111813930B (en) * | 2020-06-15 | 2024-02-20 | 语联网(武汉)信息技术有限公司 | Similar document retrieval method and device |
| CN112579870B (en) * | 2020-12-22 | 2025-06-27 | 北京三快在线科技有限公司 | Retrieval matching model training method, device, equipment and storage medium |
| CN113761124B (en) * | 2021-05-25 | 2024-04-26 | 腾讯科技(深圳)有限公司 | Text encoding model training method, information retrieval method and device |
-
2022
- 2022-02-25 CN CN202210176275.5A patent/CN116701565B/en active Active
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111460264A (en) * | 2020-03-30 | 2020-07-28 | 口口相传(北京)网络技术有限公司 | Training method and device of semantic similarity matching model |
| CN111506596A (en) * | 2020-04-21 | 2020-08-07 | 腾讯科技(深圳)有限公司 | Information retrieval method, information retrieval device, computer equipment and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN116701565A (en) | 2023-09-05 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110069709B (en) | Intention recognition method, device, computer readable medium and electronic equipment | |
| CN112732870B (en) | Word vector based search method, device, equipment and storage medium | |
| EP3937029A2 (en) | Method and apparatus for training search model, and method and apparatus for searching for target object | |
| CN113569578B (en) | A user intention recognition method, device and computer equipment | |
| JP2023516209A (en) | METHOD, APPARATUS, APPARATUS AND COMPUTER-READABLE STORAGE MEDIUM FOR SEARCHING CONTENT | |
| CN112818091A (en) | Object query method, device, medium and equipment based on keyword extraction | |
| CN113806631B (en) | Recommendation method, training method, device, equipment and news recommendation system | |
| US20190155913A1 (en) | Document search using grammatical units | |
| CN111666766A (en) | Data processing method, device and equipment | |
| CN115203514A (en) | Commodity query redirection method and device, equipment, medium and product thereof | |
| WO2019093172A1 (en) | Similarity index computation device, similarity search device, and similarity index computation program | |
| CN114595313B (en) | Information retrieval result processing method, device, server and storage medium | |
| CN111198949B (en) | A text label determination method and system | |
| CN116701565B (en) | Methods, apparatus, electronic devices and storage media for determining search results | |
| CN115238676A (en) | Method and device for identifying hot spots of bidding demands, storage medium and electronic equipment | |
| CN120163632A (en) | Product recommendation strategy generation method, device, computer equipment and storage medium | |
| CN117591662B (en) | Digital enterprise service data mining method and system based on artificial intelligence | |
| CN119621944A (en) | Data retrieval method, device, electronic device and medium | |
| CN113807920A (en) | Artificial intelligence based product recommendation method, device, equipment and storage medium | |
| CN112988971A (en) | Word vector-based search method, terminal, server and storage medium | |
| KR102215259B1 (en) | Method of analyzing relationships of words or documents by subject and device implementing the same | |
| WO2024129366A1 (en) | Model pre-training for user interface navigation | |
| CN117909560A (en) | Search method, training device, training equipment, training medium and training program product | |
| CN117633358A (en) | Content recommendation method, content recommendation device, and storage medium | |
| CN117216256A (en) | Intent recognition methods, devices, equipment and computer storage media |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |