[go: up one dir, main page]

CN114663164A - E-commerce site promotion configuration method and its device, equipment, medium and product - Google Patents

E-commerce site promotion configuration method and its device, equipment, medium and product Download PDF

Info

Publication number
CN114663164A
CN114663164A CN202210383174.5A CN202210383174A CN114663164A CN 114663164 A CN114663164 A CN 114663164A CN 202210383174 A CN202210383174 A CN 202210383174A CN 114663164 A CN114663164 A CN 114663164A
Authority
CN
China
Prior art keywords
tail
word
search
long
product
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210383174.5A
Other languages
Chinese (zh)
Other versions
CN114663164B (en
Inventor
方兵
叶朝鹏
王�锋
郭东波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huanju Shidai Information Technology Co Ltd
Original Assignee
Guangzhou Huanju Shidai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huanju Shidai Information Technology Co Ltd filed Critical Guangzhou Huanju Shidai Information Technology Co Ltd
Priority to CN202210383174.5A priority Critical patent/CN114663164B/en
Publication of CN114663164A publication Critical patent/CN114663164A/en
Application granted granted Critical
Publication of CN114663164B publication Critical patent/CN114663164B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Electronic shopping [e-shopping] by investigating goods or services
    • G06Q30/0625Electronic shopping [e-shopping] by investigating goods or services by formulating product or service queries, e.g. using keywords or predefined options

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Computational Linguistics (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Artificial Intelligence (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method for popularizing and configuring E-commerce sites, and a device, equipment, medium and product thereof, wherein the method comprises the following steps: constructing a search text according to the product words and the attribute words of the commodity titles in the commodity display page; acquiring statistical data matched with a search text, wherein the statistical data comprises a plurality of candidate long-tail words and historical search statistical indexes thereof; determining a unique candidate long tail word semantically matched with the commodity title as a target long tail word according to the statistical index; and configuring the target long tail word into a page title of the commodity display page. According to the method and the device, the long-tail keywords can be determined based on the commodity titles corresponding to the commodity display pages and are automatically configured to be the page titles of the commodity display pages, search engine keyword optimization is achieved, the long-tail effect is utilized, the effect that the search ranking is advanced is achieved through the large number of long-tail keywords of the single commodity display page, and the effect of improving the drainage capacity of the search engine of the whole independent site of the e-commerce platform is achieved.

Description

电商站点推广配置方法及其装置、设备、介质、产品E-commerce site promotion configuration method and its device, equipment, medium and product

技术领域technical field

本申请涉及电商信息技术领域,尤其涉及一种电商站点推广配置方法及其相应的装置、计算机设备、计算机可读存储介质,以及计算机程序产品。The present application relates to the field of e-commerce information technology, and in particular, to a method for promoting and configuring an e-commerce site and its corresponding apparatus, computer equipment, computer-readable storage medium, and computer program product.

背景技术Background technique

随着跨境电商和独立站模式的电商网站形式的快速发展,商品被搜索引擎搜索到并排序靠前变得越来越困难,因此对电商独立站进行搜索引擎优化(SEO)变得越来越重要。With the rapid development of cross-border e-commerce and e-commerce websites in the form of independent stations, it is becoming more and more difficult for products to be searched by search engines and ranked first. become increasingly important.

网页标题是对一个网页所提供内容的高度概括性语言,是搜索引擎检索的主要依据之一。设计一个好的网页标题可以大大提升网页在检索结果中的排名。相对应于电商网站而言,就是设计一个基于搜索量、竞争度和商品内容等多维度、科学的SEO网页标题,可以大大提升商品被检索到的概率,进而提高曝光率和转化率。Web page title is a highly generalized language for the content provided by a web page, and it is one of the main basis for search engine retrieval. Designing a good page title can greatly improve the ranking of the page in the search results. Corresponding to the e-commerce website, it is to design a multi-dimensional and scientific SEO page title based on search volume, competition and product content, which can greatly improve the probability of the product being retrieved, thereby increasing the exposure rate and conversion rate.

目前的商品页的SEO网页标题生成主要依赖卖家的工作人员,通过使用各种查询工具和经验进行设定,效率很低,且效果难以保证。The current SEO page title generation of the product page mainly relies on the seller's staff, which is set by using various query tools and experience, which is very inefficient and the effect is difficult to guarantee.

发明内容SUMMARY OF THE INVENTION

本申请的首要目的在于解决上述问题至少之一而提供一种电商站点推广配置方法及其相应的装置、计算机设备、计算机可读存储介质、计算机程序产品。The primary purpose of the present application is to solve at least one of the above problems and provide an e-commerce site promotion configuration method and its corresponding apparatus, computer equipment, computer-readable storage medium, and computer program product.

为满足本申请的各个目的,本申请采用如下技术方案:In order to meet the various purposes of the application, the application adopts the following technical solutions:

适应本申请的目的之一而提供的一种电商站点推广配置方法,包括如下步骤:An e-commerce site promotion configuration method provided to meet one of the purposes of this application includes the following steps:

根据商品展示页面中商品标题的产品词和属性词构造搜索文本;Construct the search text according to the product word and attribute word of the product title in the product display page;

获取与搜索文本相匹配的统计数据,所述统计数据包括多个候选长尾词及其历史搜索统计指标;Obtaining statistical data matching the search text, the statistical data includes a plurality of candidate long-tail words and historical search statistical indicators thereof;

根据所述统计指标确定与所述商品标题在语义上相匹配的唯一一个候选长尾词作为目标长尾词;Determining, according to the statistical index, the only candidate long-tail word that semantically matches the product title as the target long-tail word;

将所述目标长尾词配置到所述商品展示页面的页面标题中。The target long-tail keyword is configured into the page title of the product display page.

深化的部分实施例中,根据商品展示页面中商品标题的产品词和属性词构造搜索文本,包括如下步骤:In some further embodiments, the search text is constructed according to the product word and attribute word of the product title in the product display page, including the following steps:

获取商品展示页面中已输入的商品标题;Get the product title that has been entered in the product display page;

对所述商品标题进行分词和词性识别,获得由多个分词构成的分词集,其中既包含属于产品词的分词,也包含属于属性词的分词;Perform word segmentation and part-of-speech recognition on the product title, and obtain a word segmentation set composed of multiple word segmentations, which includes not only word segmentations belonging to product words, but also word segmentations belonging to attribute words;

关联所述商品标题对所述分词集进行关键词提取,确定各个分词相对应的搜索权重,所述搜索权重表征其相应的分词的潜在搜索价值;The keyword extraction is performed on the word segmentation set in association with the commodity title, and the search weight corresponding to each word segmentation is determined, and the search weight represents the potential search value of the corresponding word segmentation;

确定搜索权重最高的唯一产品词及预定数量的多个属性词,拼接构造为搜索文本。Determine the unique product word with the highest search weight and a plurality of attribute words in a predetermined number, and concatenate them into a search text.

深化的部分实施例中,获取与搜索文本相匹配的统计数据,所述统计数据包括多个候选长尾词及其历史搜索统计指标,包括如下步骤:In some further embodiments, the statistical data matching the search text is obtained, and the statistical data includes a plurality of candidate long-tail words and their historical search statistical indicators, including the following steps:

调用搜索接口获取与所述搜索文本相匹配的候选长尾词的统计数据,所述统计数据包含根据海量用户的历史搜索行为数据统计而成的关键词及其统计指标,所述统计指标包括相应的关键词的平均搜索量及其被不同网站采用的竞争度,所述关键词为长尾词。Invoke the search interface to obtain statistical data of candidate long-tail words that match the search text, the statistical data includes keywords and statistical indicators based on the historical search behavior data of a large number of users, and the statistical indicators include corresponding The average search volume of keywords that are long-tail keywords and their competition for adoption by different websites.

深化的部分实施例中,根据所述统计指标确定与所述商品标题在语义上相匹配的唯一一个候选长尾词作为目标长尾词,包括如下步骤:In some further embodiments, determining, according to the statistical index, the only candidate long-tail word that semantically matches the product title as the target long-tail word, includes the following steps:

根据所述统计指标对所述统计数据进行数据清洗,获得有效候选长尾词;Perform data cleaning on the statistical data according to the statistical indicators to obtain valid candidate long-tail words;

量化确定各个有效候选长尾词与所述商品标题的语义相似度;Quantitatively determine the semantic similarity between each valid candidate long-tail word and the product title;

将平均搜索量作为所述语义相似度的匹配权重,计算各个有效候选长尾词的综合评分;The average search volume is used as the matching weight of the semantic similarity, and the comprehensive score of each valid candidate long-tail word is calculated;

确定综合评分最高的有效候选长尾词作为目标长尾词。Determine the effective candidate long-tail words with the highest comprehensive score as the target long-tail words.

具体化的部分实施例中,根据所述统计指标对所述统计数据进行数据清洗,获得有效候选长尾词,包括如下任意一个或任意多个步骤:In some specific embodiments, data cleaning is performed on the statistical data according to the statistical indicators to obtain valid candidate long-tail words, including any one or any of the following steps:

将所述统计数据中词数少于预设数值的候选长尾词删除;Delete the candidate long-tail words whose number of words in the statistical data is less than the preset value;

将所述统计数据中预设时间范围内的候选长尾词删除;Delete the candidate long-tail words within the preset time range in the statistical data;

将所述统计数据中竞争度高于预设级别的候选长尾词删除;Delete the candidate long-tail words whose competition degree is higher than the preset level in the statistical data;

将所述统计数据中平均搜索量高于预设阈值的候选长尾词删除。The candidate long-tail words whose average search volume is higher than the preset threshold in the statistical data are deleted.

具体化的部分实施例中,量化确定各个所述的有效候选长尾词与所述商品标题的语义相似度,包括如下步骤:In some specific embodiments, quantitatively determining the semantic similarity between each of the effective candidate long-tail words and the product title includes the following steps:

编码获取有效候选长尾词及所述商品标题的嵌入向量;Encoding to obtain effective candidate long-tail words and the embedding vector of the product title;

采用预训练的文本特征提取模型提取所述有效候选长尾词及所述商品标题的嵌入向量的高层语义信息,获得其各自的语义特征向量;Use a pre-trained text feature extraction model to extract the high-level semantic information of the effective candidate long-tail words and the embedding vector of the commodity title, and obtain their respective semantic feature vectors;

采用预设的数据距离算法计算所述商品标题的语义特征向量与各个有效候选长尾词的语气特征向量之间的数据距离作为相应的语义相似度。Using a preset data distance algorithm, the data distance between the semantic feature vector of the product title and the tone feature vector of each valid candidate long-tail word is calculated as the corresponding semantic similarity.

深化的部分实施例中,将所述目标长尾词配置到所述商品展示页面的页面标题中,包括如下步骤:In some further embodiments, configuring the target long-tail word into the page title of the product display page includes the following steps:

显示与所述商品展示页面相对应的搜索优化页面以展示页面标题输入框;displaying a search optimization page corresponding to the product display page to display the page title input box;

将所述目标长尾词配置为所述页面标题输入框的内容数据;Configuring the target long-tail word as the content data of the page title input box;

响应用户提交指令,发布所述商品展示页面及搜索优化页面。In response to the user submitting the instruction, the product display page and the search optimization page are published.

适应本申请的目的之一而提供的一种电商站点推广配置装置,包括搜索构造模块、指标获取模块、目标确定模块,以及搜索优化模块,其中,所述搜索构造模块,用于根据商品展示页面中商品标题的产品词和属性词构造搜索文本;所述指标获取模块,用于获取与搜索文本相匹配的统计数据,所述统计数据包括多个候选长尾词及其历史搜索统计指标;所述目标确定模块,用于根据所述统计指标确定与所述商品标题在语义上相匹配的唯一一个候选长尾词作为目标长尾词;所述搜索优化模块,用于将所述目标长尾词配置到所述商品展示页面的页面标题中。An e-commerce site promotion configuration device provided in accordance with one of the purposes of this application includes a search construction module, an index acquisition module, a target determination module, and a search optimization module, wherein the search construction module is used for displaying according to commodities The product word and attribute word of the product title in the page construct the search text; the indicator acquisition module is used to acquire statistical data matching the search text, the statistical data includes a plurality of candidate long-tail words and their historical search statistical indicators; The target determination module is used to determine the only candidate long-tail word that semantically matches the product title according to the statistical index as the target long-tail word; the search optimization module is used to determine the target long-tail word. The tail word is configured into the page title of the product display page.

深化的部分实施例中,所述搜索构造模块,包括:标题获取单元,用于获取商品展示页面中已输入的商品标题;分词识别单元,用于对所述商品标题进行分词和词性识别,获得由多个分词构成的分词集,其中既包含属于产品词的分词,也包含属于属性词的分词;权重量化单元,用于关联所述商品标题对所述分词集进行关键词提取,确定各个分词相对应的搜索权重,所述搜索权重表征其相应的分词的潜在搜索价值;搜索表达单元,用于确定搜索权重最高的唯一产品词及预定数量的多个属性词,拼接构造为搜索文本。In some further embodiments, the search construction module includes: a title acquisition unit, which is used to acquire the commodity title that has been input on the commodity display page; a word segmentation recognition unit is used to perform word segmentation and part-of-speech recognition on the commodity title, and obtain the A word segmentation set composed of multiple word segmentations, which includes both the word segmentation belonging to the product word and the word segmentation belonging to the attribute word; the weight quantization unit is used to associate the product title to perform keyword extraction on the word segmentation set, and determine each word segmentation The corresponding search weight represents the potential search value of its corresponding word segment; the search expression unit is used to determine the unique product word with the highest search weight and a predetermined number of multiple attribute words, and concatenate them into a search text.

深化的部分实施例中,所述指标获取模块,包括:调用搜索接口获取与所述搜索文本相匹配的候选长尾词的统计数据,所述统计数据包含根据海量用户的历史搜索行为数据统计而成的关键词及其统计指标,所述统计指标包括相应的关键词的平均搜索量及其被不同网站采用的竞争度,所述关键词为长尾词。In some further embodiments, the indicator obtaining module includes: calling a search interface to obtain statistical data of candidate long-tail words matching the search text, the statistical data including statistical data based on historical search behavior data statistics of a large number of users. The obtained keywords and their statistical indicators, the statistical indicators include the average search volume of the corresponding keywords and the degree of competition adopted by different websites, and the keywords are long-tail keywords.

深化的部分实施例中,所述目标确定模块,包括:数据清洗单元,用于根据所述统计指标对所述统计数据进行数据清洗,获得有效候选长尾词;相似量化单元,用于量化确定各个有效候选长尾词与所述商品标题的语义相似度;评分量化单元,用于将平均搜索量作为所述语义相似度的匹配权重,计算各个有效候选长尾词的综合评分;目标选定单元,用于确定综合评分最高的有效候选长尾词作为目标长尾词。In some further embodiments, the target determination module includes: a data cleaning unit for performing data cleaning on the statistical data according to the statistical indicators to obtain valid candidate long-tail words; a similarity quantification unit for quantitative determination The semantic similarity between each valid candidate long-tail word and the product title; the scoring quantification unit is used to use the average search volume as the matching weight of the semantic similarity, and calculate the comprehensive score of each valid candidate long-tail word; target selection The unit is used to determine the effective candidate long-tail words with the highest comprehensive score as the target long-tail words.

具体化的部分实施例中,所述数据清洗单元,包括如下任意一个或任意多个子模块:词数清洗子模块,用于将所述统计数据中词数少于预设数值的候选长尾词删除;时间清洗子模块,用于将所述统计数据中预设时间范围内的候选长尾词删除;竞争度清洗子模块,用于将所述统计数据中竞争度高于预设级别的候选长尾词删除;搜索量清洗子模块,用于将所述统计数据中平均搜索量高于预设阈值的候选长尾词删除。In some specific embodiments, the data cleaning unit includes any one or any of the following sub-modules: a word-count cleaning sub-module, which is used to remove candidate long-tail words whose word count is less than a preset value in the statistical data delete; a time cleaning sub-module for deleting candidate long-tail words within a preset time range in the statistical data; a competition cleaning sub-module for removing candidates whose competitiveness is higher than a preset level in the statistical data Deletion of long-tail words; a search volume cleaning sub-module, which is used to delete candidate long-tail words whose average search volume is higher than a preset threshold in the statistical data.

具体化的部分实施例中,所述相似量化单元,包括:向量编码子单元,用于编码获取有效候选长尾词及所述商品标题的嵌入向量;语义提取子单元,用于采用预训练的文本特征提取模型提取所述有效候选长尾词及所述商品标题的嵌入向量的高层语义信息,获得其各自的语义特征向量;相似计算子单元,用于采用预设的数据距离算法计算所述商品标题的语义特征向量与各个有效候选长尾词的语气特征向量之间的数据距离作为相应的语义相似度。In some specific embodiments, the similarity quantization unit includes: a vector encoding subunit, used for encoding and obtaining the embedding vector of valid candidate long-tail words and the commodity title; a semantic extraction subunit, used for using pre-trained The text feature extraction model extracts the high-level semantic information of the effective candidate long-tail words and the embedding vectors of the product titles to obtain their respective semantic feature vectors; the similarity calculation subunit is used to calculate the The data distance between the semantic feature vector of the product title and the tone feature vector of each valid candidate long-tail word is used as the corresponding semantic similarity.

深化的部分实施例中,所述搜索优化模块,包括:页面显示单元,用于显示与所述商品展示页面相对应的搜索优化页面以展示页面标题输入框;自动编辑单元,用于将所述目标长尾词配置为所述页面标题输入框的内容数据;优化发布单元,用于响应用户提交指令,发布所述商品展示页面及搜索优化页面。In some further embodiments, the search optimization module includes: a page display unit for displaying a search optimization page corresponding to the product display page to display a page title input box; an automatic editing unit for displaying the page title input box; The target long-tail word is configured as the content data of the page title input box; the optimization publishing unit is configured to respond to the user submitting an instruction and publish the product display page and the search optimization page.

适应本申请的目的之一而提供的一种计算机设备,包括中央处理器和存储器,所述中央处理器用于调用运行存储于所述存储器中的计算机程序以执行本申请所述的电商站点推广配置方法的步骤。A computer device provided in accordance with one of the purposes of the present application includes a central processing unit and a memory, the central processing unit is used to call and run a computer program stored in the memory to execute the e-commerce site promotion described in the present application Steps to configure the method.

适应本申请的另一目的而提供的一种计算机可读存储介质,其以计算机可读指令的形式存储有依据所述的电商站点推广配置方法所实现的计算机程序,该计算机程序被计算机调用运行时,执行该方法所包括的步骤。A computer-readable storage medium provided for another purpose of this application, which stores a computer program implemented according to the described e-commerce site promotion configuration method in the form of computer-readable instructions, and the computer program is called by a computer At run time, the steps included in the method are performed.

适应本申请的另一目的而提供的一种计算机程序产品,包括计算机程序/指令,该计算机程序/指令被处理器执行时实现本申请任意一种实施例中所述方法的步骤。A computer program product provided in accordance with another object of the present application includes a computer program/instruction, when the computer program/instruction is executed by a processor, the steps of the method described in any one of the embodiments of the present application are implemented.

相对于现有技术,本申请的技术方案至少包含如下技术优势:Compared with the prior art, the technical solution of the present application includes at least the following technical advantages:

首先,本申请针对商品展示页面所用的商品标题中的产品词和属性词构造搜索文本,以搜索文本检索得到候选长尾词及其统计指标,然后根据所述统计指标确定出最优的候选长尾词作为目标长尾词,将目标长尾词用做所述商品展示页面的页面标题,以逐商品个性化的方式实现对商品展示页面的搜索引擎优化,服务于电商独立站的商品展示页面制作,无需人工参与,提升了配置海量商品展示页面所需信息的效率。First, the present application constructs a search text for the product words and attribute words in the product title used on the product display page, retrieves candidate long-tail words and their statistical indicators from the search text, and then determines the optimal candidate long-tail word according to the statistical indicators. The tail word is used as the target long-tail word, and the target long-tail word is used as the page title of the product display page, so as to realize the search engine optimization of the product display page in a personalized way by product, and serve the product display of the e-commerce independent station. Page production, without manual participation, improves the efficiency of configuring the information required for massive product display pages.

其次,本申请中的长尾词,顾名思义,是具有长尾效应的关键词,也称长尾关键词,尽管其搜索量相对较少,但却具有针对性强的优点,对于电商平台这种存在海量商品,每种商品对应一个商品展示页面的情况,将与商品标题相匹配的优选长尾关键词配置为商品展示页面的页面标题,有助于提升每个商品展示页面在搜索引擎的搜索结果中排行靠前的概率,从而提升电商平台的整个独立站点的总体流量。Secondly, the long-tail keywords in this application, as the name suggests, are keywords with long-tail effect, also known as long-tail keywords. Although their search volume is relatively small, they have the advantage of being highly targeted. When there are a large number of products, and each product corresponds to a product display page, configuring the preferred long-tail keywords that match the product title as the page title of the product display page will help improve the search engine performance of each product display page. The probability of ranking top in the search results, thereby increasing the overall traffic of the entire independent site of the e-commerce platform.

此外,本申请在为商品展示页面的页面标题确定目标长尾词时,不仅考虑了候选长尾词与商品标题上的语义匹配关系,也考虑了候选长尾词自身的历史搜索统计指标,统计指标本身是对长尾词在被搜索的历史过程中统计出来的表征长尾词的热度信息,有助于对候选长尾词做出优选,据此确定出的长尾词,在结合商品标题的语义作为参考之后,不仅与商品标题在语义上高度一致,而且更为有效。In addition, when determining target long-tail words for the page title of the product display page, this application not only considers the semantic matching relationship between the candidate long-tail words and the product title, but also considers the historical search statistical indicators of the candidate long-tail words themselves. The indicator itself is the information about the popularity of long-tail words in the historical process of being searched, which helps to optimize the candidate long-tail words. The long-tail words determined based on this are combined with product titles After being used as a reference, it is not only highly semantically consistent with the product title, but also more effective.

附图说明Description of drawings

本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present application will become apparent and readily understood from the following description of embodiments taken in conjunction with the accompanying drawings, wherein:

图1为本申请的电商站点推广配置方法的典型实施例的流程示意图;FIG. 1 is a schematic flowchart of a typical embodiment of an e-commerce site promotion configuration method of the present application;

图2为本申请实施例中根据商品标题构造搜索文本的过程的流程示意图;2 is a schematic flowchart of a process of constructing a search text according to a commodity title in an embodiment of the application;

图3为本申请示例性提供的一个图形用户界面,分居两侧同时展示出商品展示页面和搜索引擎优化页面;Fig. 3 is a graphical user interface exemplarily provided by the present application, and the product display page and the search engine optimization page are displayed on both sides separately;

图4为本申请实施例中确定目标长尾词的具体过程的流程示意图;4 is a schematic flowchart of a specific process for determining a target long-tail word in an embodiment of the application;

图5为本申请实施例中对统计数据进行数据清洗的过程的流程示意图;5 is a schematic flowchart of a process of performing data cleaning on statistical data in an embodiment of the present application;

图6为本申请实施例中计算商品标题与有效候选长尾词的语义相似度的过程的流程示意图;6 is a schematic flowchart of a process of calculating the semantic similarity between a commodity title and a valid candidate long-tail word in an embodiment of the application;

图7为本申请实施例中配置页面标题完成商品展示页面的发布的流程示意图;FIG. 7 is a schematic flowchart of configuring a page title to complete the publishing of a product display page in an embodiment of the application;

图8为本申请的电商站点推广配置装置的原理框图;Fig. 8 is the principle block diagram of the e-commerce site promotion configuration device of the present application;

图9为本申请所采用的一种计算机设备的结构示意图。FIG. 9 is a schematic structural diagram of a computer device used in this application.

具体实施方式Detailed ways

下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本申请,而不能解释为对本申请的限制。The following describes in detail the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein the same or similar reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the present application, but not to be construed as a limitation on the present application.

本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本申请的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。It will be understood by those skilled in the art that the singular forms "a", "an", "the" and "the" as used herein can include the plural forms as well, unless expressly stated otherwise. It should be further understood that the word "comprising" used in the specification of this application refers to the presence of stated features, integers, steps, operations, elements and/or components, but does not preclude the presence or addition of one or more other features, Integers, steps, operations, elements, components and/or groups thereof. It will be understood that when we refer to an element as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Furthermore, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combination of one or more of the associated listed items.

本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本申请所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It should also be understood that terms, such as those defined in a general dictionary, should be understood to have meanings consistent with their meanings in the context of the prior art and, unless specifically defined as herein, should not be interpreted in idealistic or overly formal meaning to explain.

本技术领域技术人员可以理解,这里所使用的“客户端”、“终端”、“终端设备”既包括无线信号接收器的设备,其仅具备无发射能力的无线信号接收器的设备,又包括接收和发射硬件的设备,其具有能够在双向通信链路上,进行双向通信的接收和发射硬件的设备。这种设备可以包括:蜂窝或其他诸如个人计算机、平板电脑之类的通信设备,其具有单线路显示器或多线路显示器或没有多线路显示器的蜂窝或其他通信设备;PCS(PersonalCommunications Service,个人通信系统),其可以组合语音、数据处理、传真和/或数据通信能力;PDA(Personal Digital Assistant,个人数字助理),其可以包括射频接收器、寻呼机、互联网/内联网访问、网络浏览器、记事本、日历和/或GPS(Global PositioningSystem,全球定位系统)接收器;常规膝上型和/或掌上型计算机或其他设备,其具有和/或包括射频接收器的常规膝上型和/或掌上型计算机或其他设备。这里所使用的“客户端”、“终端”、“终端设备”可以是便携式、可运输、安装在交通工具(航空、海运和/或陆地)中的,或者适合于和/或配置为在本地运行,和/或以分布形式,运行在地球和/或空间的任何其他位置运行。这里所使用的“客户端”、“终端”、“终端设备”还可以是通信终端、上网终端、音乐/视频播放终端,例如可以是PDA、MID(Mobile Internet Device,移动互联网设备)和/或具有音乐/视频播放功能的移动电话,也可以是智能电视、机顶盒等设备。Those skilled in the art can understand that the "client", "terminal" and "terminal device" used herein include both a wireless signal receiver device that only has a wireless signal receiver without transmission capability, and a wireless signal receiver device. A device with receive and transmit hardware that has receive and transmit hardware capable of two-way communication over a two-way communication link. Such devices may include: cellular or other communication devices such as personal computers, tablet computers, which have a single-line display or a multi-line display or a cellular or other communication device without a multi-line display; PCS (Personal Communications Service, Personal Communications System) ), which can combine voice, data processing, fax and/or data communication capabilities; PDA (Personal Digital Assistant), which can include radio frequency receivers, pagers, Internet/Intranet access, web browsers, notepads , calendar and/or GPS (Global Positioning System) receivers; conventional laptop and/or palmtop computers or other devices having and/or conventional laptop and/or palmtop radio frequency receivers computer or other device. As used herein, "client", "terminal", "terminal device" may be portable, transportable, mounted in a vehicle (air, marine and/or land), or adapted and/or configured to be locally operate, and/or in distributed form, operate at any other location on Earth and/or space. The "client", "terminal" and "terminal device" used here can also be a communication terminal, an Internet terminal, and a music/video playing terminal, such as a PDA, MID (Mobile Internet Device) and/or A mobile phone with music/video playback function, or a smart TV, set-top box, etc.

本申请所称的“服务器”、“客户端”、“服务节点”等名称所指向的硬件,本质上是具备个人计算机等效能力的电子设备,为具有中央处理器(包括运算器和控制器)、存储器、输入设备以及输出设备等冯诺依曼原理所揭示的必要构件的硬件装置,计算机程序存储于其存储器中,中央处理器将存储在外存中的程序调入内存中运行,执行程序中的指令,与输入输出设备交互,借此完成特定的功能。The hardware referred to by names such as "server", "client" and "service node" in this application is essentially an electronic device with the equivalent capability of a personal computer, which is a central processing unit (including an arithmetic unit and a controller). ), memory, input device and output device and other necessary components disclosed by the Von Neumann principle, the computer program is stored in its memory, and the central processing unit transfers the program stored in the external memory into the memory to run, and executes the program. The instructions in the interface interact with input and output devices to complete specific functions.

需要指出的是,本申请所称的“服务器”这一概念,同理也可扩展到适用于服务器机群的情况。依据本领域技术人员所理解的网络部署原理,所述各服务器应是逻辑上的划分,在物理空间上,这些服务器既可以是互相独立但可通过接口调用的,也可以是集成到一台物理计算机或一套计算机机群的。本领域技术人员应当理解这一变通,而不应以此约束本申请的网络部署方式的实施方式。It should be pointed out that the concept of "server" referred to in this application can also be extended to the case of server clusters in the same way. According to the principles of network deployment understood by those skilled in the art, the servers should be logically divided. In physical space, these servers can be independent from each other but can be called through interfaces, or can be integrated into a physical server. A computer or a group of computers. Those skilled in the art should understand this modification, but should not limit the implementation of the network deployment manner of the present application.

本申请的一个或数个技术特征,除非明文指定,既可部署于服务器实施而由客户端远程调用获取服务器提供的在线服务接口来实施访问,也可直接部署并运行于客户端来实施访问。Unless explicitly specified, one or more technical features of the present application can be deployed on the server and remotely invoked by the client to obtain the online service interface provided by the server to implement access, or can be directly deployed and run on the client to implement access.

本申请中所引用或可能引用到的神经网络模型,除非明文指定,既可部署于远程服务器且在客户端实施远程调用,也可部署于设备能力胜任的客户端直接调用,某些实施例中,当其运行于客户端时,其相应的智能可通过迁移学习来获得,以便降低对客户端硬件运行资源的要求,避免过度占用客户端硬件运行资源。The neural network model cited or possibly cited in this application, unless specified in plain text, can either be deployed on a remote server and invoked remotely on the client, or deployed on a client with competent device capabilities to directly invoke, in some embodiments , when it runs on the client, its corresponding intelligence can be obtained through transfer learning, so as to reduce the requirements on the client hardware running resources and avoid excessively occupying the client hardware running resources.

本申请所涉及的各种数据,除非明文指定,既可远程存储于服务器,也可存储于本地终端设备,只要其适于被本申请的技术方案所调用即可。All kinds of data involved in this application, unless specified in plain text, can be stored in a server remotely or in a local terminal device, as long as it is suitable for being called by the technical solution of this application.

本领域技术人员对此应当知晓:本申请的各种方法,虽然基于相同的概念而进行描述而使其彼此间呈现共通性,但是,除非特别说明,否则这些方法都是可以独立执行的。同理,对于本申请所揭示的各个实施例而言,均基于同一发明构思而提出,因此,对于相同表述的概念,以及尽管概念表述不同但仅是为了方便而适当变换的概念,应被等同理解。Those skilled in the art should know that: although the various methods of the present application are described based on the same concept to show commonality with each other, unless otherwise specified, these methods can be independently executed. Similarly, for the various embodiments disclosed in this application, they are all proposed based on the same inventive concept. Therefore, the concepts expressed in the same way, and the concepts that are appropriately transformed for convenience even though the concept expressions are different, should be regarded as equivalent. understand.

本申请即将揭示的各个实施例,除非明文指出彼此之间的相互排斥关系,否则,各个实施例所涉的相关技术特征可以交叉结合而灵活构造出新的实施例,只要这种结合不背离本申请的创造精神且可满足现有技术中的需求或解决现有技术中的某方面的不足即可。对此变通,本领域技术人员应当知晓。In the various embodiments to be disclosed in this application, unless the mutually exclusive relationship between each other is clearly indicated, the related technical features involved in the various embodiments can be cross-combined to flexibly construct new embodiments, as long as the combination does not deviate from the present invention. The creative spirit of the application can meet the needs in the prior art or solve a certain aspect of the deficiencies in the prior art. Variations on this will be known to those skilled in the art.

本申请的一种电商站点推广配置方法,可被编程为计算机程序产品,部署于客户端或服务器中运行而实现,例如在本申请包括直播电商在内的电商平台应用场景中,一般部署在服务器中实施,藉此可以通过访问该计算机程序产品运行后开放的接口,通过图形用户界面与该计算机程序产品的进程进行人机交互而执行该方法。A method for promoting and configuring an e-commerce site of the present application can be programmed as a computer program product and implemented by being deployed in a client or server to run. For example, in the application scenario of an e-commerce platform including live e-commerce The deployment is implemented in the server, whereby the method can be executed by man-machine interaction with the process of the computer program product through a graphical user interface by accessing the interface opened after the computer program product runs.

请参阅图1,本申请的电商站点推广配置方法,在其典型实施例中,包括如下步骤:Please refer to FIG. 1, the e-commerce site promotion configuration method of the present application, in its typical embodiment, includes the following steps:

步骤S1100、根据商品展示页面中商品标题的产品词和属性词构造搜索文本:Step S1100: Construct a search text according to the product word and attribute word of the product title in the product display page:

当电商平台的独立站点需要发布某一商品时,需要录入商品信息,通常包括但不限于商品标题、商品摘要、商品描述等信息,后续以这些商品信息对应生成所述商品的商品展示页面供终端用户调用浏览。为了方便提升商品被搜索的概率,还会允许一并配置搜索引擎优化参数,以便相应的商品展示页面更易被搜索引擎收录和展示。When the independent site of the e-commerce platform needs to publish a certain product, it needs to enter the product information, usually including but not limited to the product title, product summary, product description and other information, and then use these product information to generate the product display page corresponding to the product. End user invokes browse. In order to facilitate the improvement of the probability of products being searched, it is also allowed to configure search engine optimization parameters together, so that the corresponding product display pages are easier to be included and displayed by search engines.

当所述的商品展示页面的商品标题输入框中输入了相应的商品标题的内容数据时,便可获得所述的商品标题。据此,可以采用传统的各种分词方法对所述商品标题进行分词并确定其相应的词性,相应获得一个分词集。分词集中的分词,在词性上通常部分属于名词,部分属于形容词和/或副词,名词中一般会包含指示商品内容的产品词,而形容词和/或副词通常用于描述商品的某一方面的属性,因而也称属性词。由此,也就确定了商品标题中的产品词和属性词。When the content data of the corresponding commodity title is input in the commodity title input box of the commodity display page, the commodity title can be obtained. Accordingly, various traditional word segmentation methods can be used to segment the commodity title and determine its corresponding part of speech, and correspondingly obtain a word segmentation set. The participles in the participle set are usually part of nouns and part of adjectives and/or adverbs in terms of part of speech. The nouns generally contain product words that indicate the content of the product, while adjectives and/or adverbs are usually used to describe the attributes of a certain aspect of the product , so it is also called attribute word. Thus, the product word and attribute word in the product title are also determined.

为了根据商品标题获取候选长尾词,可将每个产品词与所有属性词进行任意拼接,构成一个或多个搜索文本,该搜索文本可以提交给搜索接口执行搜索,以获得所述的候选长尾词。在构造所述的搜索文本时,也可对所述的产品词和属性词进行过滤择优,以便降低搜索文本的数量,提升运算速度。同理,也可对产品词与属性词的拼接方式设定唯一性的拼接规则,使产品词与属性词之间按照该拼接规则进行唯一性拼接。In order to obtain candidate long-tail words according to the product title, each product word and all attribute words can be arbitrarily spliced to form one or more search texts, which can be submitted to the search interface for search to obtain the candidate long-tail words. suffix. When constructing the search text, the product words and attribute words can also be filtered and selected, so as to reduce the number of search texts and improve the operation speed. Similarly, a unique splicing rule can also be set for the splicing method of the product word and the attribute word, so that the product word and the attribute word can be uniquely spliced according to the splicing rule.

步骤S1200、获取与搜索文本相匹配的统计数据,所述统计数据包括多个候选长尾词及其历史搜索统计指标:Step S1200: Obtain statistical data matching the search text, where the statistical data includes multiple candidate long-tail words and their historical search statistical indicators:

根据所述的搜索文本,可以利用多种方式任意之一获取其相对应的候选长尾词及其该候选长尾词的历史搜索所形成的统计指标。According to the search text, any one of a variety of ways can be used to obtain the corresponding candidate long-tail words and the statistical indicators formed by the historical searches of the candidate long-tail words.

一种方式中,可以通过搜索引擎提供的搜索接口,直接获取所述的候选长尾词及其对应的统计指标,所述的调用接口被实现为根据所述搜索文本从预设的长尾词排行表中搜索出语义上相匹配的部分长尾词作为所述的候选长尾词,并提供候选长尾词相应的统计指标。一般而言,所述搜索引擎为期望能搜索得到该商品展示页面的目标搜索引擎。In one way, the candidate long-tail words and their corresponding statistical indicators can be directly obtained through the search interface provided by the search engine, and the invocation interface is implemented as a preset long-tail word according to the search text. Part of the semantically matching long-tail words are searched in the ranking table as the candidate long-tail words, and corresponding statistical indicators of the candidate long-tail words are provided. Generally speaking, the search engine is a target search engine that expects to search for the product display page.

另一种方式中,可以由电商平台自行统计或者其他传统的数据检索手段获得所述的长尾词排行表,然后调用电商平台提供的搜索接口从自定义的长尾词排行表中获得所述的候选长尾词及其相应的统计指标。In another way, the long-tail word ranking table can be obtained by the e-commerce platform's own statistics or other traditional data retrieval methods, and then the search interface provided by the e-commerce platform can be invoked to obtain the customized long-tail word ranking table. The candidate long-tail words and their corresponding statistical indicators.

所述的长尾词,即长尾关键词,一般是指单位时间内的平均搜索量较少且包含有多个词语的搜索字符串。长尾词具有长尾效应,尽管其单位平均搜索量较少,但却具有命中精准的优点。当搜索引擎的用户输入一个长尾词时,以该长尾词作为页面标题的页面便能获得靠前的优先排序效果。由于电商平台的每个独立站点中都往往存在海量的商品展示页面,通过在大量的商品展示页面中配置长尾词作为搜索引擎优化所需的页面标题,便可借助单个页面被精准搜索排行靠前的优势,引发规模效应,从而提升整个独立站点的用户流量,达到独立站点整站推广的效果。The long-tail words, that is, long-tail keywords, generally refer to search strings with a small average search volume per unit time and containing multiple words. Long-tail words have a long-tail effect. Although the average search volume per unit is small, they have the advantage of accurate hits. When a user of a search engine inputs a long-tail word, the page with the long-tail word as the title of the page can obtain a top priority ranking effect. Since there are often a large number of product display pages in each independent site of an e-commerce platform, by configuring long-tail keywords in a large number of product display pages as page titles required for search engine optimization, a single page can be accurately searched and ranked. The first advantage will trigger the scale effect, thereby increasing the user traffic of the entire independent site, and achieving the effect of the whole site promotion of the independent site.

所述统计指标一般包含各个长尾词的平均搜索量及竞争度。一般而言,每个搜索引擎均会自行针对用户的搜索表达式进行分析,统计搜索表达式中的关键词的用户使用频度及相应的关键词被网站页面使用频度,据此,从用户侧,可以利用用户使用频度周期性例如以月为单位进行统计,统计出关键词被用户使用的平均搜索量;从网站侧,可以利用页面使用频度周期性例如以月为单位进行统计,统计出关键词被页面使用的竞争度。所述平均搜索量可以是数值型数据,所述竞争度可以是等级型数据例如表征高、中、低的选项{1,2,3},具体视各个搜索引擎对相关数据的量化习惯而定。每个关键词的平均搜索量及竞争度,便构成每个关键词相应的统计指标。其他实施例中,所述的统计指标还可以进一步包含长尾词相应的使用时间之类的其他信息。将每个关键词及其平均搜索量和竞争度组织为映射关系数据,便可构成出关键词排行表,其中,将所包含的词语超过预设数量例如大于两个词语的关键词作为长尾词,从所述的关键词排行表中筛选出来即可构造出所述的长尾词排行表。The statistical indicators generally include the average search volume and competition degree of each long-tail keyword. Generally speaking, each search engine will analyze the user's search expression by itself, and count the frequency of user use of the keywords in the search expression and the frequency of the corresponding keywords used by website pages. From the website side, you can use the periodicity of user usage frequency, for example, to conduct statistics in units of months, and count the average search volume of keywords used by users; Calculate the competition of keywords used by the page. The average search volume can be numerical data, and the competition degree can be grade data such as high, medium, and low options {1, 2, 3}, depending on the quantification habits of each search engine for related data. . The average search volume and competition of each keyword constitute the corresponding statistical indicators of each keyword. In other embodiments, the statistical indicators may further include other information such as the corresponding usage time of the long-tail words. By organizing each keyword and its average search volume and competition into mapping relationship data, a keyword ranking table can be formed, in which keywords containing more than a preset number of words, such as more than two words, are regarded as long tails. The long-tail word ranking table can be constructed by filtering out the keyword ranking table.

由此可见,所述长尾词排行表实际上存储着根据海量用户的历史搜索行为数据相对应的长尾词统计而成的统计数据,所述统计数据包含各个长尾词及其相应的各种统计指标,所述统计指标包括相应的长尾词的平均搜索量及其被不同网站采用的竞争度。It can be seen that the long-tail word ranking table actually stores statistical data based on the long-tail words corresponding to the historical search behavior data of a large number of users, and the statistical data includes each long-tail word and its corresponding Statistical indicators, the statistical indicators include the average search volume of the corresponding long-tail words and the degree of competition adopted by different websites.

根据搜索引擎构造所述长尾词排行表的原理,电商平台也可自行获取相应的数据而自行构造出所述的长尾词排行表,或者直接调用搜索引擎发布的长尾词排行存储于本地供调用。According to the principle that the search engine constructs the long-tail word ranking table, the e-commerce platform can also obtain the corresponding data and construct the long-tail word ranking table by itself, or directly call the long-tail word ranking released by the search engine and store it in Local for calling.

从所述的长尾词排行表中获取所述搜索文本相匹配的长尾词作为候选长尾词时,一并获取候选长尾词相对应的所述统计指标。其中,可以将搜索文本与所述长尾词进行基于规则的匹配,包括模糊匹配或精准匹配,或者基于语义的匹配等,本领域技术人员可以根据此处所揭示的原理灵活适用。由此,所匹配出的候选长尾词及其相应的统计指标便构成一个候选长尾词子集。When the long-tail words matching the search text are obtained from the long-tail word ranking table as candidate long-tail words, the statistical indicators corresponding to the candidate long-tail words are also obtained. Wherein, the search text and the long-tail words can be matched based on rules, including fuzzy matching or exact matching, or semantic-based matching, etc. Those skilled in the art can flexibly apply the principles disclosed herein. Thus, the matched candidate long-tail words and their corresponding statistical indicators constitute a subset of candidate long-tail words.

步骤S1300、根据所述统计指标确定与所述商品标题在语义上相匹配的唯一一个候选长尾词作为目标长尾词:Step S1300, according to the statistical index, determine the only candidate long-tail word that semantically matches the commodity title as the target long-tail word:

由于每个商品展示页面只需采用一个页面标题,因而,需要从前一步骤所确定的候选长尾词子集中确定一个候选长尾词作为目标长尾词。为此,可以参考每个候选长尾词相应的统计指标包括其中的平均搜索量和/或竞争度,及候选长尾词与所述商品标题在语义上的匹配程度,对各个候选长尾词进行优选。Since each product display page only needs to use one page title, it is necessary to determine a candidate long-tail word from the subset of candidate long-tail words determined in the previous step as the target long-tail word. To this end, you can refer to the corresponding statistical indicators of each candidate long-tail word, including the average search volume and/or competition, and the degree of semantic matching between the candidate long-tail word and the product title, for each candidate long-tail word. Make optimization.

例如,可以联合平均搜索量、候选长尾词与所述商品标题在语义上的相似度分别作为主排序字段和次排序字段,进行多字段倒排序,然后确定出倒排序居于首位的候选长尾词作为用作页面标题的目标长尾词。For example, the average search volume, the semantic similarity between the candidate long-tail words and the product title can be used as the primary sorting field and the secondary sorting field, respectively, to perform multi-field reverse sorting, and then determine the candidate long tail that ranks first in the reverse sorting. words as target long-tail keywords to use as page titles.

又如,可以联合竞争度、候选长尾词与所述商品标题在语义上的相似度分别作为主排序字段和次排序字段,进行多字段倒排序,然后确定出竞争度较低且相似度最高的候选长尾词作为用作页面标题的目标长尾词。For another example, the degree of competition, the semantic similarity between the candidate long-tail words and the product title can be used as the primary sorting field and the secondary sorting field, respectively, to perform multi-field reverse sorting, and then determine that the degree of competition is low and the similarity is the highest. as the target long-tail keywords for the page title.

再如,可以联合平均搜索量、竞争度、候选长尾词与所述商品标题在语义上的相似度进行多字段综合排序,确定出其中平均搜索量较高、竞争度较低而且相似度较高,从而综合排行第一的候选长尾词作为用作页面标题的目标长尾词。For another example, the average search volume, competition degree, and the semantic similarity between the candidate long-tail words and the product title can be combined for multi-field comprehensive sorting, and it is determined that the average search volume is high, the competition degree is low, and the similarity degree is relatively high. Therefore, the first candidate long-tail word is comprehensively used as the target long-tail word for the page title.

除此之外,还可以灵活利用其他方式,诸如基于预先训练至收敛状态的深度学习模型进行语义判定的方式,确定出与所述商品标题高度相似,且其平均搜索量较高,和/或竞争度较低的唯一一个候选长尾词作为所述的目标长尾词。In addition, other methods can also be flexibly used, such as semantic judgment based on a deep learning model pre-trained to a convergent state, to determine that it is highly similar to the product title, and its average search volume is high, and/or The only candidate long-tail word with low competition is used as the target long-tail word.

由此可见,借助所述统计指标,可以采用多种方式为所述商品标题确定唯一一个目标长尾词,该目标长尾词具有与所述的商品标题在语义上相匹配,而且与所述统计指标中的一个或多个相关联的特点,从而实现对候选长尾词的优选。It can be seen that, with the help of the statistical indicators, a unique target long-tail word can be determined for the product title in various ways. One or more related characteristics in the statistical indicators, so as to realize the optimization of candidate long-tail words.

步骤S1400、将所述目标长尾词配置到所述商品展示页面的页面标题中:Step S1400, configure the target long-tail word into the page title of the product display page:

当确定出所述的目标长尾词后,便可将其配置为所述商品展示页面的页面标题,在所述商品展示页面被发布后,各个搜索引擎依据其自身的实现逻辑,对所述商品展示页面进行搜索和收录,后续利用所述目标长尾词在所述搜索引擎中进行搜索,能够更为精准地命中该商品展示页面,从而实现页面推广的目的。After the target long-tail word is determined, it can be configured as the page title of the product display page. After the product display page is published, each search engine, according to its own implementation logic, The product display page is searched and included, and then the target long-tail keyword is used to search in the search engine, which can more accurately hit the product display page, thereby achieving the purpose of page promotion.

通过本申请的典型实施例及其相应的多个变通实施例,可以看出,相对于现有技术,本申请的技术方案至少包含如下技术优势:Through the typical embodiment of the present application and its corresponding multiple modified embodiments, it can be seen that, compared with the prior art, the technical solution of the present application at least includes the following technical advantages:

首先,本申请针对商品展示页面所用的商品标题中的产品词和属性词构造搜索文本,以搜索文本检索得到候选长尾词及其统计指标,然后根据所述统计指标确定出最优的候选长尾词作为目标长尾词,将目标长尾词用做所述商品展示页面的页面标题,以逐商品个性化的方式实现对商品展示页面的搜索引擎优化,服务于电商独立站的商品展示页面制作,无需人工参与,提升了配置海量商品展示页面所需信息的效率。First, the present application constructs a search text for the product words and attribute words in the product title used on the product display page, retrieves candidate long-tail words and their statistical indicators from the search text, and then determines the optimal candidate long-tail word according to the statistical indicators. The tail word is used as the target long-tail word, and the target long-tail word is used as the page title of the product display page, so as to realize the search engine optimization of the product display page in a personalized manner, and serve the product display of the e-commerce independent station. Page production, without manual participation, improves the efficiency of configuring information required for massive product display pages.

其次,本申请中的长尾词,顾名思义,是具有长尾效应的关键词,也称长尾关键词,尽管其搜索量相对较少,但却具有针对性强的优点,对于电商平台这种存在海量商品,每种商品对应一个商品展示页面的情况,将与商品标题相匹配的优选长尾关键词配置为商品展示页面的页面标题,有助于提升每个商品展示页面在搜索引擎的搜索结果中排行靠前的概率,从而提升电商平台的整个独立站点的总体流量。Secondly, the long-tail keywords in this application, as the name suggests, are keywords with long-tail effect, also known as long-tail keywords. Although their search volume is relatively small, they have the advantage of being highly targeted. When there are a large number of products, and each product corresponds to a product display page, configuring the preferred long-tail keywords that match the product title as the page title of the product display page will help improve the search engine performance of each product display page. The probability of ranking top in the search results, thereby increasing the overall traffic of the entire independent site of the e-commerce platform.

此外,本申请在为商品展示页面的页面标题确定目标长尾词时,不仅考虑了候选长尾词与商品标题上的语义匹配关系,也考虑了候选长尾词自身的历史搜索统计指标,统计指标本身是对长尾词在被搜索的历史过程中统计出来的表征长尾词的热度信息,有助于对候选长尾词做出优选,据此确定出的长尾词,在结合商品标题的语义作为参考之后,不仅与商品标题在语义上高度一致,而且更为有效。In addition, when determining target long-tail words for the page title of the product display page, this application not only considers the semantic matching relationship between the candidate long-tail words and the product title, but also considers the historical search statistical indicators of the candidate long-tail words themselves. The indicator itself is the information about the popularity of long-tail words in the historical process of being searched, which helps to optimize the candidate long-tail words. The long-tail words determined based on this are combined with product titles After being used as a reference, it is not only highly semantically consistent with the product title, but also more effective.

请参阅图2,深化的部分实施例中,所述步骤S1100、根据商品展示页面中商品标题的产品词和属性词构造搜索文本,包括如下步骤:Referring to FIG. 2, in some further embodiments, the step S1100, constructing a search text according to the product word and attribute word of the product title in the product display page, includes the following steps:

步骤S1110、获取商品展示页面中已输入的商品标题:Step S1110: Obtain the title of the commodity that has been input on the commodity display page:

如图3所示,用户在编辑商品展示页面时,显示一个搜索引擎优化区,用户在图中左侧的商品标题输入框中输入其设定的商品标题,后台便获得该商品标题的文本数据。As shown in Figure 3, when the user edits the product display page, a search engine optimization area is displayed. The user enters the set product title in the product title input box on the left side of the figure, and the text data of the product title is obtained in the background. .

步骤S1120、对所述商品标题进行分词和词性识别,获得由多个分词构成的分词集,其中既包含属于产品词的分词,也包含属于属性词的分词:Step S1120: Perform word segmentation and part-of-speech recognition on the product title, and obtain a word segmentation set composed of multiple word segmentations, which includes both the word segmentations belonging to product words and the word segmentations belonging to attribute words:

为了实现对所述的商品标题的分词,可采用诸如N-Gram之类的基于统计的任意一种传统的分词算法先对所述的商品标题进行分词,获得相应的分词集,所述的分词集中,通常包括多个产品词和多个属性词,产品词通常用于描述或指代商品名称,多属名词;属性词通常用于描述商品属性,多为形容词或副词。In order to realize the word segmentation of the commodity title, any traditional word segmentation algorithm based on statistics such as N-Gram can be used to segment the commodity title to obtain a corresponding word segmentation set. Concentration usually includes multiple product words and multiple attribute words. Product words are usually used to describe or refer to commodity names, and are mostly nouns; attribute words are usually used to describe commodity attributes, mostly adjectives or adverbs.

为了实现对所述分词集中的各个分词的词性分析,可采用预先训练至收敛状态的神经网络模型进行实施,推荐的模型可采用诸如LSTM+CRF、BERT+CRF之类的基础网络架构,可由本领域技术人员灵活选型,此类神经网络模型可由本领域技术人员在预训练的基础上进行微调训练至收敛状态,使其习得根据输入的商品标题的分词集的嵌入向量进行分词和词性分析从而划分确定出商品标题中的产品词和属性词即可。In order to realize the part-of-speech analysis of each word segment in the word segment set, a neural network model pre-trained to a convergent state can be used for implementation, and the recommended model can be based on a basic network architecture such as LSTM+CRF and BERT+CRF. Those skilled in the art can select the model flexibly. Such neural network models can be fine-tuned and trained to a convergent state by those skilled in the art on the basis of pre-training, so that they can learn to perform word segmentation and part-of-speech analysis according to the embedding vector of the word segmentation set of the input product title. Thereby, the product word and the attribute word in the product title can be divided and determined.

步骤S1130、关联所述商品标题对所述分词集进行关键词提取,确定各个分词相对应的搜索权重,所述搜索权重表征其相应的分词的潜在搜索价值:Step S1130: Extract keywords from the word segment set in association with the product title, and determine the search weight corresponding to each word segment, where the search weight represents the potential search value of its corresponding word segment:

如前所述,所述分词集中,产品词和属性词的数量均可能存在多个,但构造搜索文本所需的句法一般遵守自然语言使用习惯,因而,需要尽量精简,为此,示例而言,可采用单个具有代表性的产品词结合多个具有代表性的属性词来构造搜索文本,为此需要量化确定所述产品词与所述属性词中的搜索权重,也即确定所述分词集中各个分词相对应的潜在搜索价值。As mentioned above, there may be multiple product words and attribute words in the word segmentation set, but the syntax required to construct the search text generally follows the usage habits of natural language, so it needs to be simplified as much as possible. , a single representative product word combined with multiple representative attribute words can be used to construct the search text. To this end, it is necessary to quantitatively determine the search weight of the product word and the attribute word, that is, to determine the word segmentation set. The potential search value corresponding to each participle.

一种实施例中,可采用TF-IDF算法对所述分词集中的各个分词确定其TF-IDF值作为搜索权重,另一实施例中,也可采用TextRank算法为各个分词构造分词图谱,根据分词图谱确定各个分词相应的权重作为所述的搜索权重。除此之外,本领域技术人员也可根据此处揭示的原理,灵活采用其他可替换的方式实施,只要能够量化各个分词的潜在搜索价值即可。In one embodiment, the TF-IDF algorithm may be used to determine the TF-IDF value of each word segment in the word segmentation set as the search weight. In another embodiment, the TextRank algorithm may also be used to construct a word segmentation map for each word segmentation, The graph determines the corresponding weight of each word segment as the search weight. In addition, those skilled in the art can also flexibly adopt other alternative implementations according to the principles disclosed herein, as long as the potential search value of each word segment can be quantified.

步骤S1140、确定搜索权重最高的唯一产品词及预定数量的多个属性词,拼接构造为搜索文本:Step S1140: Determine the unique product word with the highest search weight and a plurality of attribute words in a predetermined number, and construct the search text by splicing:

根据所述搜索权重对分词集中的各个分词进行倒排序,其中的产品词、属性词便按照其相应的搜索权重进行有序排列,据此,便可确定其中搜索权重最高的唯一产品词,作为所述商品标题的核心产品词,同时,根据预设数量例如3个,选取排序靠前的相应数量的多个所述的属性词,将所述核心产品词与优选出的所述多个属性词进行拼接,例如根据预设规则进行有序拼接,使属性词居前、核心产品词居后,从而构造出搜索文本。当然,也可调整拼接顺序获得多个搜索文本,本领域技术人员对此可灵活实施。According to the search weight, each word segment in the word segment set is reversely sorted, and the product words and attribute words are arranged in an orderly manner according to their corresponding search weights. Based on this, the unique product word with the highest search weight can be determined as the The core product word of the product title, at the same time, according to a preset number, such as 3, select a plurality of the attribute words in the corresponding number at the top of the ranking, and compare the core product word with the selected attributes. Words are spliced, such as orderly splicing according to preset rules, so that attribute words are in the front and core product words are in the back, so as to construct the search text. Of course, the splicing sequence can also be adjusted to obtain multiple search texts, which can be flexibly implemented by those skilled in the art.

本实施例通过对所述的商品标题进行分词和词性识别,然后计算各个分词的表征潜在搜索价值的搜索权重,根据搜索权重从中优选出核心产品词及多个优选的属性词,根据核心产品词和优选的多个属性词构造出搜索文本,该搜索文本具有更高的潜在搜索价值,据此进行候选长尾词的初步筛选,能够提升匹配精准度。In this embodiment, by performing word segmentation and part-of-speech recognition on the product title, and then calculating the search weight of each word segment representing the potential search value, and selecting the core product word and a plurality of preferred attribute words according to the search weight, according to the core product word and multiple preferred attribute words to construct a search text, the search text has higher potential search value, and the preliminary screening of candidate long-tail words can improve the matching accuracy.

请参阅图4,深化的部分实施例中,所述步骤S1300、根据所述统计指标确定与所述商品标题在语义上相匹配的唯一一个候选长尾词作为目标长尾词,包括如下步骤:Referring to FIG. 4 , in some further embodiments, the step S1300 , determining the only candidate long-tail word that semantically matches the commodity title according to the statistical index as the target long-tail word, includes the following steps:

步骤S1310、根据所述统计指标对所述统计数据进行数据清洗,获得有效候选长尾词:Step S1310, performing data cleaning on the statistical data according to the statistical index to obtain valid candidate long-tail words:

如前所述,统计数据包含统计指标,统计指标中包含平均搜索量和/或竞争度,甚至也可以包含其他信息,因此,可以按照对统计数据中的候选长尾词的平均搜索量和/或竞争度的约束条件,对所述统计指标进行数据清洗,从而过滤掉部分无效的候选长尾词,余下部分即可确认为有效长尾词。所述的约束条件可由本领域技术人员根据此处揭示的原理,灵活设定。As mentioned earlier, the statistics include statistical indicators, the statistical indicators include the average search volume and/or competition, and may even contain other information. Therefore, the statistics can be based on the average search volume and/or the candidate long-tail keywords in the statistical data. Or the constraints of competition degree, data cleaning is performed on the statistical indicators, so as to filter out some invalid candidate long-tail words, and the remaining part can be confirmed as valid long-tail words. The constraints can be flexibly set by those skilled in the art according to the principles disclosed herein.

步骤S1320、量化确定各个有效候选长尾词与所述商品标题的语义相似度:Step S1320, quantitatively determine the semantic similarity between each valid candidate long-tail word and the product title:

为了考察每个有效候选长尾词与商品标题之间在语义上的密切程度,可以对两者进行语义相似度的计算。计算语义相似度时,可先分别对所述有效候选长尾词和商品标题确定其深层语义信息,在此基础上采用数据距离算法确定每个候选长尾词与所述商品标题之间的数据距离,通过归一化表示为语义相似度相应的数值即可。In order to examine the semantic closeness between each valid candidate long-tail word and the product title, the semantic similarity can be calculated between the two. When calculating the semantic similarity, the deep semantic information of the valid candidate long-tail words and the product title can be determined respectively, and on this basis, the data distance algorithm is used to determine the data between each candidate long-tail word and the product title. The distance can be expressed as a value corresponding to the semantic similarity through normalization.

步骤S1330、将平均搜索量作为所述语义相似度的匹配权重,计算各个有效候选长尾词的综合评分:Step S1330, taking the average search volume as the matching weight of the semantic similarity, and calculating the comprehensive score of each valid candidate long-tail word:

为了方便对各个有效候选长尾词进行择优,可先根据其统计指标及其与所述商品标题的相似度确定各个有效候选长尾词的综合评分,示例性的公式如下:In order to facilitate the selection of each valid candidate long-tail word, the comprehensive score of each valid candidate long-tail word can be determined according to its statistical index and its similarity with the product title. The exemplary formula is as follows:

final_score=ln(2.001-(search_score/max_search_score))*sim_scorefinal_score=ln(2.001-(search_score/max_search_score))*sim_score

其中,search_score为当前有效候选长尾词相对应的平均搜索量,max_search_score为所有有效候选长尾词中最大的平均搜索量,以常数2.001减去两者之比取对数后与相应的语义相似度相乘,实现权重匹配,且确保数据更平稳,实现区别不同差值,量化确定出相应的综合评分final_score。Among them, search_score is the average search volume corresponding to the current valid candidate long-tail words, and max_search_score is the largest average search volume among all valid candidate long-tail words. The constant 2.001 minus the logarithm of the ratio of the two is similar to the corresponding semantics Multiply the degrees to achieve weight matching, and ensure that the data is more stable, distinguish different differences, and quantitatively determine the corresponding comprehensive score final_score.

应当理解,以上公式仅供示例,说明利用有效候选长尾词的平均搜索量对其与商品标题的语义相似度进行加权实现量化的原理,本领域技术人员可以根据此处揭示的原理灵活构造公式,只要达到相同目的即可,对此,均应视为未突破本申请的创造精神所体现的范围。It should be understood that the above formula is only an example to illustrate the principle of using the average search volume of valid candidate long-tail words to weight their semantic similarity with the product title to achieve quantification. Those skilled in the art can flexibly construct the formula according to the principles disclosed here. , as long as the same purpose is achieved, which shall be regarded as not breaking the scope embodied by the inventive spirit of the present application.

每个候选长尾词均确定其相应的综合评分后,通过所述综合评分,综合反映了候选长尾词的平均搜索热度及其与商品标题的密切程度,将有效候选长尾词的优劣统一到同一量纲,因而,可以通过综合评分的大小对各个有效候选长尾词进行择优。After each candidate long-tail word has its corresponding comprehensive score determined, the comprehensive score comprehensively reflects the average search popularity of the candidate long-tail word and its closeness to the product title, and the pros and cons of the effective candidate long-tail words Unified to the same dimension, therefore, each valid candidate long-tail word can be selected according to the size of the comprehensive score.

步骤S1340、确定综合评分最高的有效候选长尾词作为目标长尾词:Step S1340, determine the effective candidate long-tail word with the highest comprehensive score as the target long-tail word:

为获得最优的有效候选长尾词作为目标长尾词,可根据所述综合评分对各个有效候选长尾词进行倒排序,然后,将排在首位的有效候选长尾词,也即综合评分最高的有效候选长尾词确定为本申请生成页面标题所需的目标长尾词即可。In order to obtain the best effective candidate long-tail words as the target long-tail words, each effective candidate long-tail word can be reversely sorted according to the comprehensive score, and then the effective candidate long-tail words ranked first, that is, the comprehensive score. The highest valid candidate long-tail word can be determined as the target long-tail word required for generating the page title for this application.

本实施例在对根据搜索文本匹配出的统计数据进行数据清洗获得有效候选长尾词之后,利用有效候选长尾词与所述商品标题的密切程度,以及其自身统计指标中用于反映平均搜索热度的平均搜索量,应用预设公式确定出相对应的综合评分,实现对各个有效候选长尾词的优劣程度的量化,然后选取综合评分最高的有效候选长尾词作为页面标题所需的目标长尾词,实现对有效候选长尾词的择优匹配。In this embodiment, after data cleaning is performed on the statistical data matched according to the search text to obtain valid candidate long-tail words, the closeness of the valid candidate long-tail words to the product title and its own statistical indicators are used to reflect the average search The average search volume of the popularity, the corresponding comprehensive score is determined by the preset formula, and the quantification of the pros and cons of each valid candidate long-tail word is realized, and then the valid candidate long-tail word with the highest comprehensive score is selected as the page title. Target long-tail words to achieve preferential matching of valid candidate long-tail words.

本实施例所确定的目标长尾词,不仅在语义上与所述商品标题更为密切,而且,其访问热度也相对较优,由于有效候选长尾词本身是对统计数据进行数据清洗的结果,本身过滤了一些极端个例,因而,所确定的目标长尾词,是综合最优的,能够确保商品展示页面发布后,获得长尾词搜索方面的优势,从而协助全站流量的提升。The target long-tail words determined in this embodiment are not only semantically closer to the product title, but also have relatively good access popularity, because the effective candidate long-tail words themselves are the result of data cleaning for statistical data , itself filters some extreme cases. Therefore, the determined target long-tail keywords are comprehensive and optimal, which can ensure that after the product display page is released, the advantages of long-tail keyword search can be obtained, thereby helping to increase the traffic of the whole site.

请参阅图5,具体化的部分实施例中,为了实现优质的数据清洗效果,筛选出有效候选长尾词,步骤S1310、根据所述统计指标对所述统计数据进行数据清洗,获得有效候选长尾词,包括如下任意一个或任意多个步骤:Referring to FIG. 5 , in some of the specific embodiments, in order to achieve high-quality data cleaning effect and filter out effective candidate long-tail words, in step S1310 , perform data cleaning on the statistical data according to the statistical index to obtain effective candidate long-tail words. suffix, including any one or more of the following steps:

步骤S1311、将所述统计数据中词数少于预设数值的候选长尾词删除:Step S1311, delete the candidate long-tail words whose number of words in the statistical data is less than the preset value:

长尾词,顾名思义,一般是词汇数多于2、3个的语句,例如,可将长尾词设定为包含不少于3个词,据此,对与搜索文本相匹配的统计数据进行过滤,将其中词数少于3个的候选长尾词从中删除,由此约束了长尾词的定义范围。Long-tail words, as the name suggests, are generally sentences with more than 2 or 3 words. For example, long-tail words can be set to contain no less than 3 words. Filter to remove candidate long-tail words with less than 3 words, thus constraining the definition range of long-tail words.

步骤S1312、将所述统计数据中预设时间范围内的候选长尾词删除:Step S1312, delete the candidate long-tail words within the preset time range in the statistical data:

有时,搜索接口获得的统计数据包含一长段时间范围内生成的候选长尾词,但某些商品具有销售时效性,据此,在获得统计数据时,一并获取各个候选长尾词相对应的时间信息,然后根据预设的时间范围,对候选长尾词进行过滤,将落入预设时间范围的候选长尾词从统计数据中删除,由此实现对过期的候选长尾词的数据清洗。Sometimes, the statistical data obtained by the search interface includes candidate long-tail words generated within a long period of time, but some products are time-sensitive. Then, according to the preset time range, the candidate long-tail words are filtered, and the candidate long-tail words that fall within the preset time range are deleted from the statistical data, so as to realize the data of the expired candidate long-tail words. cleaning.

步骤S1313、将所述统计数据中竞争度高于预设级别的候选长尾词删除:Step S1313, delete the candidate long-tail words whose competition degree is higher than the preset level in the statistical data:

所述竞争度一般以“高、中、低”的相应程度形式进行表征,当然也可以是以数值形式表征,视具体情况而定。无论如何,由于一个候选长尾词的竞争度过高意味着大量的网站、页面使用着该候选长尾词,如果仍然采用这样的候选长尾词,容易导致商品展示页面即使被搜索到,其排名也难以靠前,因此,可利用这一特点,将竞争度属于较高级别的候选长尾词从统计数据中删除,以适度避免竞争。The competition degree is generally characterized in the form of a corresponding degree of "high, medium, and low", and of course, it can also be represented in a numerical form, depending on the specific situation. In any case, due to the high competition of a candidate long-tail word, it means that a large number of websites and pages use the candidate long-tail word. If such a candidate long-tail word is still used, it will easily lead to the product display page even if it is searched. It is also difficult to rank high, so this feature can be used to remove candidate long-tail words with a higher level of competition from the statistical data to moderately avoid competition.

步骤S1314、将所述统计数据中平均搜索量高于预设阈值的候选长尾词删除:Step S1314, delete the candidate long-tail words whose average search volume is higher than the preset threshold in the statistical data:

如果候选长尾词的平均搜索量过高,意味着此类候选长尾词较为流行,同样意味着可能大量的网站、页面会使用这样的候选长尾词,同样存在竞争问题,即使在商品展示页面中使用这样的候选长尾词,也难以提升其搜索排名,根据这一原理,可事先设置一个对应平均搜索量的预设阈值,将平均搜索量高于该预设阈值的候选长尾词从所述统计数据中删除,适度避免竞争。If the average search volume of candidate long-tail words is too high, it means that such candidate long-tail words are more popular, and it also means that a large number of websites and pages may use such candidate long-tail words. There is also a problem of competition, even in product display Using such candidate long-tail words in the page, it is also difficult to improve its search ranking. According to this principle, a preset threshold corresponding to the average search volume can be set in advance, and the average search volume of candidate long-tail words higher than the preset threshold can be set. Removed from said stats, moderately avoiding competition.

此处所揭示的各个实施例,可以供本领域技术人员任意结合使用,例如全量采用步骤S1311-步骤S1314相对应的所有实施例,无论如何,本领域技术人员可基于不同的目的灵活应用此处所揭示的各个实施例,实现对以搜索文本匹配出的统计数据的清洗,确保后续确定目标长尾词时,所依赖的有效候选长尾词的有效性,从而确保最终确定的目标长尾词的质量。The various embodiments disclosed herein can be used in any combination by those skilled in the art. For example, all the embodiments corresponding to steps S1311 to S1314 are used in full. In any case, those skilled in the art can flexibly apply the disclosed herein based on different purposes. The various embodiments of the method realize the cleaning of the statistical data matched with the search text, and ensure the validity of the effective candidate long-tail words that are relied upon when the target long-tail words are subsequently determined, thereby ensuring the quality of the final target long-tail words. .

请参阅图6,具体化的部分实施例中,步骤S1320、量化确定各个所述的有效候选长尾词与所述商品标题的语义相似度,包括如下步骤:Referring to FIG. 6, in some specific embodiments, step S1320, quantitatively determining the semantic similarity between each of the effective candidate long-tail words and the commodity title, includes the following steps:

步骤S1321、编码获取有效候选长尾词及所述商品标题的嵌入向量:Step S1321, coding to obtain the effective candidate long-tail words and the embedding vector of the product title:

为了将有效候选长尾词与商品标题进行语义相似度的计算,对于所述商品标题及每个所述的有效候选长尾词,经常规的分词之后,根据词表将其各自的分词集转换为嵌入向量,完成编码。In order to calculate the semantic similarity between the valid candidate long-tail words and the product title, for the product title and each valid candidate long-tail word, after conventional word segmentation, convert their respective word segmentation sets according to the vocabulary For the embedding vector, complete the encoding.

步骤S1322、采用预训练的文本特征提取模型提取所述有效候选长尾词及所述商品标题的嵌入向量的高层语义信息,获得其各自的语义特征向量:Step S1322, using a pre-trained text feature extraction model to extract the high-level semantic information of the effective candidate long-tail words and the embedding vectors of the product titles, and obtain their respective semantic feature vectors:

进而,采用文本特征提取模型对所述商品标题及参与计算的有效候选长尾词的嵌入向量进行表示学习,提取出其深层语义信息,获得相应的语义特征向量,也即句向量。所述文本特征提取模型可以是预训练的模型,也可以是在预训练模型的基础上由本领域技术人员微调训练获得的模型,只要其具有从文本的嵌入向量中提取获得相应的语义特征向量即可。Furthermore, a text feature extraction model is used to perform representation learning on the embedding vectors of the commodity titles and the effective candidate long-tail words involved in the calculation, extract their deep semantic information, and obtain corresponding semantic feature vectors, that is, sentence vectors. The text feature extraction model can be a pre-trained model, or a model obtained by fine-tuning training by those skilled in the art on the basis of the pre-trained model, as long as it has the corresponding semantic feature vector extracted from the text embedding vector. Can.

所述文本特征提取模型可以是基于LSTM、Bert、Sentence-Transformer等常用的任意一种基础模型,由本领域技术人员灵活选型即可。The text feature extraction model may be based on any common basic model such as LSTM, Bert, Sentence-Transformer, etc., and can be selected flexibly by those skilled in the art.

步骤S1323、采用预设的数据距离算法计算所述商品标题的语义特征向量与各个有效候选长尾词的语气特征向量之间的数据距离作为相应的语义相似度:Step S1323, using the preset data distance algorithm to calculate the data distance between the semantic feature vector of the commodity title and the tone feature vector of each valid candidate long-tail word as the corresponding semantic similarity:

确定所述商品标题及各个有效候选长尾词的语义特征向量之后,便可采用预设的数据距离算法计算商品标题与每个有效候选长尾词各自的语义特征向量之间的相似度。所述的数据距离算法,可以采用余弦相似度算法、欧氏距离算法、闵氏距离算法、皮尔逊相关系数算法、杰卡德系数算法等等任意一种传统的计算数据点和点之间距离的算法。After determining the product title and the semantic feature vector of each valid candidate long-tail word, a preset data distance algorithm can be used to calculate the similarity between the product title and the respective semantic feature vector of each valid candidate long-tail word. The data distance algorithm can use any traditional calculation data point and the distance between the points, such as the cosine similarity algorithm, the Euclidean distance algorithm, the Min's distance algorithm, the Pearson correlation coefficient algorithm, the Jaccard coefficient algorithm, etc. algorithm.

以采用余弦相似度算法为例,其对应的公式如下:Taking the cosine similarity algorithm as an example, the corresponding formula is as follows:

Figure BDA0003592650660000171
Figure BDA0003592650660000171

其中A为商品标题,B为单个有效候选长尾词,n为语义特征向量的元素总数。where A is the product title, B is a single valid candidate long-tail word, and n is the total number of elements of the semantic feature vector.

经应用预设的数据距离算法计算出所述商品标题与每个有效候选长尾词的数据距离后,可根据实际需要将其归一化到同一量纲,使其数值越高表征语义相似度越高,从而获得各个有效候选长尾词相应的语义相似度相对应的数值,后续便可据此进行目标长尾词的优选。After calculating the data distance between the product title and each valid candidate long-tail word by applying the preset data distance algorithm, it can be normalized to the same dimension according to actual needs, so that the higher the value, the semantic similarity. The higher the value is, the numerical value corresponding to the semantic similarity corresponding to each valid candidate long-tail word can be obtained, and then the target long-tail word can be optimized accordingly.

本实施例中,借助深度学习的文本特征模型对有效候选长尾词进行与商品标题的语义相似度的计算,所获得的语义相似度能更好地表示有效候选长尾词与商品标题之间的语义密切程度,后续据此对有效候选长尾词做出择优筛选,其结果会更为精准。In this embodiment, the semantic similarity between the valid candidate long-tail words and the product title is calculated by means of the deep learning text feature model, and the obtained semantic similarity can better represent the relationship between the valid candidate long-tail words and the product title. According to the semantic closeness, the subsequent selection of effective candidate long-tail words will be made, and the results will be more accurate.

请参阅图7,深化的部分实施例中,所述步骤S1400、将所述目标长尾词配置到所述商品展示页面的页面标题中,包括如下步骤:Referring to FIG. 7, in some further embodiments, the step S1400, configuring the target long-tail word into the page title of the product display page, includes the following steps:

步骤S1410、显示与所述商品展示页面相对应的搜索优化页面以展示页面标题输入框:Step S1410: Display the search optimization page corresponding to the product display page to display the page title input box:

请继续参阅图3所示,图3所示的界面中,对应所述商品展示页面预先显示有搜索优化页面,其中,包含一个页面标题输入框,该输入框用于对应输入本申请以上任意一个实施例所确定的目标长尾词。Please continue to refer to FIG. 3. In the interface shown in FIG. 3, a search optimization page is displayed in advance corresponding to the product display page, which includes a page title input box, and the input box is used to input any one of the above in this application. The target long-tail words determined by the examples.

步骤S1420、将所述目标长尾词配置为所述页面标题输入框的内容数据:Step S1420, configuring the target long-tail word as the content data of the page title input box:

在本申请通过以上任意实施例获得目标长尾词后,可自动将其填写后所述搜索优化页面的页面标题输入框中,将其配置为页面标题的内容数据。操作用户也可以对其做进一步的编辑,或者直接默认自动生成的目标长尾词即可。After the application obtains the target long-tail word through any of the above embodiments, it can be automatically filled in the page title input box of the search optimization page, and configured as the content data of the page title. The operating user can also make further edits to it, or simply default the automatically generated target long-tail words.

步骤S1430、响应用户提交指令,发布所述商品展示页面及搜索优化页面:Step S1430, in response to the user submitting the instruction, publish the product display page and the search optimization page:

当操作用户完整录入所述的商品展示页面、搜索优化页面中的各项数据后,便可通过操作提交控件而触发用户提交指令,将商品展示页面及搜索优化页面提交到独立站点的后台,实现所述商品展示页面的发布,同时提交所述搜索优化页面中录入的各项数据,包括所述的页面标题,至此,便完成了一个商品展示页面的发布过程。After the operating user has completely entered the various data in the product display page and the search optimization page, the user can be triggered to submit an instruction by operating the submit control, and the product display page and the search optimization page can be submitted to the background of the independent site to realize When the product display page is published, various data entered in the search optimization page, including the page title, are submitted simultaneously. So far, the publishing process of a product display page is completed.

每个商品展示页面均可以此种方式确定其相应的页面标题,从而利用关键词的长尾效应,为电商独立站点引流。Each product display page can determine its corresponding page title in this way, so as to use the long-tail effect of keywords to attract traffic to independent e-commerce sites.

本实施例示例性地展示了本申请所自动生成的目标长尾词的使用场景,可以看出,在本申请的技术方案的帮助下,独立站点在发布每个商品时,无需过度关注搜索引擎优化的问题,利用本申请基于给定的商品标题自动生成的目标长尾词作为商品的页面标题即可,大大提升了独立站点商品发布环节的处理效率,并且确保为海量的商品展示页面相应自动生成的页面标题可以在长尾效应的作用下,协助提升独立站点的总体流量。This embodiment exemplarily shows the usage scenario of the target long-tail words automatically generated by this application. It can be seen that with the help of the technical solution of this application, independent sites do not need to pay too much attention to search engines when publishing each commodity For the optimization problem, the target long-tail word automatically generated by the application based on the given product title can be used as the page title of the product, which greatly improves the processing efficiency of the product release link of the independent site, and ensures that the display pages for a large number of products are automatically displayed accordingly. The generated page title can help increase the overall traffic of the independent site under the action of the long tail effect.

请参阅图8,适应本申请的目的之一而提供的一种电商站点推广配置装置,是对本申请的电商站点推广配置方法的功能化体现,该装置包括搜索构造模块1100、指标获取模块1200、目标确定模块1300,以及搜索优化模块1400,其中,所述搜索构造模块1100,用于根据商品展示页面中商品标题的产品词和属性词构造搜索文本;所述指标获取模块1200,用于获取与搜索文本相匹配的统计数据,所述统计数据包括多个候选长尾词及其历史搜索统计指标;所述目标确定模块1300,用于根据所述统计指标确定与所述商品标题在语义上相匹配的唯一一个候选长尾词作为目标长尾词;所述搜索优化模块1400,用于将所述目标长尾词配置到所述商品展示页面的页面标题中。Please refer to FIG. 8 , an e-commerce site promotion and configuration device provided to meet one of the purposes of the present application is a functional embodiment of the e-commerce site promotion and configuration method of the present application. The device includes a search construction module 1100 and an index acquisition module. 1200, the target determination module 1300, and the search optimization module 1400, wherein, the search construction module 1100 is used to construct the search text according to the product words and attribute words of the commodity title in the commodity display page; the indicator acquisition module 1200 is used for Obtaining statistical data matching the search text, the statistical data includes a plurality of candidate long-tail words and their historical search statistical indicators; the target determination module 1300 is configured to determine, according to the statistical indicators, semantically related to the product title. The only candidate long-tail word that matches the above is used as the target long-tail word; the search optimization module 1400 is configured to configure the target long-tail word into the page title of the product display page.

深化的部分实施例中,所述搜索构造模块1100,包括:标题获取单元,用于获取商品展示页面中已输入的商品标题;分词识别单元,用于对所述商品标题进行分词和词性识别,获得由多个分词构成的分词集,其中既包含属于产品词的分词,也包含属于属性词的分词;权重量化单元,用于关联所述商品标题对所述分词集进行关键词提取,确定各个分词相对应的搜索权重,所述搜索权重表征其相应的分词的潜在搜索价值;搜索表达单元,用于确定搜索权重最高的唯一产品词及预定数量的多个属性词,拼接构造为搜索文本。In some further embodiments, the search construction module 1100 includes: a title acquisition unit, which is used to acquire the title of the commodity that has been input on the commodity display page; a word segmentation recognition unit, which is used to perform word segmentation and part-of-speech recognition on the commodity title, A word segmentation set consisting of multiple word segmentations is obtained, which includes both the word segmentation belonging to the product word and the word segmentation belonging to the attribute word; the weight quantization unit is used to associate the product title to perform keyword extraction on the word segmentation set, and determine each The search weight corresponding to the word segment, the search weight represents the potential search value of the corresponding word segment; the search expression unit is used to determine the unique product word with the highest search weight and a predetermined number of multiple attribute words, and spliced to construct a search text.

深化的部分实施例中,所述指标获取模块1200,包括:调用搜索接口获取与所述搜索文本相匹配的候选长尾词的统计数据,所述统计数据包含根据海量用户的历史搜索行为数据统计而成的关键词及其统计指标,所述统计指标包括相应的关键词的平均搜索量及其被不同网站采用的竞争度,所述关键词为长尾词。In some further embodiments, the indicator obtaining module 1200 includes: calling a search interface to obtain statistical data of candidate long-tail words that match the search text, the statistical data including statistics based on historical search behavior data of a large number of users The obtained keywords and their statistical indicators, the statistical indicators include the average search volume of the corresponding keywords and the degree of competition adopted by different websites, and the keywords are long-tail keywords.

深化的部分实施例中,所述目标确定模块1300,包括:数据清洗单元,用于根据所述统计指标对所述统计数据进行数据清洗,获得有效候选长尾词;相似量化单元,用于量化确定各个有效候选长尾词与所述商品标题的语义相似度;评分量化单元,用于将平均搜索量作为所述语义相似度的匹配权重,计算各个有效候选长尾词的综合评分;目标选定单元,用于确定综合评分最高的有效候选长尾词作为目标长尾词。In some further embodiments, the target determination module 1300 includes: a data cleaning unit for performing data cleaning on the statistical data according to the statistical indicators to obtain effective candidate long-tail words; a similarity quantification unit for quantifying Determine the semantic similarity between each valid candidate long-tail word and the product title; the scoring quantification unit is used to use the average search volume as the matching weight of the semantic similarity, and calculate the comprehensive score of each valid candidate long-tail word; It is used to determine the effective candidate long-tail word with the highest comprehensive score as the target long-tail word.

具体化的部分实施例中,所述数据清洗单元,包括如下任意一个或任意多个子模块:词数清洗子模块,用于将所述统计数据中词数少于预设数值的候选长尾词删除;时间清洗子模块,用于将所述统计数据中预设时间范围内的候选长尾词删除;竞争度清洗子模块,用于将所述统计数据中竞争度高于预设级别的候选长尾词删除;搜索量清洗子模块,用于将所述统计数据中平均搜索量高于预设阈值的候选长尾词删除。In some specific embodiments, the data cleaning unit includes any one or any of the following sub-modules: a word-count cleaning sub-module, which is used to remove candidate long-tail words whose word count is less than a preset value in the statistical data delete; a time cleaning sub-module for deleting candidate long-tail words within a preset time range in the statistical data; a competition cleaning sub-module for removing candidates whose competitiveness is higher than a preset level in the statistical data Deletion of long-tail words; a search volume cleaning sub-module for deleting candidate long-tail words whose average search volume in the statistical data is higher than a preset threshold.

具体化的部分实施例中,所述相似量化单元,包括:向量编码子单元,用于编码获取有效候选长尾词及所述商品标题的嵌入向量;语义提取子单元,用于采用预训练的文本特征提取模型提取所述有效候选长尾词及所述商品标题的嵌入向量的高层语义信息,获得其各自的语义特征向量;相似计算子单元,用于采用预设的数据距离算法计算所述商品标题的语义特征向量与各个有效候选长尾词的语气特征向量之间的数据距离作为相应的语义相似度。In some specific embodiments, the similarity quantization unit includes: a vector encoding subunit, used for encoding and obtaining the embedding vector of valid candidate long-tail words and the commodity title; a semantic extraction subunit, used for using pre-trained The text feature extraction model extracts the high-level semantic information of the effective candidate long-tail words and the embedding vectors of the product titles to obtain their respective semantic feature vectors; the similarity calculation subunit is used to calculate the The data distance between the semantic feature vector of the product title and the tone feature vector of each valid candidate long-tail word is used as the corresponding semantic similarity.

深化的部分实施例中,所述搜索优化模块1400,包括:页面显示单元,用于显示与所述商品展示页面相对应的搜索优化页面以展示页面标题输入框;自动编辑单元,用于将所述目标长尾词配置为所述页面标题输入框的内容数据;优化发布单元,用于响应用户提交指令,发布所述商品展示页面及搜索优化页面。In some further embodiments, the search optimization module 1400 includes: a page display unit for displaying a search optimization page corresponding to the product display page to display the page title input box; an automatic editing unit for The target long-tail word is configured as the content data of the page title input box; the optimization publishing unit is used for publishing the product display page and the search optimization page in response to the user submitting an instruction.

为解决上述技术问题,本申请实施例还提供计算机设备。如图9所示,计算机设备的内部结构示意图。该计算机设备包括通过系统总线连接的处理器、计算机可读存储介质、存储器和网络接口。其中,该计算机设备的计算机可读存储介质存储有操作系统、数据库和计算机可读指令,数据库中可存储有控件信息序列,该计算机可读指令被处理器执行时,可使得处理器实现一种电商站点推广配置方法。该计算机设备的处理器用于提供计算和控制能力,支撑整个计算机设备的运行。该计算机设备的存储器中可存储有计算机可读指令,该计算机可读指令被处理器执行时,可使得处理器执行本申请的电商站点推广配置方法。该计算机设备的网络接口用于与终端连接通信。本领域技术人员可以理解,图9中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。To solve the above technical problems, the embodiments of the present application also provide computer equipment. As shown in FIG. 9 , a schematic diagram of the internal structure of the computer equipment. The computer device includes a processor, a computer-readable storage medium, a memory, and a network interface connected by a system bus. Wherein, the computer-readable storage medium of the computer device stores an operating system, a database and computer-readable instructions, and the database may store a sequence of control information. When the computer-readable instructions are executed by the processor, the processor can be made to implement a E-commerce site promotion configuration method. The processor of the computer device is used to provide computing and control capabilities and support the operation of the entire computer device. Computer-readable instructions may be stored in the memory of the computer device, and when the computer-readable instructions are executed by the processor, the processor may execute the e-commerce site promotion configuration method of the present application. The network interface of the computer equipment is used for communication with the terminal connection. Those skilled in the art can understand that the structure shown in FIG. 9 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.

本实施方式中处理器用于执行图8中的各个模块及其子模块的具体功能,存储器存储有执行上述模块或子模块所需的程序代码和各类数据。网络接口用于向用户终端或服务器之间的数据传输。本实施方式中的存储器存储有本申请的电商站点推广配置装置中执行所有模块/子模块所需的程序代码及数据,服务器能够调用服务器的程序代码及数据执行所有子模块的功能。In this embodiment, the processor is used to execute the specific functions of each module and its sub-modules in FIG. 8 , and the memory stores program codes and various types of data required to execute the above-mentioned modules or sub-modules. The network interface is used for data transmission between user terminals or servers. The memory in this embodiment stores the program codes and data required to execute all modules/sub-modules in the e-commerce site promotion and configuration device of the present application, and the server can call the server's program codes and data to execute the functions of all sub-modules.

本申请还提供一种存储有计算机可读指令的存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行本申请任一实施例的电商站点推广配置方法的步骤。The present application further provides a storage medium storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors execute the e-commerce site promotion configuration of any embodiment of the present application. steps of the method.

本申请还提供一种计算机程序产品,包括计算机程序/指令,该计算机程序/指令被一个或多个处理器执行时实现本申请任一实施例所述方法的步骤。The present application also provides a computer program product, including computer programs/instructions, when the computer program/instructions are executed by one or more processors, to implement the steps of the method described in any embodiment of the present application.

本领域普通技术人员可以理解实现本申请上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,该计算机程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等计算机可读存储介质,或随机存储记忆体(Random Access Memory,RAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above-mentioned embodiments of the present application can be implemented by instructing relevant hardware through a computer program, and the computer program can be stored in a computer-readable storage medium. When the program is executed, it may include the flow of the embodiments of the above-mentioned methods. The aforementioned storage medium may be a computer-readable storage medium such as a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM).

综上所述,本申请能够基于商品展示页面对应的商品的商品标题确定长尾关键词,自动将其配置为商品展示页面的页面标题,实现搜索引擎关键词优化,利用长尾效应,通过海量的单个商品展示页面的长尾关键词发挥搜索排名靠前的效应,达到提升电商平台的独立站点全站的搜索引擎引流能力的效果。To sum up, the present application can determine long-tail keywords based on the product titles of the products corresponding to the product display pages, and automatically configure them as the page titles of the product display pages, so as to achieve search engine keyword optimization, using the long-tail effect, through massive The long-tail keywords of a single product display page play a top search ranking effect, and achieve the effect of improving the search engine drainage capacity of the independent site of the e-commerce platform.

本技术领域技术人员可以理解,本申请中已经讨论过的各种操作、方法、流程中的步骤、措施、方案可以被交替、更改、组合或删除。进一步地,具有本申请中已经讨论过的各种操作、方法、流程中的其他步骤、措施、方案也可以被交替、更改、重排、分解、组合或删除。进一步地,现有技术中的具有与本申请中公开的各种操作、方法、流程中的步骤、措施、方案也可以被交替、更改、重排、分解、组合或删除。Those skilled in the art can understand that various operations, methods, steps, measures, and solutions in the process discussed in this application may be alternated, modified, combined or deleted. Further, other steps, measures, and solutions in the various operations, methods, and processes that have been discussed in this application may also be alternated, modified, rearranged, decomposed, combined, or deleted. Further, steps, measures and solutions in the prior art with various operations, methods, and processes disclosed in this application may also be alternated, modified, rearranged, decomposed, combined or deleted.

以上所述仅是本申请的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。The above are only part of the embodiments of the present application. It should be pointed out that for those skilled in the art, without departing from the principles of the present application, several improvements and modifications can also be made. It should be regarded as the protection scope of this application.

Claims (10)

1.一种电商站点推广配置方法,其特征在于,包括如下步骤:1. an electric business site promotion configuration method, is characterized in that, comprises the steps: 根据商品展示页面中商品标题的产品词和属性词构造搜索文本;Construct the search text according to the product word and attribute word of the product title in the product display page; 获取与搜索文本相匹配的统计数据,所述统计数据包括多个候选长尾词及其历史搜索统计指标;Obtaining statistical data matching the search text, the statistical data includes a plurality of candidate long-tail words and historical search statistical indicators thereof; 根据所述统计指标确定与所述商品标题在语义上相匹配的唯一一个候选长尾词作为目标长尾词;Determining, according to the statistical index, the only candidate long-tail word that semantically matches the product title as a target long-tail word; 将所述目标长尾词配置到所述商品展示页面的页面标题中。The target long-tail keyword is configured into the page title of the product display page. 2.根据权利要求1所述的电商站点推广配置方法,其特征在于,根据商品展示页面中商品标题的产品词和属性词构造搜索文本,包括如下步骤:2. The e-commerce site promotion configuration method according to claim 1, wherein the search text is constructed according to the product word and the attribute word of the commodity title in the commodity display page, comprising the following steps: 获取商品展示页面中已输入的商品标题;Get the product title that has been entered in the product display page; 对所述商品标题进行分词和词性识别,获得由多个分词构成的分词集,其中既包含属于产品词的分词,也包含属于属性词的分词;Perform word segmentation and part-of-speech recognition on the product title, and obtain a word segmentation set composed of multiple word segmentations, which includes not only word segmentations belonging to product words, but also word segmentations belonging to attribute words; 关联所述商品标题对所述分词集进行关键词提取,确定各个分词相对应的搜索权重,所述搜索权重表征其相应的分词的潜在搜索价值;The keyword extraction is performed on the word segmentation set in association with the commodity title, and the search weight corresponding to each word segmentation is determined, and the search weight represents the potential search value of the corresponding word segmentation; 确定搜索权重最高的唯一产品词及预定数量的多个属性词,拼接构造为搜索文本。Determine the unique product word with the highest search weight and a plurality of attribute words in a predetermined number, and concatenate them into a search text. 3.根据权利要求1所述的电商站点推广配置方法,其特征在于,获取与搜索文本相匹配的统计数据,所述统计数据包括多个候选长尾词及其历史搜索统计指标,包括如下步骤:3. The method for promoting and configuring an e-commerce site according to claim 1, wherein statistical data matching the search text is obtained, and the statistical data includes a plurality of candidate long-tail words and historical search statistical indicators thereof, including the following step: 调用搜索接口获取与所述搜索文本相匹配的候选长尾词的统计数据,所述统计数据包含根据海量用户的历史搜索行为数据统计而成的关键词及其统计指标,所述统计指标包括相应的关键词的平均搜索量及其被不同网站采用的竞争度,所述关键词为长尾词。Invoke the search interface to obtain statistical data of candidate long-tail words that match the search text, the statistical data includes keywords and statistical indicators based on the historical search behavior data of a large number of users, and the statistical indicators include corresponding The average search volume of keywords that are long-tail keywords and their competition for adoption by different websites. 4.根据权利要求3所述的电商站点推广配置方法,其特征在于,根据所述统计指标确定与所述商品标题在语义上相匹配的唯一一个候选长尾词作为目标长尾词,包括如下步骤:4 . The method for promoting and configuring an e-commerce site according to claim 3 , wherein determining, according to the statistical index, the only candidate long-tail word that semantically matches the commodity title as the target long-tail word, 4 . Follow the steps below: 根据所述统计指标对所述统计数据进行数据清洗,获得有效候选长尾词;Perform data cleaning on the statistical data according to the statistical indicators to obtain valid candidate long-tail words; 量化确定各个有效候选长尾词与所述商品标题的语义相似度;Quantitatively determine the semantic similarity between each valid candidate long-tail word and the product title; 将平均搜索量作为所述语义相似度的匹配权重,计算各个有效候选长尾词的综合评分;The average search volume is used as the matching weight of the semantic similarity, and the comprehensive score of each valid candidate long-tail word is calculated; 确定综合评分最高的有效候选长尾词作为目标长尾词。Determine the effective candidate long-tail words with the highest comprehensive score as the target long-tail words. 5.根据权利要求4所述的电商站点推广配置方法,其特征在于,根据所述统计指标对所述统计数据进行数据清洗,获得有效候选长尾词,包括如下任意一个或任意多个步骤:5. The method for promoting and configuring an e-commerce site according to claim 4, characterized in that, performing data cleaning on the statistical data according to the statistical index to obtain valid candidate long-tail words, comprising any one or any of the following steps : 将所述统计数据中词数少于预设数值的候选长尾词删除;Delete the candidate long-tail words whose number of words in the statistical data is less than the preset value; 将所述统计数据中预设时间范围内的候选长尾词删除;Delete the candidate long-tail words within the preset time range in the statistical data; 将所述统计数据中竞争度高于预设级别的候选长尾词删除;Delete the candidate long-tail words whose competition degree is higher than the preset level in the statistical data; 将所述统计数据中平均搜索量高于预设阈值的候选长尾词删除。The candidate long-tail words whose average search volume is higher than the preset threshold in the statistical data are deleted. 6.根据权利要求4所述的电商站点推广配置方法,其特征在于,量化确定各个所述的有效候选长尾词与所述商品标题的语义相似度,包括如下步骤:6. The method for promoting and configuring an e-commerce site according to claim 4, wherein quantitatively determining the semantic similarity between each of the effective candidate long-tail words and the commodity title comprises the following steps: 编码获取有效候选长尾词及所述商品标题的嵌入向量;Encoding to obtain effective candidate long-tail words and the embedding vector of the product title; 采用预训练的文本特征提取模型提取所述有效候选长尾词及所述商品标题的嵌入向量的高层语义信息,获得其各自的语义特征向量;Use a pre-trained text feature extraction model to extract the high-level semantic information of the effective candidate long-tail words and the embedding vector of the commodity title, and obtain their respective semantic feature vectors; 采用预设的数据距离算法计算所述商品标题的语义特征向量与各个有效候选长尾词的语气特征向量之间的数据距离作为相应的语义相似度。Using a preset data distance algorithm, the data distance between the semantic feature vector of the product title and the tone feature vector of each valid candidate long-tail word is calculated as the corresponding semantic similarity. 7.根据权利要求1至6中任意一项所述的电商站点推广配置方法,其特征在于,将所述目标长尾词配置到所述商品展示页面的页面标题中,包括如下步骤:7. The method for promoting and configuring an e-commerce site according to any one of claims 1 to 6, wherein configuring the target long-tail word into the page title of the product display page comprises the following steps: 显示与所述商品展示页面相对应的搜索优化页面以展示页面标题输入框;displaying a search optimization page corresponding to the product display page to display the page title input box; 将所述目标长尾词配置为所述页面标题输入框的内容数据;Configuring the target long-tail word as the content data of the page title input box; 响应用户提交指令,发布所述商品展示页面及搜索优化页面。In response to the user submitting the instruction, the product display page and the search optimization page are published. 8.一种电商站点推广配置装置,其特征在于,包括:8. An e-commerce site promotion configuration device, characterized in that, comprising: 搜索构造模块,用于根据商品展示页面中商品标题的产品词和属性词构造搜索文本;The search construction module is used to construct the search text according to the product word and attribute word of the product title in the product display page; 指标获取模块,用于获取与搜索文本相匹配的统计数据,所述统计数据包括多个候选长尾词及其历史搜索统计指标;an indicator acquisition module, used for acquiring statistical data matching the search text, the statistical data including a plurality of candidate long-tail words and their historical search statistical indicators; 目标确定模块,用于根据所述统计指标确定与所述商品标题在语义上相匹配的唯一一个候选长尾词作为目标长尾词;A target determination module, configured to determine, according to the statistical index, the only candidate long-tail word that semantically matches the commodity title as a target long-tail word; 搜索优化模块,用于将所述目标长尾词配置到所述商品展示页面的页面标题中。A search optimization module, configured to configure the target long-tail keyword into the page title of the product display page. 9.一种计算机设备,包括中央处理器和存储器,其特征在于,所述中央处理器用于调用运行存储于所述存储器中的计算机程序以执行如权利要求1至7中任意一项所述的方法的步骤。9. A computer device comprising a central processing unit and a memory, wherein the central processing unit is used to call and run a computer program stored in the memory to execute the computer program according to any one of claims 1 to 7 steps of the method. 10.一种计算机可读存储介质,其特征在于,其以计算机可读指令的形式存储有依据权利要求1至7中任意一项所述的方法所实现的计算机程序,该计算机程序被计算机调用运行时,执行相应的方法所包括的步骤。10. A computer-readable storage medium, characterized in that it stores a computer program implemented by the method according to any one of claims 1 to 7 in the form of computer-readable instructions, and the computer program is called by a computer At runtime, the steps included in the corresponding method are executed.
CN202210383174.5A 2022-04-12 2022-04-12 E-commerce site promotion configuration method and device, equipment, medium and product thereof Active CN114663164B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210383174.5A CN114663164B (en) 2022-04-12 2022-04-12 E-commerce site promotion configuration method and device, equipment, medium and product thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210383174.5A CN114663164B (en) 2022-04-12 2022-04-12 E-commerce site promotion configuration method and device, equipment, medium and product thereof

Publications (2)

Publication Number Publication Date
CN114663164A true CN114663164A (en) 2022-06-24
CN114663164B CN114663164B (en) 2025-07-22

Family

ID=82035724

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210383174.5A Active CN114663164B (en) 2022-04-12 2022-04-12 E-commerce site promotion configuration method and device, equipment, medium and product thereof

Country Status (1)

Country Link
CN (1) CN114663164B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115470322A (en) * 2022-10-21 2022-12-13 深圳市快云科技有限公司 Keyword generation system and method based on artificial intelligence
CN115545832A (en) * 2022-10-08 2022-12-30 广州欢聚时代信息科技有限公司 Commodity search recommendation method and its device, equipment, medium
CN115719066A (en) * 2022-11-18 2023-02-28 北京百度网讯科技有限公司 Search text understanding method, device, equipment and medium based on artificial intelligence
CN117151082A (en) * 2023-10-30 2023-12-01 量子数科科技有限公司 Commodity title SPU keyword extraction method based on large language model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104391983A (en) * 2014-12-10 2015-03-04 郑州悉知信息技术有限公司 Method and system for releasing product information in batch
CN106021562A (en) * 2016-05-31 2016-10-12 北京京拍档科技有限公司 Method for recommending E-commerce platform based on theme relevance
CN110502687A (en) * 2019-08-22 2019-11-26 山东开创云软件有限公司 A kind of web information flow method and apparatus
CN113449165A (en) * 2020-03-25 2021-09-28 腾讯科技(深圳)有限公司 Search engine optimization method, device, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104391983A (en) * 2014-12-10 2015-03-04 郑州悉知信息技术有限公司 Method and system for releasing product information in batch
CN106021562A (en) * 2016-05-31 2016-10-12 北京京拍档科技有限公司 Method for recommending E-commerce platform based on theme relevance
CN110502687A (en) * 2019-08-22 2019-11-26 山东开创云软件有限公司 A kind of web information flow method and apparatus
CN113449165A (en) * 2020-03-25 2021-09-28 腾讯科技(深圳)有限公司 Search engine optimization method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王艳丽;王静婷;: "基于长尾关键词的搜索引擎优化研究", 计算机光盘软件与应用, no. 24, 15 December 2012 (2012-12-15) *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115545832A (en) * 2022-10-08 2022-12-30 广州欢聚时代信息科技有限公司 Commodity search recommendation method and its device, equipment, medium
CN115470322A (en) * 2022-10-21 2022-12-13 深圳市快云科技有限公司 Keyword generation system and method based on artificial intelligence
CN115719066A (en) * 2022-11-18 2023-02-28 北京百度网讯科技有限公司 Search text understanding method, device, equipment and medium based on artificial intelligence
CN117151082A (en) * 2023-10-30 2023-12-01 量子数科科技有限公司 Commodity title SPU keyword extraction method based on large language model
CN117151082B (en) * 2023-10-30 2024-01-02 量子数科科技有限公司 Commodity title SPU keyword extraction method based on large language model

Also Published As

Publication number Publication date
CN114663164B (en) 2025-07-22

Similar Documents

Publication Publication Date Title
CN110162593B (en) Search result processing and similarity model training method and device
CN111737559B (en) Resource sorting method, method for training sorting model and corresponding device
CN108733766B (en) Data query method and device and readable medium
CN119631069A (en) Systems and methods for real-time search-based generative artificial intelligence
CN118103834A (en) Information acquisition method and device
CN114663164A (en) E-commerce site promotion configuration method and its device, equipment, medium and product
CN110619051B (en) Question sentence classification method, device, electronic equipment and storage medium
CN110442777A (en) Pseudo-linear filter model information search method and system based on BERT
US20230386238A1 (en) Data processing method and apparatus, computer device, and storage medium
CN113505196B (en) Text retrieval method, device, electronic equipment and storage medium based on part of speech
CN113806588B (en) Method and device for searching videos
CN111414471B (en) Method and device for outputting information
CN111274366B (en) Search recommendation method, device, equipment, and storage medium
CN114971730A (en) Method for extracting file material, device, equipment, medium and product thereof
CN111125528A (en) Information recommendation method and device
CN111813993A (en) Video content expanding method and device, terminal equipment and storage medium
WO2025007478A1 (en) Recommendation words generation method for content recommendation, related apparatus, and medium
CN108959413A (en) A kind of topical webpage clawing method and Theme Crawler of Content system
CN114782142A (en) Commodity information matching method and device, equipment, medium and product thereof
CN115129819A (en) Method for producing text abstract model and its device, equipment and medium
CN115129913A (en) Sensitive word mining method and its device, equipment and medium
CN118551759A (en) Keyword generation method, device, equipment and medium based on hybrid expert model
CN117290482A (en) Knowledge base retrieval method and device
CN112347289B (en) Image management method and terminal
CN111078849A (en) Method and apparatus for outputting information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant