[go: up one dir, main page]

WO2020052059A1 - Procédé et appareil de génération d'informations - Google Patents

Procédé et appareil de génération d'informations Download PDF

Info

Publication number
WO2020052059A1
WO2020052059A1 PCT/CN2018/115951 CN2018115951W WO2020052059A1 WO 2020052059 A1 WO2020052059 A1 WO 2020052059A1 CN 2018115951 W CN2018115951 W CN 2018115951W WO 2020052059 A1 WO2020052059 A1 WO 2020052059A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
target
search
search term
term
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2018/115951
Other languages
English (en)
Chinese (zh)
Inventor
邓江东
李磊
马维英
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Publication of WO2020052059A1 publication Critical patent/WO2020052059A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Definitions

  • an embodiment of the present application provides a method for generating information.
  • the method includes: obtaining a target search term set; and for a target search term in the target search term set, determining whether a preset word set includes the same term as the A corresponding term having a pre-established correspondence relationship for the target search term; in response to determining that at least one corresponding term is included, determining the similarity between the corresponding term in the at least one corresponding term and the target search term; A target number of corresponding words are extracted from the corresponding words as a corresponding word set corresponding to the target search word; based on the obtained corresponding word set, at least one search word set is generated.
  • At least one sub-set included in the preset word set is obtained according to the following steps in advance: obtaining a target text set; performing word segmentation on the target text in the target text set to obtain a word set; The words in the word set are clustered with synonyms to obtain at least one sub-set. For the sub-set in the at least one sub-set, the similarity between the words included in the sub-set is greater than or equal to a preset similarity threshold.
  • FIG. 5 is a schematic structural diagram of an embodiment of an apparatus for generating information according to an embodiment of the present application
  • the above-mentioned execution subject may extract a target number of corresponding words from at least one corresponding word in the order of similarity from large to small as a corresponding word set corresponding to the target search word.
  • the target number may be a preset number or a number determined according to a number of corresponding words included in at least one corresponding word corresponding to the target search term. For example, when the number of corresponding words included in the at least one corresponding word is greater than or equal to the preset number, the target number is the preset number; otherwise, the target number is the number of corresponding words included in the at least one corresponding word.
  • Step 402 For a target search term in the target search term set, determine whether a preset word set includes a corresponding term having a pre-established correspondence relationship with the target search term; and in response to determining that it includes at least one corresponding term, determine at least one corresponding term.
  • step 402 is substantially the same as step 202 in the embodiment corresponding to FIG. 2, and details are not described herein again.
  • the process 400 of the method for generating information in this embodiment highlights the steps of searching and outputting search results using the generated at least one search term set. . Therefore, the solution described in this embodiment can use the generated search word set to obtain more comprehensive and targeted search results.
  • this application provides an embodiment of an apparatus for generating information.
  • the apparatus embodiment corresponds to the method embodiment shown in FIG. 2.
  • the device can be specifically applied to various electronic devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé et un appareil de génération d'informations. Le procédé comprend les étapes consistant à : obtenir au moins un ensemble de termes de recherche (201) ; pour un terme de recherche cible dans un ensemble de termes de recherche cibles, déterminer si un ensemble de termes prédéfini comprend un terme correspondant ayant une correspondance préétablie avec le terme de recherche cible ; déterminer, en réponse à la détermination qu'au moins un terme correspondant est compris, un degré de similarité du terme correspondant dans l'au moins un terme correspondant au terme de recherche cible ; extraire un nombre cible de termes correspondants à partir d'au moins un terme correspondant sous la forme d'un ensemble de termes correspondants correspondant au terme de recherche cible dans l'ordre de grandeur de similarité (202) ; générer au moins un ensemble de termes de recherche sur la base de l'ensemble de termes correspondants obtenu (203). Le procédé facilite l'amélioration de l'intelligibilité et de la pertinence de la recherche d'informations.
PCT/CN2018/115951 2018-09-14 2018-11-16 Procédé et appareil de génération d'informations Ceased WO2020052059A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811075006.XA CN109213916A (zh) 2018-09-14 2018-09-14 用于生成信息的方法和装置
CN201811075006.X 2018-09-14

Publications (1)

Publication Number Publication Date
WO2020052059A1 true WO2020052059A1 (fr) 2020-03-19

Family

ID=64984182

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/115951 Ceased WO2020052059A1 (fr) 2018-09-14 2018-11-16 Procédé et appareil de génération d'informations

Country Status (2)

Country Link
CN (1) CN109213916A (fr)
WO (1) WO2020052059A1 (fr)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112307281B (zh) * 2019-07-25 2024-10-29 北京搜狗科技发展有限公司 一种实体推荐方法及装置
CN110688837B (zh) * 2019-09-27 2023-10-31 北京百度网讯科技有限公司 数据处理的方法及装置
CN111078849B (zh) * 2019-12-02 2023-07-25 百度在线网络技术(北京)有限公司 用于输出信息的方法和装置
CN112347365B (zh) * 2020-11-25 2025-02-07 腾讯科技(深圳)有限公司 一种目标搜索信息确定方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102855252A (zh) * 2011-06-30 2013-01-02 北京百度网讯科技有限公司 一种基于需求的数据检索方法和装置
US20140358879A1 (en) * 2012-05-31 2014-12-04 International Business Machines Corporation Search engine suggestion
CN107544982A (zh) * 2016-06-24 2018-01-05 中兴通讯股份有限公司 文本信息处理方法、装置及终端
CN108491387A (zh) * 2018-03-20 2018-09-04 百度在线网络技术(北京)有限公司 用于输出信息的方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180101606A1 (en) * 2016-10-07 2018-04-12 Abel Torres Montoya Method and system for searching for relevant items in a collection of documents given user defined documents
CN106547732A (zh) * 2016-10-14 2017-03-29 深圳中兴网信科技有限公司 近义词识别方法和近义词识别系统
CN107451126B (zh) * 2017-08-21 2020-07-28 广州多益网络股份有限公司 一种近义词筛选方法及系统
CN108509474B (zh) * 2017-09-15 2022-01-07 腾讯科技(深圳)有限公司 搜索信息的同义词扩展方法及装置
CN107766498B (zh) * 2017-10-19 2022-01-07 北京百度网讯科技有限公司 用于生成信息的方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102855252A (zh) * 2011-06-30 2013-01-02 北京百度网讯科技有限公司 一种基于需求的数据检索方法和装置
US20140358879A1 (en) * 2012-05-31 2014-12-04 International Business Machines Corporation Search engine suggestion
CN107544982A (zh) * 2016-06-24 2018-01-05 中兴通讯股份有限公司 文本信息处理方法、装置及终端
CN108491387A (zh) * 2018-03-20 2018-09-04 百度在线网络技术(北京)有限公司 用于输出信息的方法和装置

Also Published As

Publication number Publication date
CN109213916A (zh) 2019-01-15

Similar Documents

Publication Publication Date Title
CN106383875B (zh) 基于人工智能的人机交互方法和装置
CN111522927B (zh) 基于知识图谱的实体查询方法和装置
CN107679039A (zh) 用于确定语句意图的方法和装置
CN107944025A (zh) 信息推送方法和装置
US10108602B2 (en) Dynamic portmanteau word semantic identification
US20180329985A1 (en) Method and Apparatus for Compressing Topic Model
WO2020103899A1 (fr) Procédé pour générer des informations infographiques et procédé pour générer une base de données d'images
CN107526718B (zh) 用于生成文本的方法和装置
CN110263142A (zh) 用于输出信息的方法和装置
US10572601B2 (en) Unsupervised template extraction
WO2020042377A1 (fr) Procédé et appareil de sortie d'informations
US9830316B2 (en) Content availability for natural language processing tasks
CN117312641A (zh) 智能获取信息的方法、装置、设备及存储介质
WO2020052069A1 (fr) Procédé et appareil de segmentation en mots
WO2020052059A1 (fr) Procédé et appareil de génération d'informations
WO2020052061A1 (fr) Procédé et dispositif de traitement d'informations
US20150169539A1 (en) Adjusting Time Dependent Terminology in a Question and Answer System
CN112182255A (zh) 用于存储媒体文件和用于检索媒体文件的方法和装置
CN114579703A (zh) 文本搜索意图识别方法、装置、电子设备及存储介质
CN114298007A (zh) 一种文本相似度确定方法、装置、设备及介质
CN110275962A (zh) 用于输出信息的方法和装置
JP2023002690A (ja) セマンティックス認識方法、装置、電子機器及び記憶媒体
CN108038172A (zh) 基于人工智能的搜索方法和装置
CN113343664B (zh) 图像文本之间的匹配度的确定方法及装置
WO2020052060A1 (fr) Procédé et appareil permettant de générer une instruction de correction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18933655

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 24/06/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18933655

Country of ref document: EP

Kind code of ref document: A1