[go: up one dir, main page]

CN104424252A - Verbal information processing method based on extensive markup language and verbal content server - Google Patents

Verbal information processing method based on extensive markup language and verbal content server Download PDF

Info

Publication number
CN104424252A
CN104424252A CN201310381678.4A CN201310381678A CN104424252A CN 104424252 A CN104424252 A CN 104424252A CN 201310381678 A CN201310381678 A CN 201310381678A CN 104424252 A CN104424252 A CN 104424252A
Authority
CN
China
Prior art keywords
information
text
evaluation
text information
content server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310381678.4A
Other languages
Chinese (zh)
Other versions
CN104424252B (en
Inventor
毕继安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING BEIDA FOUNDER ELECTRONICS Co Ltd
New Founder Holdings Development Co ltd
Original Assignee
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Founder Group Co Ltd, Beijing Founder Electronics Co Ltd filed Critical Peking University Founder Group Co Ltd
Priority to CN201310381678.4A priority Critical patent/CN104424252B/en
Publication of CN104424252A publication Critical patent/CN104424252A/en
Application granted granted Critical
Publication of CN104424252B publication Critical patent/CN104424252B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/83Querying
    • G06F16/835Query processing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明提供一种基于XML的文字信息处理方法和文字内容服务器。方法,包括:文字内容服务器获取待处理的一段文字信息;所述文字内容服务器采用预先存储的至少两种评价规则描述文件,确定所述待处理的该段文字信息的评价信息,其中,各评价规则描述文件用于对一段文字信息的一类特征内容所对应的评价信息进行描述,且所述评价规则描述文件为采用XML描述的文件;所述文字内容服务器采用获得的各评价信息对该段文字信息进行标记。本发明使得用户在进行检索时可以根据该评价信息获得该段文字的综合评价,尽快找到用户所需的内容。

The invention provides an XML-based word information processing method and a word content server. The method includes: the text content server obtains a piece of text information to be processed; the text content server uses at least two evaluation rule description files stored in advance to determine the evaluation information of the text information to be processed, wherein each evaluation The rule description file is used to describe the evaluation information corresponding to a type of characteristic content of a piece of text information, and the evaluation rule description file is a file described in XML; the text content server adopts the obtained evaluation information for the paragraph Text information is marked. The invention enables the user to obtain the comprehensive evaluation of the text according to the evaluation information when searching, and find the content required by the user as soon as possible.

Description

基于XML的文字信息处理方法和文字内容服务器Text information processing method and text content server based on XML

技术领域technical field

本发明涉及计算机技术,尤其涉及一种基于可扩展标记语言(ExtensibleMarkup Language,以下简称:XML)的文字信息处理方法和文字内容服务器。The present invention relates to computer technology, in particular to a text information processing method and a text content server based on Extensible Markup Language (XML).

背景技术Background technique

随着互联网技术的不断发展,人类正在迎来一个信息大爆炸的时代。人们可以通过网络获取到大量的文字信息,例如新闻、论文、微博等。With the continuous development of Internet technology, human beings are ushering in an era of information explosion. People can obtain a large amount of text information through the Internet, such as news, papers, Weibo and so on.

在这大量的文字信息中,用户在需要查找自己所需要的文字信息时,时长会受到一些无效信息、不感兴趣的信息、甚至一些非法信息的干扰,从而为用户的查找带来不便。In this large amount of text information, when the user needs to find the text information he needs, he will be disturbed by some invalid information, uninteresting information, or even some illegal information for a long time, thus bringing inconvenience to the user's search.

因此,在现有存在大量文字信息的情况下,如何使用户方便地获知一段文字信息的综合评价,以便用户更快地找到其所需内容,成为亟待解决的问题。Therefore, in the case of a large amount of text information, how to enable users to easily obtain the comprehensive evaluation of a piece of text information, so that users can find the content they need more quickly, has become an urgent problem to be solved.

发明内容Contents of the invention

本发明提供一种基于XML的文字信息处理方法和文字内容服务器。The invention provides an XML-based word information processing method and a word content server.

本发明提供一种基于XML的文字信息处理方法,包括:The present invention provides an XML-based text information processing method, comprising:

文字内容服务器获取待处理的一段文字信息;The text content server obtains a piece of text information to be processed;

所述文字内容服务器采用预先存储的至少两种评价规则描述文件,确定所述待处理的该段文字信息的评价信息,其中,各评价规则描述文件用于对一段文字信息的一类特征内容所对应的评价信息进行描述,且所述评价规则描述文件为采用XML描述的文件;The text content server uses at least two types of pre-stored evaluation rule description files to determine the evaluation information of the piece of text information to be processed, wherein each evaluation rule description file is used to identify a type of characteristic content of a piece of text information. The corresponding evaluation information is described, and the evaluation rule description file is a file described in XML;

所述文字内容服务器采用获得的各评价信息对该段文字信息进行标记。The text content server uses the obtained evaluation information to mark the piece of text information.

本发明提供一种文字内容服务器,包括:The present invention provides a text content server, including:

获取模块,用于确定待处理的一段文字信息;An acquisition module, configured to determine a piece of text information to be processed;

确定模块,用于采用预先存储的至少两种评价规则描述文件,确定所述待处理的该段文字信息的评价信息,其中,各评价规则描述文件用于对一段文字信息的一类特征内容所对应的评价信息进行描述,且所述评价规则描述文件为采用XML描述的文件;The determination module is used to determine the evaluation information of the piece of text information to be processed by using at least two types of pre-stored evaluation rule description files, wherein each evaluation rule description file is used to identify a type of characteristic content of a piece of text information. The corresponding evaluation information is described, and the evaluation rule description file is a file described in XML;

标记模块,用于采用获得的各评价信息对该段文字信息进行标记。The marking module is configured to use the obtained evaluation information to mark the piece of text information.

本发明中,文字内容服务器上可以预存用于对一段文字信息的至少两类特征内容所对应的评价信息分别进行描述的评价规则描述文件,通过该评价规则描述文件可以对一段文字信息的各类特征内容进行评价,从而从多个角度对该段文字信息进行综合评价,并采用该评价信息对该段文字信息进行标记,进而使得用户在进行检索时,可以根据该评价信息获得该段文字的综合评价,尽快找到用户所需的内容。而且,该评价规则描述文件采用XML来描述,其通用性和扩展性较好。In the present invention, the text content server can pre-store evaluation rule description files used to describe the evaluation information corresponding to at least two types of characteristic content of a piece of text information, and through the evaluation rule description file, various types of text information of a piece of text information can be described. The feature content is evaluated, so as to comprehensively evaluate the text information from multiple angles, and use the evaluation information to mark the text information, so that users can obtain the information of the text information based on the evaluation information when searching. Comprehensive evaluation, find the content that the user needs as soon as possible. Moreover, the evaluation rule description file is described by XML, which has good versatility and expansibility.

附图说明Description of drawings

图1为本发明基于XML的文字信息处理方法实施例的流程图;Fig. 1 is the flow chart of the embodiment of the text information processing method based on XML of the present invention;

图2为本发明文字内容服务器实施例的结构示意图。FIG. 2 is a schematic structural diagram of an embodiment of a text content server in the present invention.

具体实施方式Detailed ways

图1为本发明基于XML的文字信息处理方法实施例的流程图,如图1所示,本实施例的方法可以包括:Fig. 1 is the flow chart of the embodiment of the text information processing method based on XML of the present invention, as shown in Fig. 1, the method of the present embodiment can comprise:

步骤101、文字内容服务器获取待处理的一段文字信息;Step 101, the text content server acquires a piece of text information to be processed;

步骤102、文字内容服务器采用预先存储的至少两种评价规则描述文件,确定所述待处理的文字信息的评价信息,其中,各评价规则描述文件用于对一段文字信息的一类特征内容所对应的评价信息进行描述,且所述评价规则描述文件为采用XML描述的文件;Step 102, the text content server uses at least two pre-stored evaluation rule description files to determine the evaluation information of the text information to be processed, wherein each evaluation rule description file is used to correspond to a type of characteristic content of a piece of text information The evaluation information is described, and the evaluation rule description file is a file described in XML;

步骤103、文字内容服务器采用获得的各评价信息对该段文字信息进行标记。Step 103, the text content server uses the obtained evaluation information to mark the piece of text information.

具体来说,文字内容服务器可以预先生成并存储至少两个评价规则描述文件。而且,该评价规则描述文件为采用XML描述的文件,每个评价规则描述文件可以用于对一段文字信息的一类特征内容所对应的评价信息进行描述。Specifically, the text content server may generate and store at least two evaluation rule description files in advance. Moreover, the evaluation rule description file is a file described in XML, and each evaluation rule description file can be used to describe the evaluation information corresponding to a type of characteristic content of a piece of text information.

举例来说,字数、敏感词、关键词出现频率等均可以作为一段文字信息的特征内容。对应地,该文字内容服务器上即可存储与每类特征内容对应的评价规则描述文件。For example, word count, sensitive words, frequency of occurrence of keywords, etc. can all be used as characteristic content of a piece of text information. Correspondingly, the evaluation rule description file corresponding to each type of characteristic content can be stored on the text content server.

以字数为特征内容举例来说,对应的评价规则,例如可以为:For example, for content characterized by the number of words, the corresponding evaluation rules can be, for example:

文章整体字数在0-100字,且标题大于30字,得60分;The overall word count of the article is 0-100 words, and the title is more than 30 words, 60 points are awarded;

文章整体字数在100-200字,得70分;The overall word count of the article is 100-200 words, and 70 points are awarded;

文章整体字数在200-300字,得80分;The overall word count of the article is 200-300 words, and 80 points are awarded;

文章整体字数在300-400字,得90分;The overall word count of the article is 300-400 words, and 90 points are awarded;

文章整体字数在500字以上,得100分;Articles with more than 500 words will get 100 points;

其余文章得0分。The rest of the articles get 0 points.

为了描述该评价规则,本实施例采用XML描述的评价规则描述文件来实现。例如schema文件,其描述框架,例如可以如下所示:In order to describe the evaluation rule, this embodiment is realized by using an evaluation rule description file described in XML. For example, the schema file, which describes the framework, can be as follows, for example:

上述源程序的相关说明可以参见XML语言的相关标准,此处不再赘述。For the relevant description of the above source program, please refer to the relevant standards of the XML language, which will not be repeated here.

因此,在上述框架的基础上,针对上述字数的评价规则描述文件,具体如下所示:Therefore, on the basis of the above framework, the evaluation rule description file for the above word count is as follows:

另外,还可以自定义更多的评价规则,例如下述文件提供了三类特征内容对应的评价规则描述文件,其中id为1的内容,描述了“敏感词”这类特征内容的评价规则,id为2的内容,描述了“五言绝句”这类特征内容的评价规则,id为3的内容,描述了“段落”这类特征内容的评价规则,具体文件形式如下所示:In addition, you can also customize more evaluation rules. For example, the following files provide evaluation rule description files corresponding to three types of characteristic content. Among them, the content with id 1 describes the evaluation rules for characteristic content such as "sensitive words". The content with id 2 describes the evaluation rules for characteristic content such as "five-character quatrains", and the content with id 3 describes the evaluation rules for characteristic content such as "paragraph". The specific file format is as follows:

因此,当文字内容服务器获取到待处理的一段文字信息之后,该文字内容服务器即可采用其预先存储的各种评价规则描述文件,确定该待处理的文字信息的评价信息,从而获得该段文字信息的不同特征内容的评价信息,举例来说,若该段文字信息的整体字数为250字,则字数特征内容的评价信息为80,若该段文字信息的敏感词为2级,对应评价信息为60,等等。因此,通过各种评价规则描述文件,即可获得从各种特征角度描述该段文字信息的评价信息。Therefore, after the text content server obtains a piece of text information to be processed, the text content server can use its pre-stored various evaluation rule description files to determine the evaluation information of the text information to be processed, thereby obtaining the The evaluation information of different characteristic content of the information. For example, if the overall word count of this piece of text information is 250 characters, the evaluation information of the word count feature content is 80. If the sensitive words of this paragraph of text information are level 2, the corresponding evaluation information for 60, and so on. Therefore, through various evaluation rule description files, it is possible to obtain evaluation information describing the text information from various characteristic angles.

然后,文字内容服务器即可采用获得的各评价信息对该段文字信息进行标记,例如将各个评价信息与该段文字信息存储在一起,或者也可以采用其它标记方式,本实施例不做限定。Then, the text content server can use the obtained evaluation information to mark the piece of text information, for example, store each evaluation information together with the piece of text information, or use other marking methods, which are not limited in this embodiment.

在完成上述过程后,该文字内容服务器即可采用下述两种方式来为用户提供搜索服务。After the above process is completed, the text content server can provide users with search services in the following two ways.

方式一:method one:

文字内容服务器接收用户输入的搜索请求,该搜索请求中包含关键字;The text content server receives a search request input by the user, and the search request contains keywords;

文字内容服务器确定与关键字对应的文字信息;The text content server determines the text information corresponding to the keyword;

文字内容服务器向用户推送搜索结果,搜索结果包括文字信息和该文字信息对应的各评价信息。The text content server pushes search results to the user, and the search results include text information and evaluation information corresponding to the text information.

该方式一中,用户在获得搜索得到的文字信息的同时,可以获得该文字信息的各评价信息,从而使得用户可以快捷方便地根据各评价信息来确定该文字信息的综合评价,以便用户更快地找到其所需内容。In the first method, the user can obtain the evaluation information of the text information while obtaining the text information obtained by searching, so that the user can quickly and conveniently determine the comprehensive evaluation of the text information according to the evaluation information, so that the user can quickly to find what it needs.

方式二:Method 2:

文字内容服务器接收用户输入的搜索请求,搜索请求中包含关键字和用户需求信息;The text content server receives the search request input by the user, and the search request contains keywords and user demand information;

文字内容服务器确定与关键字对应的文字信息,并根据文字信息对应的各评价信息确定满足用户需求信息的文字信息;The text content server determines the text information corresponding to the keyword, and determines the text information that meets the user's demand information according to the evaluation information corresponding to the text information;

文字内容服务器将满足用户需求的文字信息推送给用户。The text content server pushes the text information meeting the needs of the user to the user.

该方式二中,用户可以在发送搜索请求中把用户需求信息也发送给文字内容服务器,例如某一特征内容的评分高于一预设值的需求,从而使得文字内容服务器只将搜索得到的文字信息中满足用户需求信息的那部分文字信息推送给用户。相比与方式一来说,方式二能够让用户更快地找到其所需内容。In the second method, the user can also send the user's demand information to the text content server when sending the search request, for example, the requirement that the score of a certain feature content is higher than a preset value, so that the text content server only sends the searched text The part of the text information that meets the user's needs in the information is pushed to the user. Compared with method 1, method 2 allows users to find what they need more quickly.

本实施例中,文字内容服务器上可以预存用于对一段文字信息的至少两类特征内容所对应的评价信息分别进行描述的评价规则描述文件,通过该评价规则描述文件可以对一段文字信息的各类特征内容进行评价,从而从多个角度对该段文字信息进行综合评价,并采用该评价信息对该段文字信息进行标记,进而使得用户在进行检索时,可以根据该评价信息获得该段文字的综合评价,尽快找到用户所需的内容。而且,该评价规则描述文件采用XML来描述,其通用性和扩展性较好。In this embodiment, the text content server can pre-store evaluation rule description files used to describe the evaluation information corresponding to at least two types of characteristic content of a piece of text information, through which evaluation rule description files can be used to describe each piece of text information. Class feature content is evaluated, so as to comprehensively evaluate the text information from multiple angles, and use the evaluation information to mark the text information, so that users can obtain the text information based on the evaluation information when searching. The comprehensive evaluation of the user can find the content that the user needs as soon as possible. Moreover, the evaluation rule description file is described by XML, which has good versatility and expansibility.

图2为本发明文字内容服务器实施例的结构示意图,如图2所示,本实施例的服务器可以包括:获取模块11、确定模块12和标记模块13,其中:FIG. 2 is a schematic structural diagram of an embodiment of a text content server in the present invention. As shown in FIG. 2, the server of this embodiment may include: an acquisition module 11, a determination module 12, and a marking module 13, wherein:

获取模块11,用于确定待处理的一段文字信息;Obtaining module 11, is used for determining a piece of text information to be processed;

确定模块12,用于采用预先存储的至少两种评价规则描述文件,确定所述待处理的该段文字信息的评价信息,其中,各评价规则描述文件用于对一段文字信息的一类特征内容所对应的评价信息进行描述,且所述评价规则描述文件为采用XML描述的文件;The determining module 12 is configured to use at least two pre-stored evaluation rule description files to determine the evaluation information of the piece of text information to be processed, wherein each evaluation rule description file is used to classify a type of characteristic content of a piece of text information The corresponding evaluation information is described, and the evaluation rule description file is a file described in XML;

标记模块13,用于采用获得的各评价信息对该段文字信息进行标记。The marking module 13 is configured to use the obtained evaluation information to mark the piece of text information.

进一步地,该服务器还可以包括:生成模块14和搜索处理模块15,其中:Further, the server may also include: a generation module 14 and a search processing module 15, wherein:

生成模块14,用于在获取模块11获取待处理的一段文字信息之前,生成所述至少两种评价规则描述文件并存储。The generation module 14 is configured to generate and store the at least two evaluation rule description files before the acquisition module 11 acquires a piece of text information to be processed.

搜索处理模块15,用于接收用户输入的搜索请求,所述搜索请求中包含关键字;确定与所述关键字对应的文字信息;向所述用户推送搜索结果,所述搜索结果包括所述文字信息和该文字信息对应的各评价信息;或者,用于接收用户输入的搜索请求,所述搜索请求中包含关键字和用户需求信息;确定与所述关键字对应的文字信息,并根据所述文字信息对应的各评价信息确定满足用户需求信息的文字信息;将满足用户需求的文字信息推送给所述用户。The search processing module 15 is configured to receive a search request input by a user, and the search request includes a keyword; determine text information corresponding to the keyword; push a search result to the user, and the search result includes the text Information and each evaluation information corresponding to the text information; or, for receiving a search request input by the user, the search request includes keywords and user demand information; determining the text information corresponding to the keywords, and according to the The evaluation information corresponding to the text information determines the text information that meets the user's demand information; and pushes the text information that meets the user's demand to the user.

本实施例的文字内容服务器,其可以用于执行图1所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The text content server of this embodiment can be used to implement the technical solution of the method embodiment shown in FIG. 1 , and its implementation principle and technical effect are similar, and will not be repeated here.

本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps for implementing the above method embodiments can be completed by program instructions and related hardware. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, it executes the steps of the above-mentioned method embodiments; and the aforementioned storage medium includes: ROM, RAM, magnetic disk or optical disk and other various media that can store program codes.

最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than limiting them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present invention. scope.

Claims (7)

1.一种基于可扩展标记语言XML的文字信息处理方法,其特征在于,包括:1. A text information processing method based on Extensible Markup Language XML, characterized in that, comprising: 文字内容服务器获取待处理的一段文字信息;The text content server acquires a piece of text information to be processed; 所述文字内容服务器采用预先存储的至少两种评价规则描述文件,确定所述待处理的该段文字信息的评价信息,其中,各评价规则描述文件用于对一段文字信息的一类特征内容所对应的评价信息进行描述,且所述评价规则描述文件为采用XML描述的文件;The text content server uses at least two types of pre-stored evaluation rule description files to determine the evaluation information of the piece of text information to be processed, wherein each evaluation rule description file is used to identify a type of characteristic content of a piece of text information. The corresponding evaluation information is described, and the evaluation rule description file is a file described in XML; 所述文字内容服务器采用获得的各评价信息对该段文字信息进行标记。The text content server uses the obtained evaluation information to mark the piece of text information. 2.根据权利要求1所述的方法,其特征在于,所述文字内容服务器获取待处理的一段文字信息之前,还包括:2. The method according to claim 1, wherein, before the text content server obtains a section of text information to be processed, it also includes: 生成所述至少两种评价规则描述文件并存储。The at least two evaluation rule description files are generated and stored. 3.根据权利要求1或2所述的方法,其特征在于,所述文字内容服务器采用获得的各评价信息对该段文字信息进行标记之后,还包括:3. The method according to claim 1 or 2, wherein the text content server further includes: 所述文字内容服务器接收用户输入的搜索请求,所述搜索请求中包含关键字;The text content server receives a search request input by a user, and the search request includes keywords; 所述文字内容服务器确定与所述关键字对应的文字信息;The text content server determines the text information corresponding to the keyword; 所述文字内容服务器向所述用户推送搜索结果,所述搜索结果包括所述文字信息和该文字信息对应的各评价信息。The text content server pushes search results to the user, and the search results include the text information and evaluation information corresponding to the text information. 4.根据权利要求1或2所述的方法,其特征在于,所述文字内容服务器采用获得的各评价信息对该段文字信息进行标记之后,还包括:4. The method according to claim 1 or 2, wherein the text content server further comprises: 所述文字内容服务器接收用户输入的搜索请求,所述搜索请求中包含关键字和用户需求信息;The text content server receives a search request input by a user, and the search request includes keywords and user demand information; 所述文字内容服务器确定与所述关键字对应的文字信息,并根据所述文字信息对应的各评价信息确定满足用户需求信息的文字信息;The text content server determines the text information corresponding to the keyword, and determines the text information that meets the user's demand information according to the evaluation information corresponding to the text information; 所述文字内容服务器将满足用户需求的文字信息推送给所述用户。The text content server pushes the text information meeting the needs of the user to the user. 5.一种文字内容服务器,其特征在于,包括:5. A text content server, characterized in that, comprising: 获取模块,用于确定待处理的一段文字信息;An acquisition module, configured to determine a piece of text information to be processed; 确定模块,用于采用预先存储的至少两种评价规则描述文件,确定所述待处理的该段文字信息的评价信息,其中,各评价规则描述文件用于对一段文字信息的一类特征内容所对应的评价信息进行描述,且所述评价规则描述文件为采用XML描述的文件;The determination module is used to determine the evaluation information of the piece of text information to be processed by using at least two types of pre-stored evaluation rule description files, wherein each evaluation rule description file is used to identify a type of characteristic content of a piece of text information. The corresponding evaluation information is described, and the evaluation rule description file is a file described in XML; 标记模块,用于采用获得的各评价信息对该段文字信息进行标记。The marking module is configured to use the obtained evaluation information to mark the piece of text information. 6.根据权利要求5所述的服务器,其特征在于,还包括:6. The server according to claim 5, further comprising: 生成模块,用于在所述获取模块获取待处理的一段文字信息之前,生成所述至少两种评价规则描述文件并存储。The generation module is configured to generate and store the at least two evaluation rule description files before the acquisition module acquires a piece of text information to be processed. 7.根据权利要求5或6所述的服务器,其特征在于,还包括:7. The server according to claim 5 or 6, further comprising: 搜索处理模块,用于接收用户输入的搜索请求,所述搜索请求中包含关键字;确定与所述关键字对应的文字信息;向所述用户推送搜索结果,所述搜索结果包括所述文字信息和该文字信息对应的各评价信息;或者,用于接收用户输入的搜索请求,所述搜索请求中包含关键字和用户需求信息;确定与所述关键字对应的文字信息,并根据所述文字信息对应的各评价信息确定满足用户需求信息的文字信息;将满足用户需求的文字信息推送给所述用户。A search processing module, configured to receive a search request input by a user, the search request includes keywords; determine text information corresponding to the keywords; push search results to the user, the search results include the text information Each evaluation information corresponding to the text information; or, for receiving a search request input by the user, the search request includes keywords and user demand information; determining the text information corresponding to the keywords, and according to the text The evaluation information corresponding to the information determines the text information that meets the user's demand information; and pushes the text information that meets the user's demand to the user.
CN201310381678.4A 2013-08-28 2013-08-28 Literal information processing method and word content server based on XML Active CN104424252B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310381678.4A CN104424252B (en) 2013-08-28 2013-08-28 Literal information processing method and word content server based on XML

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310381678.4A CN104424252B (en) 2013-08-28 2013-08-28 Literal information processing method and word content server based on XML

Publications (2)

Publication Number Publication Date
CN104424252A true CN104424252A (en) 2015-03-18
CN104424252B CN104424252B (en) 2017-12-15

Family

ID=52973238

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310381678.4A Active CN104424252B (en) 2013-08-28 2013-08-28 Literal information processing method and word content server based on XML

Country Status (1)

Country Link
CN (1) CN104424252B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7370061B2 (en) * 2005-01-27 2008-05-06 Siemens Corporate Research, Inc. Method for querying XML documents using a weighted navigational index
CN101599069A (en) * 2009-07-10 2009-12-09 腾讯科技(深圳)有限公司 The searching method of electronic document and system
CN102163187A (en) * 2010-02-21 2011-08-24 国际商业机器公司 Document marking method and device
CN102208992A (en) * 2010-06-13 2011-10-05 天津海量信息技术有限公司 Internet-facing filtration system of unhealthy information and method thereof
CN102855243A (en) * 2011-06-28 2013-01-02 北大方正集团有限公司 Method and device for extracting document structure
CN103064892A (en) * 2012-12-13 2013-04-24 北京海量融通软件技术有限公司 Network post indexing system and method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7370061B2 (en) * 2005-01-27 2008-05-06 Siemens Corporate Research, Inc. Method for querying XML documents using a weighted navigational index
CN101599069A (en) * 2009-07-10 2009-12-09 腾讯科技(深圳)有限公司 The searching method of electronic document and system
CN102163187A (en) * 2010-02-21 2011-08-24 国际商业机器公司 Document marking method and device
CN102208992A (en) * 2010-06-13 2011-10-05 天津海量信息技术有限公司 Internet-facing filtration system of unhealthy information and method thereof
CN102855243A (en) * 2011-06-28 2013-01-02 北大方正集团有限公司 Method and device for extracting document structure
CN103064892A (en) * 2012-12-13 2013-04-24 北京海量融通软件技术有限公司 Network post indexing system and method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
乔华等: ""文献信息标引自动化的发展 "", 《现代情报》 *
刘丹: ""基于XML的中文博硕士论文检索系统设计及实现"", 《现代图书情报技术》 *
周晓红: ""网络信息检索系统中信息自动标引方法的设计与实现"", 《情报杂志》 *

Also Published As

Publication number Publication date
CN104424252B (en) 2017-12-15

Similar Documents

Publication Publication Date Title
US10282606B2 (en) Organizational logo enrichment
CN107704480B (en) Method and system for extending and reinforcing knowledge graph and computer medium
CN106462608B (en) Improving Knowledge Source Personalization for Language Models
US12026194B1 (en) Query modification based on non-textual resource context
CN104412265B (en) Updates to the search index used to facilitate app searches
CN109582799B (en) Method, device and electronic device for determining knowledge sample data set
US8438163B1 (en) Automatic learning of logos for visual recognition
US11106718B2 (en) Content moderation system and indication of reliability of documents
WO2019041521A1 (en) Apparatus and method for extracting user keyword, and computer-readable storage medium
CN103294781B (en) A kind of method and apparatus for processing page data
CN105956053B (en) A kind of search method and device based on network information
WO2019037258A1 (en) Information recommendation method, device and system, and computer-readable storage medium
CN104866498A (en) Information processing method and device
CN109299277A (en) Public opinion analysis method, server and computer-readable storage medium
CN107491465B (en) Method and apparatus for searching for content and data processing system
AU2016204573A1 (en) Common data repository for improving transactional efficiencies of user interactions with a computing device
WO2016058425A1 (en) Voice search method, apparatus and device, and computer storage medium
US9811592B1 (en) Query modification based on textual resource context
CN104462504A (en) Method and device for providing reasoning process data in search
JP6680472B2 (en) Information processing apparatus, information processing method, and information processing program
CN104077320B (en) A method and device for generating information to be released
CN114218431A (en) Video searching method and device, electronic equipment and storage medium
US9984684B1 (en) Inducing command inputs from high precision and high recall data
CN104424252B (en) Literal information processing method and word content server based on XML
CN113377922B (en) Methods, devices, electronic devices and media for matching information

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220620

Address after: 3007, Hengqin international financial center building, No. 58, Huajin street, Hengqin new area, Zhuhai, Guangdong 519031

Patentee after: New founder holdings development Co.,Ltd.

Patentee after: Beijing Beida Founder Electronics Co., Ltd.

Address before: 100871, Beijing, Haidian District, Cheng Fu Road, No. 298, Zhongguancun Fangzheng building, 9 floor

Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee before: Beijing Beida Founder Electronics Co., Ltd.

TR01 Transfer of patent right