[go: up one dir, main page]

CN108073563A - The generation method and device of data - Google Patents

The generation method and device of data Download PDF

Info

Publication number
CN108073563A
CN108073563A CN201610981154.2A CN201610981154A CN108073563A CN 108073563 A CN108073563 A CN 108073563A CN 201610981154 A CN201610981154 A CN 201610981154A CN 108073563 A CN108073563 A CN 108073563A
Authority
CN
China
Prior art keywords
keyword
data
column heading
associated data
row headers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610981154.2A
Other languages
Chinese (zh)
Inventor
樊思国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201610981154.2A priority Critical patent/CN108073563A/en
Publication of CN108073563A publication Critical patent/CN108073563A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses the generation method and device of a kind of data, it is related to data processing field, it is not directly perceived enough that main purpose is that the incidence relation for solving the problems, such as in most of business software between keyword embodies.The present invention main technical schemes be:Obtain the associated data between keyword and the different keywords in keyword set;Keyword configuration row headers and column heading in the keyword set;According to the row headers and column heading and associated data generation incidence relation data.Present invention is mainly used for the generations of data.

Description

The generation method and device of data
Technical field
The present invention relates to data processing field more particularly to the generation methods and device of a kind of data.
Background technology
Data analysis is that the mass data of collection is analyzed using appropriate statistical analysis technique, extracts useful information Process, the conclusion of generation is subject to data research and summary in detail convenient for technical staff.Wherein, association analysis is frequent The analysis method used, and the association analysis of keyword is a kind of important analysis method in association analysis.
At present, user carries out the association analysis of keyword, still, big portion generally by means of large commercial Data Analysis Software The large commercial software divided when mounted, can install the software of auxiliary function, cause the wasting of resources, and different business softwares Different operation instructions and learning cost can be generated so that the operating cost of technical staff is excessive.
The content of the invention
In view of the above problems, it is proposed that the generation method and device of the invention in order to provide a kind of data, main purpose are Solve the problem of that operating cost is excessive when carrying out the association analysis of keyword using most of business software.
By above-mentioned technical proposal, a kind of generation method of data provided by the invention, including:
Obtain the associated data between keyword and the different keywords in keyword set;
Keyword configuration row headers and column heading in the keyword set;
According to the row headers and column heading and associated data generation incidence relation data.
By above-mentioned technical proposal, a kind of generating means of data provided by the invention, including:
Acquiring unit, for obtaining the associated data between keyword and different keywords in keyword set;
Dispensing unit, for the keyword configuration row headers and column heading in the keyword set;
Generation unit, for generating incidence relation data according to the row headers and column heading and the associated data.
By above-mentioned technical proposal, technical solution provided in an embodiment of the present invention at least has following advantages:
The generation method and device of a kind of data provided in an embodiment of the present invention obtain the key in keyword set first Associated data between word and different keywords, then the keyword configuration row headers in the keyword set and row are marked Topic generates incidence relation data further according to the row headers and column heading and the associated data.Major part is being utilized to existing When business software carries out the association analysis of keyword, operating cost is excessive to be compared, the present invention by generate keyword for row, Column heading, associated data be internal data incidence relation data, reduce business software carry out keyword association analysis into This, realizes the incidence relation for intuitively showing keyword, so as to improve the analysis efficiency of incidence relation.
Above description is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And can be practiced according to the content of specification, and in order to allow above and other objects of the present invention, feature and advantage can It is clearer and more comprehensible, below the special specific embodiment for lifting the present invention.
Description of the drawings
By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this field Technical staff will be apparent understanding.Attached drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows a kind of flow chart of the generation method for data that inventive embodiments provide;
Fig. 2 shows the flow chart of the generation method for another data that inventive embodiments provide;
Fig. 3 shows a kind of schematic diagram for keyword data that inventive embodiments provide;
Fig. 4 shows a kind of schematic diagram for generation incidence relation data that inventive embodiments provide;
Fig. 5 shows a kind of block diagram of the generating means for data that inventive embodiments provide;
Fig. 6 shows the block diagram of the generating means for another data that inventive embodiments provide.
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Completely it is communicated to those skilled in the art.
The embodiment of the present invention provides a kind of generation method of data, as shown in Figure 1, the described method includes:
101st, the associated data between keyword and the different keywords in keyword set is obtained.
Wherein, the keyword is the keyword of incidence relation to be analyzed, and the keyword set, which is combined into, preserves all tools The set of relevant keyword, the associated data between different keywords there are the corresponding numerical value of incidence relation, The occurrence number between keyword can be used to represent there are relation, the embodiment of the present invention is not specifically limited.For example, it gets Keyword for " government, growth, economy, environment, variation, inhibition, growth, government ", " government " and the associated data of " environment " For 234, the associated data of " growth " and " economy " is 214, and the associated data of " environment " and " variation " is 332, " inhibition " and " is passed through The associated data of Ji " is 112.
It should be noted that the method for obtaining keyword can be to be obtained from a data list, data list In comprising two row keywords, there are incidence relation between two row keywords, the data for representing incidence relation are stored in the 3rd row, Data list can be stored in the form of Excel forms.Further, the keyword in the embodiment of the present invention is to website In document process crawl, incidence relation can be embodied in short or the association in article between keyword. In addition, program compiling all in the embodiment of the present invention can use VBA (Visual Basic for Applications) Language, it is that Microsoft develops, and performs the programming language of general automation (OLE) task in the application.Example Such as, step 101 can be specifically embodied as using program:Define dictionary d, temporary variable temp, row name colName, row name ColName1, re-defines line number r and r1 and array arr and arr1, program are embodied as:Dim d,temp,colName, colName1;Dim r,r1;Dim arr, arr1, the program for obtaining row A and arranging all line numbers there are data of B are embodied as:r =Range (" A65536 ") .End (xlUp) .Row;R1=Range (" B65536 ") .End (xlUp) .Row, word is arranged to d The program of allusion quotation is embodied as:Set d=CreateObject (" Scripting.Dictionary "), the program for obtaining total line number are real It is now:Keyword in A row and B row is stored in array by iRows=ActiveSheet.UsedRange.Rows.Count respectively The program of arr and arr1 is embodied as:Arr=Range (" A2:A"&r).Value;Arr1=Range (" B2:B"&r1) .Value to get to the keyword of acquisition.
102nd, the keyword configuration row headers and column heading in the keyword set.
Wherein, all keywords are included in the row headers and column heading.
It should be noted that the row headers and column heading of configuration are stored in another storage location, can be Excel tables In another sheet in lattice, for example, being stored in dictionary d using keyword as key and value being arranged to 1 program realization For:For Each temp In arr;D (temp)=1;Next;For Each temp In arr1;D (temp)=1; Next, the program for emptying content in sheet2 are embodied as:Sheet2.UsedRange.Clear distinguishes the content in dictionary d In the row that the row and B1 that filling A2 starts start, form row and the program of row headers is embodied as:Sheet2.Range("A2") .Resize (d.Count, 1)=Application.Transpose (d.keys) Sheet2.Range (" B1 ") .Resize (1, D.Count)=Application.Transpose (Application.Transpose (d.keys)), obtaining has in sheet2 The program of the columns of data is embodied as:Cols=Sheet2.UsedRange.Columns.Count, the program for obtaining row name are real It is now:ColName=Col_Letter (cols).
103rd, incidence relation data are generated according to the row headers and column heading and the associated data.
Wherein, the incidence relation data can be matrix form, or tabular form, the embodiment of the present invention are not done It is specific to limit.
It should be noted that in the incidence relation data of generation, if there is association in row headers position corresponding with column heading Data, then in corresponding position display associated data.For example, rower is entitled " government, growth, economy ", column heading for " inhibit, Increase, government ", keyword " government " and keyword " economy " associated data are 234, then the incidence relation data generated be " 0, 0,0,;0,0,0;0,0,234”.
A kind of generation method of data provided in an embodiment of the present invention obtains the keyword and not in keyword set first With the associated data between keyword, then keyword configuration row headers and column heading in the keyword set, then According to the row headers and column heading and associated data generation incidence relation data.To existing soft using most of commercialization When part carries out the association analysis of keyword, operating cost is excessive to be compared, and the present invention is row, column mark by generating a keyword Topic, associated data are the incidence relation data of internal data, reduce the cost that business software carries out the association analysis of keyword, real The incidence relation of keyword is now intuitively shown, so as to improve the analysis efficiency of incidence relation.
The embodiment of the present invention provides the generation method of another data, as shown in Fig. 2, the described method includes:
201st, generation data command is received.
Wherein, the generation data command is used to indicate the generation incidence relation data.
It should be noted that generation data command can utilize the configuration of VBA programs in Excel forms, it is specific to generate The form of data-triggered event can be a button, or a quick sentence, the embodiment of the present invention do not do specific limit It is fixed.It can be generation form, generator matrix to generate incidence relation data, and data can be 1 and 0 form, or null Form with associating the frequency, the embodiment of the present invention are not specifically limited.Data command is generated by receiving, realization automatically generates pass Join relation data so that the incidence relation of keyword is shown more directly perceived.
202nd, the associated data between keyword and the different keywords in keyword set is obtained.
This step is identical with the method described in step 101 described in Fig. 1, and which is not described herein again.
203rd, deduplication operation is carried out to the keyword in the keyword set, obtains the keyword after duplicate removal.
Wherein, if the deduplication operation is that keyword is duplicated in keyword, one in duplicate key word is only retained A keyword.For example, keyword is " government, growth, economy, environment, variation, inhibition, growth, government ", after carrying out duplicate removal Keyword is " government, growth, economy, environment, variation, inhibition ".By carrying out deduplication operation to the keyword, duplicate removal is obtained Keyword afterwards realizes the title for the number of optimal keyword establish row and column, keyword is avoided to duplicate statistics, So as to improve the formation efficiency of data.
204th, data list is established according to the row headers and column heading.
Wherein, the data list can be matrix form or form, if matrix form row headers and Column heading can occur in the form of vectors, and if form, row headers and column heading are also directly established in the gauge outfit position of form It puts, the embodiment of the present invention is not specifically limited.By establishing data list according to the row headers and column heading, realize that data can To be shown in a variety of forms, so as to improve the efficiency of analyzing and associating data.
205th, associated data is added in the predeterminated position of the data list according to preset loop function, is associated Relation data.
Wherein, the associated data includes the association frequency or incidence relation data value, the association frequency for keyword it Between the times or frequency that occurs, the incidence relation data value is to represent the numerical value that whether there is relation between keyword, if depositing In relation, then can be indicated with 1, the embodiment of the present invention is not specifically limited.The predeterminated position is the pass in row headers There is corresponding position when associating with the keyword in column heading in keyword, the preset loop function is to correspond to each keyword Position in add corresponding associated data, to exist in each row headers position corresponding with the keyword in column heading Corresponding data.
It should be noted that if the keyword in row headers is not present with the keyword in column heading and associates, then it is corresponding Position can add a number, represent there is no association, can not also add, the embodiment of the present invention is not specifically limited.For example, Use as default 0 program of all cells of the matrix area of sheet2 is embodied as:Sheet2.Range("B2:"& ColName&d.Count+1)=0.By the way that associated data is added in the default of the data list according to preset loop function In position, incidence relation data are obtained, improve the intuitive of associated data.
For the embodiment of the present invention, step 205 is specifically as follows:Judge between row headers keyword corresponding with column heading With the presence or absence of associated data;If so, to add associated data in row headers position corresponding with column heading;If it is not, then To add predetermined threshold value in row headers position corresponding with column heading.
Wherein, the predetermined threshold value can be 0 for representing that association is not present between keyword, or NULL, this Inventive embodiments are not specifically limited.Associated data can include incidence relation data value or the association frequency.It should be noted that The judgment step is the specific method of preset loop function in step 205, and specific program realization can be:Using embedding Set cycle, reads the keyword incidence relation in sheet1, and finds according to relation the row of the incidence relation matrix in sheet2 Column position, and corresponding cell is arranged to 1 program and is embodied as:For i=2To iRows;W1=Sheet1.Cells (i,"A").Value;W2=Sheet1.Cells (i, " B ") .Value is obtained with EA and EB keywords in the A row of sheet2 The program of line number be embodied as:For Each Rng In Sheet2.Range("a2:a"&d.Count+1);If Rng= w1Then;R1=Rng.Row;End If;If Rng=w2Then;R2=Rng.Row;End If;Next;It obtains in sheet2 The program of columns be embodied as:Col=Sheet2.UsedRange.Columns.Count;The program for obtaining row name is embodied as: ColName1=Col_Letter (col);Obtain the program of the row number occurred in the 1st row of the EA and EB keywords in sheet2 It is embodied as:For Each Rng1 In Sheet2.Range("b1:"&colName1&"1");If Rng1=w2 Then;c1 =Rng1.Column;End If;If Rng1=w1 Then;C2=Rng1.Column;End If;Next;It will be in sheet2 The program that the value for the cell that the corresponding ranks of EA and EB intersect is arranged to 1 is embodied as:Sheet2.Cells (r1, c1)=1; Sheet2.Cells (r2, c2)=1.By judging to whether there is incidence number between row headers keyword corresponding with column heading According to corresponding data being added in correspondence position, so as to fulfill corresponding association keyword is configured for each keyword Associated data, so as to improve the formation efficiency of data.
For the embodiment of the present invention, specific application scenarios can be as follows, but not limited to this, including:Such as Fig. 3 institutes Show, click on generator matrix button, obtain the association frequency between all keywords and keyword in keyword set, A row close Keyword is " monitoring, needs, serious, growth, examination " for " government, consideration, this, economic, needs ", B row keyword, and the frequency is " 123,563,514,315,421 " carry out duplicate removal processing to the keyword got, obtain " government, consideration, this, economic, need , to monitor, be serious, increasing, examination ", title as shown in Figure 4 is generated, the frequency is added in position corresponding with keyword, As shown in figure 4, generation incidence relation data, wherein row headers position corresponding with column heading is there are associated data, then in rower Topic adds associated data in position corresponding with column heading, since associated data can include the association frequency or incidence relation data It is worth, associated data is shown by incidence relation data value in Fig. 4;Wherein, incidence relation data value numerical value 1 in this illustrated example It represents.
It is provided in an embodiment of the present invention another kind data generation method, first obtain keyword set in keyword and Associated data between different keywords, then keyword configuration row headers and column heading in the keyword set, Further according to the row headers and column heading and associated data generation incidence relation data.Most of commercialization is being utilized to existing When software carries out the association analysis of keyword, operating cost is excessive to be compared, and the present invention is row, column mark by generating a keyword Topic, associated data are the incidence relation data of internal data, reduce the cost that business software carries out the association analysis of keyword, real The incidence relation of keyword is now intuitively shown, so as to improve the analysis efficiency of incidence relation.
The device embodiment is corresponding with preceding method embodiment, and for ease of reading, present apparatus embodiment is no longer to foregoing side Detail content in method embodiment is repeated one by one, it should be understood that the device in the present embodiment can correspond to realize it is foregoing Full content in embodiment of the method.
Further, the specific implementation as method shown in Fig. 1, the embodiment of the present invention provide a kind of generation dress of data It puts, as shown in figure 5, described device can include:Acquiring unit 31, dispensing unit 32, generation unit 33.
Acquiring unit 31, for obtaining the associated data between keyword and different keywords in keyword set;
Wherein, the keyword is the keyword of incidence relation to be analyzed, and the keyword set, which is combined into, preserves all tools The set of relevant keyword, the associated data between different keywords there are the corresponding numerical value of incidence relation, The occurrence number between keyword can be used to represent there are relation, the embodiment of the present invention is not specifically limited.
Dispensing unit 32, for the keyword configuration row headers and column heading in the keyword set;
Wherein, all keywords are included in the row headers and column heading.
Generation unit 33, for generating incidence relation data according to the row headers and column heading and the associated data.
Wherein, the incidence relation data can be matrix form, or tabular form, the embodiment of the present invention are not done It is specific to limit.
A kind of generating means of data provided in an embodiment of the present invention obtain the keyword and not in keyword set first With the associated data between keyword, then keyword configuration row headers and column heading in the keyword set, then According to the row headers and column heading and associated data generation incidence relation data.To existing soft using most of commercialization When part carries out the association analysis of keyword, operating cost is excessive to be compared, and the present invention is row, column mark by generating a keyword Topic, associated data are the incidence relation data of internal data, reduce the cost that business software carries out the association analysis of keyword, real The incidence relation of keyword is now intuitively shown, so as to improve the analysis efficiency of incidence relation.
The device embodiment is corresponding with preceding method embodiment, and for ease of reading, present apparatus embodiment is no longer to foregoing side Detail content in method embodiment is repeated one by one, it should be understood that the device in the present embodiment can correspond to realize it is foregoing Full content in embodiment of the method.
Further, the specific implementation as method shown in Fig. 2, the embodiment of the present invention provide the generation dress of another data It puts, as shown in fig. 6, described device can include:Acquiring unit 41, dispensing unit 42, operating unit 44, connect generation unit 43 Receive unit 45.
Acquiring unit 41, for obtaining the associated data between keyword and different keywords in keyword set;
Dispensing unit 42, for the keyword configuration row headers and column heading in the keyword set;
Generation unit 43, for generating incidence relation data according to the row headers and column heading and the associated data.
Further, described device further includes:
Operating unit 44 for carrying out deduplication operation to the keyword in the keyword set, obtains the pass after duplicate removal Keyword.
Wherein, if the deduplication operation is that keyword is duplicated in keyword, one in duplicate key word is only retained A keyword.
Further, the generation unit 43 includes:
Module 4301 is established, for establishing data list according to the row headers and column heading;
Wherein, the data list can be matrix form or form, if matrix form row headers and Column heading can occur in the form of vectors, and if form, row headers and column heading are also directly established in the gauge outfit position of form It puts, the embodiment of the present invention is not specifically limited.
Add module 4302, for being added associated data in the default position of the data list according to preset loop function In putting, incidence relation data are obtained, the associated data includes the association frequency or incidence relation data value.
Wherein, the times or frequency that occurs for keyword between of the association frequency, the incidence relation be keyword it Between with the presence or absence of relation, if in the presence of can be indicated with 1, the predeterminated position is that the keyword in row headers is marked with row Corresponding position when keyword in topic has association, the preset loop function are that will add in the corresponding position of each keyword Add corresponding associated data, so that there are corresponding numbers in each row headers position corresponding with the keyword in column heading According to.
Further, the add module 4302 includes:
Judging submodule 430201, for judging to whether there is incidence number between row headers keyword corresponding with column heading According to;
Submodule 430202 is added, if judging row headers keyword corresponding with column heading for judging submodule 430201 Between there are associated data, then to add associated data in row headers position corresponding with column heading;
Submodule 430202 is added, if being additionally operable to judging submodule 430201 judges row headers key corresponding with column heading There is no associated data between word, then to add predetermined threshold value in row headers position corresponding with column heading.
Wherein, the predetermined threshold value can be 0 for representing that association is not present between keyword, or NULL, this Inventive embodiments are not specifically limited.
Further, described device further includes:
Receiving unit 45 generates data command for receiving, and the generation data command is used to indicate the generation association Relation data.
Wherein, the generation data command is used to indicate the generation incidence relation data.
It is provided in an embodiment of the present invention another kind data generating means, first obtain keyword set in keyword and Associated data between different keywords, then keyword configuration row headers and column heading in the keyword set, Further according to the row headers and column heading and associated data generation incidence relation data.Most of commercialization is being utilized to existing When software carries out the association analysis of keyword, operating cost is excessive to be compared, and the present invention is row, column mark by generating a keyword Topic, associated data are the incidence relation data of internal data, reduce the cost that business software carries out the association analysis of keyword, real The incidence relation of keyword is now intuitively shown, so as to improve the analysis efficiency of incidence relation.
The generating means of the data include processor and memory, above-mentioned acquiring unit, dispensing unit and generation unit Deng be used as program unit storage in memory, above procedure unit stored in memory is performed by processor to realize Corresponding function.
Comprising kernel in processor, gone in memory to transfer corresponding program unit by kernel.Kernel can set one Or more, it is solved by adjusting kernel parameter when carrying out the association analysis of keyword using most of business software, operation The problem of cost is excessive.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM), memory includes at least one deposit Store up chip.
It is first when being performed on data processing equipment, being adapted for carrying out present invention also provides a kind of computer program product The program code of beginningization there are as below methods step:Obtain the incidence number between keyword and the different keywords in keyword set According to;Keyword configuration row headers and column heading in the keyword set;According to the row headers and column heading and institute State associated data generation incidence relation data.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or computer program Product.Therefore, the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware can be used in the application Apply the form of example.Moreover, the computer for wherein including computer usable program code in one or more can be used in the application The computer program production that usable storage medium is implemented on (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of product.
The application is with reference to the flow according to the method for the embodiment of the present application, equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that it can be realized by computer program instructions every first-class in flowchart and/or the block diagram The combination of flow and/or box in journey and/or box and flowchart and/or the block diagram.These computer programs can be provided The processor of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce A raw machine so that the instruction performed by computer or the processor of other programmable data processing devices is generated for real The device for the function of being specified in present one flow of flow chart or one box of multiple flows and/or block diagram or multiple boxes.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that the instruction generation being stored in the computer-readable memory includes referring to Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or The function of being specified in multiple boxes.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted Series of operation steps is performed on calculation machine or other programmable devices to generate computer implemented processing, so as in computer or The instruction offer performed on other programmable devices is used to implement in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in a box or multiple boxes.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable Jie The example of matter.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only memory (CD-ROM), Digital versatile disc (DVD) or other optical storages, magnetic tape cassette, the storage of tape magnetic rigid disk or other magnetic storage apparatus Or any other non-transmission medium, the information that can be accessed by a computing device available for storage.It defines, calculates according to herein Machine readable medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It these are only embodiments herein, be not limited to the application.To those skilled in the art, The application can have various modifications and variations.All any modifications made within spirit herein and principle, equivalent substitution, Improve etc., it should be included within the scope of claims hereof.

Claims (10)

1. a kind of generation method of data, which is characterized in that including:
Obtain the associated data between keyword and the different keywords in keyword set;
Keyword configuration row headers and column heading in the keyword set;
According to the row headers and column heading and associated data generation incidence relation data.
2. according to the method described in claim 1, it is characterized in that, described configure row headers and column heading according to the keyword Before, the method further includes:
Deduplication operation is carried out to the keyword in the keyword set, obtains the keyword after duplicate removal.
It is 3. according to the method described in claim 2, it is characterized in that, described according to the row headers and column heading and the association Data generation associated data includes:
Data list is established according to the row headers and column heading;
Associated data is added in the predeterminated position of the data list according to preset loop function, obtains incidence relation number According to the associated data includes the association frequency or incidence relation data value.
4. according to the method described in claim 3, it is characterized in that, described add associated data according to preset loop function In the predeterminated position of the data list, obtaining incidence relation data includes:
Judge to whether there is associated data between row headers keyword corresponding with column heading;
If so, to add associated data in row headers position corresponding with column heading;
If it is not, it is then to add predetermined threshold value in row headers position corresponding with column heading.
5. according to the method described in claim 4, it is characterized in that, the keyword obtained in keyword set and different passes Before associated data between keyword, the method further includes:
Generation data command is received, the generation data command is used to indicate the generation incidence relation data.
6. a kind of generating means of data, which is characterized in that including:
Acquiring unit, for obtaining the associated data between keyword and different keywords in keyword set;
Dispensing unit, for the keyword configuration row headers and column heading in the keyword set;
Generation unit, for generating incidence relation data according to the row headers and column heading and the associated data.
7. device according to claim 6, which is characterized in that described device further includes:
Operating unit for carrying out deduplication operation to the keyword in the keyword set, obtains the keyword after duplicate removal.
8. device according to claim 7, which is characterized in that the generation unit includes:
Module is established, for establishing data list according to the row headers and column heading;
Add module for being added associated data in the predeterminated position of the data list according to preset loop function, obtains To incidence relation data, the associated data includes the association frequency or incidence relation data value.
9. device according to claim 8, which is characterized in that the add module includes:Judging submodule adds submodule Block,
The judging submodule, for judging to whether there is associated data between row headers keyword corresponding with column heading;
The addition submodule, if judging there is association between row headers keyword corresponding with column heading for judging submodule Data, then to add associated data in row headers position corresponding with column heading;
The addition submodule judges to be not present between row headers keyword corresponding with column heading if being additionally operable to judging submodule Associated data, then to add predetermined threshold value in row headers position corresponding with column heading.
10. device according to claim 9, which is characterized in that described device further includes:
Receiving unit generates data command for receiving, and the generation data command is used to indicate the generation incidence relation number According to.
CN201610981154.2A 2016-11-08 2016-11-08 The generation method and device of data Pending CN108073563A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610981154.2A CN108073563A (en) 2016-11-08 2016-11-08 The generation method and device of data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610981154.2A CN108073563A (en) 2016-11-08 2016-11-08 The generation method and device of data

Publications (1)

Publication Number Publication Date
CN108073563A true CN108073563A (en) 2018-05-25

Family

ID=62153268

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610981154.2A Pending CN108073563A (en) 2016-11-08 2016-11-08 The generation method and device of data

Country Status (1)

Country Link
CN (1) CN108073563A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0373023A (en) * 1989-08-14 1991-03-28 Toshiba Corp Knowledge registering device for expert system
US20060026498A1 (en) * 2004-07-30 2006-02-02 Microsoft Corporation Systems and methods for controlling report properties based on aggregate scope
CN101223525A (en) * 2005-06-06 2008-07-16 加利福尼亚大学董事会 Relationship networks
CN103593445A (en) * 2013-11-15 2014-02-19 北京国双科技有限公司 Data filling method and device
CN103765422A (en) * 2012-07-31 2014-04-30 乐天株式会社 Information processing device, information processing method, and information processing system
CN104123349A (en) * 2014-07-09 2014-10-29 昆明理工大学 A Method of Feature Extraction Based on Correlation Knowledge
CN104169948A (en) * 2012-03-15 2014-11-26 赛普特系统有限公司 Method, device and product for text semantic processing

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0373023A (en) * 1989-08-14 1991-03-28 Toshiba Corp Knowledge registering device for expert system
US20060026498A1 (en) * 2004-07-30 2006-02-02 Microsoft Corporation Systems and methods for controlling report properties based on aggregate scope
CN101223525A (en) * 2005-06-06 2008-07-16 加利福尼亚大学董事会 Relationship networks
CN104169948A (en) * 2012-03-15 2014-11-26 赛普特系统有限公司 Method, device and product for text semantic processing
CN103765422A (en) * 2012-07-31 2014-04-30 乐天株式会社 Information processing device, information processing method, and information processing system
CN103593445A (en) * 2013-11-15 2014-02-19 北京国双科技有限公司 Data filling method and device
CN104123349A (en) * 2014-07-09 2014-10-29 昆明理工大学 A Method of Feature Extraction Based on Correlation Knowledge

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘玉琴等: "学术关联关系可视化系统设计与实现", 《图书情报工作》 *

Similar Documents

Publication Publication Date Title
US11947556B1 (en) Computerized monitoring of a metric through execution of a search query, determining a root cause of the behavior, and providing a notification thereof
US11620300B2 (en) Real-time measurement and system monitoring based on generated dependency graph models of system components
Hansen et al. Newton-based optimization for Kullback–Leibler nonnegative tensor factorizations
CN111666296A (en) SQL data real-time processing method and device based on Flink, computer equipment and medium
CN106201861A (en) The detection method of a kind of code quality and device
Torgo An infra-structure for performance estimation and experimental comparison of predictive models in r
CN112860777B (en) Data processing method, device and equipment
Bowes et al. DConfusion: a technique to allow cross study performance evaluation of fault prediction studies
CN111813739A (en) Data migration method and device, computer equipment and storage medium
CN110889272A (en) Data processing method, device, equipment and storage medium
EP4276623A1 (en) Sorting device and method
US9020954B2 (en) Ranking supervised hashing
Hooshmand et al. Efficient constraint reduction in multistage stochastic programming problems with endogenous uncertainty
Salih et al. Data quality issues in big data: a review
CN106033390A (en) Mail style testing method and apparatus
CN106250110A (en) Set up the method and device of model
CN103064991A (en) Mass data clustering method
Mathew et al. Composing smart data services in shop floors through large language models
CN113918437B (en) User behavior data analysis method, device, computer equipment and storage medium
Niu et al. A distributed stochastic proximal-gradient algorithm for composite optimization
US20030023951A1 (en) MATLAB toolbox for advanced statistical modeling and data analysis
CN109344373A (en) Report-generating method and terminal device based on intelligent Matching
CN108073563A (en) The generation method and device of data
CN106843819A (en) The method and device of object serialization
Shi A hierarchical model for community identification in complex networks through modularity and genetic algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100083 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: Beijing Guoshuang Technology Co.,Ltd.

Address before: 100086 Beijing city Haidian District Shuangyushu Area No. 76 Zhichun Road cuigongfandian 8 layer A

Applicant before: Beijing Guoshuang Technology Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20180525

RJ01 Rejection of invention patent application after publication