CN105430504A - Method and System for Family Member Structure Recognition Based on TV Watching Log Mining - Google Patents
Method and System for Family Member Structure Recognition Based on TV Watching Log Mining Download PDFInfo
- Publication number
- CN105430504A CN105430504A CN201510852355.8A CN201510852355A CN105430504A CN 105430504 A CN105430504 A CN 105430504A CN 201510852355 A CN201510852355 A CN 201510852355A CN 105430504 A CN105430504 A CN 105430504A
- Authority
- CN
- China
- Prior art keywords
- daily record
- record data
- program category
- program
- viewing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000005065 mining Methods 0.000 title claims abstract description 14
- 230000006399 behavior Effects 0.000 claims description 12
- 238000009826 distribution Methods 0.000 claims description 9
- 238000005520 cutting process Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 5
- 241001269238 Data Species 0.000 claims description 4
- 230000008569 process Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 238000003860 storage Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 235000012364 Peperomia pellucida Nutrition 0.000 description 1
- 240000007711 Peperomia pellucida Species 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/4508—Management of client data or end-user data
- H04N21/4532—Management of client data or end-user data involving end-user characteristics, e.g. viewer profile, preferences
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/466—Learning process for intelligent management, e.g. learning user preferences for recommending movies
- H04N21/4661—Deriving a combined profile for a plurality of end-users of the same client, e.g. for family members within a home
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/466—Learning process for intelligent management, e.g. learning user preferences for recommending movies
- H04N21/4662—Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms
- H04N21/4665—Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms involving classification methods, e.g. Decision trees
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/466—Learning process for intelligent management, e.g. learning user preferences for recommending movies
- H04N21/4667—Processing of monitored end-user data, e.g. trend analysis based on the log file of viewer selections
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
- H04N21/466—Learning process for intelligent management, e.g. learning user preferences for recommending movies
- H04N21/4668—Learning process for intelligent management, e.g. learning user preferences for recommending movies for recommending content, e.g. movies
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The invention discloses a family member mix identification method and system based on television watching log mining. The method comprises the steps of acquiring log data of television programs watched by a user, wherein the log data comprises following data content: user operation time, program type, and watching time of the program type; dividing the log data by time period; extracting the program type and the proportion of the watching time of the program type from each period of log data and taking the program type and the proportion of the watching time of the program type as a feature of a family member; clustering according to features of family members; and determining the mix of family members by use of a clustering result. Through adoption of the method, the problem in the prior art that user preferences cannot be analyzed through a single account in the intelligent television field is solved. The invention also discloses a family member mix identification system based on television watching log mining.
Description
Technical field
The application relates to Smart Home field, particularly relates to a kind of kinsfolk's structural recognition method based on television-viewing Web log mining and system.
Background technology
Current Internet era, body identity correspond to an account one by one usually, as fields such as net purchase, bank safety account, game.But in the field that some is special, man-to-man account will greatly affect use and the experience effect of user, and such as some users may create multiple different account on a website, also have several user's shared in common to use the situation of same account.Such as, for the product such as intelligent television of Smart Home product, belong to typical account and share product, one can only all members in the use family of TV generally be operated by same account, select the TV programme oneself liked, account sharing problem Internet era spread over around our life.And the kinsfolk of Different age group often has different hobbies, like different TV programme, due to multi-user's account used in combination, content and service provider cannot assess the interest of user simply according to account, also cannot infer the behavioural habits of user according to account information and recommend better service for it.
For solving the problem, someone proposes to adopt the input character of user input account and password when logging in and incoming frequency to carry out clustering recognition management to the user of shared account, it thinks that the input character of each user is continuous print, and its incoming frequency is different from other people, therefore the keyboard behavior of knocking of each user can form one bunch, and the keyboard behavior of knocking of different users can form different bunches.Therefore some accounts quantity of child user of whether sharing and sharing can by produce bunch number come roughly to estimate.If formed bunch number be greater than one, just can judge that the account is shared account.But it is for this intelligent home device not needing to use username and password to log in completely of intelligent television, obvious and improper by identifying the method for incoming frequency.
Summary of the invention
The embodiment of the present application provides a kind of kinsfolk's structural recognition method based on television-viewing Web log mining and system, in order to solve the problem of intelligent television None-identified kinsfolk and hobby program thereof without the need to account inputs in prior art.
The embodiment of the present application adopts following technical proposals:
Based on kinsfolk's structural recognition method of television-viewing Web log mining, comprising:
Obtain the daily record data that user watches TV programme, described daily record data comprises following data content: user operation time, program category, program category viewing time;
Divide described daily record data on a time period;
Extract program category in every section of daily record data and this program category viewing time ratio feature as one family member;
Cluster is carried out according to described kinsfolk's feature;
Cluster result is utilized to determine the distribution situation of member's structure in family.
Based on kinsfolk's texture recognition methods of television-viewing Web log mining, comprising:
Acquiring unit, watches the daily record data of TV programme for obtaining user, described daily record data comprises following data content: user operation time, program category, program category viewing time;
Cutting unit, for dividing described daily record data on a time period;
Extraction unit, for extracting program category in every section of daily record data and this program category viewing time ratio feature as one family member;
Cluster cell, for carrying out cluster according to described kinsfolk's feature;
Recognition unit, for the distribution situation utilizing cluster result to determine member's structure in family.
At least one technical scheme above-mentioned that the embodiment of the present application adopts can reach following beneficial effect: by the cluster analysis process to television-viewing daily record data in one family, identify the program category of hobby separately between this kinsfolk, the foundation of recommended program is provided as content supplier, solves in prior art and cannot log in by unique account or account the problem analyzing user preferences at this special dimension of intelligent television.
Accompanying drawing explanation
Accompanying drawing described herein is used to provide further understanding of the present application, and form a application's part, the schematic description and description of the application, for explaining the application, does not form the improper restriction to the application.In the accompanying drawings:
The kinsfolk's structural recognition method flow chart based on television-viewing Web log mining that Fig. 1 provides for the embodiment of the present application one;
The kinsfolk's texture recognition methods schematic diagram based on television-viewing Web log mining that Fig. 2 provides for the embodiment of the present application two.
Embodiment
For making the object of the application, technical scheme and advantage clearly, below in conjunction with the application's specific embodiment and corresponding accompanying drawing, technical scheme is clearly and completely described.Obviously, described embodiment is only some embodiments of the present application, instead of whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not making the every other embodiment obtained under creative work prerequisite, all belong to the scope of the application's protection.
Below in conjunction with accompanying drawing, describe the technical scheme that each embodiment of the application provides in detail.
The fast development of the Internet, these traditional industries of TV are made no longer to be confined to the play content of TV station, along with popularizing of various TV box, TV incorporates the Internet completely as a part for Smart Home, user can choose at random the programme content of viewing oneself hobby, changes TV station and plays the traditional mode what what user just sees.
Embodiment 1
The flow chart of the kinsfolk's structural recognition method based on television-viewing Web log mining that Fig. 1 provides for the embodiment of the present application 1, mainly refer to by the finishing analysis to program viewing daily record, identify this kinsfolk's structure, and to the programme content that user recommends it to like within the concrete time period.Comprise the following steps:
S101: obtain the daily record data that user watches TV programme.
The daily record data watching TV programme described in this step refers to that in family, user watches TV and the log file produced within a certain period of time, obtained from the log file data storehouse of intelligent television content and service provider by certain index rule, or obtain in log file data storehouse from the local devices such as TV box, described index rule is the crucial term of a certain home address of mark, IP address etc. residing for this home videos.
Described daily record data comprises the contents such as user operation time, program viewing time, programm name, program category.One family comprises at least one kinsfolk.
S102: daily record data described in preliminary treatment.
Preliminary treatment described in this step refers to the data content retaining associated subscriber operating time, program viewing time, programm name, program category in daily record data, and deletes all the other incoherent data contents in daily record data.Meanwhile, the daily record data containing following situation is deleted: containing the daily record data lacking item and error message.
In other embodiments, this step also can be omitted, and e.g., when the daily record data obtained in step S101 has been the data meeting established rule, then this step is omitted.
S103: divide described daily record data on a time period.
This step refers to carries out time period division according to the associated subscriber operating time in daily record data to the daily record data that user watches TV programme, specifically refer to open according to user, time interval of closing television operation carries out time period division to the daily record data that user watches TV programme; User is opened, viewing daily record data that in this time period of closing television, user watches daily record data some kinsfolks in family of TV programme; Difference is opened, daily record data in closing television operating time section as the viewing daily record data of different home member, and the daily record data in different time sections is without any correlation.So, according to the described user operation time, described daily record data is divided into the some sections of daily record datas that are not mutually related, every section of daily record data represents one family member.
S104: extract program category in every section of daily record data and this program category viewing time ratio feature as one family member.
Particularly, viewing time according to same program type in every section of daily record data is added and gathers the viewing time summation obtaining a certain program category, the viewing time summation of the viewing time summation of described a certain program category divided by this section of daily record data is obtained watching this type joint accounting object time in this daily record data time period.Using the feature of the viewing time ratio of the program category in this daily record data time period and this program category as a keyword set vector representation one family member:
t
i={(e
1,n
1),(e
2,n
2),...}
Wherein, e
irepresent program category, n
ibe this program category within the daily record data time period viewing time ratio.
S105: carry out cluster according to described kinsfolk's feature.
Utilize word frequency-reverse document frequency (TF-IDF) formulae discovery to go out the weight of each program category in document, thus generating feature vector is as follows:
d
i={(e
1,w
1,i),(e
2,w
2,i),...}
Wherein, e
irepresent television program type, w
i,jtelevision program type e
iat document d
jthe weight of middle correspondence.Wherein:
w
i,j=TF
i,j*IDF
i
Wherein, TF
i,jrepresent program category feature e
iat document d
jthe frequency of middle appearance; IDF
ithe tolerance of a word general importance.
Wherein, n
i,jrepresent program category feature e
iat d
jthe frequency of middle appearance.
Wherein, | D| represents the total number of documents of program category feature; N (i) expression comprises program category feature e
inumber of documents.
Program viewing behavior similarity between different home member is calculated by the cosine similarity between characteristic vector, as follows:
Wherein, W
ijrepresentation feature vector d
iwith characteristic vector d
jbetween cosine similarity.
This step is not limited to adopt cosine similarity to calculate the similarity between different user, and other method for measuring similarity and distance metric method also all can be applicable to the application.
Adopt the Text Clustering Method (as KMeans method) based on vector space model, the kinsfolk with same or similar program viewing behavior is divided into a class, is considered as same class kinsfolk.The application is not limited to adopt the Text Clustering Method based on vector space model, and other clustering methods also all can be applicable to the application.
S106: utilize cluster result to determine the distribution situation of member's structure in family.
After step S105 cluster, if cluster result W
ija bunch number be greater than 1, then the kinsfolk that there is different program viewing behaviors is described in this family, and bunch number is considered as having in this family kinsfolk's number of different television program type viewing hobby; If cluster result W
ija bunch number equal 1, then illustrate in this family and only have a class kinsfolk or only have one family member, they for television program type viewing hobby similar.
The various TV programme that content of TV program provider can recommend it to like to this family according to kinsfolk's structure.
Embodiment 2
The schematic diagram of the kinsfolk's texture recognition methods based on television-viewing Web log mining that Fig. 2 provides for the embodiment of the present application 2, mainly refer to by the finishing analysis to program viewing daily record, identify this kinsfolk's structure, and to the programme content that user recommends it to like within the concrete time period.Specifically comprise:
Specifically comprise:
Acquiring unit 201, watches the daily record data of TV programme for obtaining user;
Pretreatment unit 202, for daily record data described in preliminary treatment;
Cutting unit 203, for dividing described daily record data on a time period;
Extraction unit 204, for extracting program category in every section of daily record data and this program category viewing time ratio feature as one family member;
Cluster cell 205, for carrying out cluster according to described kinsfolk's feature;
Recognition unit 206, for the distribution situation utilizing cluster result to determine member's structure in family.
The daily record data of described viewing TV programme refers to that in family, user watches TV and the log file produced within a certain period of time, described acquiring unit 201 refer to by certain index rule obtain from the log file data storehouse of intelligent television content and service provider, or obtain in log file data storehouse from the local devices such as TV box, described index rule is the crucial term of a certain home address of mark, IP address etc. residing for this home videos.
Described daily record data comprises the contents such as user operation time, program viewing time, programm name, program category.One family comprises at least one kinsfolk.
Described pretreatment unit 202 for retaining in daily record data the data content of associated subscriber operating time, program viewing time, programm name, program category, and deletes all the other incoherent data contents in daily record data.Meanwhile, the daily record data containing following situation is deleted: containing the daily record data lacking item and error message.
In other embodiments, described pretreatment unit also can omit or be incorporated to acquiring unit 201, e.g., when the daily record data obtained in acquiring unit 201 has been the data meeting established rule, then and this omission pretreatment unit 202.
Described cutting unit 203 for carrying out time period division according to the associated subscriber operating time in daily record data to the daily record data that user watches TV programme, specifically refer to open according to user, time interval of closing television operation carries out time period division to the daily record data that user watches TV programme; User is opened, viewing daily record data that in this time period of closing television, user watches daily record data some kinsfolks in family of TV programme; Difference is opened, daily record data in closing television operating time section as the viewing daily record data of different home member, and the daily record data in different time sections is without any correlation.So, according to the described user operation time, described daily record data is divided into the some sections of daily record datas that are not mutually related, every section of daily record data represents one family member.
Described extraction unit 204 is for extracting program category in every section of daily record data and this program category viewing time ratio feature as one family member.Particularly, viewing time according to same program type in every section of daily record data is added and gathers the viewing time summation obtaining a certain program category, the viewing time summation of the viewing time summation of described a certain program category divided by this section of daily record data is obtained watching this type joint accounting object time in this daily record data time period.Using the feature of the viewing time ratio of the program category in this daily record data time period and this program category as a keyword set vector representation one family member:
t
i={(e
1,n
1),(e
2,n
2),...}
Wherein, e
irepresent program category, n
ibe this program category within the daily record data time period viewing time ratio.
Described cluster cell 205 is for carrying out cluster according to described kinsfolk's feature.Refer to the weight utilizing word frequency-reverse document frequency (TF-IDF) formulae discovery to go out each program category in document, thus generating feature vector is as follows:
d
i={(e
1,w
1,i),(e
2,w
2,i),...}
Wherein, e
irepresent television program type, w
i,jtelevision program type e
iat document d
jthe weight of middle correspondence.Wherein:
w
i,j=TF
i,j*IDF
i
Wherein, TF
i,jrepresent program category feature e
iat document d
jthe frequency of middle appearance; IDF
ithe tolerance of a word general importance.
Wherein, n
i,jrepresent program category feature e
iat d
jthe frequency of middle appearance.
Wherein, | D| represents the total number of documents of program category feature; N (i) expression comprises program category feature e
inumber of documents.
Program viewing behavior similarity between different home member is calculated by the cosine similarity between characteristic vector, as follows:
Wherein, W
ijrepresentation feature vector d
iwith characteristic vector d
jbetween cosine similarity.
The application is not limited to adopt cosine similarity to calculate the similarity between different user, and other method for measuring similarity and distance metric method also all can be applicable to the application.
Adopt the Text Clustering Method (as KMeans method) based on vector space model, the kinsfolk with same or similar program viewing behavior is divided into a class, is considered as same class kinsfolk.The application is not limited to adopt the Text Clustering Method based on vector space model, and other clustering methods also all can be applicable to the application.
The distribution situation of described recognition unit 206 for utilizing cluster result to determine member's structure in family.Even cluster result W
ija bunch number be greater than 1, then the kinsfolk that there is different program viewing behaviors is described in this family, and bunch number is considered as having in this family kinsfolk's number of different television program type viewing hobby; If cluster result W
ija bunch number equal 1, then illustrate in this family and only have a class kinsfolk or only have one family member, they for television program type viewing hobby similar.
The various TV programme that content of TV program provider can recommend it to like to this family according to kinsfolk's structure.
It should be noted that, the executive agent of each step of an embodiment supplying method can be all same equipment, or, the method also by distinct device as executive agent.
Those skilled in the art should understand, embodiments of the invention can be provided as method, system or computer program.Therefore, the present invention can adopt the form of complete hardware embodiment, completely software implementation or the embodiment in conjunction with software and hardware aspect.And the present invention can adopt in one or more form wherein including the upper computer program implemented of computer-usable storage medium (including but not limited to magnetic disc store, CD-ROM, optical memory etc.) of computer usable program code.
The present invention describes with reference to according to the flow chart of the method for the embodiment of the present invention, equipment (system) and computer program and/or block diagram.Should understand can by the combination of the flow process in each flow process in computer program instructions realization flow figure and/or block diagram and/or square frame and flow chart and/or block diagram and/or square frame.These computer program instructions can being provided to the processor of all-purpose computer, special-purpose computer, Embedded Processor or other Programmable object cluster equipment to produce a machine, making the instruction performed by the processor of computer or other Programmable object cluster equipment produce device for realizing the function of specifying in flow chart flow process or multiple flow process and/or block diagram square frame or multiple square frame.
These computer program instructions also can be stored in can in the computer-readable memory that works in a specific way of vectoring computer or other Programmable object cluster equipment, the instruction making to be stored in this computer-readable memory produces the manufacture comprising command device, and this command device realizes the function of specifying in flow chart flow process or multiple flow process and/or block diagram square frame or multiple square frame.
These computer program instructions also can be loaded on computer or other Programmable object cluster equipment, make on computer or other programmable devices, to perform sequence of operations step to produce computer implemented process, thus the instruction performed on computer or other programmable devices is provided for the step realizing the function of specifying in flow chart flow process or multiple flow process and/or block diagram square frame or multiple square frame.
In one typically configuration, computing equipment comprises one or more processor (CPU), input/output interface, network interface and internal memory.
Internal memory may comprise the volatile memory in computer-readable medium, and the forms such as random access memory (RAM) and/or Nonvolatile memory, as read-only memory (ROM) or flash memory (flashRAM).Internal memory is the example of computer-readable medium.
Computer-readable medium comprises permanent and impermanency, removable and non-removable media can be stored to realize information by any method or technology.Information can be computer-readable instruction, data structure, the module of program or other data.The example of the storage medium of computer comprises, but be not limited to phase transition internal memory (PRAM), static RAM (SRAM), dynamic random access memory (DRAM), the random access memory (RAM) of other types, read-only memory (ROM), Electrically Erasable Read Only Memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical storage, magnetic cassette tape, tape magnetic rigid disk stores or other magnetic storage apparatus or any other non-transmitting medium, can be used for storing the information can accessed by computing equipment.According to defining herein, computer-readable medium does not comprise temporary computer readable media (transitorymedia), as data-signal and the carrier wave of modulation.
Also it should be noted that, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, commodity or equipment and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, commodity or equipment.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, commodity or the equipment comprising described key element and also there is other identical element.
The foregoing is only the embodiment of the application, be not limited to the application.To those skilled in the art, the application can have various modifications and variations.Any amendment done within all spirit in the application and principle, equivalent replacement, improvement etc., within the right that all should be included in the application.
Claims (10)
1., based on kinsfolk's structural recognition method of television-viewing Web log mining, it is characterized in that, comprising:
Obtain the daily record data that user watches TV programme, described daily record data comprises following data content: user operation time, program category, program category viewing time;
Divide described daily record data on a time period;
Extract program category in every section of daily record data and this program category viewing time ratio feature as one family member;
Cluster is carried out according to described kinsfolk's feature;
Cluster result is utilized to determine the distribution situation of member's structure in family.
2. the method for claim 1, is characterized in that, divides described daily record data on a time period, comprising:
According to the user operation time in described daily record data, described daily record data is divided into the some sections of daily record datas that are not mutually related, the described user operation time refers to that user opens, the time of closing television operation, and every section of daily record data is that user turns on TV to the daily record data in the closing television time period;
Every section of daily record data is expressed as one family member.
3. method as claimed in claim 2, is characterized in that, extracts program category in every section of daily record data and this program category viewing time ratio as the feature of one family member, comprising:
Viewing time according to same program type in every section of daily record data is added and gathers the viewing time summation obtaining a certain program category, the viewing time summation of the viewing time summation of described a certain program category divided by this section of daily record data is obtained watching this type joint accounting object time in this daily record data time period;
Using the feature of the viewing time ratio of the program category in this daily record data time period and this program category as a keyword set vector representation one family member:
T
i={ (e
1, n
1), (e
2, n
2) ..., wherein, e
irepresent program category, n
ibe this program category within the daily record data time period viewing time ratio.
4. method as claimed in claim 3, is characterized in that, carry out cluster, comprising according to described kinsfolk's feature:
Utilize word frequency-reverse document frequency (TF-IDF) formulae discovery to go out the weight of each program category in document, and generating feature vector is as follows:
D
i={ (e
1,w
1, i), (e
2,w
2, i) ..., wherein, e
irepresent television program type, w
i,jtelevision program type e
iat document d
jthe weight of middle correspondence; Wherein:
W
i,j=TF
i,j* IDF
i, wherein, TF
i,jrepresent program category feature e
iat document d
jthe frequency of middle appearance; IDF
ithe tolerance of a word general importance;
wherein, n
i,jrepresent program category feature e
iat d
jthe frequency of middle appearance;
wherein, | D| represents the total number of documents of program category feature; N (i) expression comprises program category feature e
inumber of documents;
Weight characteristic vector according to each program category calculates program viewing behavior similarity between different home member by cosine similarity:
wherein, W
ijrepresentation feature vector d
iwith characteristic vector d
jbetween cosine similarity.
5. method as claimed in claim 4, is characterized in that, utilize cluster result to determine the distribution situation of member's structure in family, comprising:
If cluster result W
ija bunch number be greater than 1, then the kinsfolk that there is different program viewing behaviors is described in this family, and bunch number is considered as having in this family kinsfolk's number of different television program type viewing hobby;
If cluster result W
ija bunch number equal 1, then illustrate in this family and only have a class kinsfolk or only have one family member, they for television program type viewing hobby similar.
6., based on kinsfolk's texture recognition methods of television-viewing Web log mining, it is characterized in that, comprising:
Acquiring unit, watches the daily record data of TV programme for obtaining user, described daily record data comprises following data content: user operation time, program category, program category viewing time;
Cutting unit, for dividing described daily record data on a time period;
Extraction unit, for extracting program category in every section of daily record data and this program category viewing time ratio feature as one family member;
Cluster cell, for carrying out cluster according to described kinsfolk's feature;
Recognition unit, for the distribution situation utilizing cluster result to determine member's structure in family.
7. system as claimed in claim 6, is characterized in that, described cutting unit, for dividing described daily record data on a time period, comprising:
According to the user operation time in described daily record data, described daily record data is divided into the some sections of daily record datas that are not mutually related, the described user operation time refers to that user opens, the time of closing television operation, and every section of daily record data is that user turns on TV to the daily record data in the closing television time period;
Every section of daily record data is expressed as one family member.
8. system as claimed in claim 7, is characterized in that, described extraction unit, for extracting program category in every section of daily record data and this program category viewing time ratio as the feature of one family member, comprising:
Viewing time according to same program type in every section of daily record data is added and gathers the viewing time summation obtaining a certain program category, the viewing time summation of the viewing time summation of described a certain program category divided by this section of daily record data is obtained watching this type joint accounting object time in this daily record data time period;
Using the feature of the viewing time ratio of the program category in this daily record data time period and this program category as a keyword set vector representation one family member:
T
i={ (e
1, n
1), (e
2, n
2) ..., wherein, e
irepresent program category, n
ibe this program category within the daily record data time period viewing time ratio.
9. system as claimed in claim 8, is characterized in that, described cluster cell, for carrying out cluster according to described kinsfolk's feature, comprising:
Utilize word frequency-reverse document frequency (TF-IDF) formulae discovery to go out the weight of each program category in document, and generating feature vector is as follows:
D
i={ (e
1, w
1, i), (e
2, w
2, i) ..., wherein, e
irepresent television program type, w
i,jtelevision program type e
iat document d
jthe weight of middle correspondence; Wherein:
W
i,j=TF
i,j* IDF
i, wherein, TF
i,jrepresent program category feature e
iat document d
jthe frequency of middle appearance; IDF
ithe tolerance of a word general importance;
wherein, n
i,jrepresent program category feature e
iat d
jthe frequency of middle appearance;
wherein, | D| represents the total number of documents of program category feature; N (i) expression comprises program category feature e
inumber of documents;
Weight characteristic vector according to each program category calculates program viewing behavior similarity between different home member by cosine similarity:
wherein, W
ijrepresentation feature vector d
iwith characteristic vector d
jbetween cosine similarity.
10. system as claimed in claim 9, is characterized in that, described recognition unit, for the distribution situation utilizing cluster result to determine member's structure in family, comprising:
If cluster result W
ija bunch number be greater than 1, then the kinsfolk that there is different program viewing behaviors is described in this family, and bunch number is considered as having in this family kinsfolk's number of different television program type viewing hobby;
If cluster result W
ija bunch number equal 1, then illustrate in this family and only have a class kinsfolk or only have one family member, they for television program type viewing hobby similar.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510852355.8A CN105430504B (en) | 2015-11-27 | 2015-11-27 | Kinsfolk's structural recognition method and system based on television-viewing Web log mining |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510852355.8A CN105430504B (en) | 2015-11-27 | 2015-11-27 | Kinsfolk's structural recognition method and system based on television-viewing Web log mining |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN105430504A true CN105430504A (en) | 2016-03-23 |
| CN105430504B CN105430504B (en) | 2019-04-02 |
Family
ID=55508387
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201510852355.8A Active CN105430504B (en) | 2015-11-27 | 2015-11-27 | Kinsfolk's structural recognition method and system based on television-viewing Web log mining |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN105430504B (en) |
Cited By (23)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106028126A (en) * | 2016-05-17 | 2016-10-12 | Tcl集团股份有限公司 | Program pushing method and system |
| CN106791983A (en) * | 2016-12-23 | 2017-05-31 | Tcl集团股份有限公司 | A kind of intelligent television user behavior analysis method and system |
| CN107071578A (en) * | 2017-05-24 | 2017-08-18 | 中国科学技术大学 | IPTV program commending methods |
| CN107230098A (en) * | 2016-03-25 | 2017-10-03 | 阿里巴巴集团控股有限公司 | Method and system is recommended in a kind of timesharing of business object |
| CN107741958A (en) * | 2017-09-20 | 2018-02-27 | 上海斐讯数据通信技术有限公司 | A kind of data processing method and system |
| WO2018196553A1 (en) * | 2017-04-27 | 2018-11-01 | 腾讯科技(深圳)有限公司 | Method and apparatus for obtaining identifier, storage medium, and electronic device |
| CN109429104A (en) * | 2017-09-04 | 2019-03-05 | 北京国双科技有限公司 | The analysis method and relevant apparatus of kinsfolk |
| CN109636481A (en) * | 2018-12-19 | 2019-04-16 | 未来电视有限公司 | User's portrait construction method and device towards domestic consumer |
| CN109977265A (en) * | 2019-03-30 | 2019-07-05 | 华南理工大学 | A kind of IPTV log user identification method based on user behavior characteristics |
| CN110020162A (en) * | 2017-12-14 | 2019-07-16 | 北京京东尚科信息技术有限公司 | User identification method and device |
| CN110227268A (en) * | 2018-03-06 | 2019-09-13 | 腾讯科技(深圳)有限公司 | A kind of method and device detecting violation game account number |
| CN110263196A (en) * | 2019-05-10 | 2019-09-20 | 南京旷云科技有限公司 | Image search method, device, electronic equipment and storage medium |
| CN111556369A (en) * | 2020-05-21 | 2020-08-18 | 四川省有线广播电视网络股份有限公司 | Television-based family classification method |
| CN111601166A (en) * | 2020-05-21 | 2020-08-28 | 广州欢网科技有限责任公司 | Method, device, storage medium and server for determining family member composition |
| CN111612538A (en) * | 2020-05-21 | 2020-09-01 | 广州欢网科技有限责任公司 | Advertisement valuation method, device, storage medium and server |
| CN112770181A (en) * | 2021-01-12 | 2021-05-07 | 贵州省广播电视信息网络股份有限公司 | Quick verification system and method for recommended content of family group |
| CN113553426A (en) * | 2021-03-11 | 2021-10-26 | 上海淘景立画信息技术有限公司 | Method, system, terminal and medium for representing sub-users of shared account |
| CN113569063A (en) * | 2021-07-28 | 2021-10-29 | 深圳Tcl新技术有限公司 | User analysis method, system, storage medium and terminal device |
| CN114095786A (en) * | 2021-11-17 | 2022-02-25 | 四川长虹电器股份有限公司 | Smart television user family member identification method based on community discovery algorithm |
| CN114268838A (en) * | 2021-12-15 | 2022-04-01 | 深圳市酷开网络科技股份有限公司 | Method and device for processing family member portrait based on OTT user portrait |
| CN115134668A (en) * | 2022-03-14 | 2022-09-30 | 深圳市酷开网络科技股份有限公司 | Method and device for dividing the age group and family structure of family members based on OTT |
| CN115988244A (en) * | 2022-12-20 | 2023-04-18 | 广州欢网科技有限责任公司 | Multimedia resource recommendation method and device |
| CN118200672A (en) * | 2024-04-23 | 2024-06-14 | 河北祥辉电子科技有限公司 | Intelligent recommendation method, device, equipment and storage medium for film and television programs |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120167141A1 (en) * | 2010-12-23 | 2012-06-28 | Microsoft Corporation | Electronic programming guide (epg) affinity clusters |
| CN103106615A (en) * | 2013-01-28 | 2013-05-15 | 上海交通大学 | Excavated user behavior analysis method based on television watching log |
| CN103533393A (en) * | 2013-09-17 | 2014-01-22 | 上海交通大学 | Family analyzing and program recommending method based on family watching records |
| CN103546773A (en) * | 2013-08-15 | 2014-01-29 | Tcl集团股份有限公司 | Television program recommendation method and system |
| CN104661055A (en) * | 2013-11-21 | 2015-05-27 | 中兴通讯股份有限公司 | Business recommendation method and device |
-
2015
- 2015-11-27 CN CN201510852355.8A patent/CN105430504B/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120167141A1 (en) * | 2010-12-23 | 2012-06-28 | Microsoft Corporation | Electronic programming guide (epg) affinity clusters |
| CN103106615A (en) * | 2013-01-28 | 2013-05-15 | 上海交通大学 | Excavated user behavior analysis method based on television watching log |
| CN103546773A (en) * | 2013-08-15 | 2014-01-29 | Tcl集团股份有限公司 | Television program recommendation method and system |
| CN103533393A (en) * | 2013-09-17 | 2014-01-22 | 上海交通大学 | Family analyzing and program recommending method based on family watching records |
| CN104661055A (en) * | 2013-11-21 | 2015-05-27 | 中兴通讯股份有限公司 | Business recommendation method and device |
Cited By (33)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107230098A (en) * | 2016-03-25 | 2017-10-03 | 阿里巴巴集团控股有限公司 | Method and system is recommended in a kind of timesharing of business object |
| CN106028126A (en) * | 2016-05-17 | 2016-10-12 | Tcl集团股份有限公司 | Program pushing method and system |
| CN106791983A (en) * | 2016-12-23 | 2017-05-31 | Tcl集团股份有限公司 | A kind of intelligent television user behavior analysis method and system |
| CN106791983B (en) * | 2016-12-23 | 2020-06-23 | Tcl科技集团股份有限公司 | Smart television user behavior analysis method and system |
| WO2018196553A1 (en) * | 2017-04-27 | 2018-11-01 | 腾讯科技(深圳)有限公司 | Method and apparatus for obtaining identifier, storage medium, and electronic device |
| CN107071578B (en) * | 2017-05-24 | 2019-11-22 | 中国科学技术大学 | IPTV program recommendation method |
| CN107071578A (en) * | 2017-05-24 | 2017-08-18 | 中国科学技术大学 | IPTV program commending methods |
| CN109429104B (en) * | 2017-09-04 | 2021-01-26 | 北京国双科技有限公司 | Family member analysis method and related device |
| CN109429104A (en) * | 2017-09-04 | 2019-03-05 | 北京国双科技有限公司 | The analysis method and relevant apparatus of kinsfolk |
| CN107741958A (en) * | 2017-09-20 | 2018-02-27 | 上海斐讯数据通信技术有限公司 | A kind of data processing method and system |
| CN110020162A (en) * | 2017-12-14 | 2019-07-16 | 北京京东尚科信息技术有限公司 | User identification method and device |
| CN110020162B (en) * | 2017-12-14 | 2021-09-03 | 北京京东尚科信息技术有限公司 | User identification method and device |
| CN110227268A (en) * | 2018-03-06 | 2019-09-13 | 腾讯科技(深圳)有限公司 | A kind of method and device detecting violation game account number |
| CN110227268B (en) * | 2018-03-06 | 2022-06-07 | 腾讯科技(深圳)有限公司 | Method and device for detecting illegal game account |
| CN109636481A (en) * | 2018-12-19 | 2019-04-16 | 未来电视有限公司 | User's portrait construction method and device towards domestic consumer |
| CN109977265A (en) * | 2019-03-30 | 2019-07-05 | 华南理工大学 | A kind of IPTV log user identification method based on user behavior characteristics |
| CN109977265B (en) * | 2019-03-30 | 2022-12-16 | 华南理工大学 | An IPTV log user identification method based on user behavior characteristics |
| CN110263196A (en) * | 2019-05-10 | 2019-09-20 | 南京旷云科技有限公司 | Image search method, device, electronic equipment and storage medium |
| CN110263196B (en) * | 2019-05-10 | 2022-05-06 | 南京旷云科技有限公司 | Image retrieval method, image retrieval device, electronic equipment and storage medium |
| CN111612538A (en) * | 2020-05-21 | 2020-09-01 | 广州欢网科技有限责任公司 | Advertisement valuation method, device, storage medium and server |
| CN111612538B (en) * | 2020-05-21 | 2021-07-20 | 广州欢网科技有限责任公司 | Advertisement valuation method, device, storage medium and server |
| CN111556369A (en) * | 2020-05-21 | 2020-08-18 | 四川省有线广播电视网络股份有限公司 | Television-based family classification method |
| CN111601166A (en) * | 2020-05-21 | 2020-08-28 | 广州欢网科技有限责任公司 | Method, device, storage medium and server for determining family member composition |
| CN112770181A (en) * | 2021-01-12 | 2021-05-07 | 贵州省广播电视信息网络股份有限公司 | Quick verification system and method for recommended content of family group |
| CN113553426A (en) * | 2021-03-11 | 2021-10-26 | 上海淘景立画信息技术有限公司 | Method, system, terminal and medium for representing sub-users of shared account |
| WO2023005445A1 (en) * | 2021-07-28 | 2023-02-02 | 深圳Tcl新技术有限公司 | User analysis method and system, storage medium, and terminal device |
| CN113569063A (en) * | 2021-07-28 | 2021-10-29 | 深圳Tcl新技术有限公司 | User analysis method, system, storage medium and terminal device |
| CN114095786A (en) * | 2021-11-17 | 2022-02-25 | 四川长虹电器股份有限公司 | Smart television user family member identification method based on community discovery algorithm |
| CN114268838A (en) * | 2021-12-15 | 2022-04-01 | 深圳市酷开网络科技股份有限公司 | Method and device for processing family member portrait based on OTT user portrait |
| CN114268838B (en) * | 2021-12-15 | 2023-12-26 | 深圳市酷开网络科技股份有限公司 | Family member portrait processing method and device based on OTT user portrait |
| CN115134668A (en) * | 2022-03-14 | 2022-09-30 | 深圳市酷开网络科技股份有限公司 | Method and device for dividing the age group and family structure of family members based on OTT |
| CN115988244A (en) * | 2022-12-20 | 2023-04-18 | 广州欢网科技有限责任公司 | Multimedia resource recommendation method and device |
| CN118200672A (en) * | 2024-04-23 | 2024-06-14 | 河北祥辉电子科技有限公司 | Intelligent recommendation method, device, equipment and storage medium for film and television programs |
Also Published As
| Publication number | Publication date |
|---|---|
| CN105430504B (en) | 2019-04-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN105430504A (en) | Method and System for Family Member Structure Recognition Based on TV Watching Log Mining | |
| JP6855595B2 (en) | Using machine learning to recommend live stream content | |
| Berjani et al. | A recommendation system for spots in location-based online social networks | |
| Zhang et al. | Cold-start recommendation using bi-clustering and fusion for large-scale social recommender systems | |
| CN103106285B (en) | Recommendation algorithm based on information security professional social network platform | |
| CN104111941A (en) | Method and equipment for information display | |
| CN108462888A (en) | The intelligent association analysis method and system of user's TV and internet behavior | |
| CN104615775A (en) | User recommendation method and device | |
| Agreste et al. | Analysis of a heterogeneous social network of humans and cultural objects | |
| Gao et al. | SeCo-LDA: Mining service co-occurrence topics for composition recommendation | |
| Villanueva et al. | SMORE: Towards a semantic modeling for knowledge representation on social media | |
| Dharmawan et al. | Book recommendation using Neo4j graph database in BibTeX book metadata | |
| Wang et al. | CROWN: a context-aware recommender for web news | |
| O'Doherty et al. | Towards trust inference from bipartite social networks | |
| CN105426744A (en) | Method and apparatus for setting password protection question | |
| Jansen et al. | Viewed by too many or viewed too little: Using information dissemination for audience segmentation | |
| Cremonesi et al. | Time-evolution of IPTV recommender systems | |
| Kanoje et al. | User profiling for recommendation system | |
| Perera et al. | Exploring the use of time-dependent cross-network information for personalized recommendations | |
| Tiroshi et al. | Graph-based recommendations: Make the most out of social data | |
| CN111831890B (en) | User similarity generation method, device, storage medium and computer equipment | |
| Sansonetti et al. | Dynamic social recommendation | |
| Zhao et al. | Exploiting homophily-based implicit social network to improve recommendation performance | |
| Amati et al. | Twitter: temporal events analysis | |
| Liu | Personalized recommendation algorithm for movie data combining rating matrix and user subjective preference |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |