[go: up one dir, main page]

CN114154166A - Abnormal data identification method, device, equipment and storage medium - Google Patents

Abnormal data identification method, device, equipment and storage medium Download PDF

Info

Publication number
CN114154166A
CN114154166A CN202111403923.8A CN202111403923A CN114154166A CN 114154166 A CN114154166 A CN 114154166A CN 202111403923 A CN202111403923 A CN 202111403923A CN 114154166 A CN114154166 A CN 114154166A
Authority
CN
China
Prior art keywords
data
abnormal
information
identified
account
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111403923.8A
Other languages
Chinese (zh)
Inventor
康焰龙
苏航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bigo Technology Pte Ltd
Original Assignee
Bigo Technology Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bigo Technology Pte Ltd filed Critical Bigo Technology Pte Ltd
Priority to CN202111403923.8A priority Critical patent/CN114154166A/en
Publication of CN114154166A publication Critical patent/CN114154166A/en
Priority to PCT/CN2022/132864 priority patent/WO2023093638A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a method, a device, equipment and a storage medium for identifying abnormal data, wherein the method comprises the following steps: acquiring structured data and relation data stored in a graph database, wherein the structured data and the relation data comprise account information, equipment information and associated attribute information; calculating and determining the equipment to be identified through a preset detection algorithm according to the determined abnormal account information, the structured data and the relationship data; and if the attribute information of the equipment to be identified meets a preset rule, determining that the equipment to be identified is abnormal equipment. According to the scheme, the abnormal data can be identified quickly and accurately.

Description

Abnormal data identification method, device, equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to an abnormal data identification method, an abnormal data identification device, abnormal data identification equipment and a storage medium.
Background
On each APP or platform in the Internet field, the condition that the number of a user account is stolen by a number stealing group in various ways mostly exists. Most platforms identify the user by verifying the account number and the password and then provide corresponding services. After a lawbreaker acquires an account and a password of a user in a certain mode to log in, the lawbreaker can perform activities such as encroaching on property in the account, modifying the password to occupy the account and the like, or send advertisement information to others, and pornography or virus link causes the account of the user to be forbidden by a platform. In this case, the user can only make complaints such as account recovery or account decapsulation through the platform.
In the prior art, when dealing with complaints of such stolen accounts, most of them manually search relevant logs according to complaint information provided by users to obtain multidimensional account information, compare the multidimensional account information with preset rules, and finally make a judgment whether the account is stolen according to past experience. It has the following defects: the existing storage mode of account information generally stores data of different dimensions into different two-dimensional tables after structured design, account information of multiple dimensions is difficult to directly correlate in the retrieval process, the correlation relationship of multi-dimensional information can be established only by carrying out multiple times of cross search, the retrieval and analysis efficiency is low, the time consumption is long, certain requirements are required for professional ability of workers, and meanwhile, the judgment accuracy depends on the experience of the workers; when a multi-level incidence relation is mined for a relation link, the link can be expanded only by carrying out data query for many times, on one hand, the efficiency and the duration are low, and on the other hand, the abnormal incidence relation is difficult to directly find out in a mode that a two-dimensional table shows data. Therefore, most cases that a single account is stolen can only be handled according to complaints, namely, post-repair, and it is difficult to identify abnormal data in advance through mining of a multi-level relational network, such as abnormal login equipment or abnormal accounts, so that large-scale number stealing behaviors and groups cannot be efficiently mined, and advance prevention and handling can be performed.
Disclosure of Invention
The embodiment of the invention provides an abnormal data identification method, an abnormal data identification device, abnormal data identification equipment and a storage medium, solves the problem that the identification efficiency of abnormal data such as a stolen account and abnormal login equipment is low in the prior art, and can quickly and accurately identify the abnormal data.
In a first aspect, an embodiment of the present invention provides an abnormal data identification method, where the method includes:
acquiring structured data and relation data stored in a graph database, wherein the structured data and the relation data comprise account information, equipment information and associated attribute information;
calculating and determining the equipment to be identified through a preset detection algorithm according to the determined abnormal account information, the structured data and the relationship data;
and if the attribute information of the equipment to be identified meets a preset rule, determining that the equipment to be identified is abnormal equipment.
In a second aspect, an embodiment of the present invention further provides an abnormal data identification apparatus, including:
the data acquisition module is used for acquiring structured data and relationship data stored in a graph database, wherein the structured data and the relationship data comprise account information, equipment information and associated attribute information;
the device to be identified determining module is used for calculating and determining the device to be identified through a preset detection algorithm according to the determined abnormal account information, the structured data and the relationship data;
and the abnormal equipment determining module is used for determining the equipment to be identified as abnormal equipment if the attribute information of the equipment to be identified meets a preset rule.
In a third aspect, an embodiment of the present invention further provides an abnormal data identification device, where the device includes:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the method for identifying abnormal data according to the embodiment of the present invention.
In a fourth aspect, the embodiment of the present invention further provides a storage medium storing computer-executable instructions, which are used for executing the abnormal data identification method according to the embodiment of the present invention when executed by a computer processor.
In the embodiment of the invention, structured data and relational data stored in a graph database are obtained, wherein the structured data and the relational data comprise account information, equipment information and associated attribute information, then, according to the determined abnormal account information, the structured data and the relational data, equipment to be identified is determined through calculation of a preset detection algorithm, and if the attribute information of the equipment to be identified meets a preset rule, the equipment to be identified is determined to be abnormal equipment, so that efficient identification of the abnormal equipment is realized.
Drawings
Fig. 1 is a flowchart of an abnormal data identification method according to an embodiment of the present invention;
FIG. 2 is a flow chart of another abnormal data identification method according to an embodiment of the present invention;
FIG. 3 is a flow chart of another abnormal data identification method according to an embodiment of the present invention;
FIG. 4 is a flow chart of another abnormal data identification method according to an embodiment of the present invention;
FIG. 5 is a flow chart of another abnormal data identification method according to an embodiment of the present invention;
fig. 6 is a block diagram of an abnormal data identification apparatus according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an abnormal data identification device according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad invention. It should be further noted that, for convenience of description, only some structures, not all structures, relating to the embodiments of the present invention are shown in the drawings.
The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.
Fig. 1 is a flowchart of an abnormal data identification method provided in an embodiment of the present invention, which can be applied to abnormal data identification, and the method can be executed by a computing device such as a desktop, a notebook, a server, and the like, and specifically includes the following steps:
step S101, obtaining structural data and relation data stored in a graph database, wherein the structural data and the relation data comprise account information, equipment information and associated attribute information.
The graph database stores and queries data in a graph structure mode, and embodies a relation model in a node and edge mode. The graph structure is composed of nodes and edges, the nodes and the edges can contain corresponding attribute information, the edges connected between the nodes have directions, and a plurality of edges of different types or edges of the same type but different attributes can exist between the same nodes. In one embodiment, the storage of structured data and relational data is performed in the form of a graph database, which facilitates the operation of nodes and edges in the network.
In one embodiment, the structured data and relationship data stored in the graph database are translated from the acquired account data for the plurality of dimensions. For the structured data, account information and device information are used as primary data generation nodes, and the account information and the device information may be an account ID and a device ID, where the account ID may be data generated by a platform and used for performing unique identification on an account, and the device ID may be data generated by the platform and used for uniquely identifying a device used by a user. The user may be a person who uses app or platform services, one of the users may register multiple accounts at the same time or may log in through multiple devices, and the structured data includes attribute information of the node in addition to account information and device information, where the attribute information is associated with the account and/or the device. For the relationship data, account information and equipment information are used as main data, and edge attribute information is also included, so that connection between the account and the equipment is realized, and a relationship network is constructed.
And S102, calculating and determining the equipment to be identified through a preset detection algorithm according to the determined abnormal account information, the structured data and the relationship data.
The abnormal account information is information of the determined abnormal account, such as an abnormal account ID. The abnormal account number may be determined by manually processing the account number for complaint, or an abnormal account number found through other channels. The abnormal account may be an account illegally stolen by another person.
The structured data and the relationship data are data stored in the graph database. In one embodiment, query search is performed in a certain range in a graph database according to abnormal account information to obtain a relationship network subgraph associated with the abnormal account information, and nodes in the relationship network subgraph are calculated through a preset detection algorithm to determine suspicious nodes, wherein devices corresponding to device information recorded by the suspicious nodes are devices to be identified.
Step S103, if the attribute information of the equipment to be identified meets a preset rule, determining that the equipment to be identified is abnormal equipment.
After the device to be identified is determined, whether the device to be identified is abnormal or not is judged according to a preset rule. Specifically, whether the equipment to be identified meets a preset rule or not is judged according to the attribute information of the equipment to be identified, and if the equipment to be identified meets the preset rule, the equipment to be identified is determined to be abnormal equipment.
In one embodiment, the device to be identified comprises a calculated central device and a connection device. The central device represents the device with the highest importance degree in the network relationship sub-graph, such as the corresponding central device in a large-scale community group, and the number of the account numbers logged in by the central device is large. The connecting device characterizes a device that acts as a network bridge, such as a bridge connection in a plurality of community groups.
Optionally, the attribute information of the device to be identified meeting the preset rule may be: the number of the login accounts corresponding to the central equipment is larger than a first preset number; in the login accounts corresponding to the central equipment, the ratio of the registered account to the unregistered account is larger than a first preset ratio value; in the login accounts corresponding to the central equipment, the ratio value of the anchor type account to the guest shooting type account is larger than a second preset ratio value; in the login account corresponding to the central equipment, the number of the historical abnormal records is greater than a second preset number; the connecting device is in the shortest path of the at least two central devices. The specific numerical values of the first preset quantity, the first preset proportion value, the second preset proportion value and the second preset quantity are not limited, and the configuration can be flexibly carried out. In one embodiment, when it is determined that the attribute information of the device to be identified satisfies the one or more preset rules, it is determined that the device to be identified is an abnormal device, that is, data recorded in a map database corresponding to the abnormal device is identified as abnormal data.
According to the scheme, the abnormal data identification method provided by the scheme does not need to perform abnormal identification through experience of workers, solves the problems of low abnormal data mining efficiency and high cost, can determine abnormal equipment in advance, and provides a constructive suggestion for precaution in advance.
Fig. 2 is a flowchart of another abnormal data identification method according to an embodiment of the present invention, which shows a specific process for constructing and generating structured data and relationship data, as shown in fig. 2, including:
step S201, obtaining login data stored in an original database, where the login data is relational table data and includes account information, device information, and associated attribute information.
The log-in data stored in the original database is stored in the form of a relational data table. The login conditions of different accounts under different devices and related attribute information of the accounts and the devices are recorded, and the login conditions are stored through different fields. The account information may be an account ID, the device information may be a device ID, and the associated attribute information may be, for example: account login time, login forms (such as password login, verification code login and the like), account registration time, registration forms (such as mobile phone number registration, mailbox registration and the like), registration types (such as an anchor type, a shooting type and the like) and some historical abnormal record information.
Step S202, taking the account information and the equipment information as nodes, selecting node attribute information from the attribute information as node attributes, generating structured data and storing the structured data into a graph database, taking the account information and the equipment information as nodes, selecting side attribute information from the attribute information as side attributes, and generating relationship data and storing the relationship data into the graph database.
The method comprises the steps that corresponding field contents are selected as node attribute information and side attribute information respectively according to login data stored in an original database, account information and equipment information are used as nodes, and the node attribute information is used as node attributes to generate structured data; and generating relationship data by taking the account information and the equipment information as nodes and taking the side attribute information as side attributes. For example, the content of the selected field as the attribute information may be login time, login times, login duration, and the like associated with login. If node a represents account ID01, node B represents device ID01, and the node attribute of node a may be registration time, registration mode, and login times of account ID 01; the node attribute of the node B can be the number of login accounts, login times, password modification times and the like; node A account ID01 has logged in at node B device ID01, node A and node B generate an edge whose attributes may be the number of login times, login time, login duration, etc. of account ID01 in device ID 01.
Step S203, structured data and relationship data stored in the graph database are obtained, wherein the structured data and the relationship data comprise account information, equipment information and associated attribute information.
And S204, calculating and determining the equipment to be identified through a preset detection algorithm according to the determined abnormal account information, the structured data and the relationship data.
Step S205, if the attribute information of the device to be identified meets a preset rule, determining that the device to be identified is an abnormal device.
According to the scheme, the login data stored in the original database are converted to generate the structured data and the relational data to be stored, and the multidimensional user information is converted, so that the analysis of the relational network and the construction of the graph data model are facilitated, the query and retrieval mechanism is optimized, and the data processing efficiency is improved.
Fig. 3 is a flowchart of another abnormal data identification method according to an embodiment of the present invention, which provides a specific method for determining a device to be identified by calculating through a preset detection algorithm, and as shown in fig. 3, the method includes:
step S301, obtaining login data stored in an original database, wherein the login data is relational table data and comprises account information, equipment information and associated attribute information.
Step S302, the account information and the equipment information are used as nodes, node attribute information is selected from the attribute information as node attributes, structured data are generated and stored in a graph database, the account information and the equipment information are used as nodes, side attribute information is selected from the attribute information as side attributes, and relationship data are generated and stored in the graph database.
Step S303, obtaining structural data and relationship data stored in a graph database, wherein the structural data and the relationship data comprise account information, equipment information and associated attribute information.
Step S304, determining data to be identified in a preset level according to the determined abnormal account information, the structured data and the relationship data.
The preset level is used for limiting different ranges, if the value range of the preset level is 3, the representation takes the node corresponding to the abnormal account as the center, and the association graph with the range of 3 levels is expanded to serve as data to be identified for algorithm calculation and screening subsequently.
Step S305, calculating the data to be identified through a degree centrality algorithm to determine central equipment, calculating the data to be identified through a medium centrality algorithm to determine connecting equipment, and determining the central equipment and equipment corresponding to the connecting equipment as the equipment to be identified.
In one embodiment, when the device to be identified is determined, the center device is determined through a centrality algorithm, and the connection device is determined through a medium centrality algorithm. The degree centrality represents the degree of the connection of one node with other nodes, a direct measurement index of the degree centrality of the node is described in a relational network, and the larger the degree of the node of one node is, the higher the degree centrality of the node is, and the larger the importance degree in the network is. And (4) through calculation of a recenterness algorithm, discovering central equipment of a large-scale community group. The medium centrality algorithm is an index for describing the importance of nodes by the number of shortest paths passing through a certain node, and reflects the importance degree of the nodes as bridges. Through media centrality algorithm calculations to discover devices that act as "bridges" between social groups.
Specifically, the calculation formula of the centrality algorithm is as follows:
Figure BDA0003372057630000071
wherein, CD(Ni) Represents the degree of centrality of the node i,
Figure BDA0003372057630000072
which is used to calculate the number of direct connections between node i and the other g-i j nodes.
The calculation formula of the medium centrality algorithm is as follows:
Figure BDA0003372057630000073
wherein σst(v) Representing the number of shortest paths, σ, from node s to node t through node vstRepresenting the number of shortest paths from node s to node t.
Step S306, if the attribute information of the equipment to be identified meets a preset rule, determining that the equipment to be identified is abnormal equipment.
According to the scheme, the reasonable data range to be identified is determined in the graph database based on the abnormal account information and the preset hierarchy, the central equipment and the connecting equipment are obtained through calculation by the degree centrality algorithm and the medium centrality algorithm, whether the attribute information of the central equipment and the connecting equipment meets the preset rule or not is judged to finally determine the abnormal equipment, the abnormal equipment is reasonably and efficiently excavated, the accuracy is high, and a large amount of human resources are saved.
Fig. 4 is a flowchart of another abnormal data identification method according to an embodiment of the present invention, which shows a process including information query feedback. As shown in fig. 4, includes:
step S401, structured data and relationship data stored in a graph database are obtained, wherein the structured data and the relationship data comprise account information, equipment information and associated attribute information.
Step S402, receiving an account to be queried, performing relation map path query in the map database by taking the account to be queried as a map center, determining account information and equipment information associated with the account to be queried, and displaying the account information and the equipment information.
In one embodiment, for the account complaint of the user, a quick and multistage expandable query can be performed through the converted structured data and the relationship data. If a user complains about a certain account, the system receives the account to be queried, performs relation graph path query in a graph database by taking the account to be queried as a graph center node, determines and displays account information and equipment information associated with the account to be queried, and can visually display the association relation of the account and the equipment. The staff can carry out the quick affirmation of abnormal account number according to this show content. In the relation graph path query process, an account to be queried is taken as a graph center to perform query along a side path of a node connected with the center, so that account information and equipment information related to the account to be queried are determined. In addition, due to the establishment of the graph database, the query for any node and attribute information can be supported, and the query and judgment efficiency is improved. If the login equipment, login times and login time of the account to be queried are obtained through query, and the number of the accounts logged in by the login equipment and other related information in the account login equipment to be queried are obtained.
Step S403, calculating and determining the equipment to be identified through a preset detection algorithm according to the determined abnormal account information, the structured data and the relationship data.
Step S404, if the attribute information of the equipment to be identified meets a preset rule, determining that the equipment to be identified is abnormal equipment.
According to the method, the association map path is inquired on the basis of the account to be inquired through the structural data and the relationship data stored in the map database, so that the corresponding account information and equipment information associated with the account to be inquired are obtained and displayed, the data inquiry efficiency is obviously improved, and the display effect is better, so that the staff can judge whether the account is stolen or not.
Fig. 5 is a flowchart of another abnormal data identification method according to an embodiment of the present invention, which shows a process of adjusting a verification policy after identifying abnormal data. As shown in fig. 5, includes:
step S501, login data stored in an original database are obtained, wherein the login data are relational table data and comprise account information, equipment information and associated attribute information.
Step S502, the account information and the equipment information are used as nodes, node attribute information is selected from the attribute information as node attributes, structured data are generated and stored in a graph database, the account information and the equipment information are used as nodes, side attribute information is selected from the attribute information as side attributes, and relationship data are generated and stored in the graph database.
Step S503, obtaining structural data and relationship data stored in the graph database, wherein the structural data and the relationship data comprise account information, equipment information and associated attribute information.
Step S504, determining data to be identified in a preset level according to the determined abnormal account information, the structured data and the relationship data.
Step S505, the data to be identified is calculated through a degree centrality algorithm to determine central equipment, the data to be identified is calculated through a medium centrality algorithm to determine connecting equipment, and the central equipment and the equipment corresponding to the connecting equipment are determined as the equipment to be identified.
Step S506, if the attribute information of the equipment to be identified meets a preset rule, determining that the equipment to be identified is abnormal equipment.
Step S507, adding the device identification and the abnormal grade of the abnormal device into a cache, when the device identification of the login device is detected to be consistent with the device identification of the abnormal device, determining a corresponding verification strategy according to the abnormal grade, and verifying the login device based on the verification strategy.
In one embodiment, after the abnormal device is determined, an abnormality level of the abnormal device is further determined. Optionally, the abnormal grade is determined according to the number of preset rules that the abnormal device satisfies, and the larger the number of the preset rules that satisfy, the higher the abnormal grade. And adding the equipment identification and the abnormal grade of the abnormal equipment into a cache so as to adopt a higher-grade verification strategy to verify the login equipment when the equipment identification of the login equipment is detected to be consistent with the equipment identification of the abnormal equipment, and if the verification content which is complicated and has more quantity needs to be provided, the verification can be passed. Exemplary, the method includes face recognition verification, question answering verification and the like. Wherein, the higher the exception level, the more complicated the corresponding verification strategy.
According to the method, after the abnormal equipment is determined, the identification of the abnormal equipment is stored in the cache so as to perform complex verification on the hit login equipment, the verification strategy is determined according to the abnormal grade of the abnormal equipment, the account login safety is guaranteed, and meanwhile, the conventional normal login equipment cannot be subjected to excessive complex verification so as to influence the user experience.
Fig. 6 is a block diagram of an abnormal data identification apparatus according to an embodiment of the present invention, where the apparatus is configured to execute the abnormal data identification method according to the embodiment, and has functional modules and beneficial effects corresponding to the execution method. As shown in fig. 6, the system specifically includes: a data acquisition module 101, a device to be identified determination module 102 and an abnormal device determination module 103, wherein,
the data acquisition module 101 is configured to acquire structured data and relationship data stored in a graph database, where the structured data and the relationship data include account information, device information, and associated attribute information;
the device to be identified determining module 102 is configured to calculate and determine the device to be identified through a preset detection algorithm according to the determined abnormal account information, the structured data, and the relationship data;
the abnormal device determining module 103 is configured to determine that the device to be identified is an abnormal device known by the above scheme if the attribute information of the device to be identified meets a preset rule, create a plurality of data synchronization tasks according to the information of the registration centers, where each data synchronization task corresponds to at least two different registration centers, convert the information change content into change information of different types of registration centers through the created data synchronization task when the information change of a certain registration center is detected, and send the change information to the corresponding registration centers, so as to implement information synchronization between other registration centers and the first registration center to provide information call, thereby significantly improving the abnormal data identification efficiency and reducing the overall cost.
In one possible embodiment, the apparatus further comprises a data conversion storage module 104 for:
before obtaining structural data and relationship data stored in a graph database, obtaining login data stored in an original database, wherein the login data is relationship type table data and comprises account information, equipment information and associated attribute information;
taking the account information and the equipment information as nodes, selecting node attribute information from the attribute information as node attributes, and generating structured data to be stored in a graph database;
and taking the account information and the equipment information as nodes, selecting side attribute information as side attributes from the attribute information, and generating relationship data to store in the graph database.
In a possible embodiment, the to-be-identified device determining module 102 is specifically configured to:
determining data to be identified in a preset level according to the determined abnormal account information, the structured data and the relationship data;
and calculating the data to be identified through a preset detection algorithm to determine the equipment to be identified.
In a possible embodiment, the to-be-identified device determining module 102 is specifically configured to: calculating the data to be identified by a recentering algorithm to determine central equipment;
and calculating the data to be identified through a medium centrality algorithm to determine the connection equipment.
In one possible embodiment, the attribute information of the device to be identified satisfies a preset rule, and includes at least one of the following:
the number of the login accounts corresponding to the central equipment is larger than a first preset number; or the like, or, alternatively,
in the login accounts corresponding to the central equipment, the ratio of the registered account to the unregistered account is larger than a first preset ratio value; or the like, or, alternatively,
in the login accounts corresponding to the central equipment, the ratio value of the anchor type account to the guest shooting type account is larger than a second preset ratio value; or the like, or, alternatively,
in the login account corresponding to the central equipment, the number of the historical abnormal records is greater than a second preset number; or the like, or, alternatively,
the connecting device is in the shortest path of the at least two central devices.
In one possible embodiment, the apparatus further comprises a query display module 105 for:
receiving an account to be queried, carrying out relation map path query in the map database by taking the account to be queried as a map center, determining account information and equipment information associated with the account to be queried, and displaying the account information and the equipment information.
In one possible embodiment, the apparatus further comprises a security verification module 106 for:
after the device to be identified is determined to be abnormal, adding the device identification and the abnormal grade of the abnormal device into a cache;
when the equipment identification of the login equipment is detected to be consistent with the equipment identification of the abnormal equipment, determining a corresponding verification strategy according to the abnormal grade;
and verifying the login equipment based on the verification strategy.
Fig. 7 is a schematic structural diagram of an abnormal data identification apparatus according to an embodiment of the present invention, as shown in fig. 7, the apparatus includes a processor 201, a memory 202, an input device 203, and an output device 204; the number of the processors 201 in the device may be one or more, and one processor 201 is taken as an example in fig. 7; the processor 201, the memory 202, the input device 203 and the output device 204 in the apparatus may be connected by a bus or other means, and fig. 7 illustrates the example of connection by a bus. The memory 202 is a computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the abnormal data identification method in the embodiment of the present invention. The processor 201 executes various functional applications of the device and data processing by executing software programs, instructions, and modules stored in the memory 202, that is, implements the above-described abnormal data identification method. The input device 203 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function controls of the apparatus. The output device 204 may include a display device such as a display screen.
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, where the computer-executable instructions are executed by a computer processor to perform the method for identifying abnormal data described in the foregoing embodiment, and the method specifically includes:
acquiring structured data and relation data stored in a graph database, wherein the structured data and the relation data comprise account information, equipment information and associated attribute information;
calculating and determining the equipment to be identified through a preset detection algorithm according to the determined abnormal account information, the structured data and the relationship data;
and if the attribute information of the equipment to be identified meets a preset rule, determining that the equipment to be identified is abnormal equipment. It should be noted that, in the embodiment of the abnormal data identification apparatus, each included unit and each included module are only divided according to functional logic, but are not limited to the above division as long as the corresponding function can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiment of the invention.
It should be noted that the foregoing is only a preferred embodiment of the present invention and the technical principles applied. Those skilled in the art will appreciate that the embodiments of the present invention are not limited to the specific embodiments described herein, and that various obvious changes, adaptations, and substitutions are possible, without departing from the scope of the embodiments of the present invention. Therefore, although the embodiments of the present invention have been described in more detail through the above embodiments, the embodiments of the present invention are not limited to the above embodiments, and many other equivalent embodiments may be included without departing from the concept of the embodiments of the present invention, and the scope of the embodiments of the present invention is determined by the scope of the appended claims.

Claims (10)

1.异常数据识别方法,其特征在于,包括:1. A method for identifying abnormal data, comprising: 获取图数据库中存储的结构化数据以及关系数据,所述结构化数据以及所述关系数据包括账号信息、设备信息以及关联的属性信息;Obtaining structured data and relational data stored in the graph database, where the structured data and the relational data include account information, device information, and associated attribute information; 根据确定的异常账号信息、所述结构化数据以及所述关系数据,通过预设检测算法进行计算确定待识别设备;According to the determined abnormal account information, the structured data and the relationship data, perform calculation through a preset detection algorithm to determine the device to be identified; 如果所述待识别设备的属性信息满足预设规则,则确定所述待识别设备为异常设备。If the attribute information of the device to be identified satisfies a preset rule, it is determined that the device to be identified is an abnormal device. 2.根据权利要求1所述的异常数据识别方法,其特征在于,在获取图数据库中存储的结构化数据以及关系数据之前,还包括:2. abnormal data identification method according to claim 1 is characterized in that, before acquiring the structured data and relational data stored in the graph database, also comprises: 获取原始数据库中存储的登录数据,所述登录数据为关系型表格数据,包括账号信息、设备信息以及关联的属性信息;Obtain the login data stored in the original database, where the login data is relational table data, including account information, device information and associated attribute information; 将所述账号信息和所述设备信息作为节点,所述属性信息中选取节点属性信息作为节点属性,生成结构化数据存储至图数据库中;Taking the account information and the device information as nodes, and selecting node attribute information from the attribute information as node attributes, generating structured data and storing it in a graph database; 将所述账号信息和所述设备信息作为节点,所述属性信息中选取边属性信息作为边属性,生成关系数据存储至所述图数据库中。The account information and the device information are used as nodes, and edge attribute information is selected from the attribute information as the edge attribute, and the generated relationship data is stored in the graph database. 3.根据权利要求1所述的异常数据识别方法,其特征在于,所述根据确定的异常账号信息、所述结构化数据以及所述关系数据,通过预设检测算法进行计算确定待识别设备,包括:3. The abnormal data identification method according to claim 1, characterized in that, according to the determined abnormal account information, the structured data and the relationship data, a preset detection algorithm is used to calculate and determine the device to be identified, include: 根据确定的异常账号信息、所述结构化数据以及所述关系数据确定预设层级的待识别数据;Determine the data to be identified at a preset level according to the determined abnormal account information, the structured data and the relationship data; 通过预设检测算法对所述待识别数据进行计算确定待识别设备。The device to be identified is determined by calculating the data to be identified through a preset detection algorithm. 4.根据权利要求3所述的异常数据识别方法,其特征在于,所述通过预设检测算法对所述待识别数据进行计算确定待识别设备,包括:4. The abnormal data identification method according to claim 3, wherein the calculation of the to-be-identified data by a preset detection algorithm to determine the to-be-identified device comprises: 通过度中心性算法对所述待识别数据进行计算确定中心设备;Determine the central device by calculating the data to be identified through a degree centrality algorithm; 通过介质中心性算法对所述待识别数据进行计算确定连接设备。The connected device is determined by calculating the data to be identified through a medium centrality algorithm. 5.根据权利要求4所述的异常数据识别方法,其特征在于,所述待识别设备的属性信息满足预设规则,包括下述至少一种:5. The abnormal data identification method according to claim 4, wherein the attribute information of the device to be identified satisfies a preset rule, comprising at least one of the following: 所述中心设备对应的登录账号的数量大于第一预设数量;或,The number of login accounts corresponding to the central device is greater than the first preset number; or, 所述中心设备对应的登录账号中,注册账号与非注册账号的比值大于第一预设比例值;或,In the login account corresponding to the central device, the ratio of the registered account to the non-registered account is greater than the first preset ratio; or, 所述中心设备对应的登录账号中,主播类型账号与拍客类型账号的比例值大于第二预设比例值;或,In the login account corresponding to the central device, the ratio of the host type account to the shooter type account is greater than the second preset ratio value; or, 所述中心设备对应的登录账号中,历史异常记录数量大于第二预设数量;或,In the login account corresponding to the central device, the number of historical abnormal records is greater than the second preset number; or, 所述连接设备处于至少两个中心设备的最短路径中。The connecting device is in the shortest path of at least two central devices. 6.根据权利要求1-5中任一项所述的异常数据识别方法,其特征在于,还包括:6. The abnormal data identification method according to any one of claims 1-5, characterized in that, further comprising: 接收待查询账号,以所述待查询账号为图中心在所述图数据库中进行关系图谱路径查询,确定和所述待查询账号关联的账号信息和设备信息并进行展示。The account to be queried is received, and the account to be queried is used as the graph center to perform a relational graph path query in the graph database, and account information and device information associated with the account to be queried are determined and displayed. 7.根据权利要求1-5中任一项所述的异常数据识别方法,其特征在于,在确定所述待识别设备为异常设备之后,还包括:7. The abnormal data identification method according to any one of claims 1-5, characterized in that, after determining that the device to be identified is an abnormal device, further comprising: 将所述异常设备的设备标识以及异常等级添加至缓存中;adding the device identification and the abnormal level of the abnormal device to the cache; 当检测到登录设备的设备标识与所述异常设备的设备标识一致时,根据所述异常等级确定对应的验证策略;When it is detected that the device identification of the login device is consistent with the device identification of the abnormal device, determine the corresponding verification policy according to the abnormal level; 基于所述验证策略对所述登录设备进行验证。The login device is authenticated based on the authentication policy. 8.异常数据识别装置,其特征在于,包括:8. An abnormal data identification device, characterized in that it comprises: 数据获取模块,用于获取图数据库中存储的结构化数据以及关系数据,所述结构化数据以及所述关系数据包括账号信息、设备信息以及关联的属性信息;a data acquisition module, configured to acquire structured data and relational data stored in the graph database, where the structured data and the relational data include account information, device information and associated attribute information; 待识别设备确定模块,用于根据确定的异常账号信息、所述结构化数据以及所述关系数据,通过预设检测算法进行计算确定待识别设备;A to-be-identified device determination module, configured to calculate and determine the to-be-identified device through a preset detection algorithm according to the determined abnormal account information, the structured data and the relationship data; 异常设备确定模块,用于如果所述待识别设备的属性信息满足预设规则,则确定所述待识别设备为异常设备。An abnormal device determination module, configured to determine that the to-be-identified device is an abnormal device if the attribute information of the to-be-identified device satisfies a preset rule. 9.一种异常数据识别设备,所述设备包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-7中任一项所述的异常数据识别方法。9. A device for identifying abnormal data, the device comprising: one or more processors; a storage device for storing one or more programs, when the one or more programs are executed by the one or more processors The execution causes the one or more processors to implement the abnormal data identification method according to any one of claims 1-7. 10.一种存储计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行如权利要求1-7中任一项所述的异常数据识别方法。10. A storage medium storing computer-executable instructions, which when executed by a computer processor are used to perform the abnormal data identification method according to any one of claims 1-7.
CN202111403923.8A 2021-11-24 2021-11-24 Abnormal data identification method, device, equipment and storage medium Pending CN114154166A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111403923.8A CN114154166A (en) 2021-11-24 2021-11-24 Abnormal data identification method, device, equipment and storage medium
PCT/CN2022/132864 WO2023093638A1 (en) 2021-11-24 2022-11-18 Abnormal data identification method and apparatus, and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111403923.8A CN114154166A (en) 2021-11-24 2021-11-24 Abnormal data identification method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114154166A true CN114154166A (en) 2022-03-08

Family

ID=80457612

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111403923.8A Pending CN114154166A (en) 2021-11-24 2021-11-24 Abnormal data identification method, device, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN114154166A (en)
WO (1) WO2023093638A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023093638A1 (en) * 2021-11-24 2023-06-01 百果园技术(新加坡)有限公司 Abnormal data identification method and apparatus, and device and storage medium
CN120285454A (en) * 2025-06-12 2025-07-11 脉景(杭州)健康管理有限公司 A human brain transcranial stimulation circuit and a human brain transcranial stimulation method

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116823511B (en) * 2023-08-30 2024-01-09 北京中科心研科技有限公司 Method and device for identifying social isolation state of user and wearable device
CN118885951A (en) * 2024-10-08 2024-11-01 北京芯盾时代科技有限公司 Abnormal account detection method and detection system
CN119829389B (en) * 2024-12-05 2025-10-31 中信银行股份有限公司 Operation and maintenance change influence assessment method and system based on knowledge graph

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110784470A (en) * 2019-10-30 2020-02-11 上海观安信息技术股份有限公司 Method and device for determining abnormal login of user
CN112487210A (en) * 2020-12-14 2021-03-12 每日互动股份有限公司 Abnormal device identification method, electronic device, and medium
CN113378899A (en) * 2021-05-28 2021-09-10 百果园技术(新加坡)有限公司 Abnormal account identification method, device, equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110839003A (en) * 2018-08-16 2020-02-25 北京嘀嘀无限科技发展有限公司 Method and device for identifying number stealing behavior, computer equipment and storage medium
US10778706B1 (en) * 2020-01-10 2020-09-15 Capital One Services, Llc Fraud detection using graph databases
CN114154166A (en) * 2021-11-24 2022-03-08 百果园技术(新加坡)有限公司 Abnormal data identification method, device, equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110784470A (en) * 2019-10-30 2020-02-11 上海观安信息技术股份有限公司 Method and device for determining abnormal login of user
CN112487210A (en) * 2020-12-14 2021-03-12 每日互动股份有限公司 Abnormal device identification method, electronic device, and medium
CN113378899A (en) * 2021-05-28 2021-09-10 百果园技术(新加坡)有限公司 Abnormal account identification method, device, equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023093638A1 (en) * 2021-11-24 2023-06-01 百果园技术(新加坡)有限公司 Abnormal data identification method and apparatus, and device and storage medium
CN120285454A (en) * 2025-06-12 2025-07-11 脉景(杭州)健康管理有限公司 A human brain transcranial stimulation circuit and a human brain transcranial stimulation method
CN120285454B (en) * 2025-06-12 2025-08-22 脉景(杭州)健康管理有限公司 A human brain transcranial stimulation circuit

Also Published As

Publication number Publication date
WO2023093638A1 (en) 2023-06-01

Similar Documents

Publication Publication Date Title
CN114154166A (en) Abnormal data identification method, device, equipment and storage medium
US11916944B2 (en) Network anomaly detection and profiling
US10476904B2 (en) Non-transitory recording medium recording cyber-attack analysis supporting program, cyber-attack analysis supporting method, and cyber-attack analysis supporting apparatus
JP5735969B2 (en) System and method for analyzing social graph data for determining connections within a community
CN110099059B (en) Domain name identification method and device and storage medium
CN109376078B (en) Mobile application testing method, terminal equipment and medium
CN109842628A (en) A kind of anomaly detection method and device
JP2019519018A (en) Method and apparatus for reducing security risk in a networked computer system architecture
RU2601148C1 (en) System and method for detecting anomalies when connecting devices
JP2009110177A (en) Unit and method for supporting information security measure decision, and computer program
CN105224600B (en) A kind of detection method and device of Sample Similarity
JP2019101672A (en) Cyber attack information processing program, cyber attack information processing method and information processing device
WO2017101301A1 (en) Data information processing method and device
CN105824805B (en) Identification method and device
CN110881050A (en) Security threat detection method and related product
CN111786974B (en) Network security assessment method and device, computer equipment and storage medium
US20200374308A1 (en) Method, product, and system for maintaining an ensemble of hierarchical machine learning models for detection of security risks and breaches in a network
CN116521511A (en) Risk code pre-detection method, device, equipment and storage medium
CN106789837A (en) Network anomalous behaviors detection method and detection means
JP2018196054A (en) Evaluation program, evaluation method and information processing device
CN109313541A (en) For showing and the user interface of comparison attacks telemetering resource
JP2019159431A (en) Evaluation program, evaluation method, and evaluation device
US20210165907A1 (en) Systems and methods for intelligent and quick masking
JP2018133044A (en) WebAPI execution flow generation apparatus and WebAPI execution flow generation method
JP7274162B2 (en) ABNORMAL OPERATION DETECTION DEVICE, ABNORMAL OPERATION DETECTION METHOD, AND PROGRAM

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination