US20220269784A1 - N-dimensional model techniques and architectures for data protection - Google Patents
N-dimensional model techniques and architectures for data protection Download PDFInfo
- Publication number
- US20220269784A1 US20220269784A1 US17/185,884 US202117185884A US2022269784A1 US 20220269784 A1 US20220269784 A1 US 20220269784A1 US 202117185884 A US202117185884 A US 202117185884A US 2022269784 A1 US2022269784 A1 US 2022269784A1
- Authority
- US
- United States
- Prior art keywords
- data
- points
- bits
- control circuitry
- dimensional model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/44—Program or device authentication
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/09—Supervised learning
-
- G06N7/005—
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/03—Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
- G06F2221/033—Test or assess software
Definitions
- Anti-malware tools are implemented to prevent, detect, and remove malware that threatens computing devices. These tools use pattern matching, heuristic analysis, behavioral analysis, or hash matching to identify malware. Although these techniques provide some level of security, the anti-malware tools are slow to adapt to changing malware, reliant on humans to flag or verify malware, slow to process data, and require exact matches between data and pre-flagged malware. This often leaves computing devices exposed to malware for relatively long periods of time, causing various undesirable issues.
- FIG. 1 illustrates an example architecture in which the techniques described herein may be implemented.
- FIG. 2 illustrates an example process of converting data to an n-dimensional point representation in accordance with one or more embodiments.
- FIG. 3 illustrates an example process to train an analysis model in accordance with one or more embodiments.
- FIG. 4 illustrates an example process to produce one or more n-dimensional representations in accordance with one or more embodiments.
- FIG. 5 illustrates an example process to process an n-dimensional representation using an analysis model in accordance with one or more embodiments.
- FIGS. 6A-6B illustrates an example process to generate one or more n-dimensional representations and analyze the one or more n-dimensional representations in accordance with one or more embodiments.
- FIG. 7 illustrates an example process to process analysis data regarding a target property and determine one or more characteristics about the target property in accordance with one or more embodiments.
- FIG. 8 illustrates an example process to generate one or more n-dimensional representations for data associated with one or more target properties in accordance with one or more embodiments.
- the techniques and architectures may receive data of any type and process the data at a bit or byte level to generate one or more n-dimensional representations for the data.
- the techniques and architectures may represent groups of bits within the data as points within a coordinate system, with a set of bits within a group of bits representing a coordinate for a point.
- the techniques and architectures may use the points as the n-dimensional representation and/or generate a model or another representation based on the points (e.g., a mesh, wireframe, etc.).
- the n-dimensional representation may be generated to include one or more of the points and/or a model or other representation for one or more of the points.
- the n-dimensional representation may represent a data signature for the data.
- the points within the coordinate space are analyzed to generate multiple n-dimensional representations (e.g., identify multiple sets of points and generate a model for each set of points).
- the techniques and architectures may evaluate an n-dimensional representation based on one or more analysis representations that have been tagged as being associated with a target property (e.g., threat, interruption, nuisance, etc.), such as malware, vulnerability, or another security-related issue.
- a target property e.g., threat, interruption, nuisance, etc.
- a two-dimensional (2D) or three-dimensional (3D) model representing a portion of the data may be compared to 2D or 3D models that have been previously tagged as being associated with malicious data. If the data model is substantially similar to one or more of the malicious models, a threat or potential threat may be detected.
- the data model and/or data model within the coordinate system may be analyzed to determine an actual threat, a type of a threat, a source of a threat (e.g., an entity that generated the threat/data), and so on. Further, various operations may be performed to address a target property, such as removing a threat, ensuring that the threat is not associated with the data, providing a notification/message regarding the threat, or another operation.
- the techniques and architectures discussed herein may provide various security measures to efficiently and/or accurately detect target properties for data (e.g., threats).
- the techniques and architectures may represent data in an n-dimensional representation and process the n-dimensional representation with a model that efficiently and/or accurately detects various types of threats to the data, such as malware or other malicious data.
- any type of data may be processed (e.g., the techniques and architectures are agnostic to data type, environment type, etc.).
- the techniques and architectures may be implemented for various types of data, such as file system data, network traffic data, runtime data, non-image-based data, data stored in volatile memory, data stored in non-volatile memory, behavioral data, and so on, and/or implemented for various environments, such as different operating systems, platforms, and so on.
- the techniques and architectures may detect target properties by processing just a portion of data (e.g., a portion of a file, etc.), which may further increase the efficiency of the techniques and architectures.
- the techniques and architectures may detect target properties without human involvement.
- the techniques and architectures may efficiently utilize computing resources, such as by comparing a data model to target models to identify a potential threats, interruptions, nuisances, etc., which may be relatively faster and/or require less computational resources in comparison to other solutions.
- a target property can refer to/include malicious behavior (e.g., malicious data intended to damage an environment/system/device), benign behavior (e.g., data/behavior that is not malicious), a vulnerability (e.g., vulnerability data that may make an environment/system/device vulnerable to an attack), or any other security-related characteristic that may potentially pose a threat, interruption, nuisance, vulnerability, and so on.
- malicious behavior e.g., malicious data intended to damage an environment/system/device
- benign behavior e.g., data/behavior that is not malicious
- a vulnerability e.g., vulnerability data that may make an environment/system/device vulnerable to an attack
- any other security-related characteristic may potentially pose a threat, interruption, nuisance, vulnerability, and so on.
- an n-dimensional representation may comprise a one-dimensional representation, a two-dimensional representation, a three-dimensional representation, a four-dimensional representation, and so on.
- each dimension of a representation can refer to a characteristic of data.
- a four-dimensional representation for data can include three dimensions that correspond to spatial values (e.g., to form a 3D surface model) for the data and one dimension that represents another characteristic of the data, such as any type of value, metadata, etc. that is associated with the data and/or generated from the data.
- the techniques and architectures can be implemented within a wide variety of contexts, such as industrial control systems, network traffic, physical security, system memory, isolated environments, and so on.
- FIG. 1 illustrates an example architecture 100 in which the techniques described herein may be implemented.
- the architecture 100 includes one or more service providers 110 (also referred to as “the service provider 110 ,” for ease of discussion) configured to communicate with one or more interface/client devices 130 (also referred to as “the client device 130 ,” for ease of discussion) over one or more networks 140 (also referred to as “the network 140 ,” for ease of discussion).
- the service provider 110 can perform processing remotely/separately from the client device 130 and communicate with the client device 130 to facilitate such processing for the client device 130 and/or another device.
- the service provider 110 and/or the client device 130 can be configured to facilitate various functionality.
- the network 140 can include one or more network devices 145 (also referred to as “the network device 145 ,” for ease of discussion) to facilitate communication over the network 140 .
- the service provider 110 , the client device 130 , and/or the network device 145 may be configured to perform any of the techniques/functionality discussed herein, which may generally process data to detect a threat or potential threat. Although example devices are illustrated in the architecture 100 , any of such devices may eliminated/not implemented.
- the service provider 110 may implement the techniques discussed herein without communicating with the client device 130 and/or without using the network 140 .
- the client device 130 may implement the techniques without communicating with the service provider 110 and/or without using the network 140 .
- the service provider 110 may be implemented as one or more computing devices, such as one or more servers, one or more desktop computers, one or more laptops computers, or any other type of device configured to process data.
- the one or more computing devices are configured in a cluster, data center, cloud computing environment, or a combination thereof.
- the one or more computing devices of the service provider 110 are implemented as a remote computing resource that is located remotely to the client device 130 .
- the one or more computing devices of the service provider 110 are implemented as local resources that are located locally at the client device 130 .
- the client device 130 may be implemented as one or more computing devices, such as one or more desktop computers, laptops computers, servers, smartphones, electronic reader devices, mobile handsets, personal digital assistants, portable navigation devices, portable gaming devices, tablet computers, wearable devices (e.g., a watch), portable media players, televisions, set-top boxes, computer systems in a vehicle, appliances, cameras, security systems, home-based computer systems, projectors, and so on.
- computing devices such as one or more desktop computers, laptops computers, servers, smartphones, electronic reader devices, mobile handsets, personal digital assistants, portable navigation devices, portable gaming devices, tablet computers, wearable devices (e.g., a watch), portable media players, televisions, set-top boxes, computer systems in a vehicle, appliances, cameras, security systems, home-based computer systems, projectors, and so on.
- the client device 130 includes one or more input/output (I/O) components, such as one or more displays, microphones, speakers, keyboards, mice, cameras, and so on.
- I/O input/output
- the one or more displays may be configured to display data associated with certain aspects of the present disclosure.
- the one or more displays may be configured to present a graphical user interface (GUI) to facilitate operation of the client device 130 , present information associated with an evaluation of data (e.g., information indicating if a threat is detected, a type of threat detected, etc.), provide input to cause an operation to be performed to address a threat (e.g., an operation to have a threat removed, prevent a threat from associated with and/or further corrupting data, prevent a threat from being stored with data, etc.), and so on.
- the one or more displays may include a liquid-crystal display (LCD), a light-emitting diode (LED) display, an organic LED display, a plasma display, an electronic paper display, or any other type of technology.
- the one or more displays include one or more touchscreens and/or other user input/output (I/O) devices.
- the network device 145 may include one or more routers, bridges, switches, repeaters, modems, gateways, hubs, wireless access points, servers, network interface controllers, or any other device/hardware configured to facilitate reception/transmission of data from/to another component.
- the service provider 110 , client device 130 , and/or network device 145 may include control circuitry 111 , memory 112 , and/or one or more network interfaces 113 configured to perform functionality described herein.
- the control circuitry 111 , memory 112 , and one or more network interfaces 113 are shown in blocks above the service provider 110 , client device 130 , and network device 145 . It should be understood that, in many embodiments, the service provider 110 , client device 130 , and/or network device 145 can each include separate instances of the control circuitry 111 , memory 112 , and network interface 113 .
- the service provider 110 can include its own control circuitry, data storage/memory, and/or network interface (e.g., to implement processing on the service provider 110 ), the network device 145 can include its own control circuitry, data storage/memory, and/or network interface (e.g., to implement processing on the network device 145 ), and/or the client device 130 can include its own control circuitry, data storage/memory, and/or network interface (e.g., to implement processing on the client device 130 ).
- control circuitry/memory may refer to circuitry/memory embodied in the service provider 110 , client device 130 , and/or network device 145 .
- control circuitry 111 is illustrated as a separate component from the memory 112 and network interface 113 , it should be understood that the memory 112 and/or the network interface 113 can be embodied at least in part in the control circuitry 111 .
- the control circuitry 111 can include various devices (active and/or passive), semiconductor materials and/or areas, layers, regions, and/or portions thereof, conductors, leads, vias, connections, and/or the like, wherein one or more of the memory 112 and the network interface 113 and/or portion(s) thereof can be formed and/or embodied at least in part in/by such circuitry components/devices.
- the control circuitry 111 may include one or more processors, processing circuitry, processing modules/units, chips, dies (e.g., semiconductor dies including come or more active and/or passive devices and/or connectivity circuitry), microprocessors, micro-controllers, digital signal processors (DSPs), microcomputers, central processing units (CPUs), graphics processing units (GPUs), field programmable gate arrays (FPGAs), programmable logic devices, state machines (e.g., hardware state machines), logic circuitry, analog circuitry, digital circuitry, application specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), program-specific standard products (ASSPs), complex programmable logic devices (CPLDs), and/or any device that manipulates signals (analog and/or digital) based on hard coding of the circuitry and/or operational instructions.
- DSPs digital signal processors
- microcomputers central processing units (CPUs), graphics processing units (GPUs), field programmable gate arrays (FPGAs
- Control circuitry can further comprise one or more, storage devices, which can be embodied in a single memory device, a plurality of memory devices, and/or embedded circuitry of a device.
- data storage can comprise read-only memory, random access memory, volatile memory, non-volatile memory, static memory, dynamic memory, flash memory, cache memory, data storage registers, and/or any device that stores digital information.
- control circuitry comprises a hardware state machine (and/or implements a software state machine), analog circuitry, digital circuitry, and/or logic circuitry, data storage device(s)/register(s) storing any associated operational instructions can be embedded within, or external to, the circuitry comprising the state machine, analog circuitry, digital circuitry, and/or logic circuitry.
- the memory 112 may include any suitable or desirable type of computer-readable media.
- one or more computer-readable media may include one or more volatile data storage devices, non-volatile data storage devices, removable data storage devices, and/or nonremovable data storage devices implemented using any technology, layout, and/or data structure(s)/protocol, including any suitable or desirable computer-readable instructions, data structures, program modules, or other data types.
- One or more computer-readable media may include, but is not limited to, phase change memory, static random-access memory (SRAM), dynamic random-access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disk read-only memory (CD-ROM), digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that may be used to store information for access by a computing device.
- phase change memory phase change memory
- SRAM static random-access memory
- DRAM dynamic random-access memory
- RAM random access memory
- ROM read-only memory
- EEPROM electrically erasable programmable read-only memory
- flash memory or other memory technology
- compact disk read-only memory CD-ROM
- DVD digital versatile disks
- magnetic cassettes magnetic tape
- magnetic disk storage magnetic disk storage devices
- the control circuitry 111 , memory 112 , and/or network interface 113 can be electrically and/or communicatively coupled using certain connectivity circuitry/devices/features, which can or may not be part of control circuitry 111 .
- the connectivity feature(s) can include one or more printed circuit boards configured to facilitate mounting and/or interconnectivity of at least some of the various components/circuitry. In some embodiments, two or more of the components may be electrically and/or communicatively coupled to each other.
- the memory 112 may store a data selection component 114 , a representation generation component 115 , and a representation analysis component 116 , which can include executable instructions that, when executed by the control circuitry 111 , cause the control circuitry 111 to perform various operations discussed herein.
- one or more of the components 114 - 116 may include software/firmware modules.
- one or more of the components 114 - 116 may be implemented as one or more hardware logic components, such as one or more application specific integrated circuits (ASIC), field-programmable gate arrays (FPGAs), program-specific standard products (ASSPs), complex programmable logic devices (CPLDs), and/or the like.
- ASIC application specific integrated circuits
- FPGAs field-programmable gate arrays
- ASSPs program-specific standard products
- CPLDs complex programmable logic devices
- the components 114 - 116 are illustrated as separate components. However, it should be understood that one or more of the components 114 - 116 may be implemented as any number of components
- the data selection component 114 can be configured to select a portion of data for the representation generation component 115 and/or representation analysis component 116 to process. For example, the data selection component 114 can select a number of bits/bytes of data and/or a particular portion of the data, such as a predetermined number of bits/bytes (e.g., 1500 bits/bytes, 15,0000 bits/bytes, 500 bits/bytes, and so on), header/footer/body data, metadata, a particular number of bits/bytes within a particular portion the data, and so on.
- a predetermined number of bits/bytes e.g., 1500 bits/bytes, 15,0000 bits/bytes, 500 bits/bytes, and so on
- header/footer/body data e.g., 1500 bits/bytes, 15,0000 bits/bytes, 500 bits/bytes, and so on
- metadata e.g., a particular number of bits/bytes within a particular portion the data, and so on.
- the data selection component 114 can determine a type of the data (e.g., file system data, network traffic data, runtime data, non-image-based data, data stored in volatile memory, data stored in non-volatile memory, behavioral data, and so on) and select a particular portion of the data and/or a number of bits/bytes based on the type of data. For instance, it may be determined through machine learning or other techniques that evaluating a particular section of data (e.g., a header, a footer, a section of a payload, etc.) for a particular type of data accurately detects any threats associated with the type of data by more than a threshold (e.g., 99% of the time).
- a threshold e.g., 99% of the time
- the data selection component 114 may select the particular section within each piece of data (e.g., file) and refrain from selecting other sections of the piece of data. Further, in examples, the data selection component 114 can analyze the data to generate entropy data indicating a randomness of one or more portions of data and select a particular portion of the data and/or a number of bits/bytes based on the entropy data.
- the entropy data may indicate a randomness of a portion of the data relative to other portions of the data and/or a threshold. In some instances, a portion of data that is selected is a most/least random portion and/or has a randomness value above/below a threshold. In some instances, a Shannon entropy algorithm is implemented.
- the representation generation component 115 may generally be configured to process/analyze data to generate an n-dimensional representation of the data. For example, the representation generation component 115 may retrieve/receive data 150 from a component/device/system and process (e.g., parse) the data 150 in groups of bits to determine points for a coordinate system. Each group of bits may include one or more sets of bits that represent one or more coordinates, respectively. For example, the representation generation component 115 may extract three bytes of data (e.g., a group of bits) and represent each byte (e.g., set of bits) with a coordinate for a point.
- the representation generation component 115 can convert each byte into a coordinate value for a coordinate system (e.g., a value from 0 to 255).
- a first byte in a group of bits may represent an x-coordinate (e.g., x-value from 0 to 255 on a coordinate system)
- a second byte in the group of bits may represent a y-coordinate for the point (e.g., y-value from 0 to 255 on the coordinate system)
- a third byte in the group of bits may represent z-coordinate for the point (e.g., z-value from 0 to 255 on the coordinate system).
- the representation generation component 115 may process any number of bits in the data 150 to determine any number of points for the data 150 . Although some examples are discussed herein in the context of three bytes representing a group of bits and a byte representing a set of bits, a group of bits and/or a set of bits may include any number of bits or bytes.
- the representation generation component 115 may generate an n-dimensional representation based on coordinates of points. For example, the representation generation component 115 can position each point within a coordinate system using one or more coordinates for the point (e.g., position a point based on a x-coordinate value, y-coordinate value, and z-coordinate value). In some embodiments, the points produced by such process form an n-dimensional representation (e.g., a point cloud), such as an 3D point representation 151 illustrated in FIG. 1 . Further, in some embodiments, the points produced by such process may be used to form an n-dimensional representation. For instance, the representation generation component 115 may use a pattern recognition algorithm 117 to identify a set of points that are associated with particular characteristic(s).
- Such pattern recognition algorithm 117 can generally seek to identify points that are within a particular distance from each other, positioned on a virtual surface/plane, and/or otherwise include characteristics that may indicate that the set of points may form a surface.
- the representation generation component 115 can generate an n-dimensional representation based on the set of points, such as a 3D model 152 illustrated in FIG. 1 .
- a model is a polygon mesh that includes one or more vertices, edges, faces, polygons, surfaces, and so on.
- a model is a wire-frame model that includes one or more vertices, edges, and so on.
- the representation generation component 115 can generate other types of n-dimensional representations, such as an n-dimensional map. An example process of generating an n-dimensional point representation is illustrated and discussed in reference to FIG. 2 .
- the representation generation component 115 generates multiple models/representation for different sets of points within data.
- the pattern recognition algorithm 117 can identify different sets of points, and the representation generation component 115 can generate a model for each set of points, resulting in multiple models within a coordinate space/system.
- the representation generation component 115 processes data that is selected by the data selection component 114 .
- the representation generation component 115 can generate an n-dimensional representation for a particular portion of data that is selected by the data selection component 114 .
- the data 150 includes a plurality of units of data, such as a plurality of files, and the representation generation component 115 generates an n-dimensional representation for each of the units of data.
- An n-dimensional representation such as the n-dimensional representation 151 or the n-dimensional representation 152 , may include a variety of representations, such as an n-dimensional point cloud or other plurality of points, an n-dimensional map, an n-dimensional model (e.g., mesh model, wireframe model, etc.), and so on.
- the term “n” may represent any integer.
- an n-dimensional representation may include surfaces.
- an n-dimensional representation may be visualized by a human, while in other embodiments an n-dimensional representation may not able to be visualized by a human.
- data representing an n-dimensional representation may be stored in an array, matrix, list, or any other data structure.
- an n-dimensional representation is stored as a data signature 118 in a data signature(s) data store.
- a data signature for a piece of data can be points or one or more models for the piece of data generated by the representation generation component 115 .
- An n-dimensional representation may be represented within a coordinate system.
- a coordinate system may include a number line, a cartesian coordinate system, a polar coordinate system, a homogeneous coordinate system, a cylindrical or spherical coordinate system, etc.
- the techniques and architectures may generate a representation of any number of dimensions and/or a representation may be represented in any type of coordinate system.
- the representation generation component 115 generates multiple representations for the same data (e.g., a unit of data, such as a file). In some examples, the representation generation component 115 may generate a two-dimensional representation for data and generate a three-dimensional representation for the same data. Further, in some examples, the representation generation component 115 may generate a three-dimensional representation for data using a process that represents three bytes of continuous bits as an x-coordinate, a y-coordinate, and a z-coordinate, in that order.
- the representation generation component 115 may also generate a three-dimensional representation for the same data using a process that represents three bytes of continuous bits as a y-coordinate, a z-coordinate, and an x-coordinate, in that order.
- representing data with multiple representations may be useful to provide multiple layers of evaluation of the data (e.g., when evaluating the data with the representation analysis component 116 to detect any threats).
- the representation generation component 115 may generate multiple representations for data using different coordinate systems and/or different manners of processing the data.
- the representation generation component 115 and/or the representation analysis component 116 processes a portion of data while refraining from processing another portion of the data (or at least initially refraining from processing the other portion).
- the representation generation component 115 may process a predetermined number of bytes of each file, such as a first 1500 bytes of each file, a second 1500 bytes of each file, or a last 1500 bytes of each file, to generate an n-dimensional representation for the file.
- an initial portion of data (e.g., a file) may include a header that designates execution points within the data.
- the representation generation component 115 may efficiently process data by generating an n-dimensional representation based on just the data within the header. In some instances, the representation generation component 115 processes just a portion of data that is selected by the data selection component 114 . However, any portions of data may be processed.
- Data such as the data 150
- Data may be a variety of types of data, such as audio data, video data, text data (e.g., text files, email, etc.), binary data (e.g., binary files), image data, network traffic data (e.g., data protocol units exchanged over a network, such as segments, packets, frames, etc.), file system data (e.g., files), runtime data (e.g., data generated during runtime of an application, which may be stored in volatile memory), data stored in volatile memory, data stored in non-volatile memory, application data (e.g., executable data for one or more applications), data associated with an isolated environment (e.g., data generated or otherwise associated with a virtual machine, data generated or otherwise associated with a trusted execution environment, data generated or otherwise associated with an isolated cloud service, etc.), metadata, behavioral data (e.g., data describing behaviors taken by a program during runtime), location data (e.g., geographical/physical location data of a device, user, etc.), quality
- Data may be formatted in a variety of manners and/or according a variety of standards.
- data includes a header, payload, and/or footer section.
- Data may include multiple pieces of data (e.g., multiple files or other units of data) or a single piece of data (e.g., a single file or another unit of data).
- data includes non-image-based data, such as data that is not initially intended/formatted to be represented within a coordinate system (e.g., not stored in a format that is intended for display).
- image-based data may generally be intended/formatted for display, such as images, 2D models, 3D models, point cloud data, and so on.
- a type of data may be defined by or based on a format of the data, a use of the data, an environment in which the data is stored or used (e.g., an operating system, device platform, etc.), a device that generated the data, a size of the data, an age of the data (e.g., when the data was created), and so on.
- the representation analysis component 116 may be configured to analyze an n-dimensional representation, such as the n-dimensional point representation 151 or the n-dimensional model representation 152 .
- the representation analysis component 116 may generally use an analysis model(s) 119 stored in an analysis model data store.
- the analysis model(s) 119 can include one or more machine/human-trained models and/or other types of models, which can implement techniques/algorithms for detecting a threat(s).
- the one or more analysis models 119 models may include models configured for different types of data, different coordinate systems, different types of n-dimensional representations, and so on.
- the representation analysis component 116 can use the one or more analysis models 119 to process an n-dimensional representation (generated by the representation generation component 115 ) to generate a confidence value/data indicating a likelihood that an n-dimensional includes malicious data. In examples, the representation analysis component 116 can determine if an n-dimensional representation includes malicious data (e.g., if a confidence value is above a threshold).
- the representation analysis component 116 is configured to compare an n-dimensional representation to one or more n-dimensional representations that have been tagged as malicious. For example, a 2D or 3D model for data can be compared to 2D or 3D models for malicious code to determine a similarity of the 2D or 3D data model to the 2D or 3D malicious code models.
- the representation analysis component 116 can be configured to compare a similarity between surfaces, edges, volume, area, and/or any other characteristic of a model.
- the representation analysis component 116 can generate a confidence/similarity value indicating a similarity of the 2D or 3D data model to the 2D or 3D malicious data models.
- the representation analysis component 116 includes an Artificial Intelligence (AI) component 120 configured to train a model to create a machine-trained model that is configured to analyze an n-dimensional representation to detect a threat.
- AI Artificial Intelligence
- the AI component 120 may analyze training data 121 from a training data store that includes one or more n-dimensional representations that are tagged as being associated with a threat (e.g., malicious code) and/or one or more n-dimensional representations that are tagged as being threat free (e.g., not associated with a threat).
- An n-dimensional representation may be tagged (e.g., categorized) by a user and/or a system.
- the AI component 120 may analyze the training data 121 to generate one or more machine-trained models, such as one or more artificial neural networks or another Artificial Intelligence model.
- the AI component 120 may store the one or more machine-trained models within the data store for the analysis model(s) 119 .
- the AI component 120 may learn one or more characteristics that are associated with an n-dimensional representation(s) of malicious data and train a machine-trained model to detect such one or more characteristics. For example, the AI component 120 may use pattern recognition, feature detection, shape/surface detection, and/or a spatial analysis to identify one or more characteristics and/or patterns of one or more characteristics.
- a characteristic may include: a spatial feature (e.g., a computer vision/image processing feature, such as edges, corners (interest points), blobs (regions of interest points), ridges, etc.), a feature of an n-dimensional representation, a marker of an n-dimensional representation, a number of models that may generally be associated with malicious data (e.g., an average/greatest/smallest number of models within a coordinate system for malicious data), a relationship between models (within a coordinate system) that are associated with malicious data (e.g., an average/longest/shortest distance between malicious data models), a shape of a model(s) that is associated with malicious data (e.g., a type of shape), a size of a model(s) that is associated with malicious data (e.g., an average/largest/smallest size of an average model), a volume of a model(s) that is associated with malicious data (e.g., an average/largest/smallest size of an average model),
- the AI component 120 may train one or more models for different types of threats. For example, a model may be trained to detect/identify malware, a particular type of malware (e.g., a virus, spyware, ransomware, polymorphic malware, a particular type of virus, a particular type of spyware, a particular type of ransomware, a particular type of polymorphic malware, etc.), and so on.
- the AI component 120 may learn that a particular characteristic (e.g., feature) in an n-dimensional representation is associated with a virus or a particular type of virus and train a model to detect the particular characteristic and/or to identify the particular characteristic as being associated with the virus or the particular type of virus.
- the AI component 120 may train a first model to detect/identify a first type of threat and train a second model to detect/identify a second type of threat.
- the AI component 120 may be configured to process an n-dimensional representation with a machine-trained model(s) or any other model.
- the AI component 120 may receive the n-dimensional representation 151 / 152 from the representation generation component 115 and process the n-dimensional representation 151 / 152 with a machine-trained model to identify any threats associated with the n-dimensional representation 151 / 152 .
- the AI component 120 may identify a type of threat associated with an n-dimensional representation, such as malware, a particular type of malware (e.g., a virus, spyware, ransomware, polymorphic malware, a particular type of virus, a particular type of spyware, a particular type of ransomware, a particular type of polymorphic malware, etc.), and so on.
- the processing includes pattern recognition, feature detection, and/or a spatial analysis, which may include identifying one or more characteristics (e.g., features) within an n-dimensional representation.
- the representation analysis component 116 may be configured to use different models to analyze one or more n-dimensional representations.
- the representation analysis component 116 may process an n-dimensional representation with a first model and process the n-dimensional representation with a second model.
- the representation analysis component 116 may detect a threat if either analysis detects a threat (e.g., either one of the confidence values is above a threshold). Further, in another example, the representation analysis component 116 can process an n-dimensional representation a first time with a first model.
- the representation analysis component 116 can process the n-dimensional representation (or a portion thereof) a second time with a second model.
- the representation analysis component 116 may detect a threat if a confidence value from the second model satisfies one or more criteria (e.g., is above a threshold).
- the second model may require more (or less) computational resources, time, etc.
- the representation analysis component 116 can use a multiple layered approach to process an n-dimensional representation(s), wherein each layer can be associated with a different model.
- the representation analysis component 116 may provide more accurate results regarding any potential threats. However, processing an n-dimensional representation once may be sufficient or just as accurate in many instances.
- a threat may include malware, phishing, a rootkit, a bootkit, a logic bomb, a backdoor, a screen scraper, a physical threat (e.g., an access point without security measures, such as leaving a door open, etc.), and so on.
- Malware may include a virus, spyware, adware, a worm, a Trojan horse, scareware, ransomware, polymorphic malware, and so on.
- a threat may result from any data, software, or other component that has malicious intent.
- the representation analysis component 116 may detect a physical threat associated with data.
- the representation generation component 115 may process data representing a physical environment, such as images of the interior or exterior of a building and generate an n-dimensional representation for the data.
- the representation analysis component 116 may process the n-dimensional representation to identify a potential threat, such as an access point that may potentially be at risk of a break-in due to reduced security features at the access point.
- the representation analysis component 116 may be configured to detect a variety of other types of threats.
- the representation analysis component 116 may be configured to provide a variety of types of output regarding processing of an n-dimensional representation. For example, based on processing an n-dimensional representation with the one or more analysis models 118 , the representation analysis component 116 may determine if the n-dimensional representation is associated with any threats, determine the types of threats (if any), where the threat is located in the data, a source of the threat (e.g., a content creator that generated the threat, an entity involved in distributing the threat, etc.).
- a source of the threat e.g., a content creator that generated the threat, an entity involved in distributing the threat, etc.
- the representation analysis component 116 may generate information (e.g., a report, notification, a threat rating, signal, etc.) indicating if a threat was detected, a type of threat that was detected, a confidence value of a detected threat (e.g., a rating on a scale of 1 to 10 of a confidence that data includes a threat, with 10 (or 1) being the highest confidence that the data includes a threat), where a threat is located in data, a source of a threat, and so on.
- the representation analysis component 116 may provide the information to the client device 130 (e.g., in a message), which may display the information via a user interface and/or another manner.
- a user may view information provided via the user interface and/or cause an operation to be performed, such as having a threat removed from the data, replacing the malicious data with other data, preventing a threat from further corrupting the data, preventing a threat from being stored with the data, and so on. Further, in some examples, the representation analysis component 116 may provide the information to another device/system and/or cause an operation to be performed automatically to address any threats.
- the data selection component 114 , representation generation component 115 , and/or representation analysis component 116 can be implemented in a variety of context across a variety of devices/system.
- one or more of the data selection component 114 , representation generation component 115 , and representation analysis component 116 may be implemented at the service provider 110 , network device 145 , and/or client device 130 .
- one or more instances of the data selection component 114 , representation generation component 115 , and/or representation analysis component 116 are implemented at one or more of the service provider 110 , network device 145 , and the client device 130 .
- the service provider 110 can include one or more service providers implemented as one or more computing devices, which may collectively or individually implement the data selection component 114 , representation generation component 115 , and/or representation analysis component 116 .
- the functionality of the data selection component 114 , representation generation component 115 , and the representation analysis component 116 may be divided in a variety of manners across a variety of different devices/systems/components, which may or may not operate in cooperation to evaluate data.
- the data selection component 114 , representation generation component 115 , and/or representation analysis component 116 may be configured to evaluate data at any time.
- an evaluation of data is performed in response to a request by the client device 130 , such as a user providing input through the client device 130 to analyze data.
- a user may employ the client device 130 to initiate an evaluation of data and the service provider 110 may provide a message back to the client device 130 regarding the evaluation, such as information indicating whether or not a threat was detected, a type of threat detected, and so on.
- a user can include an end-user, an administrator (e.g., an Information Technology (IT) individual), or any other individual.
- IT Information Technology
- an evaluation of data is performed periodically and/or in response to a non-user-based request received by the client device 130 , service provider 110 , network device 145 , and/or another device.
- an evaluation of data is performed when data is received/sent/downloaded.
- the one or more network interfaces 113 may be configured to communicate with one or more devices over a communication network.
- the one or more network interfaces 113 may send/receive data in a wireless or wired manner over one or more networks 140 , which can include one or more personal area networks (PAN), local area networks (LANs), wide area networks (WANs), Internet area networks (IANs), cellular networks, the Internet, etc.
- the one or more network interfaces 113 may implement a wireless technology such as Bluetooth, Wi-Fi, near field communication (NFC), or the like.
- the data store for the training data 121 , analysis model(s) 119 , and/or data signature(s) 118 may be associated with any entity and/or located at any location.
- a data store is associated with a first entity (e.g., company, environment, etc.) and the service provider 110 /network device(s) 145 /client device 130 is associated with a second entity that provides a service to evaluate data.
- a data store may be implemented in a cloud environment or locally at a facility to store a variety of forms of data and the service provider 110 may evaluate the data to provide information regarding security of the data, such as whether or not the data includes malicious data.
- a data store and the service provider 110 /network device(s) 145 /client device 130 are associated with a same entity and/or located at a same location.
- various data stores are illustrated in the example of FIG. 1 as being located within the memory 112 , in some examples a data store may be included within another device/system.
- FIG. 2 illustrates an example process of converting data to an n-dimensional point representation in accordance with one or more embodiments.
- control circuitry such as the control circuitry 111 from FIG. 1 , may processes data 202 at a bit/byte level to generate an n-dimensional point representation for the data 202 .
- the control circuitry processes the data 202 in groups of bits, with each group of bits being converted to coordinates for a point.
- the control circuitry may identify a first group of bits 206 that includes three bytes of data, with each byte of data corresponding to a set of bits.
- the group of bits 206 includes a set of bits 210 (i.e., a first byte), a set of bits 212 (i.e. a second byte), and a set of bits 214 (i.e., a third byte).
- the set of bits 210 are directly adjacent to the set of bits 212 and the set of bits 212 are directly adjacent to the set of bits 214 .
- the control circuitry converts the set of bits 210 to an x-coordinate value (illustrated as “X 1 ”), the set of bits 212 to a y-coordinate value (illustrated as “Y 1 ”), and the set of bits 214 to a z-coordinate value (illustrated as “Z 1 ”).
- the control circuitry may use the coordinate values to produce a point 222 within a coordinate system 204 (e.g., position the point 222 ), as shown.
- control circuitry may identify a second group of bits 208 that includes three bytes of data, with each byte of data corresponding to a set of bits.
- the group of bits 208 includes a set of bits 216 (i.e., a first byte), a set of bits 218 (i.e. a second byte), and a set of bits 220 (i.e., a third byte).
- the set of bits 216 are directly adjacent to the set of bits 218 and the set of bits 218 are directly adjacent to the set of bits 220 .
- control circuitry may convert the set of bits 216 to an x-coordinate value (illustrated as “X 2 ”), the set of bits 218 to a y-coordinate value (illustrated as “Y 2 ”), and the set of bits 220 to a z-coordinate value (illustrated as “Z 2 ”).
- the control circuitry may use the coordinate values to create a point 224 within the coordinate system 204 , as shown.
- the control circuitry can proceed to process any number of bits (e.g., groups of bits) in the data 202 in a similar fashion to produce any number of points within the coordinate system 204 .
- the n-dimensional representation of FIG. 2 is illustrated with two points; however, the n-dimensional representation may include any number of points, such as hundreds or thousands of points. Further, although the n-dimensional representation of FIG. 2 is illustrated with points, as noted above an n-dimensional representation may include other representations.
- the data 202 represents a unit of data, such as a file, network traffic unit, etc.
- the control circuitry may perform a similar process for any number of units of data (e.g., any number of files) to generate any number of n-dimensional representations.
- FIG. 2 is illustrated in the context of three bytes representing a group of bits and one byte representing a set of bits, a group of bits and/or a set of bits may include any number of bits or bytes.
- a group of bits may include two bytes of data (or an arbitrary number of bits, such as ten bits), with each byte (or set of five bits) being converted to a coordinate for a two-dimensional coordinate system.
- control circuitry may process data in other manners, such as by converting the first 200 bytes of data to x-coordinates, the second 200 bytes of data to y-coordinates, and the third 200 bytes of data to z-coordinates. Furthermore, control circuitry may process data in a variety of other manners.
- FIGS. 3, 4, 5, 6A-6B, 7, and 8 illustrate example processes 300 , 400 , 500 , 600 , 700 , and 800 respectively, in accordance with one or more embodiments.
- processes 300 , 400 , 500 , 600 , 700 , and 800 may be described as being performed in the example architecture 100 of FIG. 1 .
- one or more of the individual operations of the processes 300 , 400 , 500 , 600 , 700 , and 800 may be performed by the control circuitry 111 .
- the processes 300 , 400 , 500 , 600 , 700 , and/or 800 may be performed in other architectures.
- the architecture 100 may be used to perform other processes.
- the processes 300 , 400 , 500 , 600 , 700 , and 800 are each illustrated as a logical flow graph, each graph of which represents a sequence of operations that can be implemented in hardware, software, or a combination thereof.
- the operations represent executable instructions stored on one or more computer-readable media that, when executed by control circuitry, perform the recited operations.
- executable instructions include routines, programs, objects, components, data structures, and the like that perform particular functions or implement particular abstract data types.
- the order in which the operations are described is not intended to be construed as a limitation, and any number of the described operations can be combined in any order and/or in parallel to implement a process. Further, any number of the described operations may be omitted.
- FIG. 3 illustrates the example process 300 to train an analysis model in accordance with one or more embodiments.
- one or more first n-dimensional representations that are tagged as being associated with one or more target trajectories may be obtained.
- the control circuitry 111 can receive training data (e.g., one or more n-dimensional representations) that has been tagged as being associated with malware.
- the training data may have been tagged by a user, a system, or another entity.
- the training data may include one or more n-dimensional representations
- the one or more n-dimensional representations may have been generated by the control circuitry 111 by processing data at a bit or byte level, similar to various processes described herein.
- one or more second n-dimensional representations that are tagged as being free of certain target properties may be obtained.
- the control circuitry 111 may retrieve training data (e.g., one or more n-dimensional representations) that has been tagged as being free of malicious data (e.g., not associated with malware).
- the training data may have been tagged by a user, a system, or another entity.
- the training data may include one or more n-dimensional representations
- the one or more n-dimensional representations may have been generated by the control circuitry 111 by processing data at a bit or byte level, similar to various processes described herein.
- machine learning can be used to train a model based at least in part on the one or more first n-dimensional representations and/or the one or more second n-dimensional representations.
- the control circuitry 111 may analyze training data that is tagged as being associated with malware and/or the training data that is tagged as being malware free and learn what information (e.g., features) are associate with malware/malicious data.
- control circuitry 111 may create a machine-trained model that is configured to detect threats/malicious data, identify types of threats/malicious data, identify sources of threats/malicious data, identify a portion(s) of data this is associated with threats/malicious data (e.g., a portion of data to analyze), and so on.
- FIG. 4 illustrates the example process 400 to produce one or more n-dimensional representations in accordance with one or more embodiments.
- data can be obtained from a data source.
- the control circuitry 111 can receive or retrieve data from another device/system/component.
- the data can comprise a variety of types of data, such as file system data, non-image-based data, network traffic data, runtime data, data associated with an isolated environment, or any other data.
- a portion of the data can be selected.
- the control circuitry 111 can select a particular portion of the data, such as a particular number of bits/bytes.
- the selection is based on a type of the data (e.g., a format of the data, a use of the data, an environment in which the data is stored or used, a device that generated the data, a size of the data, an age of the data, and so on), entropy data for the data indicating a randomness of one or more portions of the data, and so on.
- the selected portion of the data may be extracted.
- the control circuitry 111 may extract a first portion of the data, such as a first predetermined number of bytes of the data, and/or refrain from extracting a second portion of the data.
- the control circuitry 111 may determine to represent the data with a particular portion of the data.
- a group of bits in the data may be identified.
- the control circuitry 111 may identify three bytes in the data as representing a group of bits.
- the control circuitry 111 may initially identify a group of bits at a start of the portion of the data.
- one or more coordinates for a point may be determined based at least in part on one or more sets of bits in the data.
- the control circuitry 111 may determine a first coordinate for a point based at least in part on a first set of bits in a group of bits, a second coordinate for the point based at least in part on a second set of bits in the group of bits, a third coordinate for the point based at least in part on third set of bits in the group of bits, and so on.
- the first set of bits comprises a first byte
- the second set of bits comprises a second byte that is directly adjacent to the first byte
- the third set of bits comprises a third byte that is directly adjacent to the second byte.
- the sets of bits may not be directly adjacent to each other.
- the control circuitry 111 may represent a set of bits as a coordinate for a point.
- control circuitry 111 may determine if groups of bits in the data (e.g., all groups) are processed or a limit is met. For example, if the control circuitry 111 has extracted a portion of the data for processing, such as a header of a file, the control circuitry 111 may determine if another group of bits exists in the portion of the data (e.g., if there exists another group of bits that has not yet been converted to a point). That is, the control circuitry 111 may determine if it has reached an end of the data (or portion of the data).
- control circuitry 111 may determine if the limit is reached (e.g., the control circuitry 111 has processed the first 1500 bytes of data).
- the process 400 may proceed to operation 416 (i.e., the “YES” branch). Alternatively, if the groups of bits in the data that are designated to be processed are not processed yet and/or the limit is not reached, the process 400 may proceed to operation 414 (i.e., the “NO” branch).
- a next group of bits in the data may be designated for processing.
- the control circuitry 111 may increment to a next group of bits in the data, and then proceed to the operation 408 to identify the next group of bits in the data and to the operation 410 to determine one or more coordinates for the next group of bits.
- the process 400 may loop through operations 414 , 408 , 410 , and 412 any number of times, if needed, to process the data.
- an n-dimensional representation for the data may be generated based at least in part on one or more points.
- the control circuitry 111 may use one or more coordinates for each point to generate an n-dimensional representation for the data (or selected portion of the data).
- the n-dimensional representation may include an n-dimensional point representation (e.g., a plurality of points), an n-dimensional model representation (e.g., mesh, wireframe), an n-dimensional map, and so on.
- the n-dimensional representation may be provided for processing.
- a component of a device/system may provide the n-dimensional representation to another component of the device/system for processing with an analysis model.
- the control circuitry 111 may cause the n-dimensional representation to be processed with an analysis model that is configured to detect a threat.
- data includes multiple pieces of data (e.g., multiple files) and the process 400 is performed for each piece of data. Further, in some embodiments, the process 400 is performed multiple times for the same data to generate different types of n-dimensional representations for the data.
- FIG. 5 illustrates the example process 500 to process an n-dimensional representation using an analysis model in accordance with one or more embodiments.
- an n-dimensional representation may be processed using an analysis model.
- the control circuitry 111 may process an n-dimensional representation using a machine-trained model (e.g., an artificial neural network), a shape comparison model, and/or another model (e.g., a human-trained model).
- the control circuitry 111 may seek to identify information or features within the n-dimensional representation that are associated with one or more threats. Further, in some examples, the control circuitry 111 may compare the n-dimensional representation to n-dimensional representations associated with threats. In many instances, the control circuitry 111 may determine a confidence value/data indicating a likelihood that the n-dimensional representation is associated with a threat. In some embodiments, the control circuitry 111 may process the n-dimensional representation multiple times using different models and/or process various n-dimensional representations within a coordinate system.
- the control circuitry 111 may determine if a confidence value/data regarding a threat is greater than a threshold.
- the process 500 may proceed to operation 506 (i.e., the “YES” branch). Alternatively, if the n-dimensional representation is not associated with any target properties, the process 500 may proceed to operation 508 (i.e., the “NO” branch).
- an operation may be performed to address the target property.
- the control circuitry 111 may perform or cause to be performed a threat operation that includes removing a threat, replacing a threat (e.g., malicious data), preventing a threat from associating with data, providing information (e.g., a notification, a report, a malware rating indicating a likelihood that the data is associated with malware, etc.) to a client device regarding the threat, and so on.
- information may be provided regarding the processing.
- the control circuitry can provide information indicating that no threats are associated with the n-dimensional representation, information indicating a confidence value for the processing, and so on.
- the information can be provided in a report, notification, message, signal, etc. to a client device and/or another system/component.
- the operation 508 is additionally, or alternatively, performed in the branch for operation 506 (e.g., to provide information regarding a detected threat).
- FIGS. 6A-6B illustrates the example process 600 to generate one or more n-dimensional representations and analyze the one or more n-dimensional representations in accordance with one or more embodiments.
- data can be received.
- the control circuitry 111 can receive data from another device, system, component, and so on.
- the data is retrieved from a data store, such as a data store associated with the control circuitry 111 and/or another system.
- one or more portions of the data can be selected for processing.
- the control circuitry 111 can select a particular number of bits/bytes of the data at a particular location within the data.
- the selection is based on a type of the data (e.g., a format of the data, a use of the data, an environment in which the data is stored or used, a device that generated the data, a size of the data, an age of the data, and so on), entropy data for the data indicating a randomness of one or more portions of the data, and so on.
- a type of the data e.g., a format of the data, a use of the data, an environment in which the data is stored or used, a device that generated the data, a size of the data, an age of the data, and so on
- entropy data for the data indicating a randomness of one or more portions of the data, and so on.
- control circuitry 111 can analyze the data the data using an entropy algorithm to generate entropy data indicating a randomness of one or more portions of the data. Based on the entropy data, the control circuitry 111 can select a particular portion of the data for processing, such as a portion that is associated with a most/least amount of randomness, a portion that is associated with a randomness value that is above/below a threshold, and so on.
- At 606 at least a portion of the data can be represented as a plurality of points.
- the control circuitry 111 can represent a portion of the data that is selected at 604 as a plurality of points within a coordinate system/space.
- the control circuitry 111 can represent a first set of bits in the data as a first coordinate for a first point and a second set of bits in the data as a second coordinate for the first point.
- the second set of bits can be adjacent to the first set of bits (e.g., directly adjacent or within a particular number of bits).
- control circuitry 111 can represent a third set of bits in the data as a first coordinate for a second point and a fourth set of bits in the data as a second coordinate for the second point.
- the fourth set of bits can be adjacent to the third set of bits.
- one or more points can be associated with an indicator indicating a location of bits associated with the one or more points within the data.
- the control circuitry 111 can determine a location of bits (associated with a point) within the data and associate a location indicator with the point.
- the location indicator can be visually represented within a coordinate space and/or used to generate a visual representation for a surface of a model, such as a color, contrast, brightness, and so on. This can allow a user to view a location of a point and/or a location of surfaces/points for a model associated with the point.
- one or more of the plurality of points within the coordinate system can be associated with a color, wherein each color can be associated with a different section within the data (e.g., a header, a body, a footer, and so on).
- a user may be able to view the coloring of the plurality of points to identify where the points are located within the data.
- the location indicator may not be visually represented and/or may be used by a system/component and another manner.
- a set of points in the plurality of points can be identified.
- the control circuitry 111 can analyze the plurality of points using a pattern recognition algorithm to identify points that are within a particular distance from each other, positioned on a virtual surface/plane (e.g., aligned to form a substantially planar surface), and/or otherwise include characteristics that may indicate that the set of points are positioned within some type of pattern that may be used to form a surface/edge.
- an n-dimensional representation can be generated.
- the control circuitry 111 can generate an n-dimensional representation for the set of points that are identified at 610 .
- the control circuitry 111 can generate an n-dimensional representation for any number of points within the plurality of points, such as all points within the plurality of points, a predetermined number of points within the plurality of points, and so on.
- An n-dimensional representation can include an n-dimensional point representation (e.g., the plurality of points), an n-dimensional model representation (e.g., a mesh model, a wireframe model, etc.), an n-dimensional map, and so on.
- An n-dimensional representation can have any number of dimensions, such as two, three, four, five, etc.
- each dimension of a representation can refer to a characteristic/input of data.
- a five-dimensional representation for data can include three dimensions that represent a 3D model within a 3D space and one dimension that represents a type of the data (e.g., network traffic, file system data, etc.) and another dimension that represents metadata associated with the data.
- control circuitry 111 can perform an additional analysis on the plurality of points (points other than the first set of points) using a pattern recognition algorithm to determine if there is an additional set of points associated with particular characteristics (e.g., a particular pattern).
- the process 600 may proceed to operation 616 (i.e., the “NO” branch). Alternatively, if an additional set of points is included within the data, the process 600 may return to operation 612 (i.e., the “YES” branch) to generate an n-dimensional representation for the additional set of points.
- the process 600 may perform operations 614 and 612 any number of times to generate any number of n-dimensional representations for the plurality of points in the coordinate space (e.g., generate any number of models for the plurality of points).
- the one or more n-dimensional representations can be associated with the data as a data signature.
- the control circuitry 111 can generate a data signature for the data and associate any number of an-n-dimensional representations that have been generated for the data (e.g., any number of models that are generated for the plurality of points within the coordinate space).
- a first analysis model can be selected.
- the control circuitry 111 can select a first analysis model based on a type of the data, where the portion of the data selected at 604 is located within the data, entropy data indicating a randomness of at least a portion of the data (e.g., the data as a whole, the portion of the data selected at 604 for processing, etc.), and so on.
- one or more n-dimensional representations for the data can be analyzed based on the first analysis model.
- the control circuitry 111 can use the first analysis model to analyze one or more n-dimensional representations associated with the plurality of points for the coordinate space.
- the control circuitry 111 can compare a 2D or 3D model representing one or more portions of the data to one or more 2D or 3D models that are tagged/classified as being associated with a certain target property (e.g., malicious data).
- the control circuitry 111 can determine a similarity between the 2D/3D data model and the 2D/3D malicious data model.
- control circuitry 111 can use a machine-trained model to (i) analyze/determine one or more characteristics/features of an n-dimensional representation (e.g., a point representation, a model representation, a map representation etc.) and/or (ii) determine if those one or more characteristics/features are associated with a target property.
- n-dimensional representation e.g., a point representation, a model representation, a map representation etc.
- a likelihood that the one or more n-dimensional representations include a target property can be determined and/or analysis data can be generated indicating the likelihood.
- the control circuitry 111 can determine a likelihood that an n-dimensional representation includes malicious data based on the analysis at 620 .
- the control circuitry 111 can generate analysis data indicating the likelihood, such as a confidence value/data.
- a target property can refer to/include malicious behavior (e.g., malicious data intended to damage an environment/system/device), benign behavior (e.g., data/behavior that is not malicious), a vulnerability (e.g., vulnerability data that may make an environment/system/device vulnerable to an attack), or any other security-related characteristic.
- the operation 620 and/or the operation 622 can be based on one or more characteristics of one or more n-dimensional representations within the coordinate system.
- the control circuitry 111 can determine a shape of a model(s) within the coordinate system, a size of a model(s), a volume of a model(s), an area of a model(s), a number of surfaces of a model(s), a location of a model(s) within the coordinate system, a position of a model relative to another model within the coordinate system, a number of models generated for the coordinate system, a location indicator for a point, an amount or location of empty space within a coordinate system, and so on.
- the control circuitry 111 can determine if a confidence value that the one or more in-dimensional representations include a target property (e.g., associated with/indicating a threat, disruption, nuisance, etc.) is within a range of values, is greater than a threshold, is less than a threshold, or otherwise satisfies one or more criteria. Further, in some examples, the control circuitry 111 can process the data a predetermined number of times, and the control circuitry 111 can determine if the predetermined number of times has been reached.
- a target property e.g., associated with/indicating a threat, disruption, nuisance, etc.
- the process 600 may proceed to operation 626 (i.e., the “YES” branch). Alternatively, if it is determined to not perform an additional analysis, the process 600 may proceed to operation 632 (i.e., the “NO” branch).
- a second analysis model and/or an additional portion of the data can be selected.
- the control circuitry 111 can select a second analysis model based on a type of the data, where the portion of the data selected at 604 is located within the data, entropy data indicating a randomness of at least a portion of the data (e.g., the data as a whole, the portion of the data selected at 604 for processing, etc.), a confidence value generated from a previous analysis, a type of the first analysis model, and so on.
- the second analysis model is different than the first analysis model. However, the second analysis model can be the same.
- the control circuitry 111 can select a different portion of the data (also referred to as “the second portion of the data”) for analysis at operation 628 .
- one or more n-dimensional representations for the data can be analyzed.
- the control circuitry 111 can use the second analysis model to analyze one or more previously generated n-dimensional representations (e.g., a second time). Further, in some examples, the control circuitry 111 can generate one or more n-dimensional representations for the second portion of the data (in a similar fashion as that discussed for one or more of operations 604 - 616 ) and use the second analysis model to analyze the one or more n-dimensional representations.
- a likelihood that the one or more n-dimensional representations include a target property can be determined and/or analysis data can be generated indicating the likelihood.
- the control circuitry 111 can determine, based on the analysis at 628 , a likelihood that an n-dimensional representation includes malicious data (e.g., malware, such as a virus, spyware, ransomware, polymorphic malware, etc.).
- the control circuitry 111 can generate analysis data indicating the likelihood, such as a confidence value/data.
- the control circuitry 111 can perform the operations 628 and 630 in a similar fashion as that discussed above in reference to operations 620 and 622 , respectively (e.g., based on one or more characteristics of one or more n-dimensional representations within the coordinate system).
- the control circuitry 111 can perform the operations 626 , 628 , and/or 630 any number of times to analyze one or more portions of the data.
- the process 600 can implement a third analysis model and/or a third portion of the data (when the operations 626 , 628 , and/or 630 are performed a second time), a fourth analysis model and/or a fourth portion of the data (when the operations 626 , 628 , and/or 630 are performed a third time), and so on.
- control circuitry 111 implements a multilayered approach, wherein each iteration (also referred to as a “layer”) through the operations 626 , 628 , and 630 processes the data with a different analysis model that includes different characteristics, such as a model that uses different computational resources, requires different amounts of computational time, provides different levels of effectiveness, provides different types of confidence values (e.g., a first model that is configured to minimize false positives, a second model that is configured to minimize false negatives, etc.), and so on.
- each iteration through the operation 628 can process data with an analysis model that requires more computational resources (in comparison to a previously implemented analysis model), requires more computational time, provides more accurate results, and so on.
- an additional analysis model e.g., second, third, fourth, etc. model
- each iteration through the operation 628 can process data with any type of analysis model.
- a multilayered approach can include selecting a larger portion (or smaller, in some cases) of the data and/or a different portion of the data with each iteration through the operations 626 , 628 , and 630 .
- a first portion of the data can be analyzed using an analysis model, wherein the first portion includes less than a threshold amount of bits/bytes and/or is associated with a particular portion/section of the data.
- the operations 626 , 628 , and 630 can be performed for a second portion of the data using the same analysis model and/or an additional analysis model.
- This second portion of the data can include more bits/bytes than the first portion of the data and/or include a different section of the data than the first portion.
- the operations 626 , 628 , and/or 630 can be performed for a shifted set of bits/bytes.
- a first set of bits of the data can be processed.
- a second set of bits can be selected, wherein the second set of bits can include a group of bits that overlap with the first set of bits, such as a predetermined number of bits/bytes.
- Operations 628 and 630 can be performed for the second set of bits. In some instances, processing a larger (or different) portion of the data can require more computational resources, require more computational time, provide more accurate results, and so on.
- composite analysis data can be generated based on multiple analyses of the data. For instance, a first confidence value can be generated based on an analysis of the data a first time (e.g., at operations 620 and 622 ) and a second confidence value can be generated based on an analysis of the data a second time (e.g., at operations 628 and 630 ).
- the first/second confidence value can indicate a likelihood that the data includes malicious data.
- a composite confidence value can then be generated based on the first confidence value and the second confidence value, such as by using an equation/algorithm (which can include applying a weighting to a confidence value(s), such as by weighting the second (or first) confidence value higher), and so on.
- the analysis data can be provided.
- the control circuitry 111 can provide the analysis data generated at 622 and/or 630 to a device, system, and/or component.
- the analysis data is provided as interface data, which can be output (e.g., displayed) via a user interface.
- the analysis data is provided as a message or signal, which can cause additional processing to be performed, such as removing/replacing a threat (e.g., malicious data), preventing a threat from associating with data, providing information, and so on.
- a threat e.g., malicious data
- FIG. 7 illustrates the example process 700 to process analysis data regarding a target property and determine one or more characteristics about the threat in accordance with one or more embodiments.
- analysis data indicating a likelihood that data includes a target property can be received.
- the control circuitry 111 can obtain/retrieve analysis data regarding an analysis of data using one or more of the techniques discussed herein and/or other techniques.
- the control circuitry 111 can determine if a confidence value/data included in or otherwise associated with the analysis data is greater than a threshold or otherwise satisfies one or more criteria.
- the confidence value can indicate a likelihood that one or more n-dimensional representations include a target property.
- the confidence value can be generated by an analysis model when processing the one or more n-dimensional representations.
- the process 700 may proceed to operation 706 (i.e., the “YES” branch). Alternatively, if it is determined that the data does not include a target property, the process 700 may proceed to operation 708 (i.e., the “NO” branch).
- an indication can be provided that the data is free of the target property.
- the control circuitry 111 can generate information/signal/message indicating that the data is free of the target property.
- the control circuitry 111 can provide the information/signal/message to a system/device/component, which can use the information/signal/message in a variety of manners (including continuing with normal processing).
- a type of the target property and/or a source of the target property can be determined.
- the control circuitry 111 or other control circuitry can determine through machine learning or other techniques one or more characteristics of an n-dimensional representation and/or coordinate system that are generally associated with particular types of target properties (e.g., threats) and/or sources of target properties (e.g., particular entities that create threats, particular entities that distribute threats, and so on).
- the control circuitry 111 can determine a similarity between one or more malicious data characteristics and one or more characteristics of an n-dimensional representation to determine a type of the target property and/or a source of the target property.
- control circuitry 111 can determine a type of malicious data and/or a source of the malicious data based on a shape of an n-dimensional representation(s) (e.g., 2D/3D model, point cloud, n-dimensional map, etc.), a size of the n-dimensional representation(s), a volume of the n-dimensional representation, an area of the n-dimensional representation(s), a number of surfaces of the n-dimensional representation(s), a location of the n-dimensional representation(s) within a coordinate system, a position of the n-dimensional representation(s) within the coordinate system relative to another n-dimensional representation(s) (e.g., how close n-dimensional representations are to each other, a cluster of n-dimensional representations, etc.), a number of n-dimensional representations within the coordinate system that are associated with malicious data, a number of n-dimensional representations within the coordinate system (whether or not they are associated with malicious data), where data that is associated with a threat is located
- information regarding the target property can be generated and/or provided.
- the control circuitry 111 can generate information indicating the type of threat and/or the source of the threat.
- the control circuitry 111 can provide the information to a system/device/component, such as by providing user interface data, a message/signal, and so on.
- a portion of the data associated with the target property can be updated and/or the updated data can be sent to a component/system/device.
- the control circuitry 111 can determine that a threat is associated with a particular location in data (e.g., the threat is located at within a header/footer/body, the threat is located within the first 1500 bytes/bits, the threat is located at bytes/bits 2500-3500, the threat is associated with a macro associated with a file, and so on).
- the control circuitry 111 can update a portion of the data that includes the threat, such as by removing the malicious data from the data, replacing the malicious data (e.g., with different data), and so on.
- the data can be associated with a notification/signal indicating that the data is associated with a target property, wherein such notification can be displayed or otherwise provided when the data (or a specific portion of the data that is associated with a threat) is presented, such as when the data is presented to a user.
- the control circuitry 111 can receive network data and process the network data to determine if the data is associated with a threat. If the data is associated with a threat, the control circuitry 111 can update the portion of the data in substantially real-time as the data is received, so that the data can be transmitted without a threat. This can allow a network transmission of the data to continue without interruption.
- FIG. 8 illustrates the example process 800 to generate one or more n-dimensional representations for data associated with one or more target properties in accordance with one or more embodiments.
- data that is associated with one or more target properties can be represented as a plurality of points.
- the data may have been previously tagged or otherwise categorized as including the one or more target properties (e.g., malicious data).
- the control circuitry 111 can represent malicious data as a plurality of points within a coordinate system/space.
- the control circuitry 111 can represent a first set of bits in the malicious data as a first coordinate for a first point and a second set of bits in the malicious data as a second coordinate for the first point.
- the second set of bits can be adjacent to the first set of bits (e.g., directly adjacent or within a particular number of bits).
- control circuitry 111 can represent a third set of bits in the malicious data as a first coordinate for a second point and a fourth set of bits in the malicious data as a second coordinate for the second point.
- the fourth set of bits can be adjacent to the third set of bits.
- a set of points in the plurality of points can be identified.
- the control circuitry 111 can analyze the plurality of points using a pattern recognition algorithm to identify points that are within a particular distance from each other, positioned on a virtual surface/plane (e.g., aligned to form a substantially planar surface), and/or otherwise include characteristics that may indicate that the set of points are positioned within some type of pattern that may be used to form a surface/edge.
- an n-dimensional representation can be generated.
- the control circuitry 111 can generate an n-dimensional representation for the set of points that are identified at 804 .
- the control circuitry 111 can generate an n-dimensional representation for any number of points within the plurality of points, such as all points within the plurality of points, a predetermined number of points within the plurality of points, and so on.
- An n-dimensional representation can include an n-dimensional point representation (e.g., the plurality of points), an n-dimensional model representation (e.g., a mesh model, a wireframe model, etc.), an n-dimensional map, and so on.
- An n-dimensional representation can have any number of dimensions, such as two, three, four, five, etc.
- control circuitry 111 can perform an additional analysis on the plurality of points (points other than a first set of points) using a pattern recognition algorithm to determine if there are an additional set of points associated with one or more characteristics (e.g., a pattern).
- the process 800 may proceed to operation 810 (i.e., the “NO” branch). Alternatively, if an additional set of points is included within the data, the process 800 may return to operation 806 (i.e., the “YES” branch) to generate an n-dimensional representation for the additional set of points.
- the process 800 may perform operations 806 and 808 any number of times to generate any number of n-dimensional representations for the plurality of points in the coordinate space (e.g., generate any number of models for the plurality of points).
- the one or more n-dimensional representations can be associated with the one or more target properties as a data signature for the one or more target properties.
- the control circuitry 111 can generate a data signature for a threat and associate any number of n-dimensional representations that have been generated for the data (e.g., any number of models that are generated for the plurality of points within the coordinate space).
- the data signature can be stored in a data store.
- a data signature for a target property can be associated with a target property, which can be used to analyze other data to determine if the other data includes a target property.
- machine learning can be implemented to identify characteristics of n-dimensional representations that are associated with target properties (e.g., threats). For instance, upon generating a first model for a type of malicious data and a second model for the same type of malicious data, a machine learning technique can be implemented to learn a characteristic(s) that is associated with the first model and the second model. Such characteristic can be stored/associated with a threat (e.g., associate a model characteristic with a threat (within a data signature), when the model is generated/identified for the particular type of threat a predetermined number of times).
- target properties e.g., threats
- data for various types of malware or other types of target properties can be processed to create a data store for multiple types of threats.
- the process 800 can be performed any number of times to create a taxonomy of data signatures for various types of malware.
- an ordinal term e.g., “first,” “second,” “third,” etc.
- an ordinal term used to modify an element, such as a structure, a component, an operation, etc., does not necessarily indicate priority or order of the element with respect to any other element, but rather may generally distinguish the element from another element having a similar or identical name (but for use of the ordinal term).
- articles (“a” and “an”) may indicate “one or more” rather than “one.”
- an operation performed “based on” a condition or event may also be performed based on one or more other conditions or events not explicitly recited.
- description of an operation or event as occurring or being performed “based on,” or “based at least in part on,” a stated event or condition can be interpreted as being triggered by or performed in response to the stated event or condition.
- the one or more embodiments are used herein to illustrate one or more aspects, one or more features, one or more concepts, and/or one or more examples.
- a physical embodiment of an apparatus, an article of manufacture, a machine, and/or of a process may include one or more of the aspects, features, concepts, examples, etc. described with reference to one or more of the embodiments discussed herein.
- the embodiments may incorporate the same or similarly named functions, steps, modules, etc. that may use the same, related, or unrelated reference numbers.
- the relevant features, elements, functions, operations, modules, etc. may be the same or similar functions or may be unrelated.
- module is used in the description of one or more of the embodiments.
- a module implements one or more functions via a device, such as a processor or other processing device or other hardware that may include or operate in association with a memory that stores operational instructions.
- a module may operate independently and/or in conjunction with software and/or firmware.
- a module may contain one or more sub-modules, each of which may be one or more modules.
- a computer readable memory includes one or more memory elements.
- a memory element may be a separate memory device, multiple memory devices, or a set of memory locations within a memory device.
- Such a memory device may be a read-only memory, random access memory, volatile memory, non-volatile memory, static memory, dynamic memory, flash memory, cache memory, and/or any device that stores digital information.
- the memory device may be in a form a solid-state memory, a hard drive memory, cloud memory, thumb drive, server memory, computing device memory, and/or other physical medium for storing digital information.
- Example A a system comprising: control circuitry; and memory communicatively coupled to the control circuitry and storing executable instructions that, when executed by the control circuitry, cause the control circuitry to perform operations comprising: receiving data; representing at least a portion of the data as a plurality of points in a coordinate system; using a pattern recognition algorithm to identify a set of points in the plurality of points; generating an n-dimensional model for the set of points; comparing the n-dimensional model to a plurality of n-dimensional models that are tagged as including a target property associated with at least one of malicious behavior, benign behavior, or a vulnerability; and based at least in part on the comparison, determining a likelihood that the data includes the target property.
- Example B the system of Example A, wherein the representing includes: representing a first set of bits in the data as a first coordinate for a first point of the plurality of points and a second set of bits as a second coordinate for the first point, the second set of bits being adjacent to the first set of bits; and representing a third set of bits in the data as a first coordinate for a second point of the plurality of points and a fourth set of bits in the data as a second coordinate for the second point, the fourth set of bits being adjacent to the third set of bits.
- Example C the system of Example A or B, wherein the n-dimensional model includes at least one of a 3D mesh or 3D wireframe.
- Example D the system of any of Examples A-C, wherein the determining the likelihood is based on at least one of a shape of the n-dimensional model, a size of the n-dimensional model, a volume of the n-dimensional model, an area of the n-dimensional model, a number of surfaces of the n-dimensional model, a location of the n-dimensional model within the coordinate system, a position of the n-dimensional model relative to another n-dimensional model within the coordinate system, or a number of n-dimensional models within the coordinate system.
- Example E the system of any of Examples A-D, wherein the operations further comprise: determining that the data includes the target property; and determining at least one of a type of the target property or a source of the target property based on at least one of a shape of the n-dimensional model, a size of the n-dimensional model, a volume of the n-dimensional model, an area of the n-dimensional model, a number of surfaces of the n-dimensional model, a location of the n-dimensional model within the coordinate system, a position of the n-dimensional model to another n-dimensional model within the coordinate system, or a number of n-dimensional models within the coordinate system that are associated with the target property or another target property.
- Example F the system of any of Examples A-E, wherein the operations further comprise: determining that the data includes the target property; updating a portion of the data that includes the target property to generate updated data, the updating including at least one of removing the target property or replacing the target property; and sending the updated data to a component.
- Example G the system of any of Examples A-F, wherein the operations further comprise: representing predetermined data associated with the target property as multiple points; processing the multiple points to generate one or more of the plurality of n-dimensional models that are tagged as associated with the target property; and storing the plurality of n-dimensional models as signatures for the predetermined data.
- Example H a method comprising: receiving, by control circuitry, data; representing, by the control circuitry, at least a portion of the data as a first plurality of points in a coordinate system; analyzing, by the control circuitry, the first plurality of points to identify a set of points; generating, by the control circuitry, a first n-dimensional model for the set of points; and determining, by the control circuitry, a first likelihood that the data includes a target property based at least in part on an analysis of (i) the first n-dimensional model and (ii) a plurality of n-dimensional models that are tagged as being associated with the target property, the target property including at least one of malicious data, benign data, or vulnerability data.
- Example I the method of Example I, wherein the determining the first likelihood includes determining a likelihood that the data includes polymorphic malware.
- Example J the method of Example H or I, further comprising: generating a signature for the data that includes the first n-dimensional model.
- Example K the method of any of Examples H-J, wherein the representing includes: determining a first coordinate for a first point of the plurality of points based at least in part on a first group of bits in the data; and determining a second coordinate for the first point based at least in part on a second groups of bits in the data that is adjacent to the first group of bits.
- Example L the method of any of Examples H-K, further comprising: associating the first point with an indicator indicating a location of at least one of the first group of bits or the second group of bits within the data; wherein the determining the first likelihood is based at least in part on the indicator.
- Example M the method of any of Examples H-L, wherein the portion of the data includes first bits, and the method further comprises: representing second bits in the data as a second plurality of points, the second bits including a group of bits that overlap with the first bits; generating a second n-dimensional model for the second plurality of points; and determining a second likelihood that the data includes the target property based at least in part on an analysis of (i) the second n-dimensional model and (ii) the plurality of n-dimensional models that are tagged as being associated with the target property.
- Example N one or more non-transitory computer-readable media storing computer-executable instructions that, when executed by control circuitry, cause the control circuitry to perform operations comprising: receiving data; representing at least a first portion of the data as a plurality of points in a coordinate system; identifying a set of points in the plurality of points; generating a n-dimensional model for the set of points; comparing the n-dimensional model to a n-dimensional model that is tagged as being associated with a target property; and generating a first confidence value indicating a first likelihood that the data includes the target property.
- Example O the one or more non-transitory computer-readable media of Example N, wherein the operations further comprise: processing the plurality of points using a machine-trained model; generating a second confidence value indicating a second likelihood that the data includes the target property; and determining a composite confidence value for the data based at least in part on the first confidence value and the second confidence value.
- Example P the one or more non-transitory computer-readable media of Example N or O, wherein the first likelihood indicates a likelihood that the data includes malware.
- Example Q the one or more non-transitory computer-readable media of any of Examples N-P, wherein the operations further comprise: analyzing the data to generate entropy data indicating a randomness of the first portion of the data and a randomness of a second portion of the data; and selecting the first portion of the data for processing based at least in part on the randomness of the first portion of the data; wherein the representing the first portion of the data is based at least in part on selecting the first portion of the data.
- Example R the one or more non-transitory computer-readable media of any of Examples N-Q, wherein at least one of the representing, the using, the generating, or the comparing are part of implementing a first analysis model, and the operations further comprise: selecting a second analysis model based on at least one of a type of the data, where the first portion of the data is located within the data, or entropy data indicating a randomness of at least the first portion of the data, the second analysis model being different than the first analysis model; and analyzing the data using the second analysis model.
- Example S the one or more non-transitory computer-readable media of any of Examples N-R, wherein the n-dimensional model includes at least one of a mesh or wireframe.
- Example T the one or more non-transitory computer-readable media of any of Examples N-S, wherein the generating is based on at least one of a shape of the n-dimensional model, a size of the n-dimensional model, a volume of the n-dimensional model, an area of the n-dimensional model, a number of surfaces of the n-dimensional model, a location of the n-dimensional model within the coordinate system, a position of the n-dimensional model relative to another n-dimensional model within the coordinate system, or a number of n-dimensional models within the coordinate system.
- Example AA a method of detecting malware, the method comprising: receiving, by a computing device, data from a data store; identifying, by the computing device, at least a first group of bits in the data and a second group of bits in the data; representing, by the computing device, a first set of bits in the first group of bits as a first coordinate for a first point and a second set of bits in the first group of bits as a second coordinate for the first point; representing, by the computing device, a first set of bits in the second group of bits as a first coordinate for a second point and a second set of bits in the second group of bits as a second coordinate for the second point; generating, by the computing device, an n-dimensional representation for the data based at least in part on the first point and the second point; processing the n-dimensional representation using a model that has been trained using machine learning; and determining a malware rating for the data based at least in part on the processing, the malware rating indicating a likelihood that the data is associated with malware.
- Example BB the method of Example A, further comprising: representing, by the computing device, a third set of bits in the first group of bits as a third coordinate for the first point, wherein the n-dimensional representation comprises a three-dimensional representation.
- Example CC the method of Example A or B, wherein the first set of bits in the first group of bits comprises a first byte, the second set of bits in the first group of bits comprises a second byte that is directly adjacent to the first byte, and the third set of bits in the first group of bits comprises a third byte that is directly adjacent to the second byte.
- Example DD the method of any of Examples A-C, wherein the data comprises file system data.
- Example EE the method of any of Examples A-D, wherein the data comprises non-image-based data.
- Example FF a system comprising: control circuitry; and memory communicatively coupled to the control circuitry and storing executable instructions that, when executed by the control circuitry, cause the control circuitry to perform operations comprising: obtaining data; determining a first coordinate for a first point based at least in part on a first set of bits in the data and determining a second coordinate for the first point based at least in part on a second set of bits in the data that is adjacent to the first set of bits; determining a first coordinate for a second point based at least in part on a third set of bits in the data and determining a second coordinate for the second point based at least in part on a fourth set of bits in the data that is adjacent to the third set of bits; generating an n-dimensional representation for the data based at least in part on the first point and the second point; and causing the n-dimensional representation to be processed with a machine-trained model that is configured to detect malware.
- Example GG the system of Example F, wherein the first set of bits comprises a first byte and the second set of bits comprises a second byte that is directly adjacent to the first byte.
- Example HH the system of Example F or G, wherein obtaining the data comprises retrieving data from a data store, the data comprising file system data.
- Example II the system of any of Examples F-H, wherein the operations further comprise: extracting a first portion of the data and refraining from extracting a second portion of the data, the first portion of the data including the first set of bits and the second set of bits.
- Example JJ the system of any of Examples F-I, wherein the operations further comprise: determining a type of the data; and determining to represent the data with a first portion of the data based at least in part on the type of the data, the first portion of the data including the first set of bits and the second set of bits.
- Example KK the system of any of Examples F-J, wherein the first portion of the data includes at least one of a header, a body, or a footer.
- Example LL the system of any of Examples F-K, wherein the operations further comprise: determining a type of the data; and determining to represent the data with a first portion of the data and a second portion of the data based at least in part on the type of the data, the first portion of the data including the first set of bits and the second set of bits.
- Example MM the system of any of Examples F-L, wherein the operations further comprise: training a model to create the machine-trained model, the training being based at least in part on one or more n-dimensional representations that are tagged as being associated with malware and one or more n-dimensional representations that are tagged as being malware free.
- Example NN one or more non-transitory computer-readable media storing computer-executable instructions that, when executed, instruct one or more processors to perform operations comprising: obtaining data; determining a first coordinate for a first point based at least in part on a first set of bits in the data and determining a second coordinate for the first point based at least in part on a second set of bits in the data that is adjacent to the first set of bits; determining a first coordinate for a second point based at least in part on a third set of bits in the data and determining a second coordinate for the second point based at least in part on a fourth set of bits in the data that is adjacent to the third set of bits; generating an n-dimensional representation for the data based at least in part on the first and second coordinates for the first point and the first and second coordinates for the second point; and causing the n-dimensional representation to be processed with a machine-trained model that is configured to detect a threat.
- Example OO the one or more non-transitory computer-readable media of Example N, wherein the data comprises at least one of file system data, network traffic data, runtime data, or data associated with an isolated environment.
- Example PP the one or more non-transitory computer-readable media of Example N or O, wherein the operations further comprise: processing the n-dimensional representation with the machine-trained model; detecting the threat based at least in part on the processing; and performing a threat operation to address the threat, the threat operation comprising at least one of removing the threat, preventing the threat from associating with the data, or providing a notification to a computing device regarding the threat.
- Example QQ the one or more non-transitory computer-readable media of any of Examples N-P, wherein the first set of bits is directly adjacent to the second set of bits.
- Example RR the one or more non-transitory computer-readable media of any of Examples N-Q, wherein the operations further comprise: determining a type of the data; and determining to represent the data with a first portion of the data based at least in part on the type of the data, the first portion of the data including the first set of bits, the second set of bits, the third set of bits, and the fourth set of bits.
- Example SS the one or more non-transitory computer-readable media of any of Examples N-R, wherein the operations further comprise: training a model to create the machine-trained model, the training being based at least in part on one or more n-dimensional representations that are tagged as being associated with one or more threats and one or more n-dimensional representations that are tagged as being threat free.
- Example TT the one or more non-transitory computer-readable media of any of Examples N-S, wherein the machine-trained model includes an artificial neural network and the training includes using machine learning.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Virology (AREA)
- Mathematical Analysis (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- Anti-malware tools are implemented to prevent, detect, and remove malware that threatens computing devices. These tools use pattern matching, heuristic analysis, behavioral analysis, or hash matching to identify malware. Although these techniques provide some level of security, the anti-malware tools are slow to adapt to changing malware, reliant on humans to flag or verify malware, slow to process data, and require exact matches between data and pre-flagged malware. This often leaves computing devices exposed to malware for relatively long periods of time, causing various undesirable issues.
- Various embodiments are depicted in the accompanying drawings for illustrative purposes and should in no way be interpreted as limiting the scope of the disclosure. In addition, various features of different disclosed embodiments can be combined to form additional embodiments, which are part of this disclosure. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Throughout the drawings, reference numbers may be reused to indicate correspondence between reference elements.
-
FIG. 1 illustrates an example architecture in which the techniques described herein may be implemented. -
FIG. 2 illustrates an example process of converting data to an n-dimensional point representation in accordance with one or more embodiments. -
FIG. 3 illustrates an example process to train an analysis model in accordance with one or more embodiments. -
FIG. 4 illustrates an example process to produce one or more n-dimensional representations in accordance with one or more embodiments. -
FIG. 5 illustrates an example process to process an n-dimensional representation using an analysis model in accordance with one or more embodiments. -
FIGS. 6A-6B illustrates an example process to generate one or more n-dimensional representations and analyze the one or more n-dimensional representations in accordance with one or more embodiments. -
FIG. 7 illustrates an example process to process analysis data regarding a target property and determine one or more characteristics about the target property in accordance with one or more embodiments. -
FIG. 8 illustrates an example process to generate one or more n-dimensional representations for data associated with one or more target properties in accordance with one or more embodiments. - This disclosure describes techniques and architectures for representing data with one or more n-dimensional representations and using one or more analysis models to identify target properties associated with the one or more n-dimensional representations. For example, the techniques and architectures may receive data of any type and process the data at a bit or byte level to generate one or more n-dimensional representations for the data. To generate a representation, the techniques and architectures may represent groups of bits within the data as points within a coordinate system, with a set of bits within a group of bits representing a coordinate for a point. The techniques and architectures may use the points as the n-dimensional representation and/or generate a model or another representation based on the points (e.g., a mesh, wireframe, etc.). As such, the n-dimensional representation may be generated to include one or more of the points and/or a model or other representation for one or more of the points. The n-dimensional representation may represent a data signature for the data. In some instances, the points within the coordinate space are analyzed to generate multiple n-dimensional representations (e.g., identify multiple sets of points and generate a model for each set of points).
- The techniques and architectures may evaluate an n-dimensional representation based on one or more analysis representations that have been tagged as being associated with a target property (e.g., threat, interruption, nuisance, etc.), such as malware, vulnerability, or another security-related issue. For example, a two-dimensional (2D) or three-dimensional (3D) model representing a portion of the data may be compared to 2D or 3D models that have been previously tagged as being associated with malicious data. If the data model is substantially similar to one or more of the malicious models, a threat or potential threat may be detected. In some instances, if a threat or potential threat is detected, the data model and/or data model within the coordinate system may be analyzed to determine an actual threat, a type of a threat, a source of a threat (e.g., an entity that generated the threat/data), and so on. Further, various operations may be performed to address a target property, such as removing a threat, ensuring that the threat is not associated with the data, providing a notification/message regarding the threat, or another operation.
- The techniques and architectures discussed herein may provide various security measures to efficiently and/or accurately detect target properties for data (e.g., threats). For example, the techniques and architectures may represent data in an n-dimensional representation and process the n-dimensional representation with a model that efficiently and/or accurately detects various types of threats to the data, such as malware or other malicious data. In some embodiments, since the techniques and architectures operate at a bit or byte level to generate a representation of the data, any type of data may be processed (e.g., the techniques and architectures are agnostic to data type, environment type, etc.). For example, the techniques and architectures may be implemented for various types of data, such as file system data, network traffic data, runtime data, non-image-based data, data stored in volatile memory, data stored in non-volatile memory, behavioral data, and so on, and/or implemented for various environments, such as different operating systems, platforms, and so on. Moreover, in some embodiments, the techniques and architectures may detect target properties by processing just a portion of data (e.g., a portion of a file, etc.), which may further increase the efficiency of the techniques and architectures. Furthermore, in some embodiments, the techniques and architectures may detect target properties without human involvement. Additionally, the techniques and architectures may efficiently utilize computing resources, such as by comparing a data model to target models to identify a potential threats, interruptions, nuisances, etc., which may be relatively faster and/or require less computational resources in comparison to other solutions.
- The techniques and architectures discussed herein can be applied to detect a variety of types of target properties. A target property can refer to/include malicious behavior (e.g., malicious data intended to damage an environment/system/device), benign behavior (e.g., data/behavior that is not malicious), a vulnerability (e.g., vulnerability data that may make an environment/system/device vulnerable to an attack), or any other security-related characteristic that may potentially pose a threat, interruption, nuisance, vulnerability, and so on. Although various examples refer to malicious threats, the techniques and architectures are applicable to any type of target property.
- Although many embodiments and examples are discussed herein in the context of two- or three-dimensional representations for ease of discussion and illustration, the techniques and architectures may be implemented for a representation of any number of dimensions. That is, an n-dimensional representation may comprise a one-dimensional representation, a two-dimensional representation, a three-dimensional representation, a four-dimensional representation, and so on. In examples, each dimension of a representation can refer to a characteristic of data. For instance, a four-dimensional representation for data can include three dimensions that correspond to spatial values (e.g., to form a 3D surface model) for the data and one dimension that represents another characteristic of the data, such as any type of value, metadata, etc. that is associated with the data and/or generated from the data. Further, although some embodiments and examples are discussed herein in the context of cybersecurity, the techniques and architectures can be implemented within a wide variety of contexts, such as industrial control systems, network traffic, physical security, system memory, isolated environments, and so on.
- Moreover, although certain embodiments and examples are disclosed herein, the disclosure extends beyond the specifically disclosed embodiments to other alternative embodiments and/or uses, and to modifications and equivalents thereof. Thus, the scope of the claims that may arise here from is not limited by any of the particular embodiments described below. For example, in any method or process disclosed herein, the acts or operations of the method or process may be performed in any suitable sequence and are not necessarily limited to any particular disclosed sequence. Various operations may be described as multiple discrete operations in turn, in a manner that may be helpful in understanding certain embodiments; however, the order of description should not be construed to imply that these operations are order dependent. Additionally, the structures, systems, and/or devices described herein may be embodied as integrated components or as separate components. For purposes of comparing various embodiments, certain aspects and advantages of these embodiments are described. Not necessarily all such aspects or advantages are achieved by any particular embodiment. Thus, for example, various embodiments may be carried out in a manner that achieves or optimizes one advantage or group of advantages as described herein without necessarily achieving other aspects or advantages as may also be described or suggested herein.
-
FIG. 1 illustrates anexample architecture 100 in which the techniques described herein may be implemented. Thearchitecture 100 includes one or more service providers 110 (also referred to as “theservice provider 110,” for ease of discussion) configured to communicate with one or more interface/client devices 130 (also referred to as “theclient device 130,” for ease of discussion) over one or more networks 140 (also referred to as “thenetwork 140,” for ease of discussion). For example, theservice provider 110 can perform processing remotely/separately from theclient device 130 and communicate with theclient device 130 to facilitate such processing for theclient device 130 and/or another device. Theservice provider 110 and/or theclient device 130 can be configured to facilitate various functionality. As shown, thenetwork 140 can include one or more network devices 145 (also referred to as “thenetwork device 145,” for ease of discussion) to facilitate communication over thenetwork 140. Theservice provider 110, theclient device 130, and/or thenetwork device 145 may be configured to perform any of the techniques/functionality discussed herein, which may generally process data to detect a threat or potential threat. Although example devices are illustrated in thearchitecture 100, any of such devices may eliminated/not implemented. In one example, theservice provider 110 may implement the techniques discussed herein without communicating with theclient device 130 and/or without using thenetwork 140. In another example, theclient device 130 may implement the techniques without communicating with theservice provider 110 and/or without using thenetwork 140. - The
service provider 110 may be implemented as one or more computing devices, such as one or more servers, one or more desktop computers, one or more laptops computers, or any other type of device configured to process data. In some embodiments, the one or more computing devices are configured in a cluster, data center, cloud computing environment, or a combination thereof. In some embodiments, the one or more computing devices of theservice provider 110 are implemented as a remote computing resource that is located remotely to theclient device 130. In other embodiments, the one or more computing devices of theservice provider 110 are implemented as local resources that are located locally at theclient device 130. - The
client device 130 may be implemented as one or more computing devices, such as one or more desktop computers, laptops computers, servers, smartphones, electronic reader devices, mobile handsets, personal digital assistants, portable navigation devices, portable gaming devices, tablet computers, wearable devices (e.g., a watch), portable media players, televisions, set-top boxes, computer systems in a vehicle, appliances, cameras, security systems, home-based computer systems, projectors, and so on. - In some examples, the
client device 130 includes one or more input/output (I/O) components, such as one or more displays, microphones, speakers, keyboards, mice, cameras, and so on. The one or more displays may be configured to display data associated with certain aspects of the present disclosure. For example, the one or more displays may be configured to present a graphical user interface (GUI) to facilitate operation of theclient device 130, present information associated with an evaluation of data (e.g., information indicating if a threat is detected, a type of threat detected, etc.), provide input to cause an operation to be performed to address a threat (e.g., an operation to have a threat removed, prevent a threat from associated with and/or further corrupting data, prevent a threat from being stored with data, etc.), and so on. The one or more displays may include a liquid-crystal display (LCD), a light-emitting diode (LED) display, an organic LED display, a plasma display, an electronic paper display, or any other type of technology. In some embodiments, the one or more displays include one or more touchscreens and/or other user input/output (I/O) devices. - The
network device 145 may include one or more routers, bridges, switches, repeaters, modems, gateways, hubs, wireless access points, servers, network interface controllers, or any other device/hardware configured to facilitate reception/transmission of data from/to another component. - As shown, the
service provider 110,client device 130, and/ornetwork device 145 may includecontrol circuitry 111,memory 112, and/or one ormore network interfaces 113 configured to perform functionality described herein. For ease of discussion and illustration, thecontrol circuitry 111,memory 112, and one ormore network interfaces 113 are shown in blocks above theservice provider 110,client device 130, andnetwork device 145. It should be understood that, in many embodiments, theservice provider 110,client device 130, and/ornetwork device 145 can each include separate instances of thecontrol circuitry 111,memory 112, andnetwork interface 113. For example, theservice provider 110 can include its own control circuitry, data storage/memory, and/or network interface (e.g., to implement processing on the service provider 110), thenetwork device 145 can include its own control circuitry, data storage/memory, and/or network interface (e.g., to implement processing on the network device 145), and/or theclient device 130 can include its own control circuitry, data storage/memory, and/or network interface (e.g., to implement processing on the client device 130). As such, reference herein to control circuitry/memory may refer to circuitry/memory embodied in theservice provider 110,client device 130, and/ornetwork device 145. - Although the
control circuitry 111 is illustrated as a separate component from thememory 112 andnetwork interface 113, it should be understood that thememory 112 and/or thenetwork interface 113 can be embodied at least in part in thecontrol circuitry 111. For instance, thecontrol circuitry 111 can include various devices (active and/or passive), semiconductor materials and/or areas, layers, regions, and/or portions thereof, conductors, leads, vias, connections, and/or the like, wherein one or more of thememory 112 and thenetwork interface 113 and/or portion(s) thereof can be formed and/or embodied at least in part in/by such circuitry components/devices. - The
control circuitry 111 may include one or more processors, processing circuitry, processing modules/units, chips, dies (e.g., semiconductor dies including come or more active and/or passive devices and/or connectivity circuitry), microprocessors, micro-controllers, digital signal processors (DSPs), microcomputers, central processing units (CPUs), graphics processing units (GPUs), field programmable gate arrays (FPGAs), programmable logic devices, state machines (e.g., hardware state machines), logic circuitry, analog circuitry, digital circuitry, application specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), program-specific standard products (ASSPs), complex programmable logic devices (CPLDs), and/or any device that manipulates signals (analog and/or digital) based on hard coding of the circuitry and/or operational instructions. Control circuitry can further comprise one or more, storage devices, which can be embodied in a single memory device, a plurality of memory devices, and/or embedded circuitry of a device. Such data storage can comprise read-only memory, random access memory, volatile memory, non-volatile memory, static memory, dynamic memory, flash memory, cache memory, data storage registers, and/or any device that stores digital information. It should be noted that in embodiments in which control circuitry comprises a hardware state machine (and/or implements a software state machine), analog circuitry, digital circuitry, and/or logic circuitry, data storage device(s)/register(s) storing any associated operational instructions can be embedded within, or external to, the circuitry comprising the state machine, analog circuitry, digital circuitry, and/or logic circuitry. - The memory 112 (as well as any other memory discussed herein) may include any suitable or desirable type of computer-readable media. For example, one or more computer-readable media may include one or more volatile data storage devices, non-volatile data storage devices, removable data storage devices, and/or nonremovable data storage devices implemented using any technology, layout, and/or data structure(s)/protocol, including any suitable or desirable computer-readable instructions, data structures, program modules, or other data types. One or more computer-readable media that may include, but is not limited to, phase change memory, static random-access memory (SRAM), dynamic random-access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, compact disk read-only memory (CD-ROM), digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that may be used to store information for access by a computing device. As used in certain contexts herein, computer-readable media may not generally refer to communication media, such as modulated data signals and carrier waves. As such, computer-readable media should generally be understood to refer to non-transitory media.
- The
control circuitry 111,memory 112, and/ornetwork interface 113 can be electrically and/or communicatively coupled using certain connectivity circuitry/devices/features, which can or may not be part ofcontrol circuitry 111. For example, the connectivity feature(s) can include one or more printed circuit boards configured to facilitate mounting and/or interconnectivity of at least some of the various components/circuitry. In some embodiments, two or more of the components may be electrically and/or communicatively coupled to each other. - The
memory 112 may store adata selection component 114, arepresentation generation component 115, and a representation analysis component 116, which can include executable instructions that, when executed by thecontrol circuitry 111, cause thecontrol circuitry 111 to perform various operations discussed herein. For example, one or more of the components 114-116 may include software/firmware modules. However, one or more of the components 114-116 may be implemented as one or more hardware logic components, such as one or more application specific integrated circuits (ASIC), field-programmable gate arrays (FPGAs), program-specific standard products (ASSPs), complex programmable logic devices (CPLDs), and/or the like. For ease of discussion, the components 114-116 are illustrated as separate components. However, it should be understood that one or more of the components 114-116 may be implemented as any number of components to implement the functionality discussed herein (e.g., combined or separated into additional components). - The
data selection component 114 can be configured to select a portion of data for therepresentation generation component 115 and/or representation analysis component 116 to process. For example, thedata selection component 114 can select a number of bits/bytes of data and/or a particular portion of the data, such as a predetermined number of bits/bytes (e.g., 1500 bits/bytes, 15,0000 bits/bytes, 500 bits/bytes, and so on), header/footer/body data, metadata, a particular number of bits/bytes within a particular portion the data, and so on. In examples, thedata selection component 114 can determine a type of the data (e.g., file system data, network traffic data, runtime data, non-image-based data, data stored in volatile memory, data stored in non-volatile memory, behavioral data, and so on) and select a particular portion of the data and/or a number of bits/bytes based on the type of data. For instance, it may be determined through machine learning or other techniques that evaluating a particular section of data (e.g., a header, a footer, a section of a payload, etc.) for a particular type of data accurately detects any threats associated with the type of data by more than a threshold (e.g., 99% of the time). As such, thedata selection component 114 may select the particular section within each piece of data (e.g., file) and refrain from selecting other sections of the piece of data. Further, in examples, thedata selection component 114 can analyze the data to generate entropy data indicating a randomness of one or more portions of data and select a particular portion of the data and/or a number of bits/bytes based on the entropy data. The entropy data may indicate a randomness of a portion of the data relative to other portions of the data and/or a threshold. In some instances, a portion of data that is selected is a most/least random portion and/or has a randomness value above/below a threshold. In some instances, a Shannon entropy algorithm is implemented. - The
representation generation component 115 may generally be configured to process/analyze data to generate an n-dimensional representation of the data. For example, therepresentation generation component 115 may retrieve/receivedata 150 from a component/device/system and process (e.g., parse) thedata 150 in groups of bits to determine points for a coordinate system. Each group of bits may include one or more sets of bits that represent one or more coordinates, respectively. For example, therepresentation generation component 115 may extract three bytes of data (e.g., a group of bits) and represent each byte (e.g., set of bits) with a coordinate for a point. In particular, therepresentation generation component 115 can convert each byte into a coordinate value for a coordinate system (e.g., a value from 0 to 255). For instance, a first byte in a group of bits may represent an x-coordinate (e.g., x-value from 0 to 255 on a coordinate system), a second byte in the group of bits may represent a y-coordinate for the point (e.g., y-value from 0 to 255 on the coordinate system), and a third byte in the group of bits may represent z-coordinate for the point (e.g., z-value from 0 to 255 on the coordinate system). Therepresentation generation component 115 may process any number of bits in thedata 150 to determine any number of points for thedata 150. Although some examples are discussed herein in the context of three bytes representing a group of bits and a byte representing a set of bits, a group of bits and/or a set of bits may include any number of bits or bytes. - The
representation generation component 115 may generate an n-dimensional representation based on coordinates of points. For example, therepresentation generation component 115 can position each point within a coordinate system using one or more coordinates for the point (e.g., position a point based on a x-coordinate value, y-coordinate value, and z-coordinate value). In some embodiments, the points produced by such process form an n-dimensional representation (e.g., a point cloud), such as an3D point representation 151 illustrated inFIG. 1 . Further, in some embodiments, the points produced by such process may be used to form an n-dimensional representation. For instance, therepresentation generation component 115 may use apattern recognition algorithm 117 to identify a set of points that are associated with particular characteristic(s). Suchpattern recognition algorithm 117 can generally seek to identify points that are within a particular distance from each other, positioned on a virtual surface/plane, and/or otherwise include characteristics that may indicate that the set of points may form a surface. Therepresentation generation component 115 can generate an n-dimensional representation based on the set of points, such as a3D model 152 illustrated inFIG. 1 . In examples, a model is a polygon mesh that includes one or more vertices, edges, faces, polygons, surfaces, and so on. Further, in examples, a model is a wire-frame model that includes one or more vertices, edges, and so on. However, other types of models can be implemented. Further, therepresentation generation component 115 can generate other types of n-dimensional representations, such as an n-dimensional map. An example process of generating an n-dimensional point representation is illustrated and discussed in reference toFIG. 2 . - In some instances, the
representation generation component 115 generates multiple models/representation for different sets of points within data. For example, thepattern recognition algorithm 117 can identify different sets of points, and therepresentation generation component 115 can generate a model for each set of points, resulting in multiple models within a coordinate space/system. Further, as noted above, in some instances, therepresentation generation component 115 processes data that is selected by thedata selection component 114. For example, therepresentation generation component 115 can generate an n-dimensional representation for a particular portion of data that is selected by thedata selection component 114. Moreover, in some instances, thedata 150 includes a plurality of units of data, such as a plurality of files, and therepresentation generation component 115 generates an n-dimensional representation for each of the units of data. - An n-dimensional representation, such as the n-
dimensional representation 151 or the n-dimensional representation 152, may include a variety of representations, such as an n-dimensional point cloud or other plurality of points, an n-dimensional map, an n-dimensional model (e.g., mesh model, wireframe model, etc.), and so on. The term “n” may represent any integer. In some embodiments, an n-dimensional representation may include surfaces. In some embodiments, an n-dimensional representation may be visualized by a human, while in other embodiments an n-dimensional representation may not able to be visualized by a human. In some embodiments, data representing an n-dimensional representation (e.g., coordinates of points, surfaces, etc.) may be stored in an array, matrix, list, or any other data structure. In some instances, an n-dimensional representation is stored as adata signature 118 in a data signature(s) data store. For example, a data signature for a piece of data can be points or one or more models for the piece of data generated by therepresentation generation component 115. - An n-dimensional representation may be represented within a coordinate system. A coordinate system may include a number line, a cartesian coordinate system, a polar coordinate system, a homogeneous coordinate system, a cylindrical or spherical coordinate system, etc. As noted above, although many examples are discussed herein in the context of two- or three-dimensional representations represented in two- or three-dimensional coordinate systems, the techniques and architectures may generate a representation of any number of dimensions and/or a representation may be represented in any type of coordinate system.
- In some embodiments, the
representation generation component 115 generates multiple representations for the same data (e.g., a unit of data, such as a file). In some examples, therepresentation generation component 115 may generate a two-dimensional representation for data and generate a three-dimensional representation for the same data. Further, in some examples, therepresentation generation component 115 may generate a three-dimensional representation for data using a process that represents three bytes of continuous bits as an x-coordinate, a y-coordinate, and a z-coordinate, in that order. Therepresentation generation component 115 may also generate a three-dimensional representation for the same data using a process that represents three bytes of continuous bits as a y-coordinate, a z-coordinate, and an x-coordinate, in that order. In any event, representing data with multiple representations may be useful to provide multiple layers of evaluation of the data (e.g., when evaluating the data with the representation analysis component 116 to detect any threats). As such, therepresentation generation component 115 may generate multiple representations for data using different coordinate systems and/or different manners of processing the data. - In some embodiments, the
representation generation component 115 and/or the representation analysis component 116 processes a portion of data while refraining from processing another portion of the data (or at least initially refraining from processing the other portion). For example, therepresentation generation component 115 may process a predetermined number of bytes of each file, such as a first 1500 bytes of each file, a second 1500 bytes of each file, or a last 1500 bytes of each file, to generate an n-dimensional representation for the file. In some embodiments, an initial portion of data (e.g., a file) may include a header that designates execution points within the data. In cases where malware or other threats are associated with a header and/or execution points, which may frequently be the case, therepresentation generation component 115 may efficiently process data by generating an n-dimensional representation based on just the data within the header. In some instances, therepresentation generation component 115 processes just a portion of data that is selected by thedata selection component 114. However, any portions of data may be processed. - Data, such as the
data 150, may be a variety of types of data, such as audio data, video data, text data (e.g., text files, email, etc.), binary data (e.g., binary files), image data, network traffic data (e.g., data protocol units exchanged over a network, such as segments, packets, frames, etc.), file system data (e.g., files), runtime data (e.g., data generated during runtime of an application, which may be stored in volatile memory), data stored in volatile memory, data stored in non-volatile memory, application data (e.g., executable data for one or more applications), data associated with an isolated environment (e.g., data generated or otherwise associated with a virtual machine, data generated or otherwise associated with a trusted execution environment, data generated or otherwise associated with an isolated cloud service, etc.), metadata, behavioral data (e.g., data describing behaviors taken by a program during runtime), location data (e.g., geographical/physical location data of a device, user, etc.), quality assurance data, financial data, financial analytics data, healthcare analytics data, and so on. Data may be formatted in a variety of manners and/or according a variety of standards. In some examples, data includes a header, payload, and/or footer section. Data may include multiple pieces of data (e.g., multiple files or other units of data) or a single piece of data (e.g., a single file or another unit of data). In some embodiments, data includes non-image-based data, such as data that is not initially intended/formatted to be represented within a coordinate system (e.g., not stored in a format that is intended for display). In contrast, image-based data may generally be intended/formatted for display, such as images, 2D models, 3D models, point cloud data, and so on. In some embodiments, a type of data may be defined by or based on a format of the data, a use of the data, an environment in which the data is stored or used (e.g., an operating system, device platform, etc.), a device that generated the data, a size of the data, an age of the data (e.g., when the data was created), and so on. - The representation analysis component 116 may be configured to analyze an n-dimensional representation, such as the n-
dimensional point representation 151 or the n-dimensional model representation 152. The representation analysis component 116 may generally use an analysis model(s) 119 stored in an analysis model data store. The analysis model(s) 119 can include one or more machine/human-trained models and/or other types of models, which can implement techniques/algorithms for detecting a threat(s). The one ormore analysis models 119 models may include models configured for different types of data, different coordinate systems, different types of n-dimensional representations, and so on. The representation analysis component 116 can use the one ormore analysis models 119 to process an n-dimensional representation (generated by the representation generation component 115) to generate a confidence value/data indicating a likelihood that an n-dimensional includes malicious data. In examples, the representation analysis component 116 can determine if an n-dimensional representation includes malicious data (e.g., if a confidence value is above a threshold). - In some instances, the representation analysis component 116 is configured to compare an n-dimensional representation to one or more n-dimensional representations that have been tagged as malicious. For example, a 2D or 3D model for data can be compared to 2D or 3D models for malicious code to determine a similarity of the 2D or 3D data model to the 2D or 3D malicious code models. Here, the representation analysis component 116 can be configured to compare a similarity between surfaces, edges, volume, area, and/or any other characteristic of a model. The representation analysis component 116 can generate a confidence/similarity value indicating a similarity of the 2D or 3D data model to the 2D or 3D malicious data models.
- In some instances, the representation analysis component 116 includes an Artificial Intelligence (AI) component 120 configured to train a model to create a machine-trained model that is configured to analyze an n-dimensional representation to detect a threat. For example, the AI component 120 may analyze
training data 121 from a training data store that includes one or more n-dimensional representations that are tagged as being associated with a threat (e.g., malicious code) and/or one or more n-dimensional representations that are tagged as being threat free (e.g., not associated with a threat). An n-dimensional representation may be tagged (e.g., categorized) by a user and/or a system. The AI component 120 may analyze thetraining data 121 to generate one or more machine-trained models, such as one or more artificial neural networks or another Artificial Intelligence model. The AI component 120 may store the one or more machine-trained models within the data store for the analysis model(s) 119. - In some embodiments of training a model, the AI component 120 may learn one or more characteristics that are associated with an n-dimensional representation(s) of malicious data and train a machine-trained model to detect such one or more characteristics. For example, the AI component 120 may use pattern recognition, feature detection, shape/surface detection, and/or a spatial analysis to identify one or more characteristics and/or patterns of one or more characteristics. In some embodiments, a characteristic may include: a spatial feature (e.g., a computer vision/image processing feature, such as edges, corners (interest points), blobs (regions of interest points), ridges, etc.), a feature of an n-dimensional representation, a marker of an n-dimensional representation, a number of models that may generally be associated with malicious data (e.g., an average/greatest/smallest number of models within a coordinate system for malicious data), a relationship between models (within a coordinate system) that are associated with malicious data (e.g., an average/longest/shortest distance between malicious data models), a shape of a model(s) that is associated with malicious data (e.g., a type of shape), a size of a model(s) that is associated with malicious data (e.g., an average/largest/smallest size of an average model), a volume of a model(s) that is associated with malicious data (e.g., an average/largest/smallest size of an average model), an area of a model(s) that is associated with malicious data, a number of surfaces of a model(s) that is associated with malicious data (e.g., an average/largest/smallest number of surfaces for a model), a location of a model(s) that is associated with malicious data within a coordinate system (e.g., an average position), a position of a malicious model(s) to another malicious model within a coordinate system, a number of models within the coordinate space that are associated with malicious data (e.g., an average/largest/smallest number of a models), a characteristic(s) of a particular type of malicious data, and so on. However, a characteristic may include any characteristic of an n-dimensional representation and/or coordinate system, whether visualizable/spatial or non-visualizable/non-spatial. Training a model may include machine learning or other AI techniques.
- In some embodiments, the AI component 120 may train one or more models for different types of threats. For example, a model may be trained to detect/identify malware, a particular type of malware (e.g., a virus, spyware, ransomware, polymorphic malware, a particular type of virus, a particular type of spyware, a particular type of ransomware, a particular type of polymorphic malware, etc.), and so on. To illustrate, the AI component 120 may learn that a particular characteristic (e.g., feature) in an n-dimensional representation is associated with a virus or a particular type of virus and train a model to detect the particular characteristic and/or to identify the particular characteristic as being associated with the virus or the particular type of virus. In some embodiments, the AI component 120 may train a first model to detect/identify a first type of threat and train a second model to detect/identify a second type of threat.
- The AI component 120 may be configured to process an n-dimensional representation with a machine-trained model(s) or any other model. For example, the AI component 120 may receive the n-
dimensional representation 151/152 from therepresentation generation component 115 and process the n-dimensional representation 151/152 with a machine-trained model to identify any threats associated with the n-dimensional representation 151/152. In some embodiments, the AI component 120 may identify a type of threat associated with an n-dimensional representation, such as malware, a particular type of malware (e.g., a virus, spyware, ransomware, polymorphic malware, a particular type of virus, a particular type of spyware, a particular type of ransomware, a particular type of polymorphic malware, etc.), and so on. In some embodiments, the processing includes pattern recognition, feature detection, and/or a spatial analysis, which may include identifying one or more characteristics (e.g., features) within an n-dimensional representation. - In some embodiments, the representation analysis component 116 may be configured to use different models to analyze one or more n-dimensional representations. In one example, the representation analysis component 116 may process an n-dimensional representation with a first model and process the n-dimensional representation with a second model. The representation analysis component 116 may detect a threat if either analysis detects a threat (e.g., either one of the confidence values is above a threshold). Further, in another example, the representation analysis component 116 can process an n-dimensional representation a first time with a first model. If a confidence value is within a range or otherwise satisfies one or more criteria of potentially being associated with malicious data, the representation analysis component 116 can process the n-dimensional representation (or a portion thereof) a second time with a second model. The representation analysis component 116 may detect a threat if a confidence value from the second model satisfies one or more criteria (e.g., is above a threshold). In some instances, the second model may require more (or less) computational resources, time, etc. As such, in some cases, the representation analysis component 116 can use a multiple layered approach to process an n-dimensional representation(s), wherein each layer can be associated with a different model. In some instances of processing a same n-dimensional representation multiple times, the representation analysis component 116 may provide more accurate results regarding any potential threats. However, processing an n-dimensional representation once may be sufficient or just as accurate in many instances.
- A threat (sometimes referred to as “malicious data”) may include malware, phishing, a rootkit, a bootkit, a logic bomb, a backdoor, a screen scraper, a physical threat (e.g., an access point without security measures, such as leaving a door open, etc.), and so on. Malware may include a virus, spyware, adware, a worm, a Trojan horse, scareware, ransomware, polymorphic malware, and so on. In some embodiments, a threat may result from any data, software, or other component that has malicious intent.
- In some embodiments, the representation analysis component 116 may detect a physical threat associated with data. For example, the
representation generation component 115 may process data representing a physical environment, such as images of the interior or exterior of a building and generate an n-dimensional representation for the data. The representation analysis component 116 may process the n-dimensional representation to identify a potential threat, such as an access point that may potentially be at risk of a break-in due to reduced security features at the access point. Furthermore, the representation analysis component 116 may be configured to detect a variety of other types of threats. - The representation analysis component 116 may be configured to provide a variety of types of output regarding processing of an n-dimensional representation. For example, based on processing an n-dimensional representation with the one or
more analysis models 118, the representation analysis component 116 may determine if the n-dimensional representation is associated with any threats, determine the types of threats (if any), where the threat is located in the data, a source of the threat (e.g., a content creator that generated the threat, an entity involved in distributing the threat, etc.). In some embodiments, the representation analysis component 116 may generate information (e.g., a report, notification, a threat rating, signal, etc.) indicating if a threat was detected, a type of threat that was detected, a confidence value of a detected threat (e.g., a rating on a scale of 1 to 10 of a confidence that data includes a threat, with 10 (or 1) being the highest confidence that the data includes a threat), where a threat is located in data, a source of a threat, and so on. In some examples, the representation analysis component 116 may provide the information to the client device 130 (e.g., in a message), which may display the information via a user interface and/or another manner. A user may view information provided via the user interface and/or cause an operation to be performed, such as having a threat removed from the data, replacing the malicious data with other data, preventing a threat from further corrupting the data, preventing a threat from being stored with the data, and so on. Further, in some examples, the representation analysis component 116 may provide the information to another device/system and/or cause an operation to be performed automatically to address any threats. - As noted above, the
data selection component 114,representation generation component 115, and/or representation analysis component 116 can be implemented in a variety of context across a variety of devices/system. For example, one or more of thedata selection component 114,representation generation component 115, and representation analysis component 116 may be implemented at theservice provider 110,network device 145, and/orclient device 130. In some illustrations, one or more instances of thedata selection component 114,representation generation component 115, and/or representation analysis component 116 are implemented at one or more of theservice provider 110,network device 145, and theclient device 130. Further, as also noted above, theservice provider 110 can include one or more service providers implemented as one or more computing devices, which may collectively or individually implement thedata selection component 114,representation generation component 115, and/or representation analysis component 116. As such, the functionality of thedata selection component 114,representation generation component 115, and the representation analysis component 116 may be divided in a variety of manners across a variety of different devices/systems/components, which may or may not operate in cooperation to evaluate data. - The
data selection component 114,representation generation component 115, and/or representation analysis component 116 may be configured to evaluate data at any time. In one example, an evaluation of data is performed in response to a request by theclient device 130, such as a user providing input through theclient device 130 to analyze data. For instance, a user (not illustrated) may employ theclient device 130 to initiate an evaluation of data and theservice provider 110 may provide a message back to theclient device 130 regarding the evaluation, such as information indicating whether or not a threat was detected, a type of threat detected, and so on. A user can include an end-user, an administrator (e.g., an Information Technology (IT) individual), or any other individual. In another example, an evaluation of data is performed periodically and/or in response to a non-user-based request received by theclient device 130,service provider 110,network device 145, and/or another device. In yet another example, an evaluation of data is performed when data is received/sent/downloaded. - The one or
more network interfaces 113 may be configured to communicate with one or more devices over a communication network. For example, the one ormore network interfaces 113 may send/receive data in a wireless or wired manner over one ormore networks 140, which can include one or more personal area networks (PAN), local area networks (LANs), wide area networks (WANs), Internet area networks (IANs), cellular networks, the Internet, etc. In some embodiments, the one ormore network interfaces 113 may implement a wireless technology such as Bluetooth, Wi-Fi, near field communication (NFC), or the like. - The data store for the
training data 121, analysis model(s) 119, and/or data signature(s) 118 may be associated with any entity and/or located at any location. In some examples, a data store is associated with a first entity (e.g., company, environment, etc.) and theservice provider 110/network device(s) 145/client device 130 is associated with a second entity that provides a service to evaluate data. For instance, a data store may be implemented in a cloud environment or locally at a facility to store a variety of forms of data and theservice provider 110 may evaluate the data to provide information regarding security of the data, such as whether or not the data includes malicious data. In some examples, a data store and theservice provider 110/network device(s) 145/client device 130 are associated with a same entity and/or located at a same location. As such, although various data stores are illustrated in the example ofFIG. 1 as being located within thememory 112, in some examples a data store may be included within another device/system. -
FIG. 2 illustrates an example process of converting data to an n-dimensional point representation in accordance with one or more embodiments. In this example, control circuitry, such as thecontrol circuitry 111 fromFIG. 1 , may processesdata 202 at a bit/byte level to generate an n-dimensional point representation for thedata 202. In particular, the control circuitry processes thedata 202 in groups of bits, with each group of bits being converted to coordinates for a point. For example, the control circuitry may identify a first group ofbits 206 that includes three bytes of data, with each byte of data corresponding to a set of bits. As shown, the group ofbits 206 includes a set of bits 210 (i.e., a first byte), a set of bits 212 (i.e. a second byte), and a set of bits 214 (i.e., a third byte). The set ofbits 210 are directly adjacent to the set ofbits 212 and the set ofbits 212 are directly adjacent to the set ofbits 214. In this example, the control circuitry converts the set ofbits 210 to an x-coordinate value (illustrated as “X1”), the set ofbits 212 to a y-coordinate value (illustrated as “Y1”), and the set ofbits 214 to a z-coordinate value (illustrated as “Z1”). The control circuitry may use the coordinate values to produce apoint 222 within a coordinate system 204 (e.g., position the point 222), as shown. - Similarly, the control circuitry may identify a second group of
bits 208 that includes three bytes of data, with each byte of data corresponding to a set of bits. As shown, the group ofbits 208 includes a set of bits 216 (i.e., a first byte), a set of bits 218 (i.e. a second byte), and a set of bits 220 (i.e., a third byte). The set ofbits 216 are directly adjacent to the set ofbits 218 and the set ofbits 218 are directly adjacent to the set ofbits 220. In this example, the control circuitry may convert the set ofbits 216 to an x-coordinate value (illustrated as “X2”), the set ofbits 218 to a y-coordinate value (illustrated as “Y2”), and the set ofbits 220 to a z-coordinate value (illustrated as “Z2”). The control circuitry may use the coordinate values to create apoint 224 within the coordinatesystem 204, as shown. The control circuitry can proceed to process any number of bits (e.g., groups of bits) in thedata 202 in a similar fashion to produce any number of points within the coordinatesystem 204. - For ease of illustration, the n-dimensional representation of
FIG. 2 is illustrated with two points; however, the n-dimensional representation may include any number of points, such as hundreds or thousands of points. Further, although the n-dimensional representation ofFIG. 2 is illustrated with points, as noted above an n-dimensional representation may include other representations. - In the example of
FIG. 2 , thedata 202 represents a unit of data, such as a file, network traffic unit, etc. In many examples, the control circuitry may perform a similar process for any number of units of data (e.g., any number of files) to generate any number of n-dimensional representations. Although the example ofFIG. 2 is illustrated in the context of three bytes representing a group of bits and one byte representing a set of bits, a group of bits and/or a set of bits may include any number of bits or bytes. To illustrate, a group of bits may include two bytes of data (or an arbitrary number of bits, such as ten bits), with each byte (or set of five bits) being converted to a coordinate for a two-dimensional coordinate system. Moreover, in some examples control circuitry may process data in other manners, such as by converting the first 200 bytes of data to x-coordinates, the second 200 bytes of data to y-coordinates, and the third 200 bytes of data to z-coordinates. Furthermore, control circuitry may process data in a variety of other manners. -
FIGS. 3, 4, 5, 6A-6B, 7, and 8 illustrate example processes 300, 400, 500, 600, 700, and 800 respectively, in accordance with one or more embodiments. For ease of illustration, processes 300, 400, 500, 600, 700, and 800 may be described as being performed in theexample architecture 100 ofFIG. 1 . For example, one or more of the individual operations of theprocesses control circuitry 111. However, theprocesses architecture 100 may be used to perform other processes. - The
processes -
FIG. 3 illustrates theexample process 300 to train an analysis model in accordance with one or more embodiments. - At 302, one or more first n-dimensional representations that are tagged as being associated with one or more target trajectories (e.g., threats) may be obtained. For example, the
control circuitry 111 can receive training data (e.g., one or more n-dimensional representations) that has been tagged as being associated with malware. The training data may have been tagged by a user, a system, or another entity. In some embodiments where the training data includes one or more n-dimensional representations, the one or more n-dimensional representations may have been generated by thecontrol circuitry 111 by processing data at a bit or byte level, similar to various processes described herein. - At 304, one or more second n-dimensional representations that are tagged as being free of certain target properties may be obtained. For example, the
control circuitry 111 may retrieve training data (e.g., one or more n-dimensional representations) that has been tagged as being free of malicious data (e.g., not associated with malware). The training data may have been tagged by a user, a system, or another entity. In some embodiments where the training data includes one or more n-dimensional representations, the one or more n-dimensional representations may have been generated by thecontrol circuitry 111 by processing data at a bit or byte level, similar to various processes described herein. - At 306, machine learning can be used to train a model based at least in part on the one or more first n-dimensional representations and/or the one or more second n-dimensional representations. For example, the
control circuitry 111 may analyze training data that is tagged as being associated with malware and/or the training data that is tagged as being malware free and learn what information (e.g., features) are associate with malware/malicious data. By performing the training, thecontrol circuitry 111 may create a machine-trained model that is configured to detect threats/malicious data, identify types of threats/malicious data, identify sources of threats/malicious data, identify a portion(s) of data this is associated with threats/malicious data (e.g., a portion of data to analyze), and so on. -
FIG. 4 illustrates theexample process 400 to produce one or more n-dimensional representations in accordance with one or more embodiments. - At 402, data can be obtained from a data source. For example, the
control circuitry 111 can receive or retrieve data from another device/system/component. The data can comprise a variety of types of data, such as file system data, non-image-based data, network traffic data, runtime data, data associated with an isolated environment, or any other data. - At 404, a portion of the data can be selected. For example, the
control circuitry 111 can select a particular portion of the data, such as a particular number of bits/bytes. In some instances, the selection is based on a type of the data (e.g., a format of the data, a use of the data, an environment in which the data is stored or used, a device that generated the data, a size of the data, an age of the data, and so on), entropy data for the data indicating a randomness of one or more portions of the data, and so on. - At 406, the selected portion of the data may be extracted. For example, the
control circuitry 111 may extract a first portion of the data, such as a first predetermined number of bytes of the data, and/or refrain from extracting a second portion of the data. As such, thecontrol circuitry 111 may determine to represent the data with a particular portion of the data. - Although the
operations example process 400, in some embodiments theoperations 404 and/or 406 (as well as any other operation) may not be performed. - At 408, a group of bits in the data may be identified. For example, the
control circuitry 111 may identify three bytes in the data as representing a group of bits. In some embodiments when a portion of the data has been extracted at theoperation 406, thecontrol circuitry 111 may initially identify a group of bits at a start of the portion of the data. - At 410, one or more coordinates for a point may be determined based at least in part on one or more sets of bits in the data. For example, the
control circuitry 111 may determine a first coordinate for a point based at least in part on a first set of bits in a group of bits, a second coordinate for the point based at least in part on a second set of bits in the group of bits, a third coordinate for the point based at least in part on third set of bits in the group of bits, and so on. In some embodiments, the first set of bits comprises a first byte, the second set of bits comprises a second byte that is directly adjacent to the first byte, and/or the third set of bits comprises a third byte that is directly adjacent to the second byte. However, the sets of bits may not be directly adjacent to each other. As such, thecontrol circuitry 111 may represent a set of bits as a coordinate for a point. - At 412, it may be determined if groups of bits in the data (e.g., all groups) are processed or a limit is met. For example, if the
control circuitry 111 has extracted a portion of the data for processing, such as a header of a file, thecontrol circuitry 111 may determine if another group of bits exists in the portion of the data (e.g., if there exists another group of bits that has not yet been converted to a point). That is, thecontrol circuitry 111 may determine if it has reached an end of the data (or portion of the data). Additionally, or alternatively, if a limit is set so that thecontrol circuitry 111 is configured to process a particular/selected number of bits or bytes (e.g., the first 1500 bytes of data), thecontrol circuitry 111 may determine if the limit is reached (e.g., thecontrol circuitry 111 has processed the first 1500 bytes of data). - If the groups of bits in the data that are designated to be processed are processed and/or the limit is reached, the
process 400 may proceed to operation 416 (i.e., the “YES” branch). Alternatively, if the groups of bits in the data that are designated to be processed are not processed yet and/or the limit is not reached, theprocess 400 may proceed to operation 414 (i.e., the “NO” branch). - At 414, a next group of bits in the data may be designated for processing. For example, the
control circuitry 111 may increment to a next group of bits in the data, and then proceed to theoperation 408 to identify the next group of bits in the data and to theoperation 410 to determine one or more coordinates for the next group of bits. Theprocess 400 may loop throughoperations - At 416, an n-dimensional representation for the data may be generated based at least in part on one or more points. For example, the
control circuitry 111 may use one or more coordinates for each point to generate an n-dimensional representation for the data (or selected portion of the data). The n-dimensional representation may include an n-dimensional point representation (e.g., a plurality of points), an n-dimensional model representation (e.g., mesh, wireframe), an n-dimensional map, and so on. - At 418, the n-dimensional representation may be provided for processing. For example, a component of a device/system may provide the n-dimensional representation to another component of the device/system for processing with an analysis model. The
control circuitry 111 may cause the n-dimensional representation to be processed with an analysis model that is configured to detect a threat. - In some embodiments, data includes multiple pieces of data (e.g., multiple files) and the
process 400 is performed for each piece of data. Further, in some embodiments, theprocess 400 is performed multiple times for the same data to generate different types of n-dimensional representations for the data. -
FIG. 5 illustrates theexample process 500 to process an n-dimensional representation using an analysis model in accordance with one or more embodiments. - At 502, an n-dimensional representation may be processed using an analysis model. For example, the
control circuitry 111 may process an n-dimensional representation using a machine-trained model (e.g., an artificial neural network), a shape comparison model, and/or another model (e.g., a human-trained model). In some examples, thecontrol circuitry 111 may seek to identify information or features within the n-dimensional representation that are associated with one or more threats. Further, in some examples, thecontrol circuitry 111 may compare the n-dimensional representation to n-dimensional representations associated with threats. In many instances, thecontrol circuitry 111 may determine a confidence value/data indicating a likelihood that the n-dimensional representation is associated with a threat. In some embodiments, thecontrol circuitry 111 may process the n-dimensional representation multiple times using different models and/or process various n-dimensional representations within a coordinate system. - At 504, it may be determined if the n-dimensional representation is associated with a target property. For example, the
control circuitry 111 may determine if a confidence value/data regarding a threat is greater than a threshold. - If the n-dimensional representation is associated with a target property, the
process 500 may proceed to operation 506 (i.e., the “YES” branch). Alternatively, if the n-dimensional representation is not associated with any target properties, theprocess 500 may proceed to operation 508 (i.e., the “NO” branch). - At 506, an operation may be performed to address the target property. For example, the
control circuitry 111 may perform or cause to be performed a threat operation that includes removing a threat, replacing a threat (e.g., malicious data), preventing a threat from associating with data, providing information (e.g., a notification, a report, a malware rating indicating a likelihood that the data is associated with malware, etc.) to a client device regarding the threat, and so on. - At 508, information may be provided regarding the processing. For example, the control circuitry can provide information indicating that no threats are associated with the n-dimensional representation, information indicating a confidence value for the processing, and so on. The information can be provided in a report, notification, message, signal, etc. to a client device and/or another system/component. In some instances, the
operation 508 is additionally, or alternatively, performed in the branch for operation 506 (e.g., to provide information regarding a detected threat). -
FIGS. 6A-6B illustrates theexample process 600 to generate one or more n-dimensional representations and analyze the one or more n-dimensional representations in accordance with one or more embodiments. - In
FIG. 6A , at 602, data can be received. For example, thecontrol circuitry 111 can receive data from another device, system, component, and so on. In some instances, the data is retrieved from a data store, such as a data store associated with thecontrol circuitry 111 and/or another system. - At 604, one or more portions of the data can be selected for processing. For example, the
control circuitry 111 can select a particular number of bits/bytes of the data at a particular location within the data. In some instances, the selection is based on a type of the data (e.g., a format of the data, a use of the data, an environment in which the data is stored or used, a device that generated the data, a size of the data, an age of the data, and so on), entropy data for the data indicating a randomness of one or more portions of the data, and so on. To illustrate, thecontrol circuitry 111 can analyze the data the data using an entropy algorithm to generate entropy data indicating a randomness of one or more portions of the data. Based on the entropy data, thecontrol circuitry 111 can select a particular portion of the data for processing, such as a portion that is associated with a most/least amount of randomness, a portion that is associated with a randomness value that is above/below a threshold, and so on. - At 606, at least a portion of the data can be represented as a plurality of points. For example, the
control circuitry 111 can represent a portion of the data that is selected at 604 as a plurality of points within a coordinate system/space. To illustrate, thecontrol circuitry 111 can represent a first set of bits in the data as a first coordinate for a first point and a second set of bits in the data as a second coordinate for the first point. The second set of bits can be adjacent to the first set of bits (e.g., directly adjacent or within a particular number of bits). Similarly, thecontrol circuitry 111 can represent a third set of bits in the data as a first coordinate for a second point and a fourth set of bits in the data as a second coordinate for the second point. The fourth set of bits can be adjacent to the third set of bits. - At 608, one or more points can be associated with an indicator indicating a location of bits associated with the one or more points within the data. For example, the
control circuitry 111 can determine a location of bits (associated with a point) within the data and associate a location indicator with the point. In some instances, the location indicator can be visually represented within a coordinate space and/or used to generate a visual representation for a surface of a model, such as a color, contrast, brightness, and so on. This can allow a user to view a location of a point and/or a location of surfaces/points for a model associated with the point. To illustrate, one or more of the plurality of points within the coordinate system can be associated with a color, wherein each color can be associated with a different section within the data (e.g., a header, a body, a footer, and so on). A user may be able to view the coloring of the plurality of points to identify where the points are located within the data. In other instances, the location indicator may not be visually represented and/or may be used by a system/component and another manner. - At 610, a set of points in the plurality of points can be identified. For example, the
control circuitry 111 can analyze the plurality of points using a pattern recognition algorithm to identify points that are within a particular distance from each other, positioned on a virtual surface/plane (e.g., aligned to form a substantially planar surface), and/or otherwise include characteristics that may indicate that the set of points are positioned within some type of pattern that may be used to form a surface/edge. - At 612, an n-dimensional representation can be generated. In some examples, the
control circuitry 111 can generate an n-dimensional representation for the set of points that are identified at 610. Alternatively, or additionally, thecontrol circuitry 111 can generate an n-dimensional representation for any number of points within the plurality of points, such as all points within the plurality of points, a predetermined number of points within the plurality of points, and so on. An n-dimensional representation can include an n-dimensional point representation (e.g., the plurality of points), an n-dimensional model representation (e.g., a mesh model, a wireframe model, etc.), an n-dimensional map, and so on. An n-dimensional representation can have any number of dimensions, such as two, three, four, five, etc. As noted above, each dimension of a representation can refer to a characteristic/input of data. For instance, a five-dimensional representation for data can include three dimensions that represent a 3D model within a 3D space and one dimension that represents a type of the data (e.g., network traffic, file system data, etc.) and another dimension that represents metadata associated with the data. - At 614, it may be determined if an additional set of points are included within the data. For example, the
control circuitry 111 can perform an additional analysis on the plurality of points (points other than the first set of points) using a pattern recognition algorithm to determine if there is an additional set of points associated with particular characteristics (e.g., a particular pattern). - If an additional set of points is not included within the data, the
process 600 may proceed to operation 616 (i.e., the “NO” branch). Alternatively, if an additional set of points is included within the data, theprocess 600 may return to operation 612 (i.e., the “YES” branch) to generate an n-dimensional representation for the additional set of points. Theprocess 600 may performoperations - At 616, the one or more n-dimensional representations can be associated with the data as a data signature. For example, the
control circuitry 111 can generate a data signature for the data and associate any number of an-n-dimensional representations that have been generated for the data (e.g., any number of models that are generated for the plurality of points within the coordinate space). - In
FIG. 6B , at 618, a first analysis model can be selected. For example, thecontrol circuitry 111 can select a first analysis model based on a type of the data, where the portion of the data selected at 604 is located within the data, entropy data indicating a randomness of at least a portion of the data (e.g., the data as a whole, the portion of the data selected at 604 for processing, etc.), and so on. - At 620, one or more n-dimensional representations for the data can be analyzed based on the first analysis model. For example, the
control circuitry 111 can use the first analysis model to analyze one or more n-dimensional representations associated with the plurality of points for the coordinate space. In one illustration, thecontrol circuitry 111 can compare a 2D or 3D model representing one or more portions of the data to one or more 2D or 3D models that are tagged/classified as being associated with a certain target property (e.g., malicious data). Thecontrol circuitry 111 can determine a similarity between the 2D/3D data model and the 2D/3D malicious data model. In another illustration, thecontrol circuitry 111 can use a machine-trained model to (i) analyze/determine one or more characteristics/features of an n-dimensional representation (e.g., a point representation, a model representation, a map representation etc.) and/or (ii) determine if those one or more characteristics/features are associated with a target property. - At 622, a likelihood that the one or more n-dimensional representations include a target property can be determined and/or analysis data can be generated indicating the likelihood. For example, the
control circuitry 111 can determine a likelihood that an n-dimensional representation includes malicious data based on the analysis at 620. Thecontrol circuitry 111 can generate analysis data indicating the likelihood, such as a confidence value/data. As noted above, a target property can refer to/include malicious behavior (e.g., malicious data intended to damage an environment/system/device), benign behavior (e.g., data/behavior that is not malicious), a vulnerability (e.g., vulnerability data that may make an environment/system/device vulnerable to an attack), or any other security-related characteristic. - The
operation 620 and/or theoperation 622 can be based on one or more characteristics of one or more n-dimensional representations within the coordinate system. For example, thecontrol circuitry 111 can determine a shape of a model(s) within the coordinate system, a size of a model(s), a volume of a model(s), an area of a model(s), a number of surfaces of a model(s), a location of a model(s) within the coordinate system, a position of a model relative to another model within the coordinate system, a number of models generated for the coordinate system, a location indicator for a point, an amount or location of empty space within a coordinate system, and so on. - At 624, it may be determined if an additional analysis should be performed. In some examples, the
control circuitry 111 can determine if a confidence value that the one or more in-dimensional representations include a target property (e.g., associated with/indicating a threat, disruption, nuisance, etc.) is within a range of values, is greater than a threshold, is less than a threshold, or otherwise satisfies one or more criteria. Further, in some examples, thecontrol circuitry 111 can process the data a predetermined number of times, and thecontrol circuitry 111 can determine if the predetermined number of times has been reached. - If it is determined to perform an additional analysis, the
process 600 may proceed to operation 626 (i.e., the “YES” branch). Alternatively, if it is determined to not perform an additional analysis, theprocess 600 may proceed to operation 632 (i.e., the “NO” branch). - At 626, a second analysis model and/or an additional portion of the data can be selected. For example, the
control circuitry 111 can select a second analysis model based on a type of the data, where the portion of the data selected at 604 is located within the data, entropy data indicating a randomness of at least a portion of the data (e.g., the data as a whole, the portion of the data selected at 604 for processing, etc.), a confidence value generated from a previous analysis, a type of the first analysis model, and so on. In some instances, the second analysis model is different than the first analysis model. However, the second analysis model can be the same. Further, thecontrol circuitry 111 can select a different portion of the data (also referred to as “the second portion of the data”) for analysis atoperation 628. - At 628, one or more n-dimensional representations for the data can be analyzed. In some examples, the
control circuitry 111 can use the second analysis model to analyze one or more previously generated n-dimensional representations (e.g., a second time). Further, in some examples, thecontrol circuitry 111 can generate one or more n-dimensional representations for the second portion of the data (in a similar fashion as that discussed for one or more of operations 604-616) and use the second analysis model to analyze the one or more n-dimensional representations. - At 630, a likelihood that the one or more n-dimensional representations include a target property can be determined and/or analysis data can be generated indicating the likelihood. For example, the
control circuitry 111 can determine, based on the analysis at 628, a likelihood that an n-dimensional representation includes malicious data (e.g., malware, such as a virus, spyware, ransomware, polymorphic malware, etc.). Thecontrol circuitry 111 can generate analysis data indicating the likelihood, such as a confidence value/data. - The
control circuitry 111 can perform theoperations operations - The
control circuitry 111 can perform theoperations process 600 refers to a second analysis model atoperations process 600 can implement a third analysis model and/or a third portion of the data (when theoperations operations - In some instances, the
control circuitry 111 implements a multilayered approach, wherein each iteration (also referred to as a “layer”) through theoperations operation 628 can process data with an analysis model that requires more computational resources (in comparison to a previously implemented analysis model), requires more computational time, provides more accurate results, and so on. In one example, an additional analysis model (e.g., second, third, fourth, etc. model) includes a machine-trained model, such as a neural network. However, each iteration through theoperation 628 can process data with any type of analysis model. - Further, in some instances, a multilayered approach can include selecting a larger portion (or smaller, in some cases) of the data and/or a different portion of the data with each iteration through the
operations operation 620, a first portion of the data can be analyzed using an analysis model, wherein the first portion includes less than a threshold amount of bits/bytes and/or is associated with a particular portion/section of the data. If it is determined (at 624, for example) that the confidence value for this initial analysis is above a first threshold, and/or the confidence value is below a second threshold, theoperations operations operations operation 626, a second set of bits can be selected, wherein the second set of bits can include a group of bits that overlap with the first set of bits, such as a predetermined number of bits/bytes.Operations - In some examples, composite analysis data can be generated based on multiple analyses of the data. For instance, a first confidence value can be generated based on an analysis of the data a first time (e.g., at
operations 620 and 622) and a second confidence value can be generated based on an analysis of the data a second time (e.g., atoperations 628 and 630). The first/second confidence value can indicate a likelihood that the data includes malicious data. A composite confidence value can then be generated based on the first confidence value and the second confidence value, such as by using an equation/algorithm (which can include applying a weighting to a confidence value(s), such as by weighting the second (or first) confidence value higher), and so on. - At 632, the analysis data can be provided. For example, the
control circuitry 111 can provide the analysis data generated at 622 and/or 630 to a device, system, and/or component. In some instances, the analysis data is provided as interface data, which can be output (e.g., displayed) via a user interface. Further, in some instances, the analysis data is provided as a message or signal, which can cause additional processing to be performed, such as removing/replacing a threat (e.g., malicious data), preventing a threat from associating with data, providing information, and so on. -
FIG. 7 illustrates theexample process 700 to process analysis data regarding a target property and determine one or more characteristics about the threat in accordance with one or more embodiments. - At 702, analysis data indicating a likelihood that data includes a target property can be received. For example, the
control circuitry 111 can obtain/retrieve analysis data regarding an analysis of data using one or more of the techniques discussed herein and/or other techniques. - At 704, it may be determined if the data includes a target property. For example, the
control circuitry 111 can determine if a confidence value/data included in or otherwise associated with the analysis data is greater than a threshold or otherwise satisfies one or more criteria. The confidence value can indicate a likelihood that one or more n-dimensional representations include a target property. The confidence value can be generated by an analysis model when processing the one or more n-dimensional representations. - If it is determined that the data includes a target property, the
process 700 may proceed to operation 706 (i.e., the “YES” branch). Alternatively, if it is determined that the data does not include a target property, theprocess 700 may proceed to operation 708 (i.e., the “NO” branch). - At 708, an indication can be provided that the data is free of the target property. For example, the
control circuitry 111 can generate information/signal/message indicating that the data is free of the target property. Thecontrol circuitry 111 can provide the information/signal/message to a system/device/component, which can use the information/signal/message in a variety of manners (including continuing with normal processing). - At 706, a type of the target property and/or a source of the target property can be determined. For example, the
control circuitry 111 or other control circuitry can determine through machine learning or other techniques one or more characteristics of an n-dimensional representation and/or coordinate system that are generally associated with particular types of target properties (e.g., threats) and/or sources of target properties (e.g., particular entities that create threats, particular entities that distribute threats, and so on). Thecontrol circuitry 111 can determine a similarity between one or more malicious data characteristics and one or more characteristics of an n-dimensional representation to determine a type of the target property and/or a source of the target property. For instance, the control circuitry 111 can determine a type of malicious data and/or a source of the malicious data based on a shape of an n-dimensional representation(s) (e.g., 2D/3D model, point cloud, n-dimensional map, etc.), a size of the n-dimensional representation(s), a volume of the n-dimensional representation, an area of the n-dimensional representation(s), a number of surfaces of the n-dimensional representation(s), a location of the n-dimensional representation(s) within a coordinate system, a position of the n-dimensional representation(s) within the coordinate system relative to another n-dimensional representation(s) (e.g., how close n-dimensional representations are to each other, a cluster of n-dimensional representations, etc.), a number of n-dimensional representations within the coordinate system that are associated with malicious data, a number of n-dimensional representations within the coordinate system (whether or not they are associated with malicious data), where data that is associated with a threat is located within a piece of data (e.g., where data used to generate a malicious model is located within a file or other data unit), an amount or location of empty space within a coordinate system, and so on. - At 710, information regarding the target property can be generated and/or provided. For example, the
control circuitry 111 can generate information indicating the type of threat and/or the source of the threat. Thecontrol circuitry 111 can provide the information to a system/device/component, such as by providing user interface data, a message/signal, and so on. - At 712, a portion of the data associated with the target property can be updated and/or the updated data can be sent to a component/system/device. For example, the
control circuitry 111 can determine that a threat is associated with a particular location in data (e.g., the threat is located at within a header/footer/body, the threat is located within the first 1500 bytes/bits, the threat is located at bytes/bits 2500-3500, the threat is associated with a macro associated with a file, and so on). Thecontrol circuitry 111 can update a portion of the data that includes the threat, such as by removing the malicious data from the data, replacing the malicious data (e.g., with different data), and so on. Further, in some instances, the data can be associated with a notification/signal indicating that the data is associated with a target property, wherein such notification can be displayed or otherwise provided when the data (or a specific portion of the data that is associated with a threat) is presented, such as when the data is presented to a user. In one illustration, thecontrol circuitry 111 can receive network data and process the network data to determine if the data is associated with a threat. If the data is associated with a threat, thecontrol circuitry 111 can update the portion of the data in substantially real-time as the data is received, so that the data can be transmitted without a threat. This can allow a network transmission of the data to continue without interruption. -
FIG. 8 illustrates theexample process 800 to generate one or more n-dimensional representations for data associated with one or more target properties in accordance with one or more embodiments. - At 802, data that is associated with one or more target properties can be represented as a plurality of points. The data may have been previously tagged or otherwise categorized as including the one or more target properties (e.g., malicious data). For example, the
control circuitry 111 can represent malicious data as a plurality of points within a coordinate system/space. To illustrate, thecontrol circuitry 111 can represent a first set of bits in the malicious data as a first coordinate for a first point and a second set of bits in the malicious data as a second coordinate for the first point. The second set of bits can be adjacent to the first set of bits (e.g., directly adjacent or within a particular number of bits). Similarly, thecontrol circuitry 111 can represent a third set of bits in the malicious data as a first coordinate for a second point and a fourth set of bits in the malicious data as a second coordinate for the second point. The fourth set of bits can be adjacent to the third set of bits. - At 804, a set of points in the plurality of points can be identified. For example, the
control circuitry 111 can analyze the plurality of points using a pattern recognition algorithm to identify points that are within a particular distance from each other, positioned on a virtual surface/plane (e.g., aligned to form a substantially planar surface), and/or otherwise include characteristics that may indicate that the set of points are positioned within some type of pattern that may be used to form a surface/edge. - At 806, an n-dimensional representation can be generated. In some examples, the
control circuitry 111 can generate an n-dimensional representation for the set of points that are identified at 804. Alternatively, or additionally, thecontrol circuitry 111 can generate an n-dimensional representation for any number of points within the plurality of points, such as all points within the plurality of points, a predetermined number of points within the plurality of points, and so on. An n-dimensional representation can include an n-dimensional point representation (e.g., the plurality of points), an n-dimensional model representation (e.g., a mesh model, a wireframe model, etc.), an n-dimensional map, and so on. An n-dimensional representation can have any number of dimensions, such as two, three, four, five, etc. - At 808, it may be determined if an additional set of points are included within the data. For example, the
control circuitry 111 can perform an additional analysis on the plurality of points (points other than a first set of points) using a pattern recognition algorithm to determine if there are an additional set of points associated with one or more characteristics (e.g., a pattern). - If an additional set of points is not included within the data, the
process 800 may proceed to operation 810 (i.e., the “NO” branch). Alternatively, if an additional set of points is included within the data, theprocess 800 may return to operation 806 (i.e., the “YES” branch) to generate an n-dimensional representation for the additional set of points. Theprocess 800 may performoperations - At 810, the one or more n-dimensional representations can be associated with the one or more target properties as a data signature for the one or more target properties. For example, the
control circuitry 111 can generate a data signature for a threat and associate any number of n-dimensional representations that have been generated for the data (e.g., any number of models that are generated for the plurality of points within the coordinate space). The data signature can be stored in a data store. As such, a data signature for a target property can be associated with a target property, which can be used to analyze other data to determine if the other data includes a target property. - In some examples, machine learning can be implemented to identify characteristics of n-dimensional representations that are associated with target properties (e.g., threats). For instance, upon generating a first model for a type of malicious data and a second model for the same type of malicious data, a machine learning technique can be implemented to learn a characteristic(s) that is associated with the first model and the second model. Such characteristic can be stored/associated with a threat (e.g., associate a model characteristic with a threat (within a data signature), when the model is generated/identified for the particular type of threat a predetermined number of times).
- Further, in some examples, data for various types of malware or other types of target properties can be processed to create a data store for multiple types of threats. For instance, the
process 800 can be performed any number of times to create a taxonomy of data signatures for various types of malware. - The above description of embodiments of the disclosure is not intended to be exhaustive or to limit the disclosure to the precise form disclosed above. While specific embodiments, and examples, are described above for illustrative purposes, various equivalent modifications are possible within the scope of the disclosure, as those skilled in the relevant art will recognize. For example, while processes or blocks are presented in a given order, alternative embodiments may perform routines having steps, or employ systems having blocks, in a different order, and some processes or blocks may be deleted, moved, added, subdivided, combined, and/or modified. Each of these processes or blocks may be implemented in a variety of different ways. Also, while processes or blocks are at times shown as being performed in series, these processes or blocks may instead be performed in parallel and/or at different times.
- It should be understood that certain ordinal terms (e.g., “first” or “second”) may be provided for ease of reference and do not necessarily imply physical characteristics or ordering. Therefore, as used herein, an ordinal term (e.g., “first,” “second,” “third,” etc.) used to modify an element, such as a structure, a component, an operation, etc., does not necessarily indicate priority or order of the element with respect to any other element, but rather may generally distinguish the element from another element having a similar or identical name (but for use of the ordinal term). In addition, as used herein, articles (“a” and “an”) may indicate “one or more” rather than “one.” Further, an operation performed “based on” a condition or event may also be performed based on one or more other conditions or events not explicitly recited. In some contexts, description of an operation or event as occurring or being performed “based on,” or “based at least in part on,” a stated event or condition can be interpreted as being triggered by or performed in response to the stated event or condition.
- With respect to the various methods and processes disclosed herein, although certain orders of operations or steps are illustrated and/or described, it should be understood that the various steps and operations shown and described may be performed in any suitable or desirable temporal order. Furthermore, any of the illustrated and/or described operations or steps may be omitted from any given method or process, and the illustrated/described methods and processes may include additional operations or steps not explicitly illustrated or described.
- It should be appreciated that in the above description of embodiments, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various aspects of the disclosure. This method of disclosure, however, is not to be interpreted as reflecting an intention that any claim require more features than are expressly recited in that claim. Moreover, any components, features, or steps illustrated and/or described in a particular embodiment herein can be applied to or used with any other embodiment(s). Further, no component, feature, step, or group of components, features, or steps are necessary or indispensable for each embodiment. Thus, it is intended that the scope of the disclosure should not be limited by the particular embodiments described above.
- One or more embodiments have been described above with the aid of method steps illustrating the performance of specified functions and relationships thereof. The boundaries and sequence of these functional building blocks and method steps have been arbitrarily defined herein for convenience of description. Alternate boundaries and sequences can be defined so long as the specified functions and relationships are appropriately performed. Any such alternate boundaries or sequences are thus within the scope and spirit of the claims. Further, the boundaries of these functional building blocks have been arbitrarily defined for convenience of description. Alternate boundaries could be defined as long as the certain significant functions are appropriately performed. Similarly, flow diagram blocks may also have been arbitrarily defined herein to illustrate certain significant functionality.
- To the extent used, the flow diagram block boundaries and sequence could have been defined otherwise and still perform the certain significant functionality. Such alternate definitions of both functional building blocks and flow diagram blocks and sequences are thus within the scope and spirit of the claims. One of average skill in the art will also recognize that the functional building blocks, and other illustrative blocks, modules and components herein, can be implemented as illustrated or by discrete components, application specific integrated circuits, processors executing appropriate software and the like or any combination thereof.
- The one or more embodiments are used herein to illustrate one or more aspects, one or more features, one or more concepts, and/or one or more examples. A physical embodiment of an apparatus, an article of manufacture, a machine, and/or of a process may include one or more of the aspects, features, concepts, examples, etc. described with reference to one or more of the embodiments discussed herein. Further, from figure to figure, the embodiments may incorporate the same or similarly named functions, steps, modules, etc. that may use the same, related, or unrelated reference numbers. The relevant features, elements, functions, operations, modules, etc. may be the same or similar functions or may be unrelated.
- The term “module” is used in the description of one or more of the embodiments. A module implements one or more functions via a device, such as a processor or other processing device or other hardware that may include or operate in association with a memory that stores operational instructions. A module may operate independently and/or in conjunction with software and/or firmware. As also used herein, a module may contain one or more sub-modules, each of which may be one or more modules.
- As may further be used herein, a computer readable memory includes one or more memory elements. A memory element may be a separate memory device, multiple memory devices, or a set of memory locations within a memory device. Such a memory device may be a read-only memory, random access memory, volatile memory, non-volatile memory, static memory, dynamic memory, flash memory, cache memory, and/or any device that stores digital information. The memory device may be in a form a solid-state memory, a hard drive memory, cloud memory, thumb drive, server memory, computing device memory, and/or other physical medium for storing digital information.
- Example A, a system comprising: control circuitry; and memory communicatively coupled to the control circuitry and storing executable instructions that, when executed by the control circuitry, cause the control circuitry to perform operations comprising: receiving data; representing at least a portion of the data as a plurality of points in a coordinate system; using a pattern recognition algorithm to identify a set of points in the plurality of points; generating an n-dimensional model for the set of points; comparing the n-dimensional model to a plurality of n-dimensional models that are tagged as including a target property associated with at least one of malicious behavior, benign behavior, or a vulnerability; and based at least in part on the comparison, determining a likelihood that the data includes the target property.
- Example B, the system of Example A, wherein the representing includes: representing a first set of bits in the data as a first coordinate for a first point of the plurality of points and a second set of bits as a second coordinate for the first point, the second set of bits being adjacent to the first set of bits; and representing a third set of bits in the data as a first coordinate for a second point of the plurality of points and a fourth set of bits in the data as a second coordinate for the second point, the fourth set of bits being adjacent to the third set of bits.
- Example C, the system of Example A or B, wherein the n-dimensional model includes at least one of a 3D mesh or 3D wireframe.
- Example D, the system of any of Examples A-C, wherein the determining the likelihood is based on at least one of a shape of the n-dimensional model, a size of the n-dimensional model, a volume of the n-dimensional model, an area of the n-dimensional model, a number of surfaces of the n-dimensional model, a location of the n-dimensional model within the coordinate system, a position of the n-dimensional model relative to another n-dimensional model within the coordinate system, or a number of n-dimensional models within the coordinate system.
- Example E, the system of any of Examples A-D, wherein the operations further comprise: determining that the data includes the target property; and determining at least one of a type of the target property or a source of the target property based on at least one of a shape of the n-dimensional model, a size of the n-dimensional model, a volume of the n-dimensional model, an area of the n-dimensional model, a number of surfaces of the n-dimensional model, a location of the n-dimensional model within the coordinate system, a position of the n-dimensional model to another n-dimensional model within the coordinate system, or a number of n-dimensional models within the coordinate system that are associated with the target property or another target property.
- Example F, the system of any of Examples A-E, wherein the operations further comprise: determining that the data includes the target property; updating a portion of the data that includes the target property to generate updated data, the updating including at least one of removing the target property or replacing the target property; and sending the updated data to a component.
- Example G, the system of any of Examples A-F, wherein the operations further comprise: representing predetermined data associated with the target property as multiple points; processing the multiple points to generate one or more of the plurality of n-dimensional models that are tagged as associated with the target property; and storing the plurality of n-dimensional models as signatures for the predetermined data.
- Example H, a method comprising: receiving, by control circuitry, data; representing, by the control circuitry, at least a portion of the data as a first plurality of points in a coordinate system; analyzing, by the control circuitry, the first plurality of points to identify a set of points; generating, by the control circuitry, a first n-dimensional model for the set of points; and determining, by the control circuitry, a first likelihood that the data includes a target property based at least in part on an analysis of (i) the first n-dimensional model and (ii) a plurality of n-dimensional models that are tagged as being associated with the target property, the target property including at least one of malicious data, benign data, or vulnerability data.
- Example I, the method of Example I, wherein the determining the first likelihood includes determining a likelihood that the data includes polymorphic malware.
- Example J, the method of Example H or I, further comprising: generating a signature for the data that includes the first n-dimensional model.
- Example K, the method of any of Examples H-J, wherein the representing includes: determining a first coordinate for a first point of the plurality of points based at least in part on a first group of bits in the data; and determining a second coordinate for the first point based at least in part on a second groups of bits in the data that is adjacent to the first group of bits.
- Example L, the method of any of Examples H-K, further comprising: associating the first point with an indicator indicating a location of at least one of the first group of bits or the second group of bits within the data; wherein the determining the first likelihood is based at least in part on the indicator.
- Example M, the method of any of Examples H-L, wherein the portion of the data includes first bits, and the method further comprises: representing second bits in the data as a second plurality of points, the second bits including a group of bits that overlap with the first bits; generating a second n-dimensional model for the second plurality of points; and determining a second likelihood that the data includes the target property based at least in part on an analysis of (i) the second n-dimensional model and (ii) the plurality of n-dimensional models that are tagged as being associated with the target property.
- Example N, one or more non-transitory computer-readable media storing computer-executable instructions that, when executed by control circuitry, cause the control circuitry to perform operations comprising: receiving data; representing at least a first portion of the data as a plurality of points in a coordinate system; identifying a set of points in the plurality of points; generating a n-dimensional model for the set of points; comparing the n-dimensional model to a n-dimensional model that is tagged as being associated with a target property; and generating a first confidence value indicating a first likelihood that the data includes the target property.
- Example O, the one or more non-transitory computer-readable media of Example N, wherein the operations further comprise: processing the plurality of points using a machine-trained model; generating a second confidence value indicating a second likelihood that the data includes the target property; and determining a composite confidence value for the data based at least in part on the first confidence value and the second confidence value.
- Example P, the one or more non-transitory computer-readable media of Example N or O, wherein the first likelihood indicates a likelihood that the data includes malware.
- Example Q, the one or more non-transitory computer-readable media of any of Examples N-P, wherein the operations further comprise: analyzing the data to generate entropy data indicating a randomness of the first portion of the data and a randomness of a second portion of the data; and selecting the first portion of the data for processing based at least in part on the randomness of the first portion of the data; wherein the representing the first portion of the data is based at least in part on selecting the first portion of the data.
- Example R, the one or more non-transitory computer-readable media of any of Examples N-Q, wherein at least one of the representing, the using, the generating, or the comparing are part of implementing a first analysis model, and the operations further comprise: selecting a second analysis model based on at least one of a type of the data, where the first portion of the data is located within the data, or entropy data indicating a randomness of at least the first portion of the data, the second analysis model being different than the first analysis model; and analyzing the data using the second analysis model.
- Example S, the one or more non-transitory computer-readable media of any of Examples N-R, wherein the n-dimensional model includes at least one of a mesh or wireframe.
- Example T, the one or more non-transitory computer-readable media of any of Examples N-S, wherein the generating is based on at least one of a shape of the n-dimensional model, a size of the n-dimensional model, a volume of the n-dimensional model, an area of the n-dimensional model, a number of surfaces of the n-dimensional model, a location of the n-dimensional model within the coordinate system, a position of the n-dimensional model relative to another n-dimensional model within the coordinate system, or a number of n-dimensional models within the coordinate system.
- Example AA, a method of detecting malware, the method comprising: receiving, by a computing device, data from a data store; identifying, by the computing device, at least a first group of bits in the data and a second group of bits in the data; representing, by the computing device, a first set of bits in the first group of bits as a first coordinate for a first point and a second set of bits in the first group of bits as a second coordinate for the first point; representing, by the computing device, a first set of bits in the second group of bits as a first coordinate for a second point and a second set of bits in the second group of bits as a second coordinate for the second point; generating, by the computing device, an n-dimensional representation for the data based at least in part on the first point and the second point; processing the n-dimensional representation using a model that has been trained using machine learning; and determining a malware rating for the data based at least in part on the processing, the malware rating indicating a likelihood that the data is associated with malware.
- Example BB, the method of Example A, further comprising: representing, by the computing device, a third set of bits in the first group of bits as a third coordinate for the first point, wherein the n-dimensional representation comprises a three-dimensional representation.
- Example CC, the method of Example A or B, wherein the first set of bits in the first group of bits comprises a first byte, the second set of bits in the first group of bits comprises a second byte that is directly adjacent to the first byte, and the third set of bits in the first group of bits comprises a third byte that is directly adjacent to the second byte.
- Example DD, the method of any of Examples A-C, wherein the data comprises file system data.
- Example EE, the method of any of Examples A-D, wherein the data comprises non-image-based data.
- Example FF, a system comprising: control circuitry; and memory communicatively coupled to the control circuitry and storing executable instructions that, when executed by the control circuitry, cause the control circuitry to perform operations comprising: obtaining data; determining a first coordinate for a first point based at least in part on a first set of bits in the data and determining a second coordinate for the first point based at least in part on a second set of bits in the data that is adjacent to the first set of bits; determining a first coordinate for a second point based at least in part on a third set of bits in the data and determining a second coordinate for the second point based at least in part on a fourth set of bits in the data that is adjacent to the third set of bits; generating an n-dimensional representation for the data based at least in part on the first point and the second point; and causing the n-dimensional representation to be processed with a machine-trained model that is configured to detect malware.
- Example GG, the system of Example F, wherein the first set of bits comprises a first byte and the second set of bits comprises a second byte that is directly adjacent to the first byte.
- Example HH, the system of Example F or G, wherein obtaining the data comprises retrieving data from a data store, the data comprising file system data.
- Example II, the system of any of Examples F-H, wherein the operations further comprise: extracting a first portion of the data and refraining from extracting a second portion of the data, the first portion of the data including the first set of bits and the second set of bits.
- Example JJ, the system of any of Examples F-I, wherein the operations further comprise: determining a type of the data; and determining to represent the data with a first portion of the data based at least in part on the type of the data, the first portion of the data including the first set of bits and the second set of bits.
- Example KK, the system of any of Examples F-J, wherein the first portion of the data includes at least one of a header, a body, or a footer.
- Example LL, the system of any of Examples F-K, wherein the operations further comprise: determining a type of the data; and determining to represent the data with a first portion of the data and a second portion of the data based at least in part on the type of the data, the first portion of the data including the first set of bits and the second set of bits.
- Example MM, the system of any of Examples F-L, wherein the operations further comprise: training a model to create the machine-trained model, the training being based at least in part on one or more n-dimensional representations that are tagged as being associated with malware and one or more n-dimensional representations that are tagged as being malware free.
- Example NN, one or more non-transitory computer-readable media storing computer-executable instructions that, when executed, instruct one or more processors to perform operations comprising: obtaining data; determining a first coordinate for a first point based at least in part on a first set of bits in the data and determining a second coordinate for the first point based at least in part on a second set of bits in the data that is adjacent to the first set of bits; determining a first coordinate for a second point based at least in part on a third set of bits in the data and determining a second coordinate for the second point based at least in part on a fourth set of bits in the data that is adjacent to the third set of bits; generating an n-dimensional representation for the data based at least in part on the first and second coordinates for the first point and the first and second coordinates for the second point; and causing the n-dimensional representation to be processed with a machine-trained model that is configured to detect a threat.
- Example OO, the one or more non-transitory computer-readable media of Example N, wherein the data comprises at least one of file system data, network traffic data, runtime data, or data associated with an isolated environment.
- Example PP, the one or more non-transitory computer-readable media of Example N or O, wherein the operations further comprise: processing the n-dimensional representation with the machine-trained model; detecting the threat based at least in part on the processing; and performing a threat operation to address the threat, the threat operation comprising at least one of removing the threat, preventing the threat from associating with the data, or providing a notification to a computing device regarding the threat.
- Example QQ, the one or more non-transitory computer-readable media of any of Examples N-P, wherein the first set of bits is directly adjacent to the second set of bits.
- Example RR, the one or more non-transitory computer-readable media of any of Examples N-Q, wherein the operations further comprise: determining a type of the data; and determining to represent the data with a first portion of the data based at least in part on the type of the data, the first portion of the data including the first set of bits, the second set of bits, the third set of bits, and the fourth set of bits.
- Example SS, the one or more non-transitory computer-readable media of any of Examples N-R, wherein the operations further comprise: training a model to create the machine-trained model, the training being based at least in part on one or more n-dimensional representations that are tagged as being associated with one or more threats and one or more n-dimensional representations that are tagged as being threat free.
- Example TT, the one or more non-transitory computer-readable media of any of Examples N-S, wherein the machine-trained model includes an artificial neural network and the training includes using machine learning.
Claims (24)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/185,884 US20220269784A1 (en) | 2021-02-25 | 2021-02-25 | N-dimensional model techniques and architectures for data protection |
PCT/US2022/017510 WO2022182751A1 (en) | 2021-02-25 | 2022-02-23 | N-dimensional model techniques and architectures for data protection |
US18/600,516 US12367282B2 (en) | 2021-02-25 | 2024-03-08 | Bit-level data extraction and threat detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/185,884 US20220269784A1 (en) | 2021-02-25 | 2021-02-25 | N-dimensional model techniques and architectures for data protection |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/600,516 Continuation US12367282B2 (en) | 2021-02-25 | 2024-03-08 | Bit-level data extraction and threat detection |
Publications (1)
Publication Number | Publication Date |
---|---|
US20220269784A1 true US20220269784A1 (en) | 2022-08-25 |
Family
ID=82900751
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/185,884 Abandoned US20220269784A1 (en) | 2021-02-25 | 2021-02-25 | N-dimensional model techniques and architectures for data protection |
US18/600,516 Active US12367282B2 (en) | 2021-02-25 | 2024-03-08 | Bit-level data extraction and threat detection |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/600,516 Active US12367282B2 (en) | 2021-02-25 | 2024-03-08 | Bit-level data extraction and threat detection |
Country Status (2)
Country | Link |
---|---|
US (2) | US20220269784A1 (en) |
WO (1) | WO2022182751A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220207133A1 (en) * | 2022-03-16 | 2022-06-30 | Intel Corporation | Cryptographic enforcement of borrow checking across groups of pointers |
US20230385417A1 (en) * | 2018-09-15 | 2023-11-30 | Quantum Star Technologies Inc. | Coordinate-system-based data protection techniques |
US12367282B2 (en) | 2021-02-25 | 2025-07-22 | Quantum Star Technologies Inc. | Bit-level data extraction and threat detection |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030023864A1 (en) * | 2001-07-25 | 2003-01-30 | Igor Muttik | On-access malware scanning |
US20050210056A1 (en) * | 2004-01-31 | 2005-09-22 | Itzhak Pomerantz | Workstation information-flow capture and characterization for auditing and data mining |
US20060123244A1 (en) * | 2004-12-06 | 2006-06-08 | Microsoft Corporation | Proactive computer malware protection through dynamic translation |
US20110063403A1 (en) * | 2009-09-16 | 2011-03-17 | Microsoft Corporation | Multi-camera head pose tracking |
US20150101053A1 (en) * | 2013-10-04 | 2015-04-09 | Personam, Inc. | System and method for detecting insider threats |
US20160021174A1 (en) * | 2014-07-17 | 2016-01-21 | Telefonica Digital Espana, S.L.U. | Computer implemented method for classifying mobile applications and computer programs thereof |
US9922190B2 (en) * | 2012-01-25 | 2018-03-20 | Damballa, Inc. | Method and system for detecting DGA-based malware |
US20180096230A1 (en) * | 2016-09-30 | 2018-04-05 | Cylance Inc. | Centroid for Improving Machine Learning Classification and Info Retrieval |
US20190014153A1 (en) * | 2014-01-22 | 2019-01-10 | Ulrich Lang | Automated and adaptive model-driven security system and method for operating the same |
US20220057519A1 (en) * | 2020-08-18 | 2022-02-24 | IntelliShot Holdings, Inc. | Automated threat detection and deterrence apparatus |
US20220147629A1 (en) * | 2020-11-06 | 2022-05-12 | Vmware Inc. | Systems and methods for classifying malware based on feature reuse |
Family Cites Families (168)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7006881B1 (en) | 1991-12-23 | 2006-02-28 | Steven Hoffberg | Media recording device with remote graphic user interface |
US7209221B2 (en) | 1994-05-23 | 2007-04-24 | Automotive Technologies International, Inc. | Method for obtaining and displaying information about objects in a vehicular blind spot |
US8364136B2 (en) | 1999-02-01 | 2013-01-29 | Steven M Hoffberg | Mobile system, a method of operating mobile system and a non-transitory computer readable medium for a programmable control of a mobile system |
US8010469B2 (en) | 2000-09-25 | 2011-08-30 | Crossbeam Systems, Inc. | Systems and methods for processing data flows |
US9525696B2 (en) | 2000-09-25 | 2016-12-20 | Blue Coat Systems, Inc. | Systems and methods for processing data flows |
JP3610455B2 (en) | 2001-01-30 | 2005-01-12 | 独立行政法人産業技術総合研究所 | EMG pattern identification device |
JP2003162728A (en) | 2001-11-26 | 2003-06-06 | Ricoh Co Ltd | Image processing device and image output device |
US7475427B2 (en) | 2003-12-12 | 2009-01-06 | International Business Machines Corporation | Apparatus, methods and computer programs for identifying or managing vulnerabilities within a data processing network |
US20060095732A1 (en) | 2004-08-30 | 2006-05-04 | Tran Thang M | Processes, circuits, devices, and systems for scoreboard and other processor improvements |
US7836504B2 (en) | 2005-03-01 | 2010-11-16 | Microsoft Corporation | On-access scan of memory for malware |
US7593013B2 (en) | 2005-03-11 | 2009-09-22 | University Of Utah Research Foundation | Systems and methods for displaying and querying heterogeneous sets of data |
US8151352B1 (en) | 2006-07-14 | 2012-04-03 | Bitdefender IPR Managament Ltd. | Anti-malware emulation systems and methods |
US8392996B2 (en) | 2006-08-08 | 2013-03-05 | Symantec Corporation | Malicious software detection |
US8042184B1 (en) * | 2006-10-18 | 2011-10-18 | Kaspersky Lab, Zao | Rapid analysis of data stream for malware presence |
US8126728B2 (en) | 2006-10-24 | 2012-02-28 | Medapps, Inc. | Systems and methods for processing and transmittal of medical data through an intermediary device |
US8131566B2 (en) | 2006-10-24 | 2012-03-06 | Medapps, Inc. | System for facility management of medical data and patient interface |
US8468244B2 (en) | 2007-01-05 | 2013-06-18 | Digital Doors, Inc. | Digital information infrastructure and method for security designated data and with granular data stores |
US9952673B2 (en) | 2009-04-02 | 2018-04-24 | Oblong Industries, Inc. | Operating environment comprising multiple client devices, multiple displays, multiple users, and gestural control |
TW201003421A (en) | 2008-04-28 | 2010-01-16 | Alexandria Invest Res And Technology Llc | Adaptive knowledge platform |
US9609015B2 (en) | 2008-05-28 | 2017-03-28 | Zscaler, Inc. | Systems and methods for dynamic cloud-based malware behavior analysis |
US20130275384A1 (en) | 2008-08-20 | 2013-10-17 | Arun Kumar Sivasubramanian | System, method, and computer program product for determining whether an electronic mail message is unwanted based on processing images associated with a link in the electronic mail message |
US8701192B1 (en) | 2009-06-30 | 2014-04-15 | Symantec Corporation | Behavior based signatures |
US20110041179A1 (en) | 2009-08-11 | 2011-02-17 | F-Secure Oyj | Malware detection |
US8533581B2 (en) | 2010-05-13 | 2013-09-10 | Symantec Corporation | Optimizing security seals on web pages |
US8584241B1 (en) | 2010-08-11 | 2013-11-12 | Lockheed Martin Corporation | Computer forensic system |
JP5641058B2 (en) * | 2010-12-28 | 2014-12-17 | 富士通株式会社 | Program, information processing apparatus and method |
US9323769B2 (en) | 2011-03-23 | 2016-04-26 | Novell, Inc. | Positional relationships between groups of files |
US9262246B2 (en) | 2011-03-31 | 2016-02-16 | Mcafee, Inc. | System and method for securing memory and storage of an electronic device with a below-operating system security agent |
US9164679B2 (en) | 2011-04-06 | 2015-10-20 | Patents1, Llc | System, method and computer program product for multi-thread operation involving first memory of a first memory class and second memory of a second memory class |
US20180107591A1 (en) | 2011-04-06 | 2018-04-19 | P4tents1, LLC | System, method and computer program product for fetching data between an execution of a plurality of threads |
US8930647B1 (en) | 2011-04-06 | 2015-01-06 | P4tents1, LLC | Multiple class memory systems |
US9432298B1 (en) | 2011-12-09 | 2016-08-30 | P4tents1, LLC | System, method, and computer program product for improving memory systems |
GB2495265A (en) | 2011-07-07 | 2013-04-10 | Toyota Motor Europe Nv Sa | Artificial memory system for predicting behaviours in order to assist in the control of a system, e.g. stability control in a vehicle |
US10789526B2 (en) | 2012-03-09 | 2020-09-29 | Nara Logics, Inc. | Method, system, and non-transitory computer-readable medium for constructing and applying synaptic networks |
US20130275709A1 (en) | 2012-04-12 | 2013-10-17 | Micron Technology, Inc. | Methods for reading data from a storage buffer including delaying activation of a column select |
US9411955B2 (en) * | 2012-08-09 | 2016-08-09 | Qualcomm Incorporated | Server-side malware detection and classification |
US20150309813A1 (en) * | 2012-08-31 | 2015-10-29 | iAppSecure Solutions Pvt. Ltd | A System for analyzing applications in order to find security and quality issues |
US9104870B1 (en) | 2012-09-28 | 2015-08-11 | Palo Alto Networks, Inc. | Detecting malware |
US9332028B2 (en) | 2013-01-25 | 2016-05-03 | REMTCS Inc. | System, method, and apparatus for providing network security |
WO2014149080A1 (en) | 2013-03-18 | 2014-09-25 | The Trustees Of Columbia University In The City Of New York | Detection of anomalous program execution using hardware-based micro-architectural data |
US10437658B2 (en) | 2013-06-06 | 2019-10-08 | Zebra Technologies Corporation | Method, apparatus, and computer program product for collecting and displaying sporting event data based on real time data for proximity and movement of objects |
US10089330B2 (en) | 2013-12-20 | 2018-10-02 | Qualcomm Incorporated | Systems, methods, and apparatus for image retrieval |
CN104751112B (en) | 2013-12-31 | 2018-05-04 | 石丰 | A kind of fingerprint template and fingerprint identification method based on fuzzy characteristics point information |
US10216828B2 (en) | 2014-03-05 | 2019-02-26 | Ayasdi, Inc. | Scalable topological summary construction using landmark point selection |
US9912690B2 (en) | 2014-04-08 | 2018-03-06 | Capital One Financial Corporation | System and method for malware detection using hashing techniques |
US10395032B2 (en) | 2014-10-03 | 2019-08-27 | Nokomis, Inc. | Detection of malicious software, firmware, IP cores and circuitry via unintended emissions |
EP2955899A1 (en) | 2014-06-13 | 2015-12-16 | Orange | Method and apparatus to regulate a digital security system that controls access to a resource |
US10296843B2 (en) | 2014-09-24 | 2019-05-21 | C3 Iot, Inc. | Systems and methods for utilizing machine learning to identify non-technical loss |
US9684775B2 (en) | 2014-10-15 | 2017-06-20 | Qualcomm Incorporated | Methods and systems for using behavioral analysis towards efficient continuous authentication |
EP3210153A4 (en) | 2014-10-25 | 2018-05-30 | McAfee, Inc. | Computing platform security methods and apparatus |
US10073972B2 (en) | 2014-10-25 | 2018-09-11 | Mcafee, Llc | Computing platform security methods and apparatus |
US9690928B2 (en) | 2014-10-25 | 2017-06-27 | Mcafee, Inc. | Computing platform security methods and apparatus |
WO2016094840A2 (en) | 2014-12-11 | 2016-06-16 | Ghosh Sudeep | System, method & computer readable medium for software protection via composable process-level virtual machines |
US9197663B1 (en) | 2015-01-29 | 2015-11-24 | Bit9, Inc. | Methods and systems for identifying potential enterprise software threats based on visual and non-visual data |
US9588872B2 (en) | 2015-02-20 | 2017-03-07 | Vmware, Inc. | Discovery of code paths |
US9495633B2 (en) | 2015-04-16 | 2016-11-15 | Cylance, Inc. | Recurrent neural networks for malware analysis |
WO2016179438A1 (en) | 2015-05-05 | 2016-11-10 | Ayasdi, Inc. | Scalable topological summary construction using landmark point selection |
US10536357B2 (en) | 2015-06-05 | 2020-01-14 | Cisco Technology, Inc. | Late data detection in data center |
US10721502B2 (en) | 2015-07-06 | 2020-07-21 | Lg Electronics Inc. | Broadcasting signal transmission device, broadcasting signal reception device, broadcasting signal transmission method, and broadcasting signal reception method |
US10339593B2 (en) | 2015-07-07 | 2019-07-02 | Lutzy Inc. | System and network for outfit planning and wardrobe management |
US9690938B1 (en) | 2015-08-05 | 2017-06-27 | Invincea, Inc. | Methods and apparatus for machine learning based malware detection |
US20190054347A1 (en) | 2015-08-18 | 2019-02-21 | Michael Saigh | Wearable sports guidance communication system and developers tool kit |
WO2017073000A1 (en) | 2015-10-29 | 2017-05-04 | 株式会社Preferred Networks | Information processing device and information processing method |
US10193902B1 (en) | 2015-11-02 | 2019-01-29 | Deep Instinct Ltd. | Methods and systems for malware detection |
US11049004B1 (en) | 2015-11-15 | 2021-06-29 | ThetaRay Ltd. | System and method for anomaly detection in dynamically evolving data using random neural network decomposition |
US10552727B2 (en) | 2015-12-15 | 2020-02-04 | Deep Instinct Ltd. | Methods and systems for data traffic analysis |
US10073965B2 (en) | 2015-12-15 | 2018-09-11 | Nagravision S.A. | Methods and systems for validating an autonomous system that includes a dynamic-code module and a static-code module |
US9998483B2 (en) | 2015-12-22 | 2018-06-12 | Mcafee, Llc | Service assurance and security of computing systems using fingerprinting |
US10788836B2 (en) | 2016-02-29 | 2020-09-29 | AI Incorporated | Obstacle recognition method for autonomous robots |
US11019101B2 (en) | 2016-03-11 | 2021-05-25 | Netskope, Inc. | Middle ware security layer for cloud computing services |
US9928366B2 (en) | 2016-04-15 | 2018-03-27 | Sophos Limited | Endpoint malware detection using an event graph |
US20190339688A1 (en) | 2016-05-09 | 2019-11-07 | Strong Force Iot Portfolio 2016, Llc | Methods and systems for data collection, learning, and streaming of machine signals for analytics and maintenance using the industrial internet of things |
US11327475B2 (en) | 2016-05-09 | 2022-05-10 | Strong Force Iot Portfolio 2016, Llc | Methods and systems for intelligent collection and analysis of vehicle data |
US20230196231A1 (en) | 2016-05-09 | 2023-06-22 | Strong Force Iot Portfolio 2016, Llc | Industrial digital twin systems using state value to adjust industrial production processes and determine relevance with role taxonomy |
US20180284758A1 (en) | 2016-05-09 | 2018-10-04 | StrongForce IoT Portfolio 2016, LLC | Methods and systems for industrial internet of things data collection for equipment analysis in an upstream oil and gas environment |
US20210157312A1 (en) | 2016-05-09 | 2021-05-27 | Strong Force Iot Portfolio 2016, Llc | Intelligent vibration digital twin systems and methods for industrial environments |
US10601783B2 (en) | 2016-05-13 | 2020-03-24 | MyCroft Secure Computing Corp. | System and method for digital payload inspection |
US20200004938A1 (en) | 2016-06-10 | 2020-01-02 | OneTrust, LLC | Data processing and scanning systems for assessing vendor risk |
US10652257B1 (en) * | 2016-07-11 | 2020-05-12 | State Farm Mutual Automobile Insurance Company | Detection of anomalous computer behavior |
US10891538B2 (en) | 2016-08-11 | 2021-01-12 | Nvidia Corporation | Sparse convolutional neural network accelerator |
WO2018031940A1 (en) | 2016-08-12 | 2018-02-15 | ALTR Solutions, Inc. | Fragmenting data for the purposes of persistent storage across multiple immutable data structures |
US10552292B2 (en) | 2016-08-18 | 2020-02-04 | Proov Systems Ltd. | System, method and computer product for management of proof-of-concept software pilots, including neural network-based KPI prediction |
US10218718B2 (en) | 2016-08-23 | 2019-02-26 | Cisco Technology, Inc. | Rapid, targeted network threat detection |
US10154051B2 (en) | 2016-08-31 | 2018-12-11 | Cisco Technology, Inc. | Automatic detection of network threats based on modeling sequential behavior in network traffic |
US10333965B2 (en) | 2016-09-12 | 2019-06-25 | Qualcomm Incorporated | Methods and systems for on-device real-time adaptive security based on external threat intelligence inputs |
US12039413B2 (en) | 2016-09-21 | 2024-07-16 | Blue Voyant | Cognitive modeling apparatus including multiple knowledge node and supervisory node devices |
US9858424B1 (en) | 2017-01-05 | 2018-01-02 | Votiro Cybersec Ltd. | System and method for protecting systems from active content |
EP3552112A1 (en) | 2016-12-09 | 2019-10-16 | Beijing Horizon Information Technology Co., Ltd. | Systems and methods for data management |
US10832168B2 (en) * | 2017-01-10 | 2020-11-10 | Crowdstrike, Inc. | Computational modeling and classification of data streams |
US10530747B2 (en) | 2017-01-13 | 2020-01-07 | Citrix Systems, Inc. | Systems and methods to run user space network stack inside docker container while bypassing container Linux network stack |
EP3570804A4 (en) | 2017-01-17 | 2021-05-26 | Blind Insites, LLC | Devices, systems, and methods for navigation and usage guidance in a navigable space using wireless communication |
US10404735B2 (en) * | 2017-02-02 | 2019-09-03 | Aetna Inc. | Individualized cybersecurity risk detection using multiple attributes |
JP6770237B2 (en) * | 2017-03-09 | 2020-10-14 | 富士通株式会社 | Biometric device, biometric method, and biometric program |
US10819718B2 (en) | 2017-07-05 | 2020-10-27 | Deep Instinct Ltd. | Methods and systems for detecting malicious webpages |
CN109284130B (en) | 2017-07-20 | 2021-03-23 | 上海寒武纪信息科技有限公司 | Neural network operation device and method |
US11348269B1 (en) | 2017-07-27 | 2022-05-31 | AI Incorporated | Method and apparatus for combining data to construct a floor plan |
US20230196230A1 (en) | 2017-08-02 | 2023-06-22 | Strong Force Iot Portfolio 2016, Llc | User interface for industrial digital twin system analyzing data to determine structures with visualization of those structures with reduced dimensionality |
US10921801B2 (en) | 2017-08-02 | 2021-02-16 | Strong Force loT Portfolio 2016, LLC | Data collection systems and methods for updating sensed parameter groups based on pattern recognition |
US10388042B2 (en) | 2017-08-25 | 2019-08-20 | Microsoft Technology Licensing, Llc | Efficient display of data points in a user interface |
US20190102670A1 (en) | 2017-10-02 | 2019-04-04 | Imec Vzw | Secure Broker-Mediated Data Analysis and Prediction |
US10848519B2 (en) | 2017-10-12 | 2020-11-24 | Charles River Analytics, Inc. | Cyber vaccine and predictive-malware-defense methods and systems |
WO2019075399A1 (en) | 2017-10-13 | 2019-04-18 | Huawei Technologies Co., Ltd. | System and method for cloud-device collaborative real-time user usage and performance abnormality detection |
EP3667569B1 (en) | 2017-10-20 | 2025-04-23 | Shanghai Cambricon Information Technology Co., Ltd | Processing method and device, operation method and device |
US11024424B2 (en) | 2017-10-27 | 2021-06-01 | Nuance Communications, Inc. | Computer assisted coding systems and methods |
US11475351B2 (en) | 2017-11-15 | 2022-10-18 | Uatc, Llc | Systems and methods for object detection, tracking, and motion prediction |
US20190042743A1 (en) | 2017-12-15 | 2019-02-07 | Intel Corporation | Malware detection and classification using artificial neural network |
US20190206564A1 (en) | 2017-12-28 | 2019-07-04 | Ethicon Llc | Method for facility data collection and interpretation |
WO2019144039A1 (en) | 2018-01-18 | 2019-07-25 | Risksense, Inc. | Complex application attack quantification, testing, detection and prevention |
JP7168591B2 (en) | 2018-01-26 | 2022-11-09 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device |
EP3751520A4 (en) | 2018-02-08 | 2021-03-31 | Panasonic Intellectual Property Corporation of America | METHOD FOR CODING THREE-DIMENSIONAL DATA, METHOD FOR DECODING THREE-DIMENSIONAL DATA, DEVICE FOR CODING THREE-DIMENSIONAL DATA AND DEVICE FOR DECODING THREE-DIMENSIONAL DATA |
US11630666B2 (en) | 2018-02-13 | 2023-04-18 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
US11106598B2 (en) | 2018-02-13 | 2021-08-31 | Shanghai Cambricon Information Technology Co., Ltd. | Computing device and method |
US11720357B2 (en) | 2018-02-13 | 2023-08-08 | Shanghai Cambricon Information Technology Co., Ltd | Computing device and method |
DK3800856T3 (en) | 2018-02-20 | 2023-08-28 | Darktrace Holdings Ltd | Cyber security appliance for a cloud infrastructure |
US11463457B2 (en) | 2018-02-20 | 2022-10-04 | Darktrace Holdings Limited | Artificial intelligence (AI) based cyber threat analyst to support a cyber security appliance |
US20190273510A1 (en) | 2018-03-01 | 2019-09-05 | Crowdstrike, Inc. | Classification of source data by neural network processing |
US20190273509A1 (en) | 2018-03-01 | 2019-09-05 | Crowdstrike, Inc. | Classification of source data by neural network processing |
CA3096184A1 (en) | 2018-04-10 | 2019-10-17 | Panasonic Intellectual Property Corporation Of America | Three-dimensional data encoding method, three-dimensional data decoding method, tree-dimensional data encoding device, and three-dimensional data decoding device |
US11055417B2 (en) | 2018-04-17 | 2021-07-06 | Oracle International Corporation | High granularity application and data security in cloud environments |
KR20250114575A (en) | 2018-04-19 | 2025-07-29 | 파나소닉 인텔렉츄얼 프로퍼티 코포레이션 오브 아메리카 | Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device |
JP7330710B2 (en) | 2018-04-26 | 2023-08-22 | キヤノン株式会社 | Information processing device, information processing method and program |
US20190342324A1 (en) | 2018-05-02 | 2019-11-07 | IPKeys Technologies, LLC | Computer vulnerability assessment and remediation |
US20200133254A1 (en) | 2018-05-07 | 2020-04-30 | Strong Force Iot Portfolio 2016, Llc | Methods and systems for data collection, learning, and streaming of machine signals for part identification and operating characteristics determination using the industrial internet of things |
EP3791236A4 (en) | 2018-05-07 | 2022-06-08 | Strong Force Iot Portfolio 2016, LLC | METHODS AND SYSTEMS FOR DATA COLLECTION, LEARNING AND STREAMING MACHINE SIGNALS FOR ANALYSIS AND MAINTENANCE USING THE INDUSTRIAL INTERNET OF THINGS |
US11528287B2 (en) | 2018-06-06 | 2022-12-13 | Reliaquest Holdings, Llc | Threat mitigation system and method |
CN112262412B (en) | 2018-06-13 | 2025-02-18 | 松下电器(美国)知识产权公司 | Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device |
CN112424833A (en) | 2018-07-13 | 2021-02-26 | 松下电器(美国)知识产权公司 | Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device |
US11429712B2 (en) * | 2018-07-24 | 2022-08-30 | Royal Bank Of Canada | Systems and methods for dynamic passphrases |
US11444957B2 (en) | 2018-07-31 | 2022-09-13 | Fortinet, Inc. | Automated feature extraction and artificial intelligence (AI) based detection and classification of malware |
CN109165249B (en) | 2018-08-07 | 2020-08-04 | 阿里巴巴集团控股有限公司 | Data processing model construction method and device, server and user side |
EP3620983B1 (en) | 2018-09-05 | 2023-10-25 | Sartorius Stedim Data Analytics AB | Computer-implemented method, computer program product and system for data analysis |
US10803174B2 (en) | 2018-09-15 | 2020-10-13 | Quantum Star Technologies LLC | Bit-level data generation and artificial intelligence techniques and architectures for data protection |
US10877752B2 (en) | 2018-09-28 | 2020-12-29 | Intel Corporation | Techniques for current-sensing circuit design for compute-in-memory |
US20200162280A1 (en) | 2018-11-19 | 2020-05-21 | Johnson Controls Technology Company | Building system with performance identification through equipment exercising and entity relationships |
US10703381B2 (en) | 2018-11-28 | 2020-07-07 | International Business Machines Corporation | Intelligent vehicle action decisions |
US11200318B2 (en) | 2018-12-28 | 2021-12-14 | Mcafee, Llc | Methods and apparatus to detect adversarial malware |
CN120525973A (en) | 2019-02-05 | 2025-08-22 | 松下电器(美国)知识产权公司 | Three-dimensional data encoding method, three-dimensional data decoding method, three-dimensional data encoding device, and three-dimensional data decoding device |
US11301569B2 (en) | 2019-03-07 | 2022-04-12 | Lookout, Inc. | Quarantine of software based on analysis of updated device data |
CN113615207A (en) | 2019-03-21 | 2021-11-05 | Lg电子株式会社 | Point cloud data transmitting device, point cloud data transmitting method, point cloud data receiving device, and point cloud data receiving method |
US20190272375A1 (en) | 2019-03-28 | 2019-09-05 | Intel Corporation | Trust model for malware classification |
EP3956906A4 (en) | 2019-04-16 | 2023-01-18 | International Medical Solutions, Inc. | METHODS AND SYSTEMS FOR SYNCHRONIZING MEDICAL IMAGES ON ONE OR MORE NETWORKS AND DEVICES |
US12405823B2 (en) | 2019-05-16 | 2025-09-02 | Nvidia Corporation | Resource sharing by two or more heterogeneous processing cores |
WO2020236981A1 (en) | 2019-05-20 | 2020-11-26 | Sentinel Labs Israel Ltd. | Systems and methods for executable code detection, automatic feature extraction and position independent code detection |
US11176692B2 (en) | 2019-07-01 | 2021-11-16 | Sas Institute Inc. | Real-time concealed object tracking |
US20210034813A1 (en) | 2019-07-31 | 2021-02-04 | 3M Innovative Properties Company | Neural network model with evidence extraction |
EP4024280A4 (en) | 2019-08-27 | 2022-11-16 | Anhui Cambricon Information Technology Co., Ltd. | Data processing method and apparatus, computer equipment, and storage medium |
US11693657B2 (en) | 2019-09-05 | 2023-07-04 | Micron Technology, Inc. | Methods for performing fused-multiply-add operations on serially allocated data within a processing-in-memory capable memory device, and related memory devices and systems |
EP4036810A4 (en) | 2019-09-24 | 2023-10-18 | Anhui Cambricon Information Technology Co., Ltd. | Neural network processing method and apparatus, computer device and storage medium |
US10824722B1 (en) | 2019-10-04 | 2020-11-03 | Intezer Labs, Ltd. | Methods and systems for genetic malware analysis and classification using code reuse patterns |
US11526655B2 (en) | 2019-11-19 | 2022-12-13 | Salesforce.Com, Inc. | Machine learning systems and methods for translating captured input images into an interactive demonstration presentation for an envisioned software product |
US11651075B2 (en) | 2019-11-22 | 2023-05-16 | Pure Storage, Inc. | Extensible attack monitoring by a storage system |
US11720714B2 (en) | 2019-11-22 | 2023-08-08 | Pure Storage, Inc. | Inter-I/O relationship based detection of a security threat to a storage system |
US11500788B2 (en) | 2019-11-22 | 2022-11-15 | Pure Storage, Inc. | Logical address based authorization of operations with respect to a storage system |
US11615185B2 (en) | 2019-11-22 | 2023-03-28 | Pure Storage, Inc. | Multi-layer security threat detection for a storage system |
US11755751B2 (en) | 2019-11-22 | 2023-09-12 | Pure Storage, Inc. | Modify access restrictions in response to a possible attack against data stored by a storage system |
US11941116B2 (en) | 2019-11-22 | 2024-03-26 | Pure Storage, Inc. | Ransomware-based data protection parameter modification |
US20200167258A1 (en) | 2020-01-28 | 2020-05-28 | Intel Corporation | Resource allocation based on applicable service level agreement |
US11544527B2 (en) | 2020-02-06 | 2023-01-03 | International Business Machines Corporation | Fuzzy cyber detection pattern matching |
US20220221966A1 (en) | 2021-01-14 | 2022-07-14 | Monday.com Ltd. | Digital processing systems and methods for dual mode editing in collaborative documents enabling private changes in collaborative work systems |
US20220269784A1 (en) | 2021-02-25 | 2022-08-25 | Quantum Star Technologies Inc. | N-dimensional model techniques and architectures for data protection |
US20230012220A1 (en) | 2021-07-07 | 2023-01-12 | Darktrace Holdings Limited | Method for determining likely malicious behavior based on abnormal behavior pattern comparison |
US11436330B1 (en) | 2021-07-14 | 2022-09-06 | Soos Llc | System for automated malicious software detection |
US20230084574A1 (en) | 2021-09-16 | 2023-03-16 | UncommonX Inc. | Bit sequence storage method and system |
GB2626472A (en) | 2021-10-11 | 2024-07-24 | Sophos Ltd | Augmented threat investigation |
WO2023091496A1 (en) | 2021-11-18 | 2023-05-25 | Rom Technologies, Inc. | System, method and apparatus for rehabilitation and exercise |
KR102424014B1 (en) | 2022-02-09 | 2022-07-25 | 주식회사 샌즈랩 | Apparatus for processing cyber threat information, method for processing cyber threat information, and medium for storing a program processing cyber threat information |
KR102437376B1 (en) | 2022-02-09 | 2022-08-30 | 주식회사 샌즈랩 | Apparatus for processing cyber threat information, method for processing cyber threat information, and medium for storing a program processing cyber threat information |
KR102420884B1 (en) | 2022-02-09 | 2022-07-15 | 주식회사 샌즈랩 | Apparatus for processing cyber threat information, method for processing cyber threat information, and medium for storing a program processing cyber threat information |
-
2021
- 2021-02-25 US US17/185,884 patent/US20220269784A1/en not_active Abandoned
-
2022
- 2022-02-23 WO PCT/US2022/017510 patent/WO2022182751A1/en active Application Filing
-
2024
- 2024-03-08 US US18/600,516 patent/US12367282B2/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030023864A1 (en) * | 2001-07-25 | 2003-01-30 | Igor Muttik | On-access malware scanning |
US20050210056A1 (en) * | 2004-01-31 | 2005-09-22 | Itzhak Pomerantz | Workstation information-flow capture and characterization for auditing and data mining |
US20060123244A1 (en) * | 2004-12-06 | 2006-06-08 | Microsoft Corporation | Proactive computer malware protection through dynamic translation |
US20110063403A1 (en) * | 2009-09-16 | 2011-03-17 | Microsoft Corporation | Multi-camera head pose tracking |
US9922190B2 (en) * | 2012-01-25 | 2018-03-20 | Damballa, Inc. | Method and system for detecting DGA-based malware |
US20150101053A1 (en) * | 2013-10-04 | 2015-04-09 | Personam, Inc. | System and method for detecting insider threats |
US20190014153A1 (en) * | 2014-01-22 | 2019-01-10 | Ulrich Lang | Automated and adaptive model-driven security system and method for operating the same |
US20160021174A1 (en) * | 2014-07-17 | 2016-01-21 | Telefonica Digital Espana, S.L.U. | Computer implemented method for classifying mobile applications and computer programs thereof |
US20180096230A1 (en) * | 2016-09-30 | 2018-04-05 | Cylance Inc. | Centroid for Improving Machine Learning Classification and Info Retrieval |
US20220057519A1 (en) * | 2020-08-18 | 2022-02-24 | IntelliShot Holdings, Inc. | Automated threat detection and deterrence apparatus |
US20220147629A1 (en) * | 2020-11-06 | 2022-05-12 | Vmware Inc. | Systems and methods for classifying malware based on feature reuse |
Non-Patent Citations (3)
Title |
---|
Han, KyoungSoo, BooJoong Kang, and Eul Gyu Im. "Malware analysis using visualized image matrices." The Scientific World Journal 2014 (2014). (Year: 2014) * |
NPL Search Terms (Year: 2021) * |
NPL Search Terms (Year: 2022) * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230385417A1 (en) * | 2018-09-15 | 2023-11-30 | Quantum Star Technologies Inc. | Coordinate-system-based data protection techniques |
US12367282B2 (en) | 2021-02-25 | 2025-07-22 | Quantum Star Technologies Inc. | Bit-level data extraction and threat detection |
US20220207133A1 (en) * | 2022-03-16 | 2022-06-30 | Intel Corporation | Cryptographic enforcement of borrow checking across groups of pointers |
US12039033B2 (en) * | 2022-03-16 | 2024-07-16 | Intel Corporation | Cryptographic enforcement of borrow checking across groups of pointers |
Also Published As
Publication number | Publication date |
---|---|
WO2022182751A1 (en) | 2022-09-01 |
US12367282B2 (en) | 2025-07-22 |
US20240330453A1 (en) | 2024-10-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11714908B2 (en) | Bit-level data generation and artificial intelligence techniques and architectures for data protection | |
US12367282B2 (en) | Bit-level data extraction and threat detection | |
JP7441582B2 (en) | Methods, devices, computer-readable storage media and programs for detecting data breaches | |
US10216934B2 (en) | Inferential exploit attempt detection | |
JP2023550974A (en) | Image-based malicious code detection method and device and artificial intelligence-based endpoint threat detection and response system using the same | |
US10354173B2 (en) | Icon based malware detection | |
AU2015270950A1 (en) | Real-time model of states of monitored devices | |
US11762991B2 (en) | Attack kill chain generation and utilization for threat analysis | |
US10623426B1 (en) | Building a ground truth dataset for a machine learning-based security application | |
US10255436B2 (en) | Creating rules describing malicious files based on file properties | |
US10839268B1 (en) | Artificial intelligence adversarial vulnerability audit tool | |
US10198576B2 (en) | Identification of mislabeled samples via phantom nodes in label propagation | |
Darus et al. | Android malware classification using xgboost on data image pattern | |
US20120151036A1 (en) | Identifying stray assets in a computing enviroment and responsively taking resolution actions | |
US12316661B2 (en) | Auto-detection of observables and auto-disposition of alerts in an endpoint detection and response (EDR) system using machine learning | |
CN116488909B (en) | A power Internet of Things network security protection method based on hierarchical expansion of data dimensions | |
CN118509192A (en) | Situation awareness processing method, electronic equipment, medium and program product | |
CN110413871B (en) | Application recommendation method and device and electronic equipment | |
US20250131087A1 (en) | Metadata processing techniques and architectures for data protection | |
US11966477B2 (en) | Methods and apparatus for generic process chain entity mapping | |
TWI849974B (en) | Method, system and computer program product for identifying threat relevancy | |
EP4206964A1 (en) | Methods and apparatus to implement a deterministic indicator and confidence scoring model | |
US20240427876A1 (en) | Exploitability prevention guidance engine | |
TW202409907A (en) | Neural network system and operation method for neural network system | |
CN117749418A (en) | Method, device, equipment and medium for judging and analyzing capability of network attack group |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: QUANTUM STAR TECHNOLOGIES INC., IDAHO Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:OETKEN, GARRETT THOMAS;STOLTENBERG, HENRY;REEL/FRAME:058943/0670 Effective date: 20220128 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |