[go: up one dir, main page]

WO2025080884A1 - Methods and systems of scrutinizing questions, answers, and answer hints based on customizable criteria - Google Patents

Methods and systems of scrutinizing questions, answers, and answer hints based on customizable criteria Download PDF

Info

Publication number
WO2025080884A1
WO2025080884A1 PCT/US2024/050838 US2024050838W WO2025080884A1 WO 2025080884 A1 WO2025080884 A1 WO 2025080884A1 US 2024050838 W US2024050838 W US 2024050838W WO 2025080884 A1 WO2025080884 A1 WO 2025080884A1
Authority
WO
WIPO (PCT)
Prior art keywords
security
user
user inputs
data
scores
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/050838
Other languages
French (fr)
Inventor
Matthew Vogel
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Reynold Vogel Inc
Original Assignee
Reynold Vogel Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Reynold Vogel Inc filed Critical Reynold Vogel Inc
Publication of WO2025080884A1 publication Critical patent/WO2025080884A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3236Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions
    • H04L9/3239Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using cryptographic hash functions involving non-keyed hash functions, e.g. modification detection codes [MDCs], MD5, SHA or RIPEMD
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q2220/00Business processing using cryptography

Definitions

  • the present disclosure pertains to the field of cybersecurity, specifically within user authentication systems, and more particularly to methods and systems employing artificial intelligence for evaluating and improving the security of account recovery mechanisms.
  • the disclosure encompasses the analysis and improvement of security questions, corresponding answers, and answer hints used during account recovery processes. It introduces customizable evaluation criteria aimed at assessing the quality 7 , robustness, and security of these elements, with the objective of strengthening authentication mechanisms across diverse online platforms, services, and institutions.
  • FIG. 1 illustrates an example of a deep neural network in which embodiments of the present technology may be implemented
  • FIG. 2 illustrates an example of training and deployment of a deep neural network
  • FIG. 3 illustrates an embodiment of a system architecture for evaluating and strengthening security of authentication systems
  • FIG. 4 illustrates an embodiment of a method for evaluating and strengthening security of authentication systems
  • FIG. 5 illustrates a flowchart of a method for evaluating and enhancing security entries using which embodiments of the present technology may be implemented
  • FIG. 6 illustrates an embodiment of an Al evaluation process to assess user-provided security' entries, using which embodiments of the present technology 7 may be implemented;
  • FIG. 7 illustrates a flowchart delineating the security scoring mechanism using which embodiments of the present technology 7 may be implemented;
  • FIG. 8 illustrates an example data flow in which embodiments of the systems and methods for evaluating and strengthening security systems of authentication may be implemented;
  • FIG. 9 illustrates a diagram of Al models used in embodiments of the systems and methods for evaluating and strengthening security systems
  • FIG. 10 illustrates a feedback mechanism to provide users with immediate assessments and recommendations as may be used in embodiments of the systems and methods for evaluating and strengthening security systems
  • FIG. 11 illustrates an example of integration of embodiments of the systems and methods for evaluating and strengthening security systems with existing authentication infrastructure
  • FIG. 12 is a schematic diagram of various components of an illustrative data processing system.
  • FIG. 13 is a schematic representation of an illustrative distributed data processing system.
  • the present disclosure relates to methods and systems for enhancing the security of account recovery processes in digital systems.
  • the methods and systems utilize artificial intelligence (Al) to evaluate and scrutinize user-generated security questions, answers, and answer hints based on customizable criteria set by administrators or authorized users.
  • Al artificial intelligence
  • the methods and systems provide adaptive security evaluation, enhanced computational efficiency, and improved detection of vulnerabilities.
  • the Al system in embodiments, continuously learns from new data and adjusts its evaluation criteria, offering up-to-date security assessments that adapt to emerging threats.
  • optimized algorithms such as parallel processing techniques and efficient data structures — and hardware acceleration through devices like GPUs or TPUs
  • hardware acceleration through devices like GPUs or TPUs
  • Advanced machine learning models including deep neural networks and transformer architectures, identify complex patterns and subtle vulnerabilities, thereby enhancing the technical robustness of account recovery processes.
  • the methods and systems allow administrators to configure personalized assessment criteria through a secure interface. These criteria define standards for evaluating the complexify, uniqueness, and predictability of security questions, answers, and hints.
  • the Al system adapts to these criteria in realtime, ensuring that evaluations align with organizational security policies.
  • users create their security questions, answers, and hints via a user-friendly interface.
  • the Al system assesses these inputs in real-time against the customizable criteria, providing immediate feedback and suggestions to enhance security.
  • the assessment may consider factors such as entropy, guessability, and complexity, employing advanced Al models, including transformer-based architectures and natural language processing techniques.
  • the methods and systems may employ various Al algonthms and machine learning models, such as neural networks, decision trees, and clustering algorithms, to analyze user inputs. These models evaluate the security 7 strength of the authentication elements and identify potential vulnerabilities.
  • the Al system may also dynamically adjust its assessment criteria based on emerging security 7 threats and historical data.
  • the methods and systems may incorporate secure data handling practices, including encryption techniques and secure communication protocols, to protect user data during storage and transmission.
  • the system may also utilize blockchain technology to ensure the immutability 7 of security configurations and audit trails.
  • the methods and systems provide real-time feedback to users, enabling them to refine their security questions, answers, and hints to meet the recommended security criteria. This approach enhances the overall security of account recovery processes by reducing the risk of unauthorized access and adapting to evolving security threats.
  • the methods and systems may leverage quantum computing technologies for complex cryptographic analysis, providing enhanced security and scalability.
  • edge computing could be incorporated to perform localized real-time analysis, reducing latency and improving the responsiveness of the security assessments. Integration with decentralized identity systems (such as DID) and selfsovereign identity frameworks could also ensure secure, user-controlled data management.
  • DID decentralized identity systems
  • selfsovereign identity frameworks could also ensure secure, user-controlled data management.
  • aspects of the systems and methods for evaluating and strengthening security of authentication systems may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, and the like), or an embodiment combining software and hardware aspects, all of which may generally be referred to herein as a ⁇ ‘circuit,” '‘module,” and/or “system.”
  • aspects of the systems and methods for evaluating and strengthening security of authentication systems may take the form of a computer program product embodied in a computer-readable medium (or media) having computer-readable program code/instructions embodied thereon.
  • embodiments may include the use of secure multiparty computation (SMPC) for distributed processing of sensitive data without exposing it to any single party’, enhancing privacy and security.
  • SMPC secure multiparty computation
  • the methods and systems could also leverage federated learning to train Artificial Intelligence (Al) models across decentralized data sources, preserving privacy while improving the security of recovery methods.
  • Blockchain technology may be employed to create immutable records of security questions, answers, and hints, ensuring transparency and auditability.
  • Computer-readable media can be a computer- readable signal medium and/or a computer-readable storage medium.
  • a computer-readable storage medium may include an electronic, magnetic, optical, electromagnetic, infra-red (IR), and/or semiconductor system, apparatus, or device, or any suitable combination of these.
  • a computer-readable storage medium may include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, a solid-state drive, a non-volatile memory express (NVMe) drive, and/or any suitable combination of these and/or the like.
  • Quantum storage mediums may also be leveraged in future embodiments to further enhance the capacity and speed of storage systems.
  • a computer-readable storage medium may include any suitable tangible medium that can contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electro-magnetic, optical, audible, and/or any suitable combination thereof.
  • a computer-readable signal medium may include any computer-readable medium that is not a computer-readable storage medium and that is capable of communicating, propagating, or transporting a program for use by or in connection with an instruction execution system, apparatus, or device.
  • Signal mediums may utilize 5G, 6G, or other advanced wireless communication technologies to enable faster and more secure transmission of data.
  • Quantum communication channels leveraging quantum entanglement, may also be used for secure, high-speed data transmission in future embodiments.
  • Terahertz wave communications may be utilized as part of next-generation wireless data propagation technologies, allowing greater data bandwidth and transmission speeds.
  • Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, radio frequency (RF), and/or the like, and/or any suitable combination of these.
  • Program code may be transmitted using advanced communication methods such as millimeter-wave technology, enabling high-speed data transfer over short distances, or via satellite communication for global coverage in remote or inaccessible areas.
  • the methods and systems may implement quantum key distribution (QKD) to secure data transmissions, ensuring program code integrity' and confidentiality 7 during transmission.
  • QKD quantum key distribution
  • Computer program code for cartying out operations for aspects of the systems and methods for evaluating and strengthening security of authentication systems may be written in one or any combination of programming languages, including an object-oriented programming language such as Python, Java, Smalltalk, C++, C Sharp, Swift, Rust, Kotlin, and/or the like, and conventional procedural programming languages, such as the C programming languages.
  • the program code may execute entirely on a user’s computer, partly on the user’s computer, as a stand-alone software package, partly on the user’s computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user’s computer through any type of network, including a local area network (LAN) or a wide area network (WAN), and/or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • LAN local area network
  • WAN wide area network
  • an Internet Service Provider for example, AT&T, MCI, Sprint, MCI, etc.
  • the systems and methods may also employ serverless architectures, such as Function as a Service (FaaS) platforms such as AWS Lambda, to execute program code dynamically in a distributed cloud environment, optimizing resource usage.
  • Function as a Service FaaS
  • AWS Lambda AWS Lambda
  • FIG. 1 Aspects of the systems and methods for evaluating and strengthening security of authentication systems are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatuses, systems, and/or computer program products.
  • Each block and/or combination of blocks in a flowchart and/or block diagram may be implemented by computer program instructions.
  • the computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • the methods and systems may utilize neuromorphic processors for efficient and adaptive processing of security -related tasks.
  • Quantum processors may also be integrated for executing complex cryptographic operations at exponentially faster speeds. Moreover, the methods and systems may benefit from the deployment of Al accelerators or tensor processing units (TPUs), which are specialized hardware designed to optimize Al-related computations, further enhancing the speed and accuracy of Al-driven security assessments.
  • TPUs tensor processing units
  • These computer program instructions also can be stored in a computer- readable medium that can direct a computer, other programmable data processing apparatus, and/or other device to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer program instructions also can be loaded onto a computer, other programmable data processing apparatus, and/or other device to cause a series of operational steps to be performed on the device to produce a computer-implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • these instructions may be executed in a distributed computing environment using edge computing, where processing occurs closer to the data source, reducing latency. Execution of these instructions may be optimized using Al accelerators, such as TPUs, enhancing the speed and efficiency of Al-driven processes.
  • Quantum computing architectures may also be employed for handling complex computations faster than traditional systems, further enhancing the system’s capabilities.
  • each block may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block may occur out of the order noted in the drawings.
  • two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • Each block and/or combination of blocks may be implemented by special purpose hardware-based systems (or combinations of special purpose hardware and computer instructions) that perform the specified functions or acts.
  • the blocks may be executed in parallel across distributed cloud infrastructures, leveraging serverless architectures to dynamically allocate computational resources based on real-time demand.
  • Quantum processors may also be utilized to perform concurrent executions of logical functions, optimizing the system’s computational throughput for high-complexity tasks.
  • the technical solution provided by the disclosed methods and systems addresses vulnerabilities in traditional account recovery processes by utilizing, among others, advanced Al algorithms to dynamically evaluate and enhance the security of user-generated authentication elements.
  • the Al-driven system implements natural language processing (NLP) and machine learning techniques to analyze security questions, answers, and hints in real-time, ensuring they meet customizable security criteria established by administrators.
  • NLP natural language processing
  • the methods and systems described herein bypass traditional password analysis entirely; instead, they rely on evaluating security' questions, answers, and hints through customizable criteria, eliminating the need for passwords at any stage of the user authentication process.
  • the systems and methods process user inputs by first performing data preprocessing steps such as tokenization, normalization, and vectorization.
  • Tokenization involves breaking down text into individual words or phrases (tokens), while normalization standardizes the text by converting it to lowercase and removing punctuation.
  • Vectorization then transforms the tokens into numerical representations that the Al algorithms can process.
  • the system employs NLP to understand and interpret the semantic meaning of security questions, answers, and hints. In some embodiments, this process includes tokenization. where textual input is broken down into individual words or subwords (tokens) using subword-level tokenizers such as Byte Pair Encoding (BPE) or SentencePiece.
  • BPE Byte Pair Encoding
  • SentencePiece SentencePiece
  • tokenizers enable the model to more effectively handle rare or out-of-vocabulary terms and phrases.
  • the question “What is the name of your favorite childhood teacher?” may be tokenized into [“What”, “is”, “the”, “name”, “of’, “your”, “favorite”, “childhood”, “teacher”, “?”].
  • common stopwords are filtered out to ensure the model focuses on the core semantic meaning of security questions.
  • this is using term frequency-inverse document frequency (TF-IDF) analysis, which identifies and removes 1 ow-in formation words such as “the,” “is,” and “and,” allowing the focus to remain on security-relevant content.
  • tokens undergo normalization, which includes converting text to lowercase, removing punctuation, and eliminating stopwords. Stemming and lemmatization may also be utilized to reduce words to their root forms, enabling the system to recognize variations of the same word — e.g., “running,” “runs,” and “ran” are reduced to “run.”
  • PII personally identifiable information
  • NER Named Entity Recognition
  • Additional preprocessing steps include creating context-sensitive feature vectors that encode the susceptibility of a QA pair to social engineering attacks or violation of best practices in security’ design. This is accomplished by embedding contextual clues and risk indicators, such as identifying whether a question could be easily answered via public data sources or if it lacks sufficient complexity, thus enabling the model to flag weak security questions.
  • the inputs are analyzed using machine learning models such as Transformer-based architectures, including Bidirectional Encoder Representations from Transformers (BERT) or Generative Pretrained Transformer (GPT) models.
  • BBT Bidirectional Encoder Representations from Transformers
  • GPT Generative Pretrained Transformer
  • These models leverage self-attention mechanisms to understand the context and semantic relationships within the text.
  • the Al can detect if a security question is too generic or if an answer is easily guessable based on common knowledge.
  • the model recognizes the significance of “Paris” as a location and its association with “springtime” to assess uniqueness and complexity.
  • FIG. 1 illustrates an example deep neural network in which embodiments of the present technology may be implemented.
  • the deep neural network (DNN) 100 includes an input layer 101, a plurality’ of hidden layers 102, and an output layer 103.
  • the DNN 100 is a deep auto-encoder neural network (deep ANN) or a convolutional neural network (CNN).
  • deep ANN deep auto-encoder neural network
  • CNN convolutional neural network
  • the DNN 100 has two hidden layers 102, although it is understood that alternative embodiments may have any number of two or more hidden layers.
  • Each layer 101 to 103 may have one or more nodes (represented by circles in the diagrammatic network).
  • each node in a current layer is connected to every other node in a previous layer and a next layer. This is referred to as a fully-connected neural network.
  • Other neural network structures are also possible in alternative embodiments of the DNN 100, in which not every node in each layer is connected to every node in the previous and next layers.
  • Each node in the input lay’ er 101 can be assigned a value and output that value to every node in the next layer (e.g., hidden layer).
  • the nodes in the input layer 101 can represent features about a particular environment or setting.
  • a DNN 100 used for classifying whether an object is a rectangle may have an input node representing whether the object has flat edges.
  • assigning a value of 1 to the node may represent that the object does have flat edges and assigning a value of 0 to the node may represent that the object does not have flat edges.
  • a DNN 100 takes an image as input.
  • the input nodes may each represent a pixel of the image, such as a pixel of a training image, where the assigned value may represent the intensity of the pixel.
  • the assigned value may represent the intensity of the pixel.
  • an assigned value of 1 may indicate that the pixel is completely black and an assigned value of 0 may indicate that the pixel is completely white.
  • Each node in the hidden layers 102 can receive an outputted value from nodes in a previous layer (e.g., input layer) and associate each of the nodes in the previous layer with a weight. Each hidden node can then multiply each of the received values from the nodes in the previous layer with the weight associated with the nodes in the previous layer and output the sum of the products to each node in the next layer.
  • a previous layer e.g., input layer
  • Each hidden node can then multiply each of the received values from the nodes in the previous layer with the weight associated with the nodes in the previous layer and output the sum of the products to each node in the next layer.
  • Nodes in the output layer 103 handle input values received from the nodes in the hidden layer 102 in a similar fashion.
  • each output node in the output layer 103 may multiply each input value received from each node in the previous layer (e g., hidden layer) with a weight and sum the products to generate an output value.
  • the output value of each output node can output information in a predefined format, where the information has some relationship to the corresponding information from the previous layer.
  • Example outputs may include, but are not limited to, classifications, relationships, measurements, instructions, and recommendations.
  • a DNN 100 that classifies whether the object is an ellipse, where an outputted value of 1 from the output node represents that the object is an ellipse and an outputted value of 0 represents that the object is not an ellipse. While the examples provided relate to classifying geometric shapes, this is only for illustrative purposes.
  • the output nodes can also be used to classify' any of a wide variety of objects and other features and otherwise output any of a wide variety of desired information in desired formats.
  • the systems and methods also employ Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units to capture sequential dependencies in user inputs.
  • RNNs Recurrent Neural Networks
  • LSTM Long Short-Term Memory
  • the LSTM units help retain information over longer input sequences, which is beneficial when evaluating lengthy security questions, answers, or hints.
  • the LSTM networks help identify patterns that could compromise security' by revealing too much information. For example, the system can detect if an answer hint such as “My first car’s make and model” reveals too much information about the answer “1967 Ford Mustang.”
  • the Al modules assign security scores to each user input based on factors such as uniqueness, complexity, entropy, and predictability as discussed in further detail below. Entropy calculations measure the randomness and unpredictability of the input, with higher entropy indicating stronger security.
  • the system compares these scores against the customizable criteria set by administrators, which may include thresholds (e.g. minimum complexity requirements), prohibited words, required entropy levels, and/or more.
  • the system provides realtime feedback and actionable suggestions to the user through the interface. For instance, it may recommend adding specific details to a security question or using a combination of words and numbers in an answer to increase complexity. This interactive process ensures that the final authentication elements are both secure and user-friendly.
  • the disclosed methods and systems offer a technical solution that enhances the security of account recovery processes.
  • the dynamic assessment and continuous adaptation capabilities of the Al models address the shortcomings of static security' measures, providing robust protection against unauthorized access and evolving cyber threats.
  • FIG. 2 illustrates an example of training and deployment of a deep neural network.
  • the neural network is trained using a training dataset 202.
  • initial weights may be chosen randomly or by pre-training using a deep belief network.
  • the training cycle can then be performed in either a supervised or unsupervised manner.
  • Supervised learning uses a training set to teach models to yield the desired output.
  • the training dataset 202 includes inputs and desired outputs, which allow the model to learn over time, or when the training dataset 202 includes input having known output and the output of the neural network is manually graded.
  • the network processes the inputs and compares the resulting outputs against a set of expected or desired outputs. Errors are then propagated back through the system.
  • the training framework 204 can adjust to change the weights that control the untrained neural network 206.
  • the training framework 204 can provide tools to monitor how well the untrained neural network 206 is converging towards a model suitable for generating correct answers based on known input data.
  • the training process repeatedly occurs as the network weights are adjusted to refine the output generated by the neural network.
  • the training process can continue until the neural network reaches a statistically desired accuracy associated with a trained neural network 208.
  • the trained neural network 208 can then be deployed to implement any number of machine learning operations to output a result 214
  • Supervised learning is typically separated into tw o types of problems — classification and regression.
  • Classification uses an algorithm to assign test data accurately into specific categories.
  • Regression is used to understand the relationship between dependent and independent variables.
  • Numerous different algorithms and computation techniques can be used in supervised machine learning, including but not limited to, neural networks, naive bayes, linear regression, logistic regression, support vector machines (SVM), k-nearest neighbor, and random forest.
  • Unsupervised learning is a learning method in which the network uses algorithms to analyze and cluster unlabeled data. These algorithms discover hidden patterns or data groupings. Therefore, the training dataset 202 includes input data without any associated output data.
  • the untrained neural network 206 can leam groupings within the unlabeled input and determine how' individual inputs relate to the overall dataset.
  • Unsupervised training can be used to for three main tasks — clustering, association, and dimensionality.
  • Clustering is a data mining technique that groups unlabeled data based on similarities and differences. This technique is often used to process raw', unclassified data objects into groups represented by structures or patterns in the information. Association is a rule-based method for finding relationships between variables in a given dataset. This method is often used for market basket analysis. Dimensionality reduction is used when a given dataset's number of features (dimensions) is too high. This technique is commonly used in the preprocessing of data.
  • Variations of supervised and unsupervised training may also be employed.
  • Semi-supervised learning is a technique in which the training dataset 202 includes a mix of labeled and unlabeled data of the same distribution.
  • Incremental learning is a variant of supervised learning in which input data is continuously used to train the model further. Incremental learning enables the trained neural network 208 to adapt to the new data 212 without forgetting the knowledge instilled within the network during initial training.
  • LLMs Language Learning Models
  • the LLMs are trained on a preprocessed (described below), highly specialized, proprietary dataset that focuses on security-related question-and-answer (QA) pairs.
  • the dataset includes publicly available security questions from password recovery and authentication systems, as well as proprietary data sourced from real-world security 7 environments such as penetration testing scenarios logs, real-world phishing attack attempts, and anonymized user data.
  • Unique constraints may be applied during training to improve the LLMs ability 7 to distinguish between effective and ineffective security questions, answers, and hints.
  • the model receives higher rewards for identifying and promoting questions that demonstrate strong security principles, such as being non- replicable or incorporating contextual, time-sensitive knowledge.
  • This reward system encourages the LLM to favor high-entropy, hard-to-guess answers.
  • This combination of adversarial training and reward shaping ensures the LLM can generate and optimize security questions, answers, and hints that resist social engineering and maintain high levels of entropy, providing robust protection against common security threats.
  • the LLMs are trained on high-entropy, complex questions, answers, and hints that require deeper personal knowledge.
  • the training process filters out biased questions, such as those that disproportionately affect certain demographics or rely on cultural or regional knowledge.
  • the Al system employs unsupervised learning techniques, such as K-Means clustering, to identify common patterns and group similar security answers.
  • K-Means clustering By clustering inputs, the system detects when users select popular or easily guessable answers, such as common pet names or birthplaces. If a user’s answer falls into a cluster of high-risk responses, the system flags it and prompts the user to choose a more unique answer.
  • the Al models are trained on extensive datasets that include examples of strong and weak security elements. In other embodiments, the Al models are trained exclusively on proprietary weak security elements. These datasets, in embodiments, comprise publicly available linguistic corpora, lists of common passwords, breached data samples, and synthetically generated inputs to cover a wide range of possible user responses. Supervised learning techniques are used, where the models learn to associate certain input patterns with security risk levels based on labeled training data.
  • the outputs from the various Al models are integrated using ensemble methods.
  • the system may use weighted averaging to combine the scores from the Transformer models, RNN-LSTM networks, and entropy calculations.
  • Other embodiments utilize different factors to determine the security score as described in further detail below.
  • Yet other embodiments employ a majority' voting system, where an input is flagged if a majority of models identify it as high risk. This integration facilitates a comprehensive assessment by leveraging the strengths of each model.
  • Administrators have the ability to adjust the parameters and thresholds within the methods and systems through customizable criteria settings. For instance, they can set minimum security' score, minimum entropy levels, minimum complexity levels, minimum predictability 7 or guessability' levels, specify prohibited words or phrases, adjust the sensitivity' of pattern detection in clustering algorithms, adjust the weight of each factor, and more. This customization allows the system to align with organizational security policies and adapt to emerging threats.
  • FIG. 3 illustrates an embodiment of a system architecture 300 for evaluating and strengthening security of authentication systems.
  • the system 300 comprises several interconnected components that work together to enhance the security of account recovery processes using Al.
  • User devices 302 as depicted represent the devices operated by end-users. These devices can be personal computers, smartphones, or tablets through which users create security questions, answers, and hints.
  • the user interface module 304 is a software interface that facilitates user interaction with the system. It is connected to user devices 302 via bidirectional communication links, indicating data exchange between users and the system. The user interface module 304 sends user inputs to the Al evaluation module 310.
  • Administrator devices 306 are utilized by administrators to configure and manage customizable assessment criteria. They communicate bidirectionally with the administrator interface module 308. This secure interface 308 allows administrators to set and adjust security 7 parameters. It sends configuration data to the Al evaluation module 310.
  • the Al evaluation module 310 processes and evaluates user inputs using Al algorithms and machine learning models as disclosed herein. It receives user inputs from the user interface module 304 and assessment criteria from the administrator interface module 308. It communicates with the data preprocessing module 312. The data preprocessing module 312 performs initial processing of user inputs, such as tokenization, normalization, and vectorization. It receives data from the Al evaluation module 310 and returns processed data. It also interacts with the database 314.
  • the feedback module 316 generates real-time feedback and recommendations for users based on Al evaluation results. It receives evaluation results from the Al evaluation module 310 and sends feedback to users via the user interface module 304.
  • the encrypted database 314 securely stores user inputs, assessment criteria, and historical data. It communicates bidirectionally with both the Al evaluation module 310 and the data preprocessing module 312. All components are interconnected via the communications network 318, which facilitates secure communication encrypted using protocols as described herein
  • the disclosed methods and systems offer several technical advantages.
  • the system adapts to new security threats in real-time. For instance, when new patterns of attacks are detected, the Al models update their evaluation criteria without the need for manual reconfiguration.
  • the architecture supports scalable processing capabilities, handling large volumes of user data efficiently.
  • the use of distributed computing and parallel processing techniques allows the system to maintain high performance even under heavy load.
  • the integration of multiple Al models, such as Transformer models, RNN- LSTM networks, and clustering algorithms enables a comprehensive analysis of authentication elements, reducing false positives and negatives in risk assessment. Real-time feedback and suggestions help users create stronger security’ questions and answers without significant delays, improving the usability of the system.
  • implementation of the methods and systems disclosed herein includes optimized algorithms for text processing and analysis.
  • the use of BPE in tokenization reduces the computational load and improves processing speed.
  • the system in embodiments, utilizes hardware accelerators such as GPUs or TPUs to enhance computational efficiency, enabling faster analysis of user inputs.
  • Advanced encryption techniques and secure communication protocols ensure that user data is processed and stored securely, addressing technical concerns related to data privacy and security.
  • This present disclosure enables users to create and manage account recovery questions, answers, and hints through a user-friendly interface. As users generate recovery information, the system immediately submits it for real-time Al evaluation. Al modules and models, as discussed herein, assess the security of the questions, answers, and hints against customizable criteria set by the user, strengthening them against potential attacks like guessing or brute-force attempts. This flexible yet rigorous approach to account recovery significantly reduces the risk of unauthorized access to sensitive information.
  • FIG. 4 illustrate an embodiment of a system and method for evaluating and strengthening security of authentication systems 400, the process begins with step 420, where administrators configure the Al system with personalized assessment criteria.
  • administrators configure the Al system with personalized assessment criteria.
  • their define standards for evaluating aspects such as question complexity, specificity, answer length, and hint obviousness. These criteria are securely stored using advanced encry ption techniques and can be adjusted in real time to meet the organization’s unique security needs.
  • authenticated users access a secure platform to create their own security questions, answers, and hints.
  • the user interface accessible via web or mobile applications, provides prompts or guidelines to help users generate memorable yet secure authentication elements, which can include text, images, sounds, or other media.
  • the system leverages natural language processing algorithms to assist users in crafting content that is both personally significant and resilient against unauthorized access attempts.
  • the Al system retrieves the user-generated questions, answers, and hints from the secure, encrypted database and evaluates them against the predefined security criteria established in step 420.
  • Advanced algorithms analyze factors such as complexity, uniqueness, predictability, and adherence to administrator- defined standards. Techniques like graph theory, semantic analysis, and machine learning models — including transformer architectures such as GPT — are employed to assess the security and effectiveness of the user inputs.
  • step 480 the system delivers immediate feedback to users based on the Al's assessment.
  • This feedback is integrated into the user interface and may include pass/fail results, numerical scores, or actionable suggestions for improvement. Users can then refine their inputs to better align with the security standards, actively participating in enhancing their account recovery security. This dynamic interaction ensures that the authentication elements are both memorable to the user and meet stringent security criteria.
  • the system maintains secure data handling and storage by utilizing advanced encryption techniques and may employ blockchain technology for immutable records of changes, ensuring data integrity and confidentiality.
  • the Al system can adapt assessment criteria dynamically based on realtime threat intelligence and historical security data, using advanced machine learning algorithms like deep learning neural networks. This continuous learning approach ensures that security standards evolve to counter emerging threats effectively.
  • FIG. 5 illustrates a flowchart 500 that outlines a method for evaluating and enhancing security entries.
  • the process commences at step 502, initiating the method, and subsequently proceeds to configuring the Al with assessment criteria at step 504. Administrators define and set customizable assessment criteria through the administrative interface, after which these criteria are securely stored within the database at step 506. [0077] Following the configuration and storage of criteria, users undergo authentication at step 508 to verify their identities prior to submitting their security entries at step 510. The submitted security questions, answers, and hints are then preprocessed at step 512, involving tokenization, normalization, and vectorization of the input data.
  • the Al module evaluates the preprocessed inputs based on the established assessment criteria at step 514, subsequently computing security scores at step 516. These scores assess factors such as entropy, guessability, and complexity, and are then compared against predefined thresholds at step 518.
  • the security entries are securely stored in the database at step 528, and the process concludes at step 530. Conversely, if the scores do not meet the thresholds at step 522, the system generates real-time feedback at step 524, providing users with suggestions and recommendations for improving their security entries. Users are then prompted to revise their entries at step 526, and the revised entries are resubmitted to the input stage at step 510. This creates an iterative loop, allowing users to continuously enhance their security’ entries until they satisfy the required security standards, thereby ensuring robust protection of user information.
  • FIG. 6 illustrates an embodiment of the Al evaluation process 600 used to assess user-provided security’ entries, detailing the internal components and data flow.
  • the process begins with user inputs 602, where users submit their security questions, answ ers, and hints via the user interface. These inputs are then processed by the data preprocessing module 604, which performs essential tasks such as tokenization, normalization, vectorization, stopword filtering, stemming, and lemmatization to prepare the data for analysis.
  • the feature extraction module 606 extracts critical features relevant to security evaluation, including measures of entropy, complexity, and predictability’. These extracted features are then passed to the Al Models 608, which encompass transformer models 610 for semantic analysis, RNN- LSTM networks 612 for analyzing sequential dependencies, and clustering algorithms 614 for detecting patterns and grouping similar inputs to identify common or easily guessable entries.
  • the evaluation metrics computation 616 component calculates security scores based on the extracted features, focusing in part on entropy, guessability, and complexity. These computed metrics are then compared against the Assessment Criteria 618, which are customizable parameters set by administrators to define the required security standards.
  • the Decision Module 620 evaluates whether the security entries meet the established thresholds. If the scores meet or exceed the thresholds, the entries are approved and stored in the secure database 624. Conversely, if the scores do not meet the thresholds, the process directs the flow to the feedback generation module 622, where real-time feedback and suggestions for improvement are created and sent back to the user interface.
  • the Al models 608 (transformer models 610, RNN-LSTM networks 612, and clustering algorithms 614) operate collectively ensuring a cohesive analysis of the input data.
  • the secure database 624 securely stores approved security entries, maintaining data integrity' and confidentiality' through advanced encry ption techniques. This Al evaluation process ensures that user-generated security elements are thoroughly analyzed for their robustness and adherence to security standards.
  • FIG. 7 illustrates a flowchart 700 that delineates the security' scoring mechanism employed, in embodiments, to compute the final security' score (S) for user- provided security entries.
  • the mechanism integrates multiple factors, including entropy, guessability, and complexity 7 scores, each contributing to the overall assessment of security robustness.
  • the process commences with the entropy calculation module 702, yvhich calculates the entropy score by measuring the randomness and unpredictability of the security 7 entries. This entropy score is then foryvarded to the security 7 score computation module 724. Concurrently, the guessability analysis module 704 determines the guessability score (G). evaluating how easily an attacker might predict the security answers. This module, in embodiments, encompasses three sub-modules: the public knowledge factor 706, which assesses the availability of the information through public records or social media; pattern detection 708, which identifies predictable sequences or repeated characters; and memory -based predictability 710, which evaluates the likelihood of an answer being guessed based on common knowledge or social engineering techniques. The resulting guessability score (G) is transmitted to the security score computation module 724.
  • the complexity assessment module 712 computes the complexity score (C) by analyzing the uniqueness and sophistication of the security questions.
  • This module includes sub-modules such as the uniqueness function 714, which measures the distinctiveness of the question compared to a database of known security questions; the number of unique words 716, which counts the unique words used to assess complexity 7 ; the entropy of words 718, which analyzes the unpredictability of word choices; and time-based uniqueness 720. which considers elements that change over time to enhance security.
  • the complexity score (C) is then sent to the security score computation module 724.
  • the weight assignment module 722 assigns specific w eights (a, (3, y — which may be managed by administrators) to each of the calculated scores — entropy (H), guessability (G), and complexity 7 (C) — thereby adjusting their influence on the final security 7 score (S). These weights are conveyed to the security 7 score computation module 724.
  • the security score computation module 724 aggregates the weighted scores and calculates the final security score (S). This computed score is then passed to the threshold comparison module 726.
  • the threshold comparison module 726 evaluates whether the final security score meets or exceeds predefined security thresholds. If the score satisfies the threshold criteria, the process proceeds to approval; how ever, if the score falls below the threshold, the recommendation module 728 is activated.
  • the recommendation module 728 generates tailored suggestions and recommendations aimed at improving the security entries to achieve compliance with the required security 7 standards.
  • the data preprocessing module 806 This module is responsible for preparing the data for evaluation by performing tasks such as tokenization, which breaks down the input into manageable pieces; normalization, which standardizes the data format; and vectorization, which converts the data into a format suitable for analysis by the Al evaluation module 808.
  • the Al evaluation module 808 then assesses the security entries against established criteria, determining their robustness and identifying any weaknesses.
  • the feedback module 810 generates real-time feedback based on the Al’s assessment, providing users with actionable suggestions to enhance the security of their entries.
  • the encryption module 812 encrypts the processed data to ensure that it remains confidential and protected from unauthorized access.
  • the encrypted data is then securely stored in the encrypted database 814, which serves as a centralized repository for all user inputs, assessment criteria, and historical data, safeguarded against potential breaches.
  • Administrators access and manage the system through administrator devices 816, which connect to the administrator interface 818.
  • This interface provides administrative functionalities, allowing for the configuration of assessment criteria, monitoring of system performance, and management of user accounts.
  • the audit log module 820 records all system interactions, including changes and access events, thereby facilitating auditing and ensuring compliance with security protocols.
  • FIG. 9 illustrates an architecture of machine learning models 900 utilized within the system’s Al evaluation module.
  • the architecture comprises several layers and components designed to process and analyze user inputs effectively. The process begins with the input layer 902, which receives preprocessed data from the data preprocessing module 906. This data is then transformed into vector representations by the embedding layer 904, facilitating semantic understanding.
  • the data progresses through multiple transformer blocks 906, each consisting of a self-attention mechanism 908, layer normalization 910, and a feed-forward neural network 912. These components work in tandem to capture contextual relationships and enhance feature extraction.
  • the data is processed by the RNN-LSTM networks 914, which handle sequential dependencies and temporal patterns within the data, thereby improving the model's ability to understand the sequence and structure of user inputs.
  • the output layer 916 receives the processed data from the RNN-LSTM networks and generates evaluation results, including security scores and assessments.
  • the model training module 918 oversees the training processes using comprehensive datasets, while the hyperparameter tuning module (920) fine-tunes parameters such as learning rates and weight decay to optimize performance.
  • the model update mechanism 922 enables real-time updates to the models based on new data or evolving assessment criteria, ensuring the system adapts to emerging security threats and maintains high evaluation standards.
  • FIG. 10 illustrates an exemplary real-time feedback loop 1000 between the user and the system, s featuring the interactive process that provides immediate assessments and recommendations based on user inputs.
  • the loop begins with user inputs 1002, where users submit their security questions, answers, and hints through the user interface 1008. These inputs are then processed by the Al evaluation module 1004, which assesses the strength and robustness of the security entries against predefined criteria using advanced machine learning algorithms.
  • the feedback generation module 1006 creates actionable feedback based on the Al's assessment. This feedback is displayed to the user via the user interface 1008, allowing users to understand the necessary improvements. Users then perform revision actions 1010 by modifying their entries in response to the feedback provided. This iterative process is represented by the loop arrow 1012, which directs the revised entries back to the Al evaluation module 1004 for re-assessment. The loop continues until the security entries meet or exceed the required security thresholds, at which point an approval confirmation 1014 is generated by the system, indicating successful compliance with the security standards.
  • the connections and flow within the feedback loop indicate the progression of data and feedback through the various modules.
  • the loop arrow 1012 illustrate the continuous nature of the process, ensuring that users can repeatedly refine their security entries until they achieve the desired level of security.
  • Directional arrows throughout the diagram illustrate the flow of information from user inputs to evaluation, feedback generation, and user revisions, culminating in the approval confirmation once the security criteria are satisfied.
  • the system configures the Al with personalized assessment criteria, establishing the standards for evaluating security questions, answers, and hints.
  • this configuration is performed by retrieving predefined security' policies from a central database or rules engine containing parameters [HOW] such as question complexity, answer length, and hint specificity.
  • System administrators may interact with a secure user interface to modify these standards.
  • the secure user interface may incorporate multi-factor authentication (MFA) to verify the identity of system administrators.
  • MFA multi-factor authentication
  • Access to this configuration occurs through a secure communication channel using encryption protocols such as Transport Layer Security (TLS), ensuring the confidentiality' and integrity of the transmitted data.
  • TLS Transport Layer Security
  • Authorized personnel typically system administrators, access the configuration through the secure user interface, which presents configurable parameters for security assessments.
  • Customizable standards parameters may include question, answer, and hint entropy, complexity, predictability, guessability, length, specificity, the degree of personalization based on historical user data, the weight given to each, and more.
  • RBAC rolebased access control
  • the system validates administrators in real time by employing both client-side (e.g., JavaScript) and server-side (e.g., API validation) mechanisms.
  • client-side validation ensures the format is correct before submission, while server-side validation checks compliance against preset boundaries stored in the backend schema. If an input falls outside the acceptable range, the system generates an alert or notification for correction, preventing non-compliant values from being saved.
  • the backend processes the configurations and updates the system’s database to store the new parameters.
  • Each configuration parameter in embodiments, is securely stored in a distributed database, leveraging technologies such as NoSQL databases (e.g., MongoDB) for high availability and fault tolerance.
  • NoSQL databases e.g., MongoDB
  • the change is first committed to a primary node in the database cluster.
  • the primary node writes the new data to its local storage and then initiates replication to secondary nodes.
  • This replication is achieved through mechanisms such as write-ahead logging or journaling, where the primary node records the changes and streams them to secondary nodes in real time.
  • the secondary nodes apply these updates to their own copies of the database, ensuring consistency across the cluster. This process achieves redundancy and consistency across the distributed architecture, maintaining scalability and fault tolerance.
  • dynamic data structures within the Al system such as hash maps or binary trees, manage real-time updates to these criteria. These data structures allow the Al to quickly access and incorporate the new configurations into its assessment algorithms without significant latency.
  • the updated assessment criteria are securely stored using advanced encr ption techniques, such as AES-256 for symmetric encr ption or RSA for asymmetric encryption. This ensures that even if the data is intercepted during transmission or storage, it remains encrypted and inaccessible without the correct decryption key.
  • Role-based access control RBAC is also applied to protect sensitive data, limiting modification privileges to authorized personnel based on their roles and responsibilities.
  • RBAC Role-based access control
  • the system replicates data across a distributed, fault-tolerant architecture. In the event of anode failure, traffic is automatically rerouted to secondary nodes via load balancers or coordination services like Apache ZooKeeper. ZooKeeper monitors the health of each node and facilitates leader election in case the primary node becomes unavailable. This ensures continuous service with no data loss, allowing the system to maintain horizontal scalability and redundancy without compromising security.
  • the system integrates blockchain technology to further enhance the security and immutability of the configuration process.
  • Each change to the assessment criteria is stored as a transaction in a decentralized, distributed ledger, ensuring that all modifications are permanently recorded and cannot be altered or deleted.
  • Each block in the blockchain is cryptographically linked to the previous one, creating a chain of records that is resistant to tampering. Multiple nodes independently verify each transaction, maintaining the integrity of the assessment criteria.
  • the system logs every configuration change in an encry pted audit trail, which records the time of the change, the identity 7 of the administrator who made the change, and the specifics of the updated parameters.
  • the audit logs are stored using a secure logging system that ensures non-repudiation and tamper resistance, employing hashing algorithms such as SHA-256 to verify the integrity of the logs. These logs are used for compliance and accountability, enabling organizations to review- changes and investigate any unauthorized modifications.
  • the system recalibrates the Al’s assessment algorithms to align with the new 7 parameters.
  • the configurations are translated into parameterized data models that the Al system uses to adjust its evaluation algorithms according to the new standards.
  • the Al models dynamically adjust their parameters based on the updated configurations.
  • recalibration may involve adjusting weights and biases within the network. This is achieved through backpropagation, where the network computes the gradient of the loss function with respect to each weight, allowing for fine-tuning based on the new criteria.
  • recalibration may modify split conditions or thresholds at each node.
  • the system may adjust the criteria for splitting nodes based on the updated parameters, influencing how the tree classifies input data.
  • recalibration may update hyperparameters such as the regularization parameter (C) or kernel parameters (e.g., gamma in an RBF kernel). Adjusting these parameters changes the margin and decision boundary, aligning the model with new evaluation standards.
  • C regularization parameter
  • kernel parameters e.g., gamma in an RBF kernel
  • the system in embodiments, handles recalibration asynchronously by initiating a parallel process or thread. This approach prevents disruption of ongoing evaluations and maintains system performance.
  • the recalibration process fine-tunes the Al's decision-making models to align with the new thresholds and weights, ensuring optimal functionality while integrating the new rules in real time.
  • the Al system instantaneously applies the updated criteria in real-time evaluations.
  • the Al utilizes real-time data pipelines, such as Apache Kafka, to push the updated criteria into the system's inference engine.
  • Kafka acts as a distributed streaming platform that allows the system to publish and subscribe to streams of records.
  • the updated configurations are published to a Kafka topic, and the inference engine subscribes to this topic to receive updates instantaneously.
  • This mechanism allows the Al to dynamically adjust its decisionmaking process without requiring a full model retraining. For example, if a new threshold for question complexity is set, this change is applied immediately in subsequent security evaluations.
  • the Al system assesses security questions, answers, and hints according to the updated standards, ensuring that evaluations are consistent with the administrators' specified requirements. By incorporating the new configurations promptly, the system remains responsive to evolving security needs and can adapt to new threat landscapes. This dynamic application of updated criteria enhances the overall resilience and effectiveness of the account recovery process, providing robust protection against unauthorized access.
  • the system incorporates advanced machine learning models, such as deep learning neural networks or decision trees. In embodiments, this is done to evaluate the effectiveness of the configured criteria.
  • advanced machine learning models such as deep learning neural networks or decision trees.
  • this is done to evaluate the effectiveness of the configured criteria.
  • the system employs various metrics, including accuracy, precision, and recall, calculated against historical data using cross- validation techniques. Weighted averaging may be used to combine individual security metrics into an overall security score.
  • the system in embodiments, utilizes reinforcement learning to continuously leam and fine-tune standard based on historical performance, user behavior, and other factors. This assigns rewards or penalties based on the success or failure of security assessments.
  • This feedback loop allows the Al to dynamically finetune its parameters [HOW], ensuring the system adapts to evolving security landscapes. These dynamic adjustments ensure that the system remains resilient and adaptable, continually evolving in response to new threats and changing security requirements. For example, if a new type of attack targets security questions with specific characteristics, the system can detect this trend and adjust the weight assigned to related factors, enhancing the overall resilience of the account recovery' process.
  • MFA multifactor authentication
  • TOTP time-based one-time passcodes
  • biometric data e g., fingerprint or facial recognition
  • hardware tokens e.g., hardware tokens
  • users access a secure user interface to create personalized security questions, answers, and hints.
  • the user interface accessible through web, mobile applications, and more, provides secure input fields linked to the backend where data is validated and stored.
  • the system in embodiments, supports various data types for security- entries, including text, sounds, images, and video.
  • Multimedia files are encrypted and stored efficiently using appropriate formats, such as PNG for images or AAC for audio, ensuring efficient storage without compromising quality. For instance, an uploaded image might be compressed using the PNG format and then encrypted using AES-256 before being stored in the database.
  • the user interface transmits data securely to the server using protocols such as HTTPS, ensuring confidentiality and integrity during transmission.
  • client-side encryption such as AES-256
  • AES-256 is applied to encrypt data before transmission. For example, when a user inputs a security entry, the system encrypts the entry on the user’s device using an AES-256 encryption key. The encrypted data is then transmitted to the server and stored in an encrypted database. Other methods as disclosed herein, or similar, may be utilized.
  • User data is securely stored in an encrypted database, with data integrity verified through hash functions such as SHA-256.
  • hash functions such as SHA-256.
  • a hash value is generated using SHA-256 to create a unique digital fingerprint of the input. This ensures that any alteration to the stored data can be detected by comparing the hash values during future retrievals.
  • the database system employs replication to ensure redundancy and high availability. For example, the database may replicate the stored data across multiple geographically distributed nodes, ensuring that even if one node fails, the data remains accessible from another node.
  • the systems and methods preprocess user inputs. Preprocessing steps as discussed above include, but are not limited to, tokenization, normalization, vectorization, stopwords filtering, stemming, lemmatization, PII removal, creating context-sensitive feature vectors, and more.
  • the system validates and evaluates tokenized user entries based on predetermined criteria, as described above. Operational in real-time, the system, in embodiments, deploys machine learning algorithms and LLMs to perform an in-depth analysis of user-created security entries, including applying semantic analysis to assess and interpret the semantic meaning of security entries [HOW].
  • neural networks configured to analyze input complexity by assessing factors such as uniqueness, complexity’, depth of knowledge demonstrated, length, diversity of characters, linguistic structure, and alignment with administrators’ security criteria.
  • the neural network identifies patterns that indicate low complexity, such as repetitive sequences or predictable character combinations.
  • the system utilizes a dictionary lookup algorithm to compare user entries against a list of common phrases and dictionary words.
  • decision tree algorithms are utilized to evaluate the number of unique “words” in an entry, checking whether the answer has sufficient personal relevance based on the user’s historical data, as may be stored in the system. This ensures that the authentication elements are both secure and memorable to the user.
  • the system makes a determines the security score of the user entries in accordance with administrator criteria using any one of the several embodiments disclosed herein.
  • the system applies predefined rules to validate user inputs and reject overly simple or guessable entries. Entries resembling common phrases or dictionary words may be flagged. For example, if the system utilizes a dictionary lookup algorithm to compare user entries against a list of common phrases and dictionary words, and there a match is found, the system may flag the entry.
  • a user entry does not meet a specified threshold, the entry is flagged. If the entry’ does not meet the security criteria, the system provides suggestions to increase complexity, such as using multi-word answers or incorporating unique personal experiences. The system then prompts the user to provide a more secure entry.
  • User-generated inputs are securely stored in the encrypted database and continuously analyzed to ensure they meet administrator updated security standards, publicly available data, data breaches, and more. In some embodiments, the system continuously recalibrates its Al models by updating weights and biases based on newly observed user behavior and updated security policies. Machine learning models adjust based on new inputs, recalibrating weightings and thresholds to ensure compliance with evolving criteria. This adaptive approach balances memorability with security, enabling the creation of highly personalized and secure authentication elements.
  • the system initiates an Al-driven evaluation process of user-provided security entries to assess their robustness and effectiveness in account recovery. This evaluation is based on the predefined security standards and protocols established by administrators or authorized users in the first step. By integrating multiple technical and software components, such as encryption modules and machine learning algorithms, the system performs a comprehensive security analysis to ensure that recovery mechanisms comply with required security and usability thresholds.
  • the Al assessment begins by securely retrieving security entry from an encrypted database.
  • the retrieval process employs advanced data structures, such as hash maps and Merkle trees, in conjunction with encryption techniques like AES-256, to ensure the secure handling and protection of sensitive information.
  • the assessment criteria are dynamically adapted in real-time by leveraging an indexed database that manages criteria weights, threshold values, and user-specific security preferences, allowing immediate synchronization with the updated security' parameters defined in the first step.
  • the system integrates blockchain technology to ensure the immutability of security configuration changes.
  • the blockchain acts as a distributed ledger that logs every modification made to the security settings, creating immutable records that are resistant to tampering.
  • the system dynamically updates the security' criteria based on historical data, detected security breaches, and evolving threat landscapes. This continuous adaptation is powered by machine learning models, including RNNs and deep reinforcement learning algorithms, which analyze patterns in security breaches and vulnerability reports to make real-time adjustments to the assessment criteria. For instance, RNNs process historical data sequences to identify trends in user behavior or attack patterns, while reinforcement learning applies real-time feedback to adjust security thresholds accordingly.
  • the system evaluates user security entries by employing advanced algorithms to assess factors such as complexity, specificity, and alignment with the security standards defined in the first step.
  • these algorithms utilize graph theory to model relationships between characters, subwords, words, or phrases within the security entries, identifying vulnerabilities by calculating edge connections and node weights. For example, a security' question containing repetitive or common words may' receive a lower complexify score. Network analysis methods assign complexify scores based on these relationships to ensure that security questions are both memorable to the user and sufficiently robust against attacks.
  • the Al system may be implemented using transformer-based models, such as BERT or GPT architectures, which rely on self-attention mechanisms to dynamically weigh input data based on contextual relevance.
  • the model assigns vary ing attention scores to different words based on their relationships to surrounding words, allowing the system to capture long-range dependencies and complex relationships within the data.
  • the system leverages NLP techniques and semantic analysis to assess their depth, uniqueness, and adherence to the security' criteria.
  • convolutional neural networks (CNNs) are employed to analyze image, audio, or video-based answer hints, enabling the system to process multimedia-based knowledge elements effectively.
  • Transformer architectures as utilized in embodiments employ positional encoding to compensate for the lack of inherent sequential order awareness in the model.
  • Positional encoding provides numerical values representing the position of each token in the sequence, allowing the model to process entries with an understanding of word order. This is useful for evaluating security questions, answers, and hints because context and sequence are relevant for meaningful assessment.
  • multiple transformer blocks are stacked to create increasingly abstract representations of the input data, facilitating the extraction of high-level semantic information from complex inputs.
  • Each transformer block in embodiments, includes self-attention mechanisms followed by layer normalization and feed-forward neural networks.
  • Layer normalization ensures consistent activation outputs by normalizing neuron activations, while the feed-forward network applies mathematical transformations to further refine the data. These components enhance the model's ability to process sequential text and capture complex relationships.
  • the system employs Al-driven probability distributions and mathematical thresholds.
  • the model generates a probability distribution over potential next words or tokens in the answer. If the distribution shows a high probability for a small set of words, the answer may be deemed too predictable.
  • the system in further embodiments analyzes the entropy of the output distribution, where high entropy suggests less predictability and greater security, while low entropy may indicate that the answer is guessable.
  • the system flags security entries that exhibit low entropy or predictable patterns, guiding users to create security questions and answers that are user-friendly yet maintain high-security standards.
  • the security questions, answers, and hints entries are assigned a security' score.
  • the security score is based in part on the entropy, a measure of randomness or unpredictability, of the assessed entry. Entries with more unique data points and higher variability, for example, will have greater entropy, making them more secure.
  • the security score is based in part on the complexity of the entry. Entries that require multiple data points and crossreference different aspects of a user's life are assigned higher complexity scores, making them harder for unauthorized individuals to guess.
  • NLP algorithms may be utilized to further enhance complexity scoring by analyzing the syntactic and semantic structure of the question. The NLP model performs syntactic parsing to analyze grammatical structure and semantic parsing to understand meaning, ensuring that questions are specific to the user and resistant to guessing attacks.
  • the system implements a multi-faceted scoring process to evaluate the security strength of each security question-answer (QA) pair.
  • the system determines a security score for each security question entry, answer entry, hint entry, or any combination thereof is based on a combination of factors, including an entropy score, a guessability score, and a complexity score.
  • Each component contributes to the overall security assessment of the entries, wherein the entropy score (H) measures the randomness and unpredictability of the answer; the guessability score (G) assesses how easily the answer can be guessed based on known attack patterns such as brute force, social engineering, common knowledge, and more; and complexity score (C) evaluates the complexity and uniqueness of the question itself.
  • the security score in embodiments, is calculated using a weighted average of these components, with the contribution of each factor adjustable to control their influence. These weights may be fine-tuned during model optimization to reflect specific security requirements or organizational policies. By adjusting the relative importance of each factor, the system can dynamically assess and optimize security questions for robustness, ensuring that weaker entries with higher guessability or lower complexity are flagged for improvement.
  • adjustable security parameters refer to configurable variables or settings that control how each artificial intelligence model operates, affecting subfactors that contribute to the determination of entropy, predictability, and complexity. These parameters not only dictate the behavior of the Al models by influencing how they process inputs, weigh security metrics, and generate security scores but also control the contribution of specific subfactors to the final evaluation. For instance, adjustable security parameters can modify the influence of factors such as the public knowdedge factor, pattern detection, memory-based predictability', the number of unique words, and time-based uniqueness on the output of the models.
  • adjustable influence parameters refer to configurable variables that control the relative contribution of various factors, such as entropy, predictability, and complexity, to the final security score. These parameters allow for fine-tuning of how different models influence the outcome of the security evaluation. These parameters may adjust threshold levels, weight assignments for individual metrics, and impose limits on particular model outputs, allowing for precise control over the Al models’ behavior and the overall security score calculation in response to vary ing security requirements.
  • each artificial intelligence model produces an assessment score based on the evaluation user entries and the specific security' parameters.
  • the assessment scores include entropy, predictability, and complexity.
  • the assessment score may include the calculated entropy of the user input based on the security parameter governing randomness or unpredictability, the predictability score derived from the likelihood of an input being easily guessed, and the complexity score reflecting the uniqueness and intricacy of the user input.
  • Each of these scores is generated by the corresponding Al model, such as the entropy assessment model, predictability analysis model, or complexity evaluation model. The combination of these assessment scores is used to generate the overall security score of the user inputs, with weights applied according to the adjustable influence parameters.
  • these parameters may adjust threshold levels, weight assignments for specific metrics, and impose limits on particular model outputs, allowing for precise tuning of the Al models' performance in response to vary ing security requirements.
  • the entropy (H) of an entry' or a pair is calculated using Shannon’s entropy formula, which measures the unpredictability' or randomness of the information contained in the entry or pair:
  • Pi represents the probability of occurrence of each possible entry or pair character.
  • H is the total entropy score, indicating the unpredictability' of the entry' or pair.
  • a higher entropy score corresponds to a more secure and unpredictable entry or pair, making it harder for attackers to guess or deduce.
  • Lower entropy suggests vulnerability, especially to social engineering or brute force attacks.
  • This entropy calculation in embodiments, is used as part of a comprehensive scoring mechanism to assess the overall security strength of the QA pair.
  • the system generates a Guessability Index based on one or more guessability scores.
  • the Guessability Index is a proprietary algorithm specifically designed to enhance the security of user authentication mechanisms by assigning a numerical “guessability'” score to entries or pairs, such as security questions and their corresponding answers, while maintaining a historical log of scoring data in an indexed format.
  • the Guessability Index in embodiments, includes multiple components, including a public knowledge lookup module and a pattern-matching engine.
  • the system evaluates the provided entries comparing them against common knowledge datasets (e.g., publicly available information such as birth dates, family names, or other easily accessible personal details). This comparison flags responses that are vulnerable to discovery via public records or social media, assigning them a lower guessability score. Additionally, the algorithm performs pattern recognition to identify predictable or weak response structures, such as repeating character sequences, dictionary words, or commonly used phrases. Entries exhibiting these patterns are deemed to be more susceptible to guessing and are thus assigned a proportionally lower guessability score.
  • common knowledge datasets e.g., publicly available information such as birth dates, family names, or other easily accessible personal details. This comparison flags responses that are vulnerable to discovery via public records or social media, assigning them a lower guessability score. Additionally, the algorithm performs pattern recognition to identify predictable or weak response structures, such as repeating character sequences, dictionary words, or commonly used phrases. Entries exhibiting these patterns are deemed to be more susceptible to guessing and are thus assigned a proportionally lower guessability score.
  • the guessability or predictability score measures how easily an answer can be guessed based on attack patterns. It may be calculated using the following formula:
  • P k represents a public knowledge factor, which is a measure of how likely the answer can be found in public databases (e.g., family names, birth dates, publicly available records). This factor is calculated by comparing the answer against a list of publicly accessible data.
  • F p represents frequency of pattern detection, which is the number of detected predictable patterns, such as repeating characters, common dictionary words, or sequences like ” 1234" or ‘"password/’ This is flagged using pattern matching algorithms.
  • M p represents memory-based predictability, which is a score that indicates how predictable the answer is based on social engineering techniques (e.g., answers that could be easily inferred by knowing basic facts about the user). This may be determined using historical attack patterns and social engineering simulation.
  • u ⁇ , u 2 , u 3 are weights that balance the contribution of each factor, which may be adjusted during model optimization.
  • the complexity score evaluates how unique, open- ended, and context-dependent an entry or pair is, calculated as:
  • ( ⁇ ?) is a function that evaluates the uniqueness of the security entry or pair. This can be measured by comparing the question against a database of known security entries or pairs and assigning a score based on how unique or uncommon the entries or pairs are.
  • N w represents the number of unique words in the entry' or pair. The more unique words, the more complex the question becomes.
  • E w represents entropy of words in the security question, measuring the unpredictability of word choice.
  • T u represents time-based uniqueness — a factor that rewards questions with elements that change over time or are tied to time-sensitive or time-dependent events.
  • a 2 , A 3 are weights to balance the contribution of each factor, which may be determined during model optimization.
  • the complexity score is calculated using graph theory', where relationships between words in the entries or pair are represented as nodes and edges.
  • Graph theory enables the system to quantify complexity by analyzing the connectivity between elements in the entries or pair. More intricate connections between nodes — such as relationships that cross-reference multiple data points — result in a higher complexity score.
  • the final security' score is a weighted sum of one or more factors.
  • the final security score (S) is computed as: Where:
  • H represents the entropy score, assessing the randomness of the answer.
  • G represents the guessability score, reflecting the likelihood of an attacker correctly guessing the answer based on known attack patterns.
  • C represents the complexity 7 score, which evaluates the sophistication of the question.
  • a, fj, Y are adjustable weights, that may be fine-tuned during model optimization to balance the contribution of each factor.
  • Thresholds may be dynamically determined based on the application's sensitivity. For example, in financial or healthcare applications, a score below 70/100 may trigger recommendations.
  • the system automatically generates recommendations for improving security entries or pairs when the overall security' score falls below a predefined threshold.
  • the threshold may be dynamically determined based on the specific ty pe of application for which the security questions are intended. For instance, in highly sensitive environments such as financial or healthcare applications, a score below 70/100 may trigger recommendations.
  • the system evaluates several criteria to determine when to generate a recommendation (e.g., rephrasing the question, introducing time-based elements, or suggesting a more complex answer).
  • Recommendations may be triggered when the entropy score (H) falls below a certain threshold (e.g., less than 2.5); when the guessability' score (G) exceeds a specified threshold (e.g., more than 60%); when the complexity score (C) indicates that the question lacks uniqueness or is overly ambiguous; when the security score is too low (e.g. less than 70/100); and/or any combination of factors.
  • H entropy score
  • G guessability' score
  • C complexity score
  • recommendations may still be generated if individual component scores fall outside acceptable ranges in subcategories of the final security score (e.g., entropy is low). This ensures that specific vulnerabilities are addressed, enhancing the robustness of the security questions by considering both holistic and component-specific weaknesses.
  • administrators may set an overall score threshold of 80/100, but also specify that entropy (H) must not fall below 2, and guessability (G) must not exceed 50%.
  • H entropy
  • G guessability
  • administrators may require not only a sufficient security score, but a relatively high, memory-based predictability factor.
  • the system processes user input to evaluate and enhance the security of a security question and answer pair.
  • a user enters the question ‘'What’s your favorite movie?” and the answer “Titanic.”
  • the system begins by preprocessing the input, tokenizing the question and answer, and removing stopwords. It categorizes the question as a “personal preference” type, which typically has lower security due to its generality.
  • the system processes the input, identify ing the question as open- ended but lacking complexity. By referencing a database of common answers, it determines that “Titanic” is a highly guessable answer due to its popularity. The entropy analysis of the answer shows a low entropy value because “Titanic” is a single, easily predictable word. Similarly, the system determines that “Titanic” is a guessable answer in light of the question based on attack patterns. The system also determines that the entries or pair are not complex as they are not unique.
  • the system calculates the security scores.
  • the entropy score (H) may be calculated as 1.8 out of 10, which is a low score reflecting that the word “Titanic” is short and simple.
  • the guessability score (G) may be calculated as 75%, indicating a high likelihood of “Titanic” being guessed based on public knowledge.
  • the complexity score (C) may be calculated as 2.0 out of 10, indicating a low score due to the open-ended nature of the question.
  • Example 1 Evaluating a Security Question: A user submits the security’ question, “What is my favorite color?” The system begins by receiving the user input and normalizing and tokenizing the question into lowercase tokens: [“what,” “is,” “my,” “favorite,” “color”].
  • the Al models trained on datasets containing security breaches, undesired user inputs, and user data, are configured to flag questions based on specific security parameters, such as entropy thresholds, predictability- indices, and complexity weights.
  • the administrator configures one or more of the Al models with one or more adjustable security parameters to adjust the relative contribution of the factors within each model.
  • the “public knowledge factor,” ’‘number of predictable patterns,” or “memory -based predictability” may be assigned lower weights and therefore contribute less to the predictability index.
  • An administrator may assign the public knowledge factor a weight of 0.4.
  • the number of predictable patterns a weight of 0.2
  • the administrator assigns a weight of 0.5 to the number of unique words in the security question, a weight of 0.3 to the entropy of the words, and a weight of 0.2 to the time-based uniqueness of the security question.
  • C f(Q) i 0.5N w + 0.3Ew + 0.2T u .
  • the overall contribution of each Al model is adjusted by the administrator to determine the contribution of the models to the final security score using one or more adjustable influence parameters.
  • the system evaluates the security using the predefined parameters, including entropy, predictability, and complexity. Due to the limited range of possible answers (common colors), the entropy calculation yields a low value.
  • the phrase “favorite color” is identified as common and generic, in part because it does not contain a sufficient number of unique words, contributing to a low complexity score.
  • the system determines the security score, calculated as the weighted combination of entropy, predictability, and complexity, as follows:
  • the system assigns a low-security score, indicating a high-security question, and flags the question for revision.
  • the Al re-evaluates the revised question, detecting increased specificity and uniqueness. The entropy calculation reflects a higher value due to a broader range of possible answers.
  • the Al assigns a low-risk score, indicating compliance with the security criteria.
  • the system confirms to the user: '‘Your security question meets the recommended security’ criteria.”
  • the system again receives this new input and normalizes and tokenizes the revised question into tokens: [“what,” “was,” “the,” “name,” “of.” “the.” “first,” “school,” “I,” “attended,” “abroad”].
  • the system reevaluates the revised question, which now reflects increased specificity, increased uniqueness, increased complexity, and increased range of possible answers.
  • the system then calculates a new security score, which now reflects a higher score, indicating a lower risk due.
  • the system confirms to the user: “Your security question meets the recommended security criteria.”
  • Example 2 Evaluating an Answer and Hint: A user provides the answer “Fluffy” with the hint “My pet's nickname.” The Al analyzes the answer by checking “Fluffy” against a database of common pet names. It identifies that “Fluffy’” falls into a high-risk cluster due to its popularity, resulting in a low entropy calculation. The hint directly relates to the answer, increasing the risk of guessability. According to the administrator’s customizable criteria, common pet names and direct hints are flagged. As a result, the Al assigns a high-risk score to both the answer and the hint.
  • the system alerts the user: “Your answer is commonly used and may be easily guessed. Additionally, your hint directly reveals the nature of your answer. Please choose a more unique answer and a less direct hint.”
  • the user revises the answer to “MountKilimanjaro2015” and updates the hint to “First major hike and year.”
  • the Al re-evaluates the revised inputs. The answer now combines a specific location with a date, increasing uniqueness and complexity. The entropy calculation yields a high value due to the combination of letters and numbers.
  • the hint provides context but is less directly tied to the answer, reducing guessability.
  • the Al assigns a low-risk score to the revised answer and hint.
  • Example 3 Administrator Adjusting Customizable Criteria: An administrator or the system observes a trend where users frequently select security questions related to favorite sports teams, which are easily discoverable and pose security risks. In response, the administrator or the system updates the customizable criteria to include “favorite sports team” in the list of prohibited phrases. Additionally, the administrator increases the minimum complexity threshold for security questions. The Al models are updated in real-time to incorporate the new criteria. Consequently, future user inputs containing prohibited phrases are automatically flagged during evaluation. Users attempting to use now-prohibited phrases receive immediate feedback to modify their inputs, thereby enhancing overall security.
  • Example 4 Al Adapting to Emerging Threats: A recent data breach reveals that the name “Charlie” is frequently used as an answer to security- questions, indicating a potential vulnerability. In response, the Al system updates its high-risk answer database to include “Charlie.” Clustering algorithms are retrained with the latest breach data to identify new patterns of common answers. When a user provides the answer “Charlie” with the hint “Best friend in high school,” the Al identifies “Charlie” as a high-risk answer due to its prevalence. A high-risk score is assigned to the answer. The hint also provides additional context that could aid in guessing the answer, further increasing the risk. The system warns the user: “Your answer is commonly used and may be compromised. Please choose a more unique answer to enhance security.” This proactive adaptation helps mitigate emerging threats by updating security assessments based on real-world data breaches.
  • the Al-driven assessment process provides real-time feedback to users via the user interface, as described in fourth step. Users receive immediate feedback on the security of their security questions, answers, and hints entries, along with actionable suggestions for improvement when necessary.
  • the system delivers real-time assessment results to users through a user interface, including for example, GUI module, backend processing servers, and communication APIs.
  • This immediate feedback includes the security scores, visual indicators, checkmarks and crosses, pass/fail statuses, numerical ratings, and/or other indicators of the security of their questions, answers, and hints, along with actionable recommendations for improvement if the entries do not meet the predefined security thresholds.
  • the Al algorithms generate these suggestions based on the customizable criteria established in the first step, ensuring that users can enhance their authentication elements to comply with organizational security standards.
  • the real-time feedback mechanism is designed to be interactive and user-friendly, enabling users to refine their inputs until they satisfy the required security criteria. For instance, if a user's security question is deemed too generic or easily guessable, the system may prompt the user to rephrase it to increase complexity and specificity.
  • the Al models provide context-aware suggestions, taking into account factors such as entropy, guessability, complexity scores, and more.
  • the system employs Natural Language Generation (NLG) techniques to present recommendations in clear and understandable language.
  • NLG Natural Language Generation
  • This approach ensures that users can easily comprehend the suggestions without requiring technical expertise.
  • the NLG component translates complex security 7 assessments into user-friendly advice, bridging the gap between advanced Al evaluations and user interactions.
  • the system in embodiments, also considers the balance between security 7 and memorability. While enhancing complexity and uniqueness, the recommendations aim to ensure that users can easily recall their security answers during account recovery processes. This is achieved by personalizing suggestions based on user data and historical interactions, thereby creating security elements that are both robust and user-friendly.
  • the system provides feedback on answer hints to ensure they do not inadvertently compromise the security of the answers. If an answer hint is too revealing, the Al may suggest making it more abstract or indirectly related to the answer, thereby preserving security while still aiding user recall.
  • FIG. 11 illustrates an embodiment for the integration 1100 of the AI- based evaluation system with existing authentication and account recovery systems 1102.
  • the integration is facilitated by an API gateway 1104. which enables seamless communication between the existing authentication system and the Al evaluation module 1106.
  • the API gatew ay 1104 handles the transmission of data, ensuring that user inputs and security entries are efficiently routed to the appropriate modules.
  • the Al evaluation module 1106 serves as an intermediary for evaluating security’ entries, assessing their strength and robustness before they are stored or utilized within the user database 1108. This evaluation process leverages advanced machine learning algorithms to ensure that security entries meet the predefined criteria for entropy, guessability, and complexity.
  • the integration layer 11 10 acts as middleware, ensuring compatibility and smooth operation between the Al evaluation module 1106 and the existing authentication infrastructure.
  • Security protocols 1114 represented by padlock symbols on the arrows and modules, highlight the encryption and authentication methods employed during the integration. These protocols ensure that all data transmissions are secure, protecting sensitive user information from unauthorized access and potential breaches.
  • the existing authentication system 1102 communicates with the API gateway 1104, which then passes data to the integration layer 1110.
  • the integration layer 1110 interfaces with the Al evaluation module 1106, and results are returned to the integration layer.
  • User data is subsequently stored in the user database 1108. All communication lines are secured with encryption, as indicated by the padlock symbols 1114, ensuring that data flow betw een components remains protected and confidential.
  • Figure 11 demonstrates an embodiment of how' the Al-based evaluation system (1 100) integrates with existing authentication and account recovery' systems (1102).
  • the API gateway (1104) facilitates communication between the existing authentication system and the Al evaluation module (1106). ensuring that data is accurately and securely transmitted.
  • the integration layer (1110) acts as middleware, translating and formatting data as necessary to maintain compatibility between the systems.
  • the Al evaluation module (1106) assesses the security entries using advanced algorithms before they are stored in the user database (1108).
  • Security protocols (1114) are employed throughout the data flow arrows (1112) to maintain encrypted and authenticated communication, safeguarding user credentials and security information during the integration process. This integration allows organizations to enhance their current authentication mechanisms with Al-driven security evaluations without necessitating a complete overhaul of their existing systems.
  • FIG. 12 shows, in accordance with aspects of the present disclosure, an example describing a data processing system 1200.
  • data processing system 1200 is an illustrative data processing system suitable for implementing aspects of systems and methods for evaluating and strengthening security' of authentication systems.
  • devices that are embodiments of data processing systems e.g.. smartphones, tablets, personal computers may be used by one or more users such as retailers, customers, advertisers, consumers, patients, healthcare providers, etc.
  • devices that are embodiments of data processing systems, e.g., smartphones, tablets, personal computers may be used as one or more server(s) in encoding, decoding, and communicating data with one or more mobile communication devices.
  • data processing system 1200 includes communications framework 1202.
  • Communications framework 1202 provides communications between processor unit 1204, memory 1206, persistent storage 1208, communications unit 1210, input/output (I/O) unit 1212, and display 1214.
  • Memory 1206, persistent storage 1208, communications unit 1210, input/output (I/O) unit 1212, and display 1214 are examples of resources accessible by processor unit 1204 via communications framework 1202.
  • Processor unit 1204 serves to run instructions that may be loaded into memory’ 1206.
  • Processor unit 1204 may be a number of processors, a multi-processor core, or some other ty pe of processor, depending on the particular implementation. Further, processor unit 1204 may be implemented using a number of heterogeneous processor systems in which a main processor is present with secondary processors on a single chip. As another illustrative example, processor unit 1204 may be a symmetric multi-processor system containing multiple processors of the same ty pe.
  • Memory 1206 and persistent storage 1208 are examples of storage devices 1216.
  • a storage device is any piece of hardware that is capable of storing information, such as, for example, without limitation, data, program code in functional form, and other suitable information either on a temporary' basis or a permanent basis.
  • Storage devices 1216 also may be referred to as computer-readable storage devices in these examples.
  • Memory 1206, in these examples, may be, for example, a random access memory or any other suitable volatile or non-volatile storage device.
  • Persistent storage 1208 may take various forms, depending on the particular implementation.
  • the logging module as described with respect to various embodiments, may be carried out by the processor unit 1204 and stored in one or more storage devices 1216 as may be appropriate.
  • persistent storage 1208 may contain one or more components or devices.
  • persistent storage 1208 may 7 be a hard drive, a flash memory, a rewritable optical disk, a rewritable magnetic tape, or some combination of the above.
  • the media used by persistent storage 1208 also may be removable.
  • a removable hard drive may be used for persistent storage 1208.
  • Communications unit 1210 in these examples, provides for communications with other data processing systems or devices.
  • communications unit 1210 is anetwork interface card.
  • Communications unit 1210 may provide communications through the use of either or both physical and wireless communications links.
  • Input/output (I/O) unit 1212 allows for input and output of data with other devices that may be connected to data processing system 1200.
  • input/output (I/O) unit 1212 may provide a connection for user input through a keyboard, a mouse, and/or some other suitable input device. Further, input/output (I/O) unit 1212 may send output to a printer.
  • the input module, the output module, and the feedback module may be carried out by the input/output (I/O) unit 1212 as may be appropriate.
  • the Display 1214 provides a mechanism to display information to a user.
  • the user interface module as described with respect to various embodiments, may be carried out by the display 1214 as may be appropriate.
  • Instructions for the operating system, applications, and/or programs may be located in storage devices 1216, which are in communication with processor unit 1204 through communications framework 1202. In these illustrative examples, the instructions are in a functional form on persistent storage 1208. These instructions may be loaded into memory 1206 for execution by processor unit 1204. The processes of the different embodiments may be performed by processor unit 1204 using computer- implemented instructions, which may be located in a memory, such as memory 1206.
  • program instructions are referred to as program instructions, program code, computer usable program code, or computer-readable program code that may be read and executed by a processor in processor unit 1204.
  • the program code in the different embodiments may be embodied on different physical or computer-readable storage media, such as memory 1206 or persistent storage 1208.
  • Program code 1218 is located in a functional form on computer-readable media 1220 that is selectively removable and may be loaded onto or transferred to data processing system 1200 for execution by processor unit 1204.
  • Program code 1218 and computer-readable media 1220 form computer program product 1222 in these examples.
  • computer-readable media 1220 may be computer-readable storage media 1224 or computer-readable signal media 1226.
  • Computer-readable storage media 1224 may include, for example, an optical or magnetic disk that is inserted or placed into a drive or other device that is part of persistent storage 1208 for transfer onto a storage device, such as a hard drive, that is part of persistent storage 1208.
  • Computer-readable storage media 1224 also may take the form of a persistent storage, such as a hard drive, a thumb drive, or a flash memory, that is connected to data processing system 1200. In some instances, computer-readable storage media 1224 may not be removable from data processing system 1200.
  • computer-readable storage media 1224 is a physical or tangible storage device used to store program code 1218 rather than a medium that propagates or transmits program code 1218.
  • Computer-readable storage media 1224 is also referred to as a computer-readable tangible storage device or a computer-readable physical storage device. In other words, computer-readable storage media 1224 is non- transitory.
  • program code 1218 may be transferred to data processing system 1200 using computer-readable signal media 1226.
  • Computer-readable signal media 1226 may be, for example, a propagated data signal containing program code 1218.
  • Computer-readable signal media 1226 may be an electromagnetic signal, an optical signal, and/or any other suitable type of signal. These signals may be transmitted over communications links, such as wireless communications links, optical fiber cable, coaxial cable, a wire, and/or any other suitable type of communications link.
  • the communications link and/or the connection may be physical or wireless in the illustrative examples.
  • program code 1218 may be downloaded over a network to persistent storage 1208 from another device or data processing system through computer-readable signal media 1226 for use within data processing system 1200.
  • program code stored in a computer-readable storage medium in a server data processing system may be downloaded over a netw ork from the server to data processing system 1200.
  • the data processing system providing program code 1218 may be a server computer, a client computer, or some other device capable of storing and transmitting program code 1218.
  • data processing system 1200 The different components illustrated for data processing system 1200 are not meant to provide architectural limitations to the manner in which different embodiments may be implemented.
  • the different illustrative embodiments may be implemented in a data processing system including components in addition to and/or in place of those illustrated for data processing system 1200.
  • Other components shown in FIG. 2 can be varied from the illustrative examples shown.
  • the different embodiments may be implemented using any hardware device or system capable of running program code.
  • data processing system 1200 may include organic components integrated with inorganic components and/or may be comprised entirely of organic components excluding ahuman being.
  • a storage device may be comprised of an organic semiconductor.
  • processor unit 1204 may take the form of a hardware unit that has circuits that are manufactured or configured for a particular use. This type of hardware may perform operations without needing program code to be loaded into a memory from a storage device to be configured to perform the operations.
  • processor unit 1204 when processor unit 1204 takes the form of a hardware unit, processor unit 1204 may be a circuit system, an application specific integrated circuit (ASIC), a programmable logic device, or some other suitable type of hardware configured to perform a number of operations.
  • ASIC application specific integrated circuit
  • a programmable logic device the device is configured to perform the number of operations. The device may be reconfigured at a later time or may be permanently configured to perform the number of operations.
  • Examples of programmable logic devices include, for example, a programmable logic array, a field programmable logic array, a field programmable gate array, and other suitable hardware devices.
  • program code 1218 may be omitted, because the processes for the different embodiments are implemented in a hardware unit.
  • processor unit 1204 may be implemented using a combination of processors found in computers and hardware units.
  • Processor unit 1204 may have a number of hardware units and a number of processors that are configured to run program code 1218. With this depicted example, some of the processes may be implemented in the number of hardware units, while other processes may be implemented in the number of processors.
  • the preprocessing, processing, feedback, and extraction modules, as described with respect to various embodiments, may be carried out using any configuration of processing unit 1204 as may be appropriate.
  • a bus system may be used to implement communications framework 1202 and may be comprised of one or more buses, such as a system bus or an input/output (I/O) bus.
  • buses such as a system bus or an input/output (I/O) bus.
  • I/O input/output
  • the bus system may be implemented using any suitable type of architecture that provides for a transfer of data between different components or devices attached to the bus system.
  • communications unit 1210 may include a number of devices that transmit data, receive data, or both transmit and receive data.
  • Communications unit 1210 may be, for example, a modem or a network adapter, two network adapters, or some combination thereof.
  • a memory may be, for example, memory 1206, or a cache, such as that found in an interface and memory controller hub that may be present in communications framework 1202.
  • each block in the flowcharts or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function or functions.
  • the functions noted in a block may occur out of the order noted in the drawings. For example, the functions of two blocks shown in succession may be executed substantially concurrently, or the functions of the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • FIG. 13 shows an example describing a general network data processing system 1300, interchangeably termed a network, a computer network, a network system, or a distributed network, aspects of which may be included in one or more illustrative embodiments of methods and systems of scrutinizing questions, answers, and answer hints based on customizable criteria.
  • a network a computer network
  • a network system or a distributed network
  • one or more mobile computing devices or data processing devices may communicate with one another or with one or more servers(s) through the network.
  • FIG. 13 is provided as an illustration of one implementation and is not intended to imply any limitation with regard to environments in which different embodiments may be implemented. Many modifications to the depicted environment may be made.
  • Network data processing system 1300 is a network of computers, each of which is an example of data processing system 1200, and other components.
  • Network data processing system 1300 may include network 1302, which is a medium configured to provide communications links between various devices and computers connected together within network data processing system 1300.
  • Network 1302 may include connections such as wired or wireless communication links, fiber optic cables, and/or any other suitable medium for transmitting and/or communicating data between network devices, or any combination thereof.
  • a first network device 1304 and a second network device 1306 connect to network 1302, as does an electronic storage device 1308.
  • Network devices 1304 and 1306 are each examples of data processing system 1200, described above.
  • devices 1304 and 1306 are shown as server computers.
  • network devices may include, without limitation, one or more personal computers, mobile computing devices such as personal digital assistants (PDAs), tablets, and smart phones, handheld gaming devices, wearable devices, tablet computers, routers, switches, voice gates, servers, electronic storage devices, imaging devices, and/or other networked-enabled tools that may perform a mechanical or other function.
  • PDAs personal digital assistants
  • These network devices may be interconnected through wired, wireless, optical, and other appropriate communication links.
  • client electronic devices such as a client computer 1310, a client laptop or tablet 1312, and/ or a client smart device 1314, may connect to network 1302.
  • client electronic devices 1310, 1312, and 1314 may include, for example, one or more personal computers, network computers, and/or mobile computing devices such as personal digital assistants (PDAs), smart phones, handheld gaming devices, wearable devices, and/or tablet computers, and the like.
  • server 1304 provides information, such as boot files, operating system images, and applications to one or more of client electronic devices 1310, 1312, and 1314.
  • Client electronic devices 1310, 1312, and 1314 may be referred to as "clients" with respect to a server such as server computer 1304.
  • Network data processing system 1300 may include more or fewer servers and clients or no servers or clients, as well as other devices not shown.
  • Client smart device 1314 may include any suitable portable electronic device capable of wireless communications and execution of software, such as a smartphone or a tablet.
  • the term “smartphone” may describe any suitable portable electronic device having more advanced computing ability and network connectivity than a typical mobile phone.
  • smartphones may be capable of sending and receiving emails, texts, and multimedia messages, accessing the Internet, and/or functioning as a web browser.
  • Smartdevices e.g., smartphones
  • Smartdevices may be capable of connecting with other smartdevices, computers, or electronic devices wirelessly, such as through near field communications (NFC).
  • NFC near field communications
  • BLUETOOTH® Wi-Fi
  • Wireless connectivity may be established among smartdevices, smartphones, computers, and other devices to form a mobile network where information can be exchanged.
  • Program code located in system 1300 may be stored in or on a computer recordable storage medium, such as persistent storage 1308 in FIG. 13. and may be downloaded to a data processing system or other device for use.
  • program code may be stored on a computer recordable storage medium on server computer 1304 and downloaded for use to client 1310 over network 1302 for use on client 1310.
  • Network data processing system 1300 may be implemented as one or more of a number of different types of networks.
  • system 1300 may include an intranet, a local area network (LAN), a wade area network (WAN), or a personal area network (PAN).
  • network data processing system 1300 includes the Internet, with network 1302 representing a worldwide collection of networks and gateways that use the transmission control protocol/Intemet protocol (TCP/IP) suite of protocols to communicate with one another.
  • TCP/IP transmission control protocol/Intemet protocol
  • FIG. 13 is intended as an example, and not as an architectural limitation for any illustrative embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Accounting & Taxation (AREA)
  • Biophysics (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Storage Device Security (AREA)

Abstract

Methods and systems utilize artificial intelligence to evaluate and enhance the security of account recovery processes by scrutinizing user-generated security questions, answers, and hints based on customizable criteria. Administrators set dynamic criteria, and the trained AI system provides real-time assessments and feedback to users, ensuring robust protection against unauthorized access and evolving threats.

Description

METHODS AND SYSTEMS OF SCRUTINIZING QUESTIONS, ANSWERS, AND ANSWER HINTS BASED ON CUSTOMIZABLE CRITERIA
FIELD OF THE DISCLOSURE
[0001] The present disclosure pertains to the field of cybersecurity, specifically within user authentication systems, and more particularly to methods and systems employing artificial intelligence for evaluating and improving the security of account recovery mechanisms. The disclosure encompasses the analysis and improvement of security questions, corresponding answers, and answer hints used during account recovery processes. It introduces customizable evaluation criteria aimed at assessing the quality7, robustness, and security of these elements, with the objective of strengthening authentication mechanisms across diverse online platforms, services, and institutions.
BACKGROUND
[0002] Account recovery processes in digital systems often utilize methods security’ questions and answers and answer hints to authenticate users. These methods, while widely used, may present security' vulnerabilities due to factors such as the use of easily guessable or publicly available information in security questions, answers, and answer hints.
[0003] Security' questions typically rely on static personal information, which might be accessible through public records or social media platforms. This accessibility’ can increase the risk of unauthorized access. Answer hints, while intended to assist legitimate users in recalling their security answers, may inadvertently provide clues that aid unauthorized individuals in compromising accounts.
[0004] As digital threats continue to evolve, there is an ongoing interest in enhancing the robustness of authentication mechanisms. Improving the security of account recovery processes remains a significant consideration for online platforms, sendees, and institutions. [0005] What is needed, therefore, are methods and systems that enhance the security of account recovery processes by dynamically evaluating security questions, answers, and hints using customizable criteria using advanced technologies, such as Artificial Intelligence (Al).
BRIEF DESCRIPTION OF THE DRAWINGS
[0006] For a better understanding of the present disclosure, and to show how the same may be carried into effect, reference will now be made, by way of example, to the accompanying drawings, in which:
[0007] FIG. 1 illustrates an example of a deep neural network in which embodiments of the present technology may be implemented;
[0008] FIG. 2 illustrates an example of training and deployment of a deep neural network;
[0009] FIG. 3 illustrates an embodiment of a system architecture for evaluating and strengthening security of authentication systems;
[0010] FIG. 4 illustrates an embodiment of a method for evaluating and strengthening security of authentication systems;
[0011] FIG. 5 illustrates a flowchart of a method for evaluating and enhancing security entries using which embodiments of the present technology may be implemented;
[0012] FIG. 6 illustrates an embodiment of an Al evaluation process to assess user-provided security' entries, using which embodiments of the present technology7 may be implemented;
[0013] FIG. 7 illustrates a flowchart delineating the security scoring mechanism using which embodiments of the present technology7 may be implemented; [0014] FIG. 8 illustrates an example data flow in which embodiments of the systems and methods for evaluating and strengthening security systems of authentication may be implemented;
[0015] FIG. 9 illustrates a diagram of Al models used in embodiments of the systems and methods for evaluating and strengthening security systems;
[0016] FIG. 10 illustrates a feedback mechanism to provide users with immediate assessments and recommendations as may be used in embodiments of the systems and methods for evaluating and strengthening security systems;
[0017] FIG. 11 illustrates an example of integration of embodiments of the systems and methods for evaluating and strengthening security systems with existing authentication infrastructure;
[0018] FIG. 12 is a schematic diagram of various components of an illustrative data processing system; and
[0019] FIG. 13 is a schematic representation of an illustrative distributed data processing system.
SUMMARY OF THE DISCLOSURE
[0020] The present disclosure relates to methods and systems for enhancing the security of account recovery processes in digital systems. The methods and systems utilize artificial intelligence (Al) to evaluate and scrutinize user-generated security questions, answers, and answer hints based on customizable criteria set by administrators or authorized users.
[0021] In embodiments, the methods and systems provide adaptive security evaluation, enhanced computational efficiency, and improved detection of vulnerabilities. The Al system, in embodiments, continuously learns from new data and adjusts its evaluation criteria, offering up-to-date security assessments that adapt to emerging threats. By utilizing optimized algorithms — such as parallel processing techniques and efficient data structures — and hardware acceleration through devices like GPUs or TPUs, the system processes user inputs rapidly without compromising accuracy. Advanced machine learning models, including deep neural networks and transformer architectures, identify complex patterns and subtle vulnerabilities, thereby enhancing the technical robustness of account recovery processes.
[0022] In one aspect, the methods and systems allow administrators to configure personalized assessment criteria through a secure interface. These criteria define standards for evaluating the complexify, uniqueness, and predictability of security questions, answers, and hints. The Al system adapts to these criteria in realtime, ensuring that evaluations align with organizational security policies.
[0023] In another aspect, users create their security questions, answers, and hints via a user-friendly interface. The Al system assesses these inputs in real-time against the customizable criteria, providing immediate feedback and suggestions to enhance security. The assessment may consider factors such as entropy, guessability, and complexity, employing advanced Al models, including transformer-based architectures and natural language processing techniques.
[0024] The methods and systems may employ various Al algonthms and machine learning models, such as neural networks, decision trees, and clustering algorithms, to analyze user inputs. These models evaluate the security7 strength of the authentication elements and identify potential vulnerabilities. The Al system may also dynamically adjust its assessment criteria based on emerging security7 threats and historical data.
[0025] Furthermore, the methods and systems may incorporate secure data handling practices, including encryption techniques and secure communication protocols, to protect user data during storage and transmission. The system may also utilize blockchain technology to ensure the immutability7 of security configurations and audit trails. [0026] The methods and systems provide real-time feedback to users, enabling them to refine their security questions, answers, and hints to meet the recommended security criteria. This approach enhances the overall security of account recovery processes by reducing the risk of unauthorized access and adapting to evolving security threats.
DETAI ED DESCRIPTION
[0027] Various embodiments of systems and methods for evaluating and strengthening security of authentication systems are configured to transmit data between two devices are descnbed below and illustrated in the associated drawings. Unless otherwise specified, the systems and methods for evaluating and strengthening security of authentication systems and/or their various components may contain at least one of the structures, components, functionalities, and/or variations described, illustrated, and/or incorporated herein. Furthermore, the structures, components, functionalities, and/or variations described, illustrated, and/or incorporated herein in connection with the present disclosure may be included in other similar data transmission systems. The following description of various embodiments is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses. Additionally, the advantages provided by the embodiments, as described below, are illustrative in nature and not all embodiments provide the same advantages or the same degree of advantages.
[0028] In addition, the methods and systems may leverage quantum computing technologies for complex cryptographic analysis, providing enhanced security and scalability. Furthermore, edge computing could be incorporated to perform localized real-time analysis, reducing latency and improving the responsiveness of the security assessments. Integration with decentralized identity systems (such as DID) and selfsovereign identity frameworks could also ensure secure, user-controlled data management. [0029] Aspects of systems and methods for evaluating and strengthening security of authentication systems may be embodied as a computer method, computer system, or computer program product. Accordingly, aspects of the systems and methods for evaluating and strengthening security of authentication systems may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, and the like), or an embodiment combining software and hardware aspects, all of which may generally be referred to herein as a ■‘circuit,” '‘module,” and/or “system.” Furthermore, aspects of the systems and methods for evaluating and strengthening security of authentication systems may take the form of a computer program product embodied in a computer-readable medium (or media) having computer-readable program code/instructions embodied thereon.
[0030] Additionally, embodiments may include the use of secure multiparty computation (SMPC) for distributed processing of sensitive data without exposing it to any single party’, enhancing privacy and security. The methods and systems could also leverage federated learning to train Artificial Intelligence (Al) models across decentralized data sources, preserving privacy while improving the security of recovery methods. Blockchain technology’ may be employed to create immutable records of security questions, answers, and hints, ensuring transparency and auditability.
[0031] Any combination of computer-readable media may be utilized. Computer-readable media can be a computer- readable signal medium and/or a computer-readable storage medium. A computer-readable storage medium may include an electronic, magnetic, optical, electromagnetic, infra-red (IR), and/or semiconductor system, apparatus, or device, or any suitable combination of these. More specific examples of a computer-readable storage medium may include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, a solid-state drive, a non-volatile memory express (NVMe) drive, and/or any suitable combination of these and/or the like. Quantum storage mediums may also be leveraged in future embodiments to further enhance the capacity and speed of storage systems. In the context of this disclosure, a computer-readable storage medium may include any suitable tangible medium that can contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
[0032] A computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electro-magnetic, optical, audible, and/or any suitable combination thereof. A computer-readable signal medium may include any computer-readable medium that is not a computer-readable storage medium and that is capable of communicating, propagating, or transporting a program for use by or in connection with an instruction execution system, apparatus, or device.
[0033] Signal mediums may utilize 5G, 6G, or other advanced wireless communication technologies to enable faster and more secure transmission of data. Quantum communication channels, leveraging quantum entanglement, may also be used for secure, high-speed data transmission in future embodiments. Terahertz wave communications may be utilized as part of next-generation wireless data propagation technologies, allowing greater data bandwidth and transmission speeds.
[0034] Program code embodied on a computer-readable medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, radio frequency (RF), and/or the like, and/or any suitable combination of these. Program code may be transmitted using advanced communication methods such as millimeter-wave technology, enabling high-speed data transfer over short distances, or via satellite communication for global coverage in remote or inaccessible areas. The methods and systems may implement quantum key distribution (QKD) to secure data transmissions, ensuring program code integrity' and confidentiality7 during transmission. [0035] Computer program code for cartying out operations for aspects of the systems and methods for evaluating and strengthening security of authentication systems may be written in one or any combination of programming languages, including an object-oriented programming language such as Python, Java, Smalltalk, C++, C Sharp, Swift, Rust, Kotlin, and/or the like, and conventional procedural programming languages, such as the C programming languages. The program code may execute entirely on a user’s computer, partly on the user’s computer, as a stand-alone software package, partly on the user’s computer and partly on a remote computer, or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user’s computer through any type of network, including a local area network (LAN) or a wide area network (WAN), and/or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
[0036] The systems and methods may also employ serverless architectures, such as Function as a Service (FaaS) platforms such as AWS Lambda, to execute program code dynamically in a distributed cloud environment, optimizing resource usage. Execution across edge computing networks reduces latency by processing data closer to the source, while integration with blockchain-based decentralized networks enables secure, transparent operations across a distributed system.
[0037] Aspects of the systems and methods for evaluating and strengthening security of authentication systems are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatuses, systems, and/or computer program products. Each block and/or combination of blocks in a flowchart and/or block diagram may be implemented by computer program instructions. The computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. [0038] The methods and systems may utilize neuromorphic processors for efficient and adaptive processing of security -related tasks. Quantum processors may also be integrated for executing complex cryptographic operations at exponentially faster speeds. Moreover, the methods and systems may benefit from the deployment of Al accelerators or tensor processing units (TPUs), which are specialized hardware designed to optimize Al-related computations, further enhancing the speed and accuracy of Al-driven security assessments.
[0039] These computer program instructions also can be stored in a computer- readable medium that can direct a computer, other programmable data processing apparatus, and/or other device to function in a particular manner, such that the instructions stored in the computer-readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
[0040] The computer program instructions also can be loaded onto a computer, other programmable data processing apparatus, and/or other device to cause a series of operational steps to be performed on the device to produce a computer-implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. In embodiments, these instructions may be executed in a distributed computing environment using edge computing, where processing occurs closer to the data source, reducing latency. Execution of these instructions may be optimized using Al accelerators, such as TPUs, enhancing the speed and efficiency of Al-driven processes. Quantum computing architectures may also be employed for handling complex computations faster than traditional systems, further enhancing the system’s capabilities.
[0041] Any flowchart and/or block diagram in the drawings is intended to illustrate the architecture, functionality, and/or operation of possible implementations of systems, methods, and computer program products according to aspects of the systems and methods for evaluating and strengthening security of authentication systems. In this regard, each block may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some implementations, the functions noted in the block may occur out of the order noted in the drawings. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. Each block and/or combination of blocks may be implemented by special purpose hardware-based systems (or combinations of special purpose hardware and computer instructions) that perform the specified functions or acts. The blocks may be executed in parallel across distributed cloud infrastructures, leveraging serverless architectures to dynamically allocate computational resources based on real-time demand. Quantum processors may also be utilized to perform concurrent executions of logical functions, optimizing the system’s computational throughput for high-complexity tasks.
[0042] The technical solution provided by the disclosed methods and systems addresses vulnerabilities in traditional account recovery processes by utilizing, among others, advanced Al algorithms to dynamically evaluate and enhance the security of user-generated authentication elements. Specifically, the Al-driven system implements natural language processing (NLP) and machine learning techniques to analyze security questions, answers, and hints in real-time, ensuring they meet customizable security criteria established by administrators. In embodiments, the methods and systems described herein bypass traditional password analysis entirely; instead, they rely on evaluating security' questions, answers, and hints through customizable criteria, eliminating the need for passwords at any stage of the user authentication process.
[0043] The systems and methods, in embodiments, process user inputs by first performing data preprocessing steps such as tokenization, normalization, and vectorization. Tokenization involves breaking down text into individual words or phrases (tokens), while normalization standardizes the text by converting it to lowercase and removing punctuation. Vectorization then transforms the tokens into numerical representations that the Al algorithms can process. [0044] The system, in embodiments, employs NLP to understand and interpret the semantic meaning of security questions, answers, and hints. In some embodiments, this process includes tokenization. where textual input is broken down into individual words or subwords (tokens) using subword-level tokenizers such as Byte Pair Encoding (BPE) or SentencePiece. These tokenizers enable the model to more effectively handle rare or out-of-vocabulary terms and phrases. As an example, the question “What is the name of your favorite childhood teacher?” may be tokenized into [“What”, “is”, “the”, “name”, “of’, “your”, “favorite”, “childhood”, “teacher”, “?”].
[0045] In embodiments, common stopwords are filtered out to ensure the model focuses on the core semantic meaning of security questions. In embodiments, this is using term frequency-inverse document frequency (TF-IDF) analysis, which identifies and removes 1 ow-in formation words such as “the,” “is,” and “and,” allowing the focus to remain on security-relevant content. Additionally, in embodiments, tokens undergo normalization, which includes converting text to lowercase, removing punctuation, and eliminating stopwords. Stemming and lemmatization may also be utilized to reduce words to their root forms, enabling the system to recognize variations of the same word — e.g., “running,” “runs,” and “ran” are reduced to “run.”
[0046] Further embodiments, remove all personally identifiable information (PII) to ensure compliance with privacy standards, with advanced anonymization techniques, such as Named Entity Recognition (NER) and masking algorithms, employed to strip sensitive details from the data. Additional preprocessing steps, in embodiments, include creating context-sensitive feature vectors that encode the susceptibility of a QA pair to social engineering attacks or violation of best practices in security’ design. This is accomplished by embedding contextual clues and risk indicators, such as identifying whether a question could be easily answered via public data sources or if it lacks sufficient complexity, thus enabling the model to flag weak security questions.
[0047] Once preprocessed, in embodiments, the inputs are analyzed using machine learning models such as Transformer-based architectures, including Bidirectional Encoder Representations from Transformers (BERT) or Generative Pretrained Transformer (GPT) models. These models leverage self-attention mechanisms to understand the context and semantic relationships within the text. The Al can detect if a security question is too generic or if an answer is easily guessable based on common knowledge. In addition, for example, in the answer “Paris in the springtime,” the model recognizes the significance of “Paris” as a location and its association with “springtime” to assess uniqueness and complexity.
[0048] FIG. 1 illustrates an example deep neural network in which embodiments of the present technology may be implemented. The deep neural network (DNN) 100 includes an input layer 101, a plurality’ of hidden layers 102, and an output layer 103. In one embodiment, the DNN 100 is a deep auto-encoder neural network (deep ANN) or a convolutional neural network (CNN).
[0049] As illustrated, the DNN 100 has two hidden layers 102, although it is understood that alternative embodiments may have any number of two or more hidden layers. Each layer 101 to 103 may have one or more nodes (represented by circles in the diagrammatic network). As depicted by the connecting lines, each node in a current layer is connected to every other node in a previous layer and a next layer. This is referred to as a fully-connected neural network. Other neural network structures are also possible in alternative embodiments of the DNN 100, in which not every node in each layer is connected to every node in the previous and next layers.
[0050] Each node in the input lay’ er 101 can be assigned a value and output that value to every node in the next layer (e.g., hidden layer). The nodes in the input layer 101 can represent features about a particular environment or setting. For example, a DNN 100 used for classifying whether an object is a rectangle may have an input node representing whether the object has flat edges. In this example, assigning a value of 1 to the node may represent that the object does have flat edges and assigning a value of 0 to the node may represent that the object does not have flat edges. In another example, a DNN 100 takes an image as input. In this case, the input nodes may each represent a pixel of the image, such as a pixel of a training image, where the assigned value may represent the intensity of the pixel. Following this example, an assigned value of 1 may indicate that the pixel is completely black and an assigned value of 0 may indicate that the pixel is completely white.
[0051] Each node in the hidden layers 102 can receive an outputted value from nodes in a previous layer (e.g., input layer) and associate each of the nodes in the previous layer with a weight. Each hidden node can then multiply each of the received values from the nodes in the previous layer with the weight associated with the nodes in the previous layer and output the sum of the products to each node in the next layer.
[0052] Nodes in the output layer 103 handle input values received from the nodes in the hidden layer 102 in a similar fashion. In one example, each output node in the output layer 103 may multiply each input value received from each node in the previous layer (e g., hidden layer) with a weight and sum the products to generate an output value. The output value of each output node can output information in a predefined format, where the information has some relationship to the corresponding information from the previous layer. Example outputs may include, but are not limited to, classifications, relationships, measurements, instructions, and recommendations. For example, a DNN 100 that classifies whether the object is an ellipse, where an outputted value of 1 from the output node represents that the object is an ellipse and an outputted value of 0 represents that the object is not an ellipse. While the examples provided relate to classifying geometric shapes, this is only for illustrative purposes. The output nodes can also be used to classify' any of a wide variety of objects and other features and otherwise output any of a wide variety of desired information in desired formats.
[0053] The systems and methods, in embodiments, also employ Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units to capture sequential dependencies in user inputs. These models are adept at processing sequences where the order of elements is relevant. The LSTM units help retain information over longer input sequences, which is beneficial when evaluating lengthy security questions, answers, or hints. The LSTM networks help identify patterns that could compromise security' by revealing too much information. For example, the system can detect if an answer hint such as “My first car’s make and model” reveals too much information about the answer “1967 Ford Mustang.”
[0054] The Al modules assign security scores to each user input based on factors such as uniqueness, complexity, entropy, and predictability as discussed in further detail below. Entropy calculations measure the randomness and unpredictability of the input, with higher entropy indicating stronger security. The system compares these scores against the customizable criteria set by administrators, which may include thresholds (e.g. minimum complexity requirements), prohibited words, required entropy levels, and/or more.
[0055] If an input does not meet the security' criteria, the system provides realtime feedback and actionable suggestions to the user through the interface. For instance, it may recommend adding specific details to a security question or using a combination of words and numbers in an answer to increase complexity. This interactive process ensures that the final authentication elements are both secure and user-friendly.
[0056] By integrating these Al-driven evaluation mechanisms, the disclosed methods and systems offer a technical solution that enhances the security of account recovery processes. The dynamic assessment and continuous adaptation capabilities of the Al models address the shortcomings of static security' measures, providing robust protection against unauthorized access and evolving cyber threats.
[0057] FIG. 2 illustrates an example of training and deployment of a deep neural network. Once a given network has been structured for a task, the neural network is trained using a training dataset 202. To begin training the deep neural network (DNN) 200, initial weights may be chosen randomly or by pre-training using a deep belief network. The training cycle can then be performed in either a supervised or unsupervised manner.
[0058] Supervised learning uses a training set to teach models to yield the desired output. The training dataset 202 includes inputs and desired outputs, which allow the model to learn over time, or when the training dataset 202 includes input having known output and the output of the neural network is manually graded. The network processes the inputs and compares the resulting outputs against a set of expected or desired outputs. Errors are then propagated back through the system. The training framework 204 can adjust to change the weights that control the untrained neural network 206. The training framework 204 can provide tools to monitor how well the untrained neural network 206 is converging towards a model suitable for generating correct answers based on known input data. The training process repeatedly occurs as the network weights are adjusted to refine the output generated by the neural network. The training process can continue until the neural network reaches a statistically desired accuracy associated with a trained neural network 208. The trained neural network 208 can then be deployed to implement any number of machine learning operations to output a result 214.
[0059] Supervised learning is typically separated into tw o types of problems — classification and regression. Classification uses an algorithm to assign test data accurately into specific categories. Regression is used to understand the relationship between dependent and independent variables. Numerous different algorithms and computation techniques can be used in supervised machine learning, including but not limited to, neural networks, naive bayes, linear regression, logistic regression, support vector machines (SVM), k-nearest neighbor, and random forest.
[0060] Unsupervised learning is a learning method in which the network uses algorithms to analyze and cluster unlabeled data. These algorithms discover hidden patterns or data groupings. Therefore, the training dataset 202 includes input data without any associated output data. The untrained neural network 206 can leam groupings within the unlabeled input and determine how' individual inputs relate to the overall dataset. Unsupervised training can be used to for three main tasks — clustering, association, and dimensionality. Clustering is a data mining technique that groups unlabeled data based on similarities and differences. This technique is often used to process raw', unclassified data objects into groups represented by structures or patterns in the information. Association is a rule-based method for finding relationships between variables in a given dataset. This method is often used for market basket analysis. Dimensionality reduction is used when a given dataset's number of features (dimensions) is too high. This technique is commonly used in the preprocessing of data.
[0061] Variations of supervised and unsupervised training may also be employed. Semi-supervised learning is a technique in which the training dataset 202 includes a mix of labeled and unlabeled data of the same distribution. Incremental learning is a variant of supervised learning in which input data is continuously used to train the model further. Incremental learning enables the trained neural network 208 to adapt to the new data 212 without forgetting the knowledge instilled within the network during initial training.
[0062] Several embodiments of the systems and methods disclosed herein include one or more Language Learning Models (LLMs). The LLMs are trained on a preprocessed (described below), highly specialized, proprietary dataset that focuses on security-related question-and-answer (QA) pairs. The dataset, in embodiments, includes publicly available security questions from password recovery and authentication systems, as well as proprietary data sourced from real-world security7 environments such as penetration testing scenarios logs, real-world phishing attack attempts, and anonymized user data. Unique constraints may be applied during training to improve the LLMs ability7 to distinguish between effective and ineffective security questions, answers, and hints.
[0063] Several techniques are integrated into the training process to further enhance the model’s ability to distinguish between strong and weak security questions. In embodiments, this is done using adversarial training, where the model is presented with deliberately weak questions, answers, and hints, such as easily guessable, overly simplistic, or publicly available information, during the reinforcement learning phase. These adversarial examples, such as single-word or common responses like ‘’John” or “1234,” simulate answers that are vulnerable to social engineering attacks. Reward shaping, in embodiments, is used in the reinforcement learning framew ork, where the model is rewarded based on the entropy and complexity of the security questions it promotes. For example, the model receives higher rewards for identifying and promoting questions that demonstrate strong security principles, such as being non- replicable or incorporating contextual, time-sensitive knowledge. This reward system encourages the LLM to favor high-entropy, hard-to-guess answers. This combination of adversarial training and reward shaping ensures the LLM can generate and optimize security questions, answers, and hints that resist social engineering and maintain high levels of entropy, providing robust protection against common security threats. In other embodiments, the LLMs are trained on high-entropy, complex questions, answers, and hints that require deeper personal knowledge. The training process, in embodiments, filters out biased questions, such as those that disproportionately affect certain demographics or rely on cultural or regional knowledge.
[0064] Moreover, the Al system, in embodiments, employs unsupervised learning techniques, such as K-Means clustering, to identify common patterns and group similar security answers. By clustering inputs, the system detects when users select popular or easily guessable answers, such as common pet names or birthplaces. If a user’s answer falls into a cluster of high-risk responses, the system flags it and prompts the user to choose a more unique answer.
[0065] The Al models, in embodiments, are trained on extensive datasets that include examples of strong and weak security elements. In other embodiments, the Al models are trained exclusively on proprietary weak security elements. These datasets, in embodiments, comprise publicly available linguistic corpora, lists of common passwords, breached data samples, and synthetically generated inputs to cover a wide range of possible user responses. Supervised learning techniques are used, where the models learn to associate certain input patterns with security risk levels based on labeled training data.
[0066] The outputs from the various Al models, in embodiments, are integrated using ensemble methods. For example, the system may use weighted averaging to combine the scores from the Transformer models, RNN-LSTM networks, and entropy calculations. Other embodiments utilize different factors to determine the security score as described in further detail below. Yet other embodiments employ a majority' voting system, where an input is flagged if a majority of models identify it as high risk. This integration facilitates a comprehensive assessment by leveraging the strengths of each model.
[0067] Administrators have the ability to adjust the parameters and thresholds within the methods and systems through customizable criteria settings. For instance, they can set minimum security' score, minimum entropy levels, minimum complexity levels, minimum predictability7 or guessability' levels, specify prohibited words or phrases, adjust the sensitivity' of pattern detection in clustering algorithms, adjust the weight of each factor, and more. This customization allows the system to align with organizational security policies and adapt to emerging threats.
FIG. 3 illustrates an embodiment of a system architecture 300 for evaluating and strengthening security of authentication systems. The system 300 comprises several interconnected components that work together to enhance the security of account recovery processes using Al.
User devices 302 as depicted represent the devices operated by end-users. These devices can be personal computers, smartphones, or tablets through which users create security questions, answers, and hints. The user interface module 304 is a software interface that facilitates user interaction with the system. It is connected to user devices 302 via bidirectional communication links, indicating data exchange between users and the system. The user interface module 304 sends user inputs to the Al evaluation module 310.
Administrator devices 306 are utilized by administrators to configure and manage customizable assessment criteria. They communicate bidirectionally with the administrator interface module 308. This secure interface 308 allows administrators to set and adjust security7 parameters. It sends configuration data to the Al evaluation module 310.
One central component of the system is the Al evaluation module 310, which processes and evaluates user inputs using Al algorithms and machine learning models as disclosed herein. It receives user inputs from the user interface module 304 and assessment criteria from the administrator interface module 308. It communicates with the data preprocessing module 312. The data preprocessing module 312 performs initial processing of user inputs, such as tokenization, normalization, and vectorization. It receives data from the Al evaluation module 310 and returns processed data. It also interacts with the database 314.
The feedback module 316 generates real-time feedback and recommendations for users based on Al evaluation results. It receives evaluation results from the Al evaluation module 310 and sends feedback to users via the user interface module 304.
The encrypted database 314 securely stores user inputs, assessment criteria, and historical data. It communicates bidirectionally with both the Al evaluation module 310 and the data preprocessing module 312. All components are interconnected via the communications network 318, which facilitates secure communication encrypted using protocols as described herein
[0068] The disclosed methods and systems offer several technical advantages. By leveraging machine learning algorithms in embodiments, the system adapts to new security threats in real-time. For instance, when new patterns of attacks are detected, the Al models update their evaluation criteria without the need for manual reconfiguration. The architecture supports scalable processing capabilities, handling large volumes of user data efficiently. The use of distributed computing and parallel processing techniques allows the system to maintain high performance even under heavy load. The integration of multiple Al models, such as Transformer models, RNN- LSTM networks, and clustering algorithms, enables a comprehensive analysis of authentication elements, reducing false positives and negatives in risk assessment. Real-time feedback and suggestions help users create stronger security’ questions and answers without significant delays, improving the usability of the system.
[0069] In embodiments, implementation of the methods and systems disclosed herein includes optimized algorithms for text processing and analysis. For example, the use of BPE in tokenization reduces the computational load and improves processing speed. The system, in embodiments, utilizes hardware accelerators such as GPUs or TPUs to enhance computational efficiency, enabling faster analysis of user inputs. Advanced encryption techniques and secure communication protocols ensure that user data is processed and stored securely, addressing technical concerns related to data privacy and security.
[0070] This present disclosure enables users to create and manage account recovery questions, answers, and hints through a user-friendly interface. As users generate recovery information, the system immediately submits it for real-time Al evaluation. Al modules and models, as discussed herein, assess the security of the questions, answers, and hints against customizable criteria set by the user, strengthening them against potential attacks like guessing or brute-force attempts. This flexible yet rigorous approach to account recovery significantly reduces the risk of unauthorized access to sensitive information.
[0071] FIG. 4 illustrate an embodiment of a system and method for evaluating and strengthening security of authentication systems 400, the process begins with step 420, where administrators configure the Al system with personalized assessment criteria. Through a secure user interface requiring robust authentication like multifactor authentication, they define standards for evaluating aspects such as question complexity, specificity, answer length, and hint obviousness. These criteria are securely stored using advanced encry ption techniques and can be adjusted in real time to meet the organization’s unique security needs.
[0072] Moving to step 440, authenticated users access a secure platform to create their own security questions, answers, and hints. The user interface, accessible via web or mobile applications, provides prompts or guidelines to help users generate memorable yet secure authentication elements, which can include text, images, sounds, or other media. The system leverages natural language processing algorithms to assist users in crafting content that is both personally significant and resilient against unauthorized access attempts. [0073] At step 460, the Al system retrieves the user-generated questions, answers, and hints from the secure, encrypted database and evaluates them against the predefined security criteria established in step 420. Advanced algorithms analyze factors such as complexity, uniqueness, predictability, and adherence to administrator- defined standards. Techniques like graph theory, semantic analysis, and machine learning models — including transformer architectures such as GPT — are employed to assess the security and effectiveness of the user inputs.
[0074] Finally, in step 480, the system delivers immediate feedback to users based on the Al's assessment. This feedback is integrated into the user interface and may include pass/fail results, numerical scores, or actionable suggestions for improvement. Users can then refine their inputs to better align with the security standards, actively participating in enhancing their account recovery security. This dynamic interaction ensures that the authentication elements are both memorable to the user and meet stringent security criteria.
[0075] Throughout this process, the system maintains secure data handling and storage by utilizing advanced encryption techniques and may employ blockchain technology for immutable records of changes, ensuring data integrity and confidentiality. The Al system can adapt assessment criteria dynamically based on realtime threat intelligence and historical security data, using advanced machine learning algorithms like deep learning neural networks. This continuous learning approach ensures that security standards evolve to counter emerging threats effectively.
[0076] FIG. 5 illustrates a flowchart 500 that outlines a method for evaluating and enhancing security entries. The process commences at step 502, initiating the method, and subsequently proceeds to configuring the Al with assessment criteria at step 504. Administrators define and set customizable assessment criteria through the administrative interface, after which these criteria are securely stored within the database at step 506. [0077] Following the configuration and storage of criteria, users undergo authentication at step 508 to verify their identities prior to submitting their security entries at step 510. The submitted security questions, answers, and hints are then preprocessed at step 512, involving tokenization, normalization, and vectorization of the input data. The Al module evaluates the preprocessed inputs based on the established assessment criteria at step 514, subsequently computing security scores at step 516. These scores assess factors such as entropy, guessability, and complexity, and are then compared against predefined thresholds at step 518.
[0078] If the computed security scores meet the established thresholds at step 520, the security entries are securely stored in the database at step 528, and the process concludes at step 530. Conversely, if the scores do not meet the thresholds at step 522, the system generates real-time feedback at step 524, providing users with suggestions and recommendations for improving their security entries. Users are then prompted to revise their entries at step 526, and the revised entries are resubmitted to the input stage at step 510. This creates an iterative loop, allowing users to continuously enhance their security’ entries until they satisfy the required security standards, thereby ensuring robust protection of user information.
[0079] FIG. 6 illustrates an embodiment of the Al evaluation process 600 used to assess user-provided security’ entries, detailing the internal components and data flow. The process begins with user inputs 602, where users submit their security questions, answ ers, and hints via the user interface. These inputs are then processed by the data preprocessing module 604, which performs essential tasks such as tokenization, normalization, vectorization, stopword filtering, stemming, and lemmatization to prepare the data for analysis.
[0080] Following preprocessing, the feature extraction module 606 extracts critical features relevant to security evaluation, including measures of entropy, complexity, and predictability’. These extracted features are then passed to the Al Models 608, which encompass transformer models 610 for semantic analysis, RNN- LSTM networks 612 for analyzing sequential dependencies, and clustering algorithms 614 for detecting patterns and grouping similar inputs to identify common or easily guessable entries.
[0081] The evaluation metrics computation 616 component calculates security scores based on the extracted features, focusing in part on entropy, guessability, and complexity. These computed metrics are then compared against the Assessment Criteria 618, which are customizable parameters set by administrators to define the required security standards. The Decision Module 620 evaluates whether the security entries meet the established thresholds. If the scores meet or exceed the thresholds, the entries are approved and stored in the secure database 624. Conversely, if the scores do not meet the thresholds, the process directs the flow to the feedback generation module 622, where real-time feedback and suggestions for improvement are created and sent back to the user interface.
[0082] The Al models 608 (transformer models 610, RNN-LSTM networks 612, and clustering algorithms 614) operate collectively ensuring a cohesive analysis of the input data. The secure database 624 securely stores approved security entries, maintaining data integrity' and confidentiality' through advanced encry ption techniques. This Al evaluation process ensures that user-generated security elements are thoroughly analyzed for their robustness and adherence to security standards.
[0083] FIG. 7 illustrates a flowchart 700 that delineates the security' scoring mechanism employed, in embodiments, to compute the final security' score (S) for user- provided security entries. The mechanism integrates multiple factors, including entropy, guessability, and complexity7 scores, each contributing to the overall assessment of security robustness.
[0084] The process commences with the entropy calculation module 702, yvhich calculates the entropy score by measuring the randomness and unpredictability of the security7 entries. This entropy score is then foryvarded to the security7 score computation module 724. Concurrently, the guessability analysis module 704 determines the guessability score (G). evaluating how easily an attacker might predict the security answers. This module, in embodiments, encompasses three sub-modules: the public knowledge factor 706, which assesses the availability of the information through public records or social media; pattern detection 708, which identifies predictable sequences or repeated characters; and memory -based predictability 710, which evaluates the likelihood of an answer being guessed based on common knowledge or social engineering techniques. The resulting guessability score (G) is transmitted to the security score computation module 724.
[0085] Simultaneously, the complexity assessment module 712 computes the complexity score (C) by analyzing the uniqueness and sophistication of the security questions. This module, in embodiments, includes sub-modules such as the uniqueness function 714, which measures the distinctiveness of the question compared to a database of known security questions; the number of unique words 716, which counts the unique words used to assess complexity7; the entropy of words 718, which analyzes the unpredictability of word choices; and time-based uniqueness 720. which considers elements that change over time to enhance security. The complexity score (C) is then sent to the security score computation module 724.
[0086] The weight assignment module 722 assigns specific w eights (a, (3, y — which may be managed by administrators) to each of the calculated scores — entropy (H), guessability (G), and complexity7 (C) — thereby adjusting their influence on the final security7 score (S). These weights are conveyed to the security7 score computation module 724.
[0087] The security score computation module 724 aggregates the weighted scores and calculates the final security score (S). This computed score is then passed to the threshold comparison module 726. The threshold comparison module 726 evaluates whether the final security score meets or exceeds predefined security thresholds. If the score satisfies the threshold criteria, the process proceeds to approval; how ever, if the score falls below the threshold, the recommendation module 728 is activated. The recommendation module 728 generates tailored suggestions and recommendations aimed at improving the security entries to achieve compliance with the required security7 standards.
[0088] FIG. 8 depicts an embodiment of the data flow and secure storage mechanism 800 employed within the system to ensure the protection of user-provided security entries. The process initiates with user devices 802, such as personal computers or mobile devices, where users originate their inputs. These inputs are transmitted through the secure transmission module 804, which in embodiments, utilizes robust encryption protocols like TLS/SSL to secure data during transit, preventing unauthorized interception or access.
[0089] Upon successful transmission, the data is processed by the data preprocessing module 806. This module is responsible for preparing the data for evaluation by performing tasks such as tokenization, which breaks down the input into manageable pieces; normalization, which standardizes the data format; and vectorization, which converts the data into a format suitable for analysis by the Al evaluation module 808. The Al evaluation module 808 then assesses the security entries against established criteria, determining their robustness and identifying any weaknesses.
[0090] Following evaluation, the feedback module 810 generates real-time feedback based on the Al’s assessment, providing users with actionable suggestions to enhance the security of their entries. Before any data is stored, the encryption module 812 encrypts the processed data to ensure that it remains confidential and protected from unauthorized access. The encrypted data is then securely stored in the encrypted database 814, which serves as a centralized repository for all user inputs, assessment criteria, and historical data, safeguarded against potential breaches.
[0091] Administrators access and manage the system through administrator devices 816, which connect to the administrator interface 818. This interface provides administrative functionalities, allowing for the configuration of assessment criteria, monitoring of system performance, and management of user accounts. To maintain comprehensive oversight and accountability, the audit log module 820 records all system interactions, including changes and access events, thereby facilitating auditing and ensuring compliance with security protocols.
[0092] All interactions and data exchanges within the system occur over encrypted, secure network connections 822 These secure connections maintain the integrity and confidentiality of data as it flows between various modules and components of the system, thereby ensuring a robust and secure environment for managing user security entries.
[0093] FIG. 9 illustrates an architecture of machine learning models 900 utilized within the system’s Al evaluation module. The architecture comprises several layers and components designed to process and analyze user inputs effectively. The process begins with the input layer 902, which receives preprocessed data from the data preprocessing module 906. This data is then transformed into vector representations by the embedding layer 904, facilitating semantic understanding.
[0094] Subsequently, the data progresses through multiple transformer blocks 906, each consisting of a self-attention mechanism 908, layer normalization 910, and a feed-forward neural network 912. These components work in tandem to capture contextual relationships and enhance feature extraction. Following the transformer blocks, the data is processed by the RNN-LSTM networks 914, which handle sequential dependencies and temporal patterns within the data, thereby improving the model's ability to understand the sequence and structure of user inputs.
[0095] The output layer 916 receives the processed data from the RNN-LSTM networks and generates evaluation results, including security scores and assessments. To ensure the models remain accurate and effective, the model training module 918 oversees the training processes using comprehensive datasets, while the hyperparameter tuning module (920) fine-tunes parameters such as learning rates and weight decay to optimize performance. Additionally, the model update mechanism 922 enables real-time updates to the models based on new data or evolving assessment criteria, ensuring the system adapts to emerging security threats and maintains high evaluation standards.
[0096] All components within the architecture are interconnected, with emphasis on the influences of training and tuning on the model's components. The output layer 916 loops back to the Model training module 918 ensuring continuous learning and updating process, allowing the system to evolve and improve over time.
[0097] FIG. 10 illustrates an exemplary real-time feedback loop 1000 between the user and the system, showcasing the interactive process that provides immediate assessments and recommendations based on user inputs. The loop begins with user inputs 1002, where users submit their security questions, answers, and hints through the user interface 1008. These inputs are then processed by the Al evaluation module 1004, which assesses the strength and robustness of the security entries against predefined criteria using advanced machine learning algorithms.
[0098] Following the evaluation, the feedback generation module 1006 creates actionable feedback based on the Al's assessment. This feedback is displayed to the user via the user interface 1008, allowing users to understand the necessary improvements. Users then perform revision actions 1010 by modifying their entries in response to the feedback provided. This iterative process is represented by the loop arrow 1012, which directs the revised entries back to the Al evaluation module 1004 for re-assessment. The loop continues until the security entries meet or exceed the required security thresholds, at which point an approval confirmation 1014 is generated by the system, indicating successful compliance with the security standards.
[0099] The connections and flow within the feedback loop indicate the progression of data and feedback through the various modules. The loop arrow 1012 illustrate the continuous nature of the process, ensuring that users can repeatedly refine their security entries until they achieve the desired level of security. Directional arrows throughout the diagram illustrate the flow of information from user inputs to evaluation, feedback generation, and user revisions, culminating in the approval confirmation once the security criteria are satisfied.
[00100] An embodiment of a system and method for evaluating and strengthening security of authentication systems is described next. In a first step, the system configures the Al with personalized assessment criteria, establishing the standards for evaluating security questions, answers, and hints.
[00101] In embodiments, this configuration is performed by retrieving predefined security' policies from a central database or rules engine containing parameters [HOW] such as question complexity, answer length, and hint specificity.
[00102] System administrators may interact with a secure user interface to modify these standards. The secure user interface may incorporate multi-factor authentication (MFA) to verify the identity of system administrators. Access to this configuration, in embodiments, occurs through a secure communication channel using encryption protocols such as Transport Layer Security (TLS), ensuring the confidentiality' and integrity of the transmitted data. Authorized personnel, typically system administrators, access the configuration through the secure user interface, which presents configurable parameters for security assessments.
[00103] System administrators may customize assessment standards according to their organization’s specific security requirements. Customizable standards parameters may include question, answer, and hint entropy, complexity, predictability, guessability, length, specificity, the degree of personalization based on historical user data, the weight given to each, and more. In embodiments, the system enforces rolebased access control (RBAC) to ensure that only authorized personnel can modify these critical security parameters.
[00104] The system, in embodiments, validates administrators in real time by employing both client-side (e.g., JavaScript) and server-side (e.g., API validation) mechanisms. Client-side validation ensures the format is correct before submission, while server-side validation checks compliance against preset boundaries stored in the backend schema. If an input falls outside the acceptable range, the system generates an alert or notification for correction, preventing non-compliant values from being saved.
[00105] Once the administrators submit changes through the secure user interface, the backend processes the configurations and updates the system’s database to store the new parameters. Each configuration parameter, in embodiments, is securely stored in a distributed database, leveraging technologies such as NoSQL databases (e.g., MongoDB) for high availability and fault tolerance.
[00106] Depending on the embodiment, when an update is made, the change is first committed to a primary node in the database cluster. Specifically, the primary node writes the new data to its local storage and then initiates replication to secondary nodes. This replication is achieved through mechanisms such as write-ahead logging or journaling, where the primary node records the changes and streams them to secondary nodes in real time. The secondary nodes apply these updates to their own copies of the database, ensuring consistency across the cluster. This process achieves redundancy and consistency across the distributed architecture, maintaining scalability and fault tolerance.
[00107] Additionally, in embodiments, dynamic data structures within the Al system, such as hash maps or binary trees, manage real-time updates to these criteria. These data structures allow the Al to quickly access and incorporate the new configurations into its assessment algorithms without significant latency.
[00108] The updated assessment criteria, in embodiments, are securely stored using advanced encr ption techniques, such as AES-256 for symmetric encr ption or RSA for asymmetric encryption. This ensures that even if the data is intercepted during transmission or storage, it remains encrypted and inaccessible without the correct decryption key. Role-based access control (RBAC) is also applied to protect sensitive data, limiting modification privileges to authorized personnel based on their roles and responsibilities. [00109] In embodiments, to maintain system uptime and ensure data availability, the system replicates data across a distributed, fault-tolerant architecture. In the event of anode failure, traffic is automatically rerouted to secondary nodes via load balancers or coordination services like Apache ZooKeeper. ZooKeeper monitors the health of each node and facilitates leader election in case the primary node becomes unavailable. This ensures continuous service with no data loss, allowing the system to maintain horizontal scalability and redundancy without compromising security.
[00110] In embodiments, the system integrates blockchain technology to further enhance the security and immutability of the configuration process. Each change to the assessment criteria is stored as a transaction in a decentralized, distributed ledger, ensuring that all modifications are permanently recorded and cannot be altered or deleted. Each block in the blockchain is cryptographically linked to the previous one, creating a chain of records that is resistant to tampering. Multiple nodes independently verify each transaction, maintaining the integrity of the assessment criteria.
[00111] The system, in embodiments, logs every configuration change in an encry pted audit trail, which records the time of the change, the identity7 of the administrator who made the change, and the specifics of the updated parameters. The audit logs are stored using a secure logging system that ensures non-repudiation and tamper resistance, employing hashing algorithms such as SHA-256 to verify the integrity of the logs. These logs are used for compliance and accountability, enabling organizations to review- changes and investigate any unauthorized modifications.
[00112] After receiving the desired criteria from administrators and updating the database, the system recalibrates the Al’s assessment algorithms to align with the new7 parameters. In embodiments, the configurations are translated into parameterized data models that the Al system uses to adjust its evaluation algorithms according to the new standards. In further embodiments, the Al models dynamically adjust their parameters based on the updated configurations. [00113] For neural networks, recalibration may involve adjusting weights and biases within the network. This is achieved through backpropagation, where the network computes the gradient of the loss function with respect to each weight, allowing for fine-tuning based on the new criteria. For decision trees, recalibration may modify split conditions or thresholds at each node. The system may adjust the criteria for splitting nodes based on the updated parameters, influencing how the tree classifies input data. For support vector machines (SVMs), recalibration may update hyperparameters such as the regularization parameter (C) or kernel parameters (e.g., gamma in an RBF kernel). Adjusting these parameters changes the margin and decision boundary, aligning the model with new evaluation standards.
[00114] The system, in embodiments, handles recalibration asynchronously by initiating a parallel process or thread. This approach prevents disruption of ongoing evaluations and maintains system performance. The recalibration process fine-tunes the Al's decision-making models to align with the new thresholds and weights, ensuring optimal functionality while integrating the new rules in real time.
[00115] Once recalibration is complete, the Al system instantaneously applies the updated criteria in real-time evaluations. The Al utilizes real-time data pipelines, such as Apache Kafka, to push the updated criteria into the system's inference engine. Kafka acts as a distributed streaming platform that allows the system to publish and subscribe to streams of records. The updated configurations are published to a Kafka topic, and the inference engine subscribes to this topic to receive updates instantaneously.
[00116] This mechanism allows the Al to dynamically adjust its decisionmaking process without requiring a full model retraining. For example, if a new threshold for question complexity is set, this change is applied immediately in subsequent security evaluations. The Al system assesses security questions, answers, and hints according to the updated standards, ensuring that evaluations are consistent with the administrators' specified requirements. By incorporating the new configurations promptly, the system remains responsive to evolving security needs and can adapt to new threat landscapes. This dynamic application of updated criteria enhances the overall resilience and effectiveness of the account recovery process, providing robust protection against unauthorized access.
[00117] In addition to standards set by administrators, in embodiment, the system incorporates advanced machine learning models, such as deep learning neural networks or decision trees. In embodiments, this is done to evaluate the effectiveness of the configured criteria. To assess performance, the system employs various metrics, including accuracy, precision, and recall, calculated against historical data using cross- validation techniques. Weighted averaging may be used to combine individual security metrics into an overall security score.
[00118] The system, in embodiments, utilizes reinforcement learning to continuously leam and fine-tune standard based on historical performance, user behavior, and other factors. This assigns rewards or penalties based on the success or failure of security assessments. This feedback loop allows the Al to dynamically finetune its parameters [HOW], ensuring the system adapts to evolving security landscapes. These dynamic adjustments ensure that the system remains resilient and adaptable, continually evolving in response to new threats and changing security requirements. For example, if a new type of attack targets security questions with specific characteristics, the system can detect this trend and adjust the weight assigned to related factors, enhancing the overall resilience of the account recovery' process.
[00119] In a second step, users create account recovery questions, answers, and hints within the system’s framework.
[00120] In embodiments, before proceeding, users are authenticated using multifactor authentication (MFA) to ensure that only authorized individuals can modify security’ information, thereby reducing the risk of unauthorized access. MFA may include a combination of methods such as passwords, time-based one-time passcodes (TOTP), biometric data (e g., fingerprint or facial recognition), hardware tokens, and more. [00121] After successful authentication, users access a secure user interface to create personalized security questions, answers, and hints. The user interface, accessible through web, mobile applications, and more, provides secure input fields linked to the backend where data is validated and stored. The system, in embodiments, supports various data types for security- entries, including text, sounds, images, and video. Multimedia files are encrypted and stored efficiently using appropriate formats, such as PNG for images or AAC for audio, ensuring efficient storage without compromising quality. For instance, an uploaded image might be compressed using the PNG format and then encrypted using AES-256 before being stored in the database.
[00122] The user interface transmits data securely to the server using protocols such as HTTPS, ensuring confidentiality and integrity during transmission. In embodiments, client-side encryption, such as AES-256, is applied to encrypt data before transmission. For example, when a user inputs a security entry, the system encrypts the entry on the user’s device using an AES-256 encryption key. The encrypted data is then transmitted to the server and stored in an encrypted database. Other methods as disclosed herein, or similar, may be utilized.
[00123] User data is securely stored in an encrypted database, with data integrity verified through hash functions such as SHA-256. Upon storage, a hash value is generated using SHA-256 to create a unique digital fingerprint of the input. This ensures that any alteration to the stored data can be detected by comparing the hash values during future retrievals. The database system employs replication to ensure redundancy and high availability. For example, the database may replicate the stored data across multiple geographically distributed nodes, ensuring that even if one node fails, the data remains accessible from another node.
[00124] The systems and methods, in embodiments, preprocess user inputs. Preprocessing steps as discussed above include, but are not limited to, tokenization, normalization, vectorization, stopwords filtering, stemming, lemmatization, PII removal, creating context-sensitive feature vectors, and more. [00125] The system validates and evaluates tokenized user entries based on predetermined criteria, as described above. Operational in real-time, the system, in embodiments, deploys machine learning algorithms and LLMs to perform an in-depth analysis of user-created security entries, including applying semantic analysis to assess and interpret the semantic meaning of security entries [HOW]. These algorithms include neural networks configured to analyze input complexity by assessing factors such as uniqueness, complexity’, depth of knowledge demonstrated, length, diversity of characters, linguistic structure, and alignment with administrators’ security criteria. The neural network identifies patterns that indicate low complexity, such as repetitive sequences or predictable character combinations.
[00126] The system, in embodiments, utilizes a dictionary lookup algorithm to compare user entries against a list of common phrases and dictionary words. In embodiments, decision tree algorithms are utilized to evaluate the number of unique “words” in an entry, checking whether the answer has sufficient personal relevance based on the user’s historical data, as may be stored in the system. This ensures that the authentication elements are both secure and memorable to the user.
[00127] The system makes a determines the security score of the user entries in accordance with administrator criteria using any one of the several embodiments disclosed herein. The system applies predefined rules to validate user inputs and reject overly simple or guessable entries. Entries resembling common phrases or dictionary words may be flagged. For example, if the system utilizes a dictionary lookup algorithm to compare user entries against a list of common phrases and dictionary words, and there a match is found, the system may flag the entry.
[00128] If a user entry does not meet a specified threshold, the entry is flagged. If the entry’ does not meet the security criteria, the system provides suggestions to increase complexity, such as using multi-word answers or incorporating unique personal experiences. The system then prompts the user to provide a more secure entry. [00129] User-generated inputs are securely stored in the encrypted database and continuously analyzed to ensure they meet administrator updated security standards, publicly available data, data breaches, and more. In some embodiments, the system continuously recalibrates its Al models by updating weights and biases based on newly observed user behavior and updated security policies. Machine learning models adjust based on new inputs, recalibrating weightings and thresholds to ensure compliance with evolving criteria. This adaptive approach balances memorability with security, enabling the creation of highly personalized and secure authentication elements.
[00130] At a third step, the system initiates an Al-driven evaluation process of user-provided security entries to assess their robustness and effectiveness in account recovery. This evaluation is based on the predefined security standards and protocols established by administrators or authorized users in the first step. By integrating multiple technical and software components, such as encryption modules and machine learning algorithms, the system performs a comprehensive security analysis to ensure that recovery mechanisms comply with required security and usability thresholds.
[00131] The Al assessment begins by securely retrieving security entry from an encrypted database. The retrieval process employs advanced data structures, such as hash maps and Merkle trees, in conjunction with encryption techniques like AES-256, to ensure the secure handling and protection of sensitive information.
[00132] Synchronize assessment criteria. The assessment criteria are dynamically adapted in real-time by leveraging an indexed database that manages criteria weights, threshold values, and user-specific security preferences, allowing immediate synchronization with the updated security' parameters defined in the first step. In embodiments, the system integrates blockchain technology to ensure the immutability of security configuration changes. The blockchain acts as a distributed ledger that logs every modification made to the security settings, creating immutable records that are resistant to tampering. [00133] The system dynamically updates the security' criteria based on historical data, detected security breaches, and evolving threat landscapes. This continuous adaptation is powered by machine learning models, including RNNs and deep reinforcement learning algorithms, which analyze patterns in security breaches and vulnerability reports to make real-time adjustments to the assessment criteria. For instance, RNNs process historical data sequences to identify trends in user behavior or attack patterns, while reinforcement learning applies real-time feedback to adjust security thresholds accordingly.
[00134] The system evaluates user security entries by employing advanced algorithms to assess factors such as complexity, specificity, and alignment with the security standards defined in the first step.
[00135] In embodiments, these algorithms utilize graph theory to model relationships between characters, subwords, words, or phrases within the security entries, identifying vulnerabilities by calculating edge connections and node weights. For example, a security' question containing repetitive or common words may' receive a lower complexify score. Network analysis methods assign complexify scores based on these relationships to ensure that security questions are both memorable to the user and sufficiently robust against attacks.
[00136] The Al system may be implemented using transformer-based models, such as BERT or GPT architectures, which rely on self-attention mechanisms to dynamically weigh input data based on contextual relevance. When processing a sentence, the model assigns vary ing attention scores to different words based on their relationships to surrounding words, allowing the system to capture long-range dependencies and complex relationships within the data. In further embodiments, the system leverages NLP techniques and semantic analysis to assess their depth, uniqueness, and adherence to the security' criteria. In some embodiments, convolutional neural networks (CNNs) are employed to analyze image, audio, or video-based answer hints, enabling the system to process multimedia-based knowledge elements effectively. [00137] Transformer architectures as utilized in embodiments, employ positional encoding to compensate for the lack of inherent sequential order awareness in the model. Positional encoding provides numerical values representing the position of each token in the sequence, allowing the model to process entries with an understanding of word order. This is useful for evaluating security questions, answers, and hints because context and sequence are relevant for meaningful assessment. In embodiment, multiple transformer blocks are stacked to create increasingly abstract representations of the input data, facilitating the extraction of high-level semantic information from complex inputs.
[00138] Each transformer block, in embodiments, includes self-attention mechanisms followed by layer normalization and feed-forward neural networks. Layer normalization ensures consistent activation outputs by normalizing neuron activations, while the feed-forward network applies mathematical transformations to further refine the data. These components enhance the model's ability to process sequential text and capture complex relationships.
[00139] To ensure the security of user-generated elements in embodiments, the system employs Al-driven probability distributions and mathematical thresholds. During assessment, the model generates a probability distribution over potential next words or tokens in the answer. If the distribution shows a high probability for a small set of words, the answer may be deemed too predictable. The system in further embodiments analyzes the entropy of the output distribution, where high entropy suggests less predictability and greater security, while low entropy may indicate that the answer is guessable. In embodiments, the system flags security entries that exhibit low entropy or predictable patterns, guiding users to create security questions and answers that are user-friendly yet maintain high-security standards.
[00140] In embodiments, the security questions, answers, and hints entries are assigned a security' score. In embodiments, the security score is based in part on the entropy, a measure of randomness or unpredictability, of the assessed entry. Entries with more unique data points and higher variability, for example, will have greater entropy, making them more secure. In embodiments, the security score is based in part on the complexity of the entry. Entries that require multiple data points and crossreference different aspects of a user's life are assigned higher complexity scores, making them harder for unauthorized individuals to guess. NLP algorithms may be utilized to further enhance complexity scoring by analyzing the syntactic and semantic structure of the question. The NLP model performs syntactic parsing to analyze grammatical structure and semantic parsing to understand meaning, ensuring that questions are specific to the user and resistant to guessing attacks.
[00141] In embodiments, the system implements a multi-faceted scoring process to evaluate the security strength of each security question-answer (QA) pair. In embodiments, the system determines a security score for each security question entry, answer entry, hint entry, or any combination thereof is based on a combination of factors, including an entropy score, a guessability score, and a complexity score. Each component contributes to the overall security assessment of the entries, wherein the entropy score (H) measures the randomness and unpredictability of the answer; the guessability score (G) assesses how easily the answer can be guessed based on known attack patterns such as brute force, social engineering, common knowledge, and more; and complexity score (C) evaluates the complexity and uniqueness of the question itself.
[00142] The security score, in embodiments, is calculated using a weighted average of these components, with the contribution of each factor adjustable to control their influence. These weights may be fine-tuned during model optimization to reflect specific security requirements or organizational policies. By adjusting the relative importance of each factor, the system can dynamically assess and optimize security questions for robustness, ensuring that weaker entries with higher guessability or lower complexity are flagged for improvement.
[00143] In embodiments, adjustable security parameters refer to configurable variables or settings that control how each artificial intelligence model operates, affecting subfactors that contribute to the determination of entropy, predictability, and complexity. These parameters not only dictate the behavior of the Al models by influencing how they process inputs, weigh security metrics, and generate security scores but also control the contribution of specific subfactors to the final evaluation. For instance, adjustable security parameters can modify the influence of factors such as the public knowdedge factor, pattern detection, memory-based predictability', the number of unique words, and time-based uniqueness on the output of the models.
[00144] In embodiments, adjustable influence parameters refer to configurable variables that control the relative contribution of various factors, such as entropy, predictability, and complexity, to the final security score. These parameters allow for fine-tuning of how different models influence the outcome of the security evaluation. These parameters may adjust threshold levels, weight assignments for individual metrics, and impose limits on particular model outputs, allowing for precise control over the Al models’ behavior and the overall security score calculation in response to vary ing security requirements.
[00145] In embodiments, each artificial intelligence model produces an assessment score based on the evaluation user entries and the specific security' parameters. The assessment scores include entropy, predictability, and complexity. For instance, the assessment score may include the calculated entropy of the user input based on the security parameter governing randomness or unpredictability, the predictability score derived from the likelihood of an input being easily guessed, and the complexity score reflecting the uniqueness and intricacy of the user input. Each of these scores, in embodiments, is generated by the corresponding Al model, such as the entropy assessment model, predictability analysis model, or complexity evaluation model. The combination of these assessment scores is used to generate the overall security score of the user inputs, with weights applied according to the adjustable influence parameters.
[00146] Additionally, these parameters may adjust threshold levels, weight assignments for specific metrics, and impose limits on particular model outputs, allowing for precise tuning of the Al models' performance in response to vary ing security requirements.
[00147] The entropy (H) of an entry' or a pair is calculated using Shannon’s entropy formula, which measures the unpredictability' or randomness of the information contained in the entry or pair:
Figure imgf000041_0001
Where:
Pi represents the probability of occurrence of each possible entry or pair character.
H is the total entropy score, indicating the unpredictability' of the entry' or pair.
A higher entropy score corresponds to a more secure and unpredictable entry or pair, making it harder for attackers to guess or deduce. Lower entropy suggests vulnerability, especially to social engineering or brute force attacks. This entropy calculation, in embodiments, is used as part of a comprehensive scoring mechanism to assess the overall security strength of the QA pair.
[00148] In embodiments, the system generates a Guessability Index based on one or more guessability scores. The Guessability Index is a proprietary algorithm specifically designed to enhance the security of user authentication mechanisms by assigning a numerical “guessability'” score to entries or pairs, such as security questions and their corresponding answers, while maintaining a historical log of scoring data in an indexed format. The Guessability Index, in embodiments, includes multiple components, including a public knowledge lookup module and a pattern-matching engine.
[00149] The system evaluates the provided entries comparing them against common knowledge datasets (e.g., publicly available information such as birth dates, family names, or other easily accessible personal details). This comparison flags responses that are vulnerable to discovery via public records or social media, assigning them a lower guessability score. Additionally, the algorithm performs pattern recognition to identify predictable or weak response structures, such as repeating character sequences, dictionary words, or commonly used phrases. Entries exhibiting these patterns are deemed to be more susceptible to guessing and are thus assigned a proportionally lower guessability score.
[00150] The guessability or predictability score (G) measures how easily an answer can be guessed based on attack patterns. It may be calculated using the following formula:
G = u3 x Pk + u2 x Fp + u3 x Mp
Where:
Pk represents a public knowledge factor, which is a measure of how likely the answer can be found in public databases (e.g., family names, birth dates, publicly available records). This factor is calculated by comparing the answer against a list of publicly accessible data.
Fp represents frequency of pattern detection, which is the number of detected predictable patterns, such as repeating characters, common dictionary words, or sequences like ” 1234" or ‘"password/’ This is flagged using pattern matching algorithms.
Mp represents memory-based predictability, which is a score that indicates how predictable the answer is based on social engineering techniques (e.g., answers that could be easily inferred by knowing basic facts about the user). This may be determined using historical attack patterns and social engineering simulation. u}, u2, u3 are weights that balance the contribution of each factor, which may be adjusted during model optimization. [00151] In embodiments, the complexity score evaluates how unique, open- ended, and context-dependent an entry or pair is, calculated as:
Figure imgf000043_0001
Where: (<?) is a function that evaluates the uniqueness of the security entry or pair. This can be measured by comparing the question against a database of known security entries or pairs and assigning a score based on how unique or uncommon the entries or pairs are.
Nw represents the number of unique words in the entry' or pair. The more unique words, the more complex the question becomes.
Ew represents entropy of words in the security question, measuring the unpredictability of word choice.
Tu represents time-based uniqueness — a factor that rewards questions with elements that change over time or are tied to time-sensitive or time-dependent events.
A2, A3 are weights to balance the contribution of each factor, which may be determined during model optimization.
[00152] In embodiments, the complexity score is calculated using graph theory', where relationships between words in the entries or pair are represented as nodes and edges. Graph theory enables the system to quantify complexity by analyzing the connectivity between elements in the entries or pair. More intricate connections between nodes — such as relationships that cross-reference multiple data points — result in a higher complexity score.
[00153] The final security' score is a weighted sum of one or more factors. In embodiments, the final security score (S) is computed as:
Figure imgf000043_0002
Where:
H represents the entropy score, assessing the randomness of the answer.
G represents the guessability score, reflecting the likelihood of an attacker correctly guessing the answer based on known attack patterns.
C represents the complexity7 score, which evaluates the sophistication of the question. a, fj, Y are adjustable weights, that may be fine-tuned during model optimization to balance the contribution of each factor.
[00154] If the security score (S) falls below a predefined threshold, the system automatically generates recommendations for improvement. Thresholds may be dynamically determined based on the application's sensitivity. For example, in financial or healthcare applications, a score below 70/100 may trigger recommendations.
[00155] In embodiments, the system automatically generates recommendations for improving security entries or pairs when the overall security' score falls below a predefined threshold. The threshold may be dynamically determined based on the specific ty pe of application for which the security questions are intended. For instance, in highly sensitive environments such as financial or healthcare applications, a score below 70/100 may trigger recommendations.
[00156] In embodiments, the system evaluates several criteria to determine when to generate a recommendation (e.g., rephrasing the question, introducing time-based elements, or suggesting a more complex answer). Recommendations, in embodiments, may be triggered when the entropy score (H) falls below a certain threshold (e.g., less than 2.5); when the guessability' score (G) exceeds a specified threshold (e.g., more than 60%); when the complexity score (C) indicates that the question lacks uniqueness or is overly ambiguous; when the security score is too low (e.g. less than 70/100); and/or any combination of factors. [00157] In embodiments, even if the final security7 score meets the minimum threshold, recommendations may still be generated if individual component scores fall outside acceptable ranges in subcategories of the final security score (e.g., entropy is low). This ensures that specific vulnerabilities are addressed, enhancing the robustness of the security questions by considering both holistic and component-specific weaknesses.
[00158] In embodiments, administrators may set an overall score threshold of 80/100, but also specify that entropy (H) must not fall below 2, and guessability (G) must not exceed 50%. In other embodiments, for example, in applications where an aging population is the primary end users, administrators may require not only a sufficient security score, but a relatively high, memory-based predictability factor.
[00159] In an illustrative example, the system processes user input to evaluate and enhance the security of a security question and answer pair. A user enters the question ‘'What’s your favorite movie?” and the answer “Titanic.” The system begins by preprocessing the input, tokenizing the question and answer, and removing stopwords. It categorizes the question as a “personal preference” type, which typically has lower security due to its generality.
[00160] The system’s LLM processes the input, identify ing the question as open- ended but lacking complexity. By referencing a database of common answers, it determines that “Titanic” is a highly guessable answer due to its popularity. The entropy analysis of the answer shows a low entropy value because “Titanic” is a single, easily predictable word. Similarly, the system determines that “Titanic” is a guessable answer in light of the question based on attack patterns. The system also determines that the entries or pair are not complex as they are not unique.
[00161] The system then calculates the security scores. For example, the entropy score (H) may be calculated as 1.8 out of 10, which is a low score reflecting that the word “Titanic” is short and simple. The guessability score (G) may be calculated as 75%, indicating a high likelihood of “Titanic” being guessed based on public knowledge. The complexity score (C) may be calculated as 2.0 out of 10, indicating a low score due to the open-ended nature of the question.
[00162] Assuming the administrator sets weights a = 0.3, P = 0.5, and y = 0.2, and normalizes H and C to a scale of 100. The final score (S), as an example, may computed as a weighted sum, as follows: S = 0.3H + 0.5(7 + 0.2C = 46.9/100. This falls below the acceptable threshold of 70. Assuming the administrator sets a final score threshold of 70.
[00163] This score falls below the acceptable threshold of 70. Consequently, the system triggers a recommendation. The LLM suggests rewording the question to make it more specific, such as “What was the last movie you watched in theaters?” It may also recommend using a more complex and unique answer, such as “Pirates of the Caribbean: The Curse of the Black Pearl,” which would increase the entropy due to its length and complexity. The system then outputs the reworded question and advises the user to provide longer, more distinctive answers to improve security, ensuring a more robust authentication mechanism.
[00164] To illustrate the practical application of the disclosed methods and systems, the following examples demonstrate how the Al evaluates security questions, answers, and hints using the customizable criteria set by administrators.
[00165] Example 1 — Evaluating a Security Question: A user submits the security’ question, “What is my favorite color?” The system begins by receiving the user input and normalizing and tokenizing the question into lowercase tokens: [“what,” “is,” “my,” “favorite,” “color”]. The Al models, trained on datasets containing security breaches, undesired user inputs, and user data, are configured to flag questions based on specific security parameters, such as entropy thresholds, predictability- indices, and complexity weights.
[00166] The administrator configures one or more of the Al models with one or more adjustable security parameters to adjust the relative contribution of the factors within each model. For example, the “public knowledge factor,” ’‘number of predictable patterns,” or “memory -based predictability" may be assigned lower weights and therefore contribute less to the predictability index. An administrator may assign the public knowledge factor a weight of 0.4. the number of predictable patterns a weight of 0.2, and the number of memory-based predictable elements a weight of 0.4. This results in a predictability model that determines a predictability score based on the following equation: G = 0.4*Pk + 0.2FP + 0.4MP. Similarly, for complexity, the administrator assigns a weight of 0.5 to the number of unique words in the security question, a weight of 0.3 to the entropy of the words, and a weight of 0.2 to the time-based uniqueness of the security question. This results in a complexity model that determines the complexity score based on the following equation: C =f(Q) i 0.5Nw + 0.3Ew + 0.2Tu.
[00167] The overall contribution of each Al model is adjusted by the administrator to determine the contribution of the models to the final security score using one or more adjustable influence parameters. The administrator assigns a weight of 0.4 to entropy, a 0.3 to predictability, and a 0.3 to complexity. Based on these weighted, the system computes the overall security score as follows: S' = 0.4H + 0.3G + 0.3C.
[00168] For this question, the system evaluates the security using the predefined parameters, including entropy, predictability, and complexity. Due to the limited range of possible answers (common colors), the entropy calculation yields a low value. The phrase “favorite color” is identified as common and generic, in part because it does not contain a sufficient number of unique words, contributing to a low complexity score. The system then determines the security score, calculated as the weighted combination of entropy, predictability, and complexity, as follows:
Figure imgf000047_0001
+ 0.3 f (Q) + 0.5 x Nw + 0.3 x Ew + 0.2 x Tu)
Based on these determinations, the system assigns a low-security score, indicating a high-security question, and flags the question for revision. [00169] The Al re-evaluates the revised question, detecting increased specificity and uniqueness. The entropy calculation reflects a higher value due to a broader range of possible answers. The Al assigns a low-risk score, indicating compliance with the security criteria. The system then confirms to the user: '‘Your security question meets the recommended security’ criteria.” The system again receives this new input and normalizes and tokenizes the revised question into tokens: [“what,” “was,” “the,” “name,” “of.” “the.” “first,” “school,” “I,” “attended,” “abroad”]. The system reevaluates the revised question, which now reflects increased specificity, increased uniqueness, increased complexity, and increased range of possible answers. The system then calculates a new security score, which now reflects a higher score, indicating a lower risk due. The system confirms to the user: “Your security question meets the recommended security criteria.”
[00170] Example 2 — Evaluating an Answer and Hint: A user provides the answer “Fluffy” with the hint “My pet's nickname.” The Al analyzes the answer by checking “Fluffy” against a database of common pet names. It identifies that “Fluffy’” falls into a high-risk cluster due to its popularity, resulting in a low entropy calculation. The hint directly relates to the answer, increasing the risk of guessability. According to the administrator’s customizable criteria, common pet names and direct hints are flagged. As a result, the Al assigns a high-risk score to both the answer and the hint.
[00171] The system alerts the user: “Your answer is commonly used and may be easily guessed. Additionally, your hint directly reveals the nature of your answer. Please choose a more unique answer and a less direct hint.” The user revises the answer to “MountKilimanjaro2015” and updates the hint to “First major hike and year.” The Al re-evaluates the revised inputs. The answer now combines a specific location with a date, increasing uniqueness and complexity. The entropy calculation yields a high value due to the combination of letters and numbers. The hint provides context but is less directly tied to the answer, reducing guessability. The Al assigns a low-risk score to the revised answer and hint. The system confirms: “Your answer and hint meet the recommended security’ criteria.” [00172] Example 3 — Administrator Adjusting Customizable Criteria: An administrator or the system observes a trend where users frequently select security questions related to favorite sports teams, which are easily discoverable and pose security risks. In response, the administrator or the system updates the customizable criteria to include “favorite sports team” in the list of prohibited phrases. Additionally, the administrator increases the minimum complexity threshold for security questions. The Al models are updated in real-time to incorporate the new criteria. Consequently, future user inputs containing prohibited phrases are automatically flagged during evaluation. Users attempting to use now-prohibited phrases receive immediate feedback to modify their inputs, thereby enhancing overall security.
[00173] Example 4 — Al Adapting to Emerging Threats: A recent data breach reveals that the name “Charlie” is frequently used as an answer to security- questions, indicating a potential vulnerability. In response, the Al system updates its high-risk answer database to include “Charlie.” Clustering algorithms are retrained with the latest breach data to identify new patterns of common answers. When a user provides the answer “Charlie” with the hint “Best friend in high school,” the Al identifies “Charlie” as a high-risk answer due to its prevalence. A high-risk score is assigned to the answer. The hint also provides additional context that could aid in guessing the answer, further increasing the risk. The system warns the user: “Your answer is commonly used and may be compromised. Please choose a more unique answer to enhance security.” This proactive adaptation helps mitigate emerging threats by updating security assessments based on real-world data breaches.
[00174] The Al-driven assessment process provides real-time feedback to users via the user interface, as described in fourth step. Users receive immediate feedback on the security of their security questions, answers, and hints entries, along with actionable suggestions for improvement when necessary.
[00175] At a fourth step, the system delivers real-time assessment results to users through a user interface, including for example, GUI module, backend processing servers, and communication APIs. This immediate feedback includes the security scores, visual indicators, checkmarks and crosses, pass/fail statuses, numerical ratings, and/or other indicators of the security of their questions, answers, and hints, along with actionable recommendations for improvement if the entries do not meet the predefined security thresholds. The Al algorithms generate these suggestions based on the customizable criteria established in the first step, ensuring that users can enhance their authentication elements to comply with organizational security standards.
[00176] The real-time feedback mechanism is designed to be interactive and user-friendly, enabling users to refine their inputs until they satisfy the required security criteria. For instance, if a user's security question is deemed too generic or easily guessable, the system may prompt the user to rephrase it to increase complexity and specificity. The Al models provide context-aware suggestions, taking into account factors such as entropy, guessability, complexity scores, and more.
[00177] The system, in embodiments, employs Natural Language Generation (NLG) techniques to present recommendations in clear and understandable language. This approach ensures that users can easily comprehend the suggestions without requiring technical expertise. The NLG component translates complex security7 assessments into user-friendly advice, bridging the gap between advanced Al evaluations and user interactions.
[00178] For example, if a user submits the security question “What is your favorite color?"’ the system may flag it as insufficiently secure due to its commonality and simplicity. The Al would then recommend a more secure alternative, such as “What w as the color of the car you learned to drive in?” which is more specific and less likely to be guessed by unauthorized individuals.
[00179] The system, in embodiments, also considers the balance between security7 and memorability. While enhancing complexity and uniqueness, the recommendations aim to ensure that users can easily recall their security answers during account recovery processes. This is achieved by personalizing suggestions based on user data and historical interactions, thereby creating security elements that are both robust and user-friendly.
[00180] Additionally, the system provides feedback on answer hints to ensure they do not inadvertently compromise the security of the answers. If an answer hint is too revealing, the Al may suggest making it more abstract or indirectly related to the answer, thereby preserving security while still aiding user recall.
[00181] The interactive process continues until the user’s security questions, answers, and hints meet or exceed the security thresholds defined in the first step. The system then securely stores the final approved entries in the encrypted database, ensuring they are protected against unauthorized access.
[00182] By delivering real-time assessments and tailored recommendations, the system empowers users to create highly secure and personalized authentication elements. This proactive approach enhances the overall security of account recovery mechanisms, reducing the risk of unauthorized access due to weak or predictable security questions and answers.
[00183] In another example, if a user initially provides the question '‘What is your mother’s maiden name?,” the system recognizes this as a common and easily discoverable piece of information. The Al would then suggest rephrasing it to something more secure, such as ‘"Describe the location where you proposed to your spouse.” which requires a more detailed and personalized response, thereby increasing security . This allows users, in embodiments, to gain access to the account when answering with an answer that is not an exact match, but is sufficiently close, such that the NLP determines that the user demonstrated sufficient knowledge.
[00184] While the disclosure herein in several instances discusses the one or more security' entries (e.g., questions, answers, and answer hints), it will be understood that the methods and systems described are applicable interchangeably to all these entries, collectively, individually, or in any combination. [00185] FIG. 11 illustrates an embodiment for the integration 1100 of the AI- based evaluation system with existing authentication and account recovery systems 1102. The integration is facilitated by an API gateway 1104. which enables seamless communication between the existing authentication system and the Al evaluation module 1106. The API gatew ay 1104 handles the transmission of data, ensuring that user inputs and security entries are efficiently routed to the appropriate modules.
[00186] The Al evaluation module 1106 serves as an intermediary for evaluating security’ entries, assessing their strength and robustness before they are stored or utilized within the user database 1108. This evaluation process leverages advanced machine learning algorithms to ensure that security entries meet the predefined criteria for entropy, guessability, and complexity. The integration layer 11 10 acts as middleware, ensuring compatibility and smooth operation between the Al evaluation module 1106 and the existing authentication infrastructure.
[00187] Security protocols 1114, represented by padlock symbols on the arrows and modules, highlight the encryption and authentication methods employed during the integration. These protocols ensure that all data transmissions are secure, protecting sensitive user information from unauthorized access and potential breaches.
[00188] The existing authentication system 1102 communicates with the API gateway 1104, which then passes data to the integration layer 1110. The integration layer 1110 interfaces with the Al evaluation module 1106, and results are returned to the integration layer. User data is subsequently stored in the user database 1108. All communication lines are secured with encryption, as indicated by the padlock symbols 1114, ensuring that data flow betw een components remains protected and confidential.
[00189] In the detailed description for the specification, Figure 11 demonstrates an embodiment of how' the Al-based evaluation system (1 100) integrates with existing authentication and account recovery' systems (1102). The API gateway (1104) facilitates communication between the existing authentication system and the Al evaluation module (1106). ensuring that data is accurately and securely transmitted. The integration layer (1110) acts as middleware, translating and formatting data as necessary to maintain compatibility between the systems. The Al evaluation module (1106) assesses the security entries using advanced algorithms before they are stored in the user database (1108). Security protocols (1114) are employed throughout the data flow arrows (1112) to maintain encrypted and authenticated communication, safeguarding user credentials and security information during the integration process. This integration allows organizations to enhance their current authentication mechanisms with Al-driven security evaluations without necessitating a complete overhaul of their existing systems.
[00190] FIG. 12 shows, in accordance with aspects of the present disclosure, an example describing a data processing system 1200. In this example, data processing system 1200 is an illustrative data processing system suitable for implementing aspects of systems and methods for evaluating and strengthening security' of authentication systems. More specifically, in some examples, devices that are embodiments of data processing systems, e.g.. smartphones, tablets, personal computers may be used by one or more users such as retailers, customers, advertisers, consumers, patients, healthcare providers, etc. Further, devices that are embodiments of data processing systems, e.g., smartphones, tablets, personal computers, may be used as one or more server(s) in encoding, decoding, and communicating data with one or more mobile communication devices.
[00191] In this illustrative example, data processing system 1200 includes communications framework 1202. Communications framework 1202 provides communications between processor unit 1204, memory 1206, persistent storage 1208, communications unit 1210, input/output (I/O) unit 1212, and display 1214. Memory 1206, persistent storage 1208, communications unit 1210, input/output (I/O) unit 1212, and display 1214 are examples of resources accessible by processor unit 1204 via communications framework 1202.
[00192] Processor unit 1204 serves to run instructions that may be loaded into memory’ 1206. Processor unit 1204 may be a number of processors, a multi-processor core, or some other ty pe of processor, depending on the particular implementation. Further, processor unit 1204 may be implemented using a number of heterogeneous processor systems in which a main processor is present with secondary processors on a single chip. As another illustrative example, processor unit 1204 may be a symmetric multi-processor system containing multiple processors of the same ty pe.
[00193] Memory 1206 and persistent storage 1208 are examples of storage devices 1216. A storage device is any piece of hardware that is capable of storing information, such as, for example, without limitation, data, program code in functional form, and other suitable information either on a temporary' basis or a permanent basis.
[00194] Storage devices 1216 also may be referred to as computer-readable storage devices in these examples. Memory 1206, in these examples, may be, for example, a random access memory or any other suitable volatile or non-volatile storage device. Persistent storage 1208 may take various forms, depending on the particular implementation. In embodiments, the logging module, as described with respect to various embodiments, may be carried out by the processor unit 1204 and stored in one or more storage devices 1216 as may be appropriate.
[00195] For example, persistent storage 1208 may contain one or more components or devices. For example, persistent storage 1208 may7 be a hard drive, a flash memory, a rewritable optical disk, a rewritable magnetic tape, or some combination of the above. The media used by persistent storage 1208 also may be removable. For example, a removable hard drive may be used for persistent storage 1208.
[00196] Communications unit 1210, in these examples, provides for communications with other data processing systems or devices. In these examples, communications unit 1210 is anetwork interface card. Communications unit 1210 may provide communications through the use of either or both physical and wireless communications links. [00197] Input/output (I/O) unit 1212 allows for input and output of data with other devices that may be connected to data processing system 1200. For example, input/output (I/O) unit 1212 may provide a connection for user input through a keyboard, a mouse, and/or some other suitable input device. Further, input/output (I/O) unit 1212 may send output to a printer. In embodiments, the input module, the output module, and the feedback module, as described with respect to various embodiments, may be carried out by the input/output (I/O) unit 1212 as may be appropriate. The Display 1214 provides a mechanism to display information to a user. In embodiments, the user interface module, as described with respect to various embodiments, may be carried out by the display 1214 as may be appropriate.
[00198] Instructions for the operating system, applications, and/or programs may be located in storage devices 1216, which are in communication with processor unit 1204 through communications framework 1202. In these illustrative examples, the instructions are in a functional form on persistent storage 1208. These instructions may be loaded into memory 1206 for execution by processor unit 1204. The processes of the different embodiments may be performed by processor unit 1204 using computer- implemented instructions, which may be located in a memory, such as memory 1206.
[00199] These instructions are referred to as program instructions, program code, computer usable program code, or computer-readable program code that may be read and executed by a processor in processor unit 1204. The program code in the different embodiments may be embodied on different physical or computer-readable storage media, such as memory 1206 or persistent storage 1208.
[00200] Program code 1218 is located in a functional form on computer-readable media 1220 that is selectively removable and may be loaded onto or transferred to data processing system 1200 for execution by processor unit 1204. Program code 1218 and computer-readable media 1220 form computer program product 1222 in these examples. In one example, computer-readable media 1220 may be computer-readable storage media 1224 or computer-readable signal media 1226. [00201] Computer-readable storage media 1224 may include, for example, an optical or magnetic disk that is inserted or placed into a drive or other device that is part of persistent storage 1208 for transfer onto a storage device, such as a hard drive, that is part of persistent storage 1208. Computer-readable storage media 1224 also may take the form of a persistent storage, such as a hard drive, a thumb drive, or a flash memory, that is connected to data processing system 1200. In some instances, computer-readable storage media 1224 may not be removable from data processing system 1200.
[00202] In these examples, computer-readable storage media 1224 is a physical or tangible storage device used to store program code 1218 rather than a medium that propagates or transmits program code 1218. Computer-readable storage media 1224 is also referred to as a computer-readable tangible storage device or a computer-readable physical storage device. In other words, computer-readable storage media 1224 is non- transitory.
[00203] Alternatively, program code 1218 may be transferred to data processing system 1200 using computer-readable signal media 1226. Computer-readable signal media 1226 may be, for example, a propagated data signal containing program code 1218. For example, computer-readable signal media 1226 may be an electromagnetic signal, an optical signal, and/or any other suitable type of signal. These signals may be transmitted over communications links, such as wireless communications links, optical fiber cable, coaxial cable, a wire, and/or any other suitable type of communications link. In other words, the communications link and/or the connection may be physical or wireless in the illustrative examples.
[00204] In some illustrative embodiments, program code 1218 may be downloaded over a network to persistent storage 1208 from another device or data processing system through computer-readable signal media 1226 for use within data processing system 1200. For instance, program code stored in a computer-readable storage medium in a server data processing system may be downloaded over a netw ork from the server to data processing system 1200. The data processing system providing program code 1218 may be a server computer, a client computer, or some other device capable of storing and transmitting program code 1218.
[00205] The different components illustrated for data processing system 1200 are not meant to provide architectural limitations to the manner in which different embodiments may be implemented. The different illustrative embodiments may be implemented in a data processing system including components in addition to and/or in place of those illustrated for data processing system 1200. Other components shown in FIG. 2 can be varied from the illustrative examples shown. The different embodiments may be implemented using any hardware device or system capable of running program code. As one example, data processing system 1200 may include organic components integrated with inorganic components and/or may be comprised entirely of organic components excluding ahuman being. For example, a storage device may be comprised of an organic semiconductor.
[00206] In another illustrative example, processor unit 1204 may take the form of a hardware unit that has circuits that are manufactured or configured for a particular use. This type of hardware may perform operations without needing program code to be loaded into a memory from a storage device to be configured to perform the operations.
[00207] For example, when processor unit 1204 takes the form of a hardware unit, processor unit 1204 may be a circuit system, an application specific integrated circuit (ASIC), a programmable logic device, or some other suitable type of hardware configured to perform a number of operations. With a programmable logic device, the device is configured to perform the number of operations. The device may be reconfigured at a later time or may be permanently configured to perform the number of operations. Examples of programmable logic devices include, for example, a programmable logic array, a field programmable logic array, a field programmable gate array, and other suitable hardware devices. With this type of implementation, program code 1218 may be omitted, because the processes for the different embodiments are implemented in a hardware unit. [00208] In still another illustrative example, processor unit 1204 may be implemented using a combination of processors found in computers and hardware units. Processor unit 1204 may have a number of hardware units and a number of processors that are configured to run program code 1218. With this depicted example, some of the processes may be implemented in the number of hardware units, while other processes may be implemented in the number of processors. In embodiments, the preprocessing, processing, feedback, and extraction modules, as described with respect to various embodiments, may be carried out using any configuration of processing unit 1204 as may be appropriate.
[00209] In another example, a bus system may be used to implement communications framework 1202 and may be comprised of one or more buses, such as a system bus or an input/output (I/O) bus. Of course, the bus system may be implemented using any suitable type of architecture that provides for a transfer of data between different components or devices attached to the bus system.
[00210] Additionally, communications unit 1210 may include a number of devices that transmit data, receive data, or both transmit and receive data. Communications unit 1210 may be, for example, a modem or a network adapter, two network adapters, or some combination thereof. Further, a memory may be, for example, memory 1206, or a cache, such as that found in an interface and memory controller hub that may be present in communications framework 1202.
[00211] The flowcharts and block diagrams described herein illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various illustrative embodiments. In this regard, each block in the flowcharts or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function or functions. It should also be noted that, in some alternative implementations, the functions noted in a block may occur out of the order noted in the drawings. For example, the functions of two blocks shown in succession may be executed substantially concurrently, or the functions of the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
[00212] FIG. 13 shows an example describing a general network data processing system 1300, interchangeably termed a network, a computer network, a network system, or a distributed network, aspects of which may be included in one or more illustrative embodiments of methods and systems of scrutinizing questions, answers, and answer hints based on customizable criteria. For example, one or more mobile computing devices or data processing devices may communicate with one another or with one or more servers(s) through the network. It should be appreciated that FIG. 13 is provided as an illustration of one implementation and is not intended to imply any limitation with regard to environments in which different embodiments may be implemented. Many modifications to the depicted environment may be made.
[00213] Network data processing system 1300 is a network of computers, each of which is an example of data processing system 1200, and other components. Network data processing system 1300 may include network 1302, which is a medium configured to provide communications links between various devices and computers connected together within network data processing system 1300. Network 1302 may include connections such as wired or wireless communication links, fiber optic cables, and/or any other suitable medium for transmitting and/or communicating data between network devices, or any combination thereof.
[00214] In the depicted example, a first network device 1304 and a second network device 1306 connect to network 1302, as does an electronic storage device 1308. Network devices 1304 and 1306 are each examples of data processing system 1200, described above. In the depicted example, devices 1304 and 1306 are shown as server computers. However, network devices may include, without limitation, one or more personal computers, mobile computing devices such as personal digital assistants (PDAs), tablets, and smart phones, handheld gaming devices, wearable devices, tablet computers, routers, switches, voice gates, servers, electronic storage devices, imaging devices, and/or other networked-enabled tools that may perform a mechanical or other function. These network devices may be interconnected through wired, wireless, optical, and other appropriate communication links.
[00215] In addition, client electronic devices, such as a client computer 1310, a client laptop or tablet 1312, and/ or a client smart device 1314, may connect to network 1302. Each of these devices is an example of data processing system 1200, described above regarding FIG. 12. Client electronic devices 1310, 1312, and 1314 may include, for example, one or more personal computers, network computers, and/or mobile computing devices such as personal digital assistants (PDAs), smart phones, handheld gaming devices, wearable devices, and/or tablet computers, and the like. In the depicted example, server 1304 provides information, such as boot files, operating system images, and applications to one or more of client electronic devices 1310, 1312, and 1314. Client electronic devices 1310, 1312, and 1314 may be referred to as "clients" with respect to a server such as server computer 1304. Network data processing system 1300 may include more or fewer servers and clients or no servers or clients, as well as other devices not shown.
[00216] Client smart device 1314 may include any suitable portable electronic device capable of wireless communications and execution of software, such as a smartphone or a tablet. Generally speaking, the term "smartphone" may describe any suitable portable electronic device having more advanced computing ability and network connectivity than a typical mobile phone. In addition to making phone calls (e.g., over a cellular network), smartphones may be capable of sending and receiving emails, texts, and multimedia messages, accessing the Internet, and/or functioning as a web browser. Smartdevices (e.g., smartphones) may also include features of other known electronic devices, such as a media player, personal digital assistant, digital camera, video camera, and/or global positioning system. Smartdevices (e.g., smartphones) may be capable of connecting with other smartdevices, computers, or electronic devices wirelessly, such as through near field communications (NFC). BLUETOOTH®, Wi-Fi, or mobile broadband networks. Wireless connectivity may be established among smartdevices, smartphones, computers, and other devices to form a mobile network where information can be exchanged. [00217] Program code located in system 1300 may be stored in or on a computer recordable storage medium, such as persistent storage 1308 in FIG. 13. and may be downloaded to a data processing system or other device for use. For example, program code may be stored on a computer recordable storage medium on server computer 1304 and downloaded for use to client 1310 over network 1302 for use on client 1310.
[00218] Network data processing system 1300 may be implemented as one or more of a number of different types of networks. For example, system 1300 may include an intranet, a local area network (LAN), a wade area network (WAN), or a personal area network (PAN). In some examples, network data processing system 1300 includes the Internet, with network 1302 representing a worldwide collection of networks and gateways that use the transmission control protocol/Intemet protocol (TCP/IP) suite of protocols to communicate with one another. At the heart of the Internet is a backbone of high-speed data communication lines between major nodes or host computers. Thousands of commercial, governmental, educational and other computer systems may be utilized to route data and messages. FIG. 13 is intended as an example, and not as an architectural limitation for any illustrative embodiments.
[00219] It will be appreciated that the invention is not restricted to the particular embodiment that has been described, and that variations may be made therein without departing from the scope of the invention as defined in the appended claims, as interpreted in accordance with principles of prevailing law including the doctrine of equivalents or any other principle that enlarges the enforceable scope of a claim beyond its literal scope. Unless the context indicates otherwise, a reference in a claim to the number of instances of an element, be it a reference to one instance or more than one instance, requires at least the stated number of instances of the element but is not intended to exclude from the scope of the claim a structure or method having more instances of that element than stated. The word “comprise" or a derivative thereof, when used in a claim, is used in a nonexclusive sense that is not intended to exclude the presence of other elements or steps in a claimed structure or method.

Claims

1. A method for enhancing security of user account recovery', the method comprising: a. receiving one or more user inputs; b. training one or more artificial intelligence models to identify low- security user inputs based, in part, on patterns recognized in datasets containing security breaches, undesired user inputs, and user data: c. configuring the one or more artificial intelligence models using one or more adjustable security parameters, wherein the security parameters affect entropy, predictability, and complexity outputs; d. configuring relative contribution of each of the one or more artificial intelligence models using one or more adjustable influence parameters; e. evaluating security of the one or more user inputs based, in part, on the one or more adjustable security' parameters and the one or more adjustable influence parameters applied to one or more outputs from the artificial intelligence models; and f. generating one or more security’ scores of the one or more user inputs based, in part, on the evaluated user inputs.
2. The method of claim 1 , further comprising prompting the user to provide one or more revised user inputs based on the generated security scores of the one or more user inputs.
3. The method of claim 2, wherein prompting the user includes presenting to the user one or more recommendations that are each predicted to achieve a security score that meets or exceeds a predetermined threshold.
4. The method of claim 1, further comprising rejecting one or more of the one or more user inputs when a respective security score of the one or more generated security scores is below a predetermined threshold.
5. The method of claim 1. further comprising classifying each of the one or more user inputs based on its public availability.
6. The method of claim 4, further comprising storing the one or more user inputs, the classification of each the one or more user inputs, and the generated security' scores of the one or more user inputs in a memory.
7. The method of claim 1, wherein the one or more generated security scores are updated in real-time in response to training of the one or more artificial intelligence models with newly identified security breach data or new publicly available data related to the user.
8. A system for enhancing the security' of user account recovery', comprising: a. A neural network comprising network layers including an input layer, one or more hidden layers, and an output layer, wherein i. the input layer is configured to receive one or more tokenized user inputs; ii. the one or more hidden layers are configured to evaluate dependencies and semantic relationships of the one or more tokenized user inputs and extract patterns from the one or more tokenized user inputs, using one or more pre-trained models, wherein the one or more pre-trained models are trained on datasets of security breaches and user behavior patterns, wherein the one or more pre-trained modes are configured using one or more adjustable security parameters, wherein the one or more configured pre-trained models are configured to generate one or more outputs, wherein relative contribution of the one or more generated outputs is configured using one or more adjustable influence parameters; and iii. the output layer is configured to generate one or more security scores based on the one or more generated outputs and the relative contribution of the one or more generated outputs from the one or more hidden layers.
9. The system of claim 8, wherein the one or more hidden layers are further configured to categorize each tokenized user input based on its level of public availability.
10. The system of claim 8, wherein the adjustable influence parameters include a pre-determined threshold for minimum entropy, a minimum complexity, or a maximum predictability' index.
11. The system of claim 8. wherein the one or more hidden layers comprise a deep learning transformer network configured to detect interdependencies between tokens in two or more user inputs.
12. A system for evaluating the security of user account recovery comprising: a. an input module configured to receive two or more user inputs; b. a preprocessing module configured to tokenize each of the two or more user inputs; c. a processing module including one or more artificial intelligence models, trained, in part, using known undesired user inputs, and configured to assess the security of the tokenized two or more user inputs based on adjustable security parameters, wherein the assessment of each user input is, in part, dependent on semantic and temporal relationships of the two or more of the other user inputs, wherein each of the models determines an assessment score; d. an output module configured to determine one or more security scores based on the adjustable security parameters and the determined assessment scores; e. a feedback module configured to provide suggested improvements to the one or more user inputs based the one or more security- scores; and f. a user interface module configured to display the one or more security' scores and the suggested improvements to a user.
13. The system of claim 12, wherein the processing module further includes a public availability7 assessment sub-module configured to identify publicly available personal data related to the user.
14. The system of claim 13, further comprising a logging module configured to store the two or more tokenized user inputs, the corresponding one or more security scores, and the corresponding public availability assessments.
15. The system of claim 12, wherein the feedback module generates suggested improvements by comparing the two or more tokenized user inputs with a database of high-security inputs.
16. The system of claim 13, wherein the feedback module is further configured to rank the suggested improvements based on predicted impact to an overall security score.
17. A method for training a deep neural network configured to enhance the security of user account recovery; the method comprising: a. receiving a training dataset comprising authentication data, including questions and corresponding answers, wherein each question and corresponding answer is annotated with an associated security score based, in part, on correlation between each of the questions and semantics of the corresponding answers to the questions; b. tokenizing the dataset into tokens; c. processing the dataset using a feature extraction module to identify relevant features, that include entropy, complexity, and predictability; d. training the deep neural network using a supervised learning algorithm, wherein:
1. a forward propagation algorithm is applied through the deep neural network to generate one or more security scores, and ii. a back-propagation algorithm is applied to adjust weights within the deep neural network based on generated one or more security scores; e. repeating application of forward- and back-propagation algorithms until the deep neural network meets a predefined threshold for generating security scores; and f. refining the deep neural network based on new datasets from new known security breach datasets and new user behavior patterns.
18. The method of claim 17. wherein the training dataset further comprises user behavior data correlated with security breach outcomes.
19. The method of claim 17, wherein the tokenization is performed using a subword tokenization.
20. The method of claim 17, further comprising validating the trained neural network using a hold-out test set derived from real-world user data, wherein the test set includes known security breaches and user behavior patterns.
PCT/US2024/050838 2023-10-10 2024-10-10 Methods and systems of scrutinizing questions, answers, and answer hints based on customizable criteria Pending WO2025080884A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363589261P 2023-10-10 2023-10-10
US63/589,261 2023-10-10

Publications (1)

Publication Number Publication Date
WO2025080884A1 true WO2025080884A1 (en) 2025-04-17

Family

ID=95396418

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/050838 Pending WO2025080884A1 (en) 2023-10-10 2024-10-10 Methods and systems of scrutinizing questions, answers, and answer hints based on customizable criteria

Country Status (1)

Country Link
WO (1) WO2025080884A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180302423A1 (en) * 2015-08-31 2018-10-18 Splunk Inc. Network security anomaly and threat detection using rarity scoring
US20190327237A1 (en) * 2016-03-31 2019-10-24 Microsoft Technology Licensing, Llc Personalized Inferred Authentication For Virtual Assistance
US20200153855A1 (en) * 2016-02-26 2020-05-14 Oracle International Corporation Techniques for discovering and managing security of applications
US20200364366A1 (en) * 2019-05-15 2020-11-19 International Business Machines Corporation Deep learning-based identity fraud detection
US20220253871A1 (en) * 2020-10-22 2022-08-11 Assent Inc Multi-dimensional product information analysis, management, and application systems and methods

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180302423A1 (en) * 2015-08-31 2018-10-18 Splunk Inc. Network security anomaly and threat detection using rarity scoring
US20200153855A1 (en) * 2016-02-26 2020-05-14 Oracle International Corporation Techniques for discovering and managing security of applications
US20190327237A1 (en) * 2016-03-31 2019-10-24 Microsoft Technology Licensing, Llc Personalized Inferred Authentication For Virtual Assistance
US20200364366A1 (en) * 2019-05-15 2020-11-19 International Business Machines Corporation Deep learning-based identity fraud detection
US20220253871A1 (en) * 2020-10-22 2022-08-11 Assent Inc Multi-dimensional product information analysis, management, and application systems and methods

Similar Documents

Publication Publication Date Title
US11882118B2 (en) Identity verification and management system
US11829486B1 (en) Apparatus and method for enhancing cybersecurity of an entity
US20240265114A1 (en) An apparatus and method for enhancing cybersecurity of an entity
US20240411896A1 (en) End-to-end measurement, grading and evaluation of pretrained artificial intelligence models via a graphical user interface (gui) systems and methods
Asmar et al. Integrating machine learning for sustaining cybersecurity in digital banks
Mahmood et al. Optimizing network security with machine learning and multi-factor authentication for enhanced intrusion detection
US11663397B1 (en) Digital posting match recommendation apparatus and method
KR20180105688A (en) Computer security based on artificial intelligence
US20250124001A1 (en) Apparatus and method for data ingestion for user-specific outputs of one or more machine learning models
KR20220109418A (en) neural flow proof
US20230252416A1 (en) Apparatuses and methods for linking action data to an immutable sequential listing identifier of a user
US11586766B1 (en) Apparatuses and methods for revealing user identifiers on an immutable sequential listing
US11886403B1 (en) Apparatus and method for data discrepancy identification
Gonaygunta Factors influencing the adoption of machine learning algorithms to detect cyber threats in the banking industry
US20250168003A1 (en) Systems and methods for encrypting data
US11573986B1 (en) Apparatuses and methods for the collection and storage of user identifiers
Breve et al. Hybrid prompt learning for generating justifications of security risks in automation rules
Román‐Gallego et al. Artificial Intelligence Web Application Firewall for advanced detection of web injection attacks
Tsinganos et al. Cse-ars: Deep learning-based late fusion of multimodal information for chat-based social engineering attack recognition
Zuo et al. Federated TrustChain: Blockchain-enhanced LLM training and unlearning
US12125410B1 (en) Apparatus and method for data ingestion for user specific outputs of one or more machine learning models
Kim et al. PassREfinder: Credential stuffing risk prediction by representing password reuse between websites on a graph
Riza et al. Leveraging Machine Learning and AI to Combat Modern Cyber Threats
US11847660B2 (en) Apparatus for automatic credential classification
US11461652B1 (en) Apparatus and methods for status management of immutable sequential listing records for postings

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24878033

Country of ref document: EP

Kind code of ref document: A1