ANTI-PHISING LOGON AUTHENTICATION OBJECT ORIENTED SYSTEM AND METHOD CROSS-REFERENCE TO RELATED APPLICATIONS The present application claims priority under 35 U.S. C. §119(e) of the following U.S. Patent Provisional Applications: Serial Number 60/633,364, filed December 4, 2004, entitled "User- friendly mutual authentication with protection against man in the middle attacks"; and Serial Number 60/647,065, filed January 28, 2005, entitled "Automated comparison with protection against man in the middle attacks".
BACKGROUND OF THE INVENTION
Field of the Invention.
The invention relates to Internet web site software. More specifically, the field of the invention is that of authentication software for prpviding a high level of confidence in on-line transactions.
Description of the Related Art.
When doing business on the Internet, it is important to verify the identity of the individuals and organizations that one deals with. Let's consider an example that, rather unfortunately, hits a little too close to home for a number of online banking clients. Suppose you wake up one morning, and, while checking your email, you notice you have received an email from your bank stating that they have a new free online security product and that they are asking all users to sign up for it, all you have to is click on the link to the bank's web-page and log-in. Being a security conscious user, you're happy to see your bank finally taking some proactive security measures and you click on the link and log on to your online account; after logging on, you're told that due to security concerns with the previous security system, the bank needs some additional information to confirm your identity. The site asks for you current mailing address, social
security number and the credit card number associated with the overdraft protection on your checking account in order to verify your identity. You comply, at which point the bank's web server informs you that the extra security features have been activated on your account, and it logs you on to the system. Several days later you discover some rather odd charges on your credit card bill that originate in Eastern Europe. You call your bank and they explain that you've been phished (with a 'ph'). It seems that the email you thought you had received from the bank was actually from a phisher (attacker), and the link in the email that you clicked had directed you to a web-site controlled by the phisher, not the bank's web-site as you had thought. Therefore, you had provided your useπiame, password, address, social security number and credit card number to the phisher. This has rather dire consequences for you, as you use the same username and password at several popular e-commerce sites, and it turns out several fraudulent purchases have been made from your account on several of these sites, using other credit cards that belong to you. After several days of making phone calls to different merchants, credit-card companies, credit-rating agencies and local authorities, you close all accounts to which the phisher might have access, have new credit-cards issued in your name, and seriously consider never using the Internet for e-commerce or online banking again.
At its heart, phishing is an attack that combines social engineering, web spoofing and often spamming in an attempt to trick users out of confidential information for a variety of nefarious reasons. Normally the information of interest to attackers includes usernames and passwords for financial institutions' web sites, or currently valid credit-card numbers; in both cases the goal is generally fraud. In its simplest guise, an attacker phishes as follows: firstly, the phisher produces a fraudulent web site that mimics a well known legitimate site, such as a large bank; secondly, the phisher attempts to lure a large number of users into divulging the desired information to the fraudulent mimicked web site. In the second step, the phisher generally lures users to its mimicked site by indiscriminately sending out spam that requests users to authenticate or provide information to the fraudulent web page that's masquerading as the legitimate page. The mimicked web site is linked to in the spam.
The term phishing is derived from the term fishing, as you can think of the attacker (a.k.a. the phisher) as casting out a lure with the sending of the spam messages, and then waiting for the victims (a.k.a. the phish) to bite, by providing information to the illegitimate site. The spelling of phishing with a 'ph' comes from the historical tendency to replace T by 'ph' in the hacking community, a tendency that began with the creation of the work "phreakin" that combined the words "phone" and "frea", and which continues in 133t speak. L33t speak is a typed dialect of English that is used by certain segments of the hacking community. In this dialect, words often shorted, words are purposefully misspelled, and different letters are often replaced with numerals or other letter combinations that have similar appearance or sound. For example, 133t comes from the word elite, which is misspelled as eleet, shortened to leet and finally the letter V is replaced with the numeral '3'. The original idea was that 133t speak was the language of the elite hacking community, but the usage has become popularized to many online forums.
The attack described above is, at the time of this writing, the most common variant of the phishing attack. Of course, there are many different variants that have been spotted in the wild and many others that are likely to occur in the future. An example of variants that have already been seen or are predicted to be seen in the near future are attacks that lure phishes to the mimicked or spoofed sites by way of instant messaging, text messaging on cellular phones or comments sections on blogs, as opposed to the traditional use of email based spam. Additionally, while phishing attacks are not new, they have been around in one form or another since the 1990's (Generally the first phishing attacks are believed to have occurred on AOL, with phishers trying to phish for passwords to legitimate AOL accounts), the attack is becoming increasing popular. The Anti-PhishingWorking Group reported in their August 2005 report that phishing had increased some 59% since the beginning of the year, with some 5259 unique phishing sites being discovered in the month of August alone. This increase in popularity can be at least partially attributed to the low risks for the perpetrators of such an attack and the possibility of large rewards. Further, the larger (and less technically savvy) segment of society that now uses the Internet combined with the growth of ecommerce suggests that the pool of people that can actively be taken advantage by phishers has grown significantly over the last few years. This is disturbing for both users and commercial e-commerce operations: users do not
wish to be the victims of fraud, and e-commerce operations are afraid of both the direct costs of covering client fraud and, probably more importantly, the opportunity cost of potential and current customers not performing online commerce due to the fear of being victimized by phishing attacks.
There are a number of cryptographic protocols dedicated to keeping information both private and authenticated, so you may be wondering why phishing is a problem at all. Assuming that web sites are constructed with appropriate cryptographic protocols, shouldn't everything be secure? It is true that there are many cryptographic protocols that are quite secure, and mostly unbreakable, except at large financial and/or computational costs, and these are to some degree what are used to currently secure web sites. Given the large costs in breaking these schemes, if you're a thief and you can't attack the cryptographic protocols, then you should go around them! How is this done? Cryptographers and security engineers, who design and implement the protocols that are used to make sites secure, begin with the reasonable assumption that the sites' users actually invoke the intended security protocols. After all, the security protocol is there to protect the user, and generally the use of the security protocol is not optional from the sites' points of view, so surely the users will run the security protocol, as they have seemingly no choice.
Unfortunately for the cryptographers and security engineers, one of the main benefits of phishing for an attacker is that it bypasses all of the standard security protocols on a legitimate server by ignoring them. When a phisher spoofs a legitimate site, the resulting fraudulent site either does not implement the security protocols of the legitimate site (but does go to some trouble to make it appear as though the correct protocols are working), or implements the security protocols, but in a fashion that guarantees that the protocols don't prevent the phisher from learning the information of interest. For instance, the fraudulent site may implement a public key encryption protocol such as SSL, but it will do so with a keypair for which the phisher knows the secret key, and he can therefore decrypt any messages received by his fraudulent site. Therefore, if phishers can lure a victim to their fraudulent sites, the traditional security measures are unimportant, as the victim is not using them or is using them in a fashion that does not protect the data.
Historically, the average user has had no practical way of discerning this lack or incorrect use of security protocols. To make matters worse, it's an unfortunate fact that many users use the same password to authenticate themselves to multiple sites, so once a victim has given her password to a phisher she has not only compromised her security at the spoofed site, but also at the many other sites where she uses the same password. This also makes life difficult for many site operators, as the security of their site is often at least partially dependent of the security of other sites with which they have a shared user base. This is at least true for the majority of sites that rely on user selected passwords as their means of authenticating users.
Because of the large growth in the popularity of phishing attacks over the last year or two, a number of different researchers have brought a number of different tools to the table in order to attempt to provide security from such attacks. There are two clear areas where we can try and prevent phishing attacks from occurring: first, by making it more difficult for phishers to lure their pray to a spoofed or mimicked site in the first place; and second, by making it more difficult for a user to unknowingly divulge their information to a spoofed site once they have arrived. Pairing is the term used to describe the scenario of when two or more devices determine whether they have successfully established some shared and essentially identical value, which may be a secret key, or whether they may have inconsistent versions of the same, the latter which may have been caused by a man-in-the-middle attack. Such pairing is known to be difficult to perform; a recent survey of proposals is given by Gehrmann, Mitchell and Nyberg in their CryptoBytes Spring 2004 article. This article demonstrates the current state in the art, and emphasizes the long-felt need for good pairing methods. Here, good may mean secure against active attacks. It may also mean efficient, and it may also mean resistant against leaks of information, whether accidental or intended (by an attacker). An example of a piece of information that we wish not to leak is that corresponding to the two or more values to be compared using the pairing algorithm. Another example of information is a PIN or a password, as is often used to secure such an exchange, and which is given examples of in the article by Gehrmann et al. The reason the latter should not be leaked is many fold. For one thing, users may reuse PINs, especially for devices that have fixed PINs that cannot be modified by users - headphones are typical examples of such. Another reason is that an active attacker may disrupt
one pairing attempt and in the action of doing so learn some part of the PIN; when the attempt is repeated, he may disrupt it again, and learn some further portion; until he finally can guess the PIN with a very good probability, and then successfully impersonate one or more protocol participants; this would correspond to the lack of security of the protocol, and a failure to securely perform pairing. It would not, however, with necessity be discoverable by the victims of the attack. This emphasizes the need for secure pairing algorithms that do not leak information of any type. We can see from the survey by Gehrmann et al that all known approaches to date have some flaw, vulnerability, or drawback.
Like the spam problem, it is unlikely that one silver bullet solution that will solve the phishing problem. However, further defensive measures are desired.
SUMMARY OF THE INVENTION
The present invention is a authentication system and method which allows for the user to authenticate an Internet web site without giving out secret information.
The present invention is particularly designed to prevent the man-in-the-middle attack. What makes the man-in-the-middle attack so difficult to protect against is the following dilemma: (1) The client machine and the server machine do not have any prior shared secret. (If they did, there would be no problem with either authentication or man-in-the-middle attacks.); and (2) There is a shared secret (the personal identifying information) between the user and the server. Secrets of this kind might be known in an unencoded format or "cleartext" (the way the user knows his password, and the server may have a stored picture) or somehow be encoded in some verifiable but cryptographiclly one-way method (the way a server stores a hashed password, and therefore can recognize the password, or the way a user can recognize a picture). In order to verify shared secrets, they would have to be sent over the network between the client machine and server. Neither machine can verify that it is actually sending or receiving the information to or from the intended machine. Instead both the client and the server may be communicating with a man-in- the-middle attacker. They server and client cannot encrypt the information with a known key
(since, due to 1 above there is no shared secret between the machines). The client or server could encrypt with a key provided by the alleged sever or client respectively, but this is open to an attack by a man-in-the-middle attacker, who could provide his victims his own keys.
To ground and clarify our exposition, we use a bank as our running example of a network service provider. This example should not be seen as limiting in any method the type of network service providers which could make use of our solution, nor should it any way be seen as necessary for our exemplary embodiment. The present invention works by training users to recognize official web-sites from impostor web-sites in a simple and transparent manner.
In order to do this, each time a user logs on to the banks official web site, at which the user has an account, a series of pictures will be displayed to the user. For a given user, the selected pictures are unique to each service provider, but the library from which such pictures are selected is assumed to be public knowledge. Further, the selection displayed to each user is hard to predict without prior knowledge of the user's Personally Identifiable Information ("PII"), and therefore are nearly impossible for an attacker, often called a phisher in these instances, to duplicate. Through a simple education campaign, users will be instructed that every time they log-on to the bank's official site, the same series of pictures will be displayed, and should the users ever log-on to a site claiming to be "official", but where the site is unable to provide verification through the display of the appropriate images, then the users should immediately stop providing any information, but especially his PII, to the web-site.
The present invention requires no changes to the existing infrastructure (e.g., the use of different types of PIIs, and their use). In the method of the present invention, the user is sent a first picture, and then, again assuming that the picture presented is the one that the user intended to see, enters a first digit or character (if the picture is correct). This is sent across in a particular protected format. The server responds by a second picture that is displayed to the user, who then gives the second digit or character in his PII (again this assuming the picture is the anticipated on). This is repeated for each character or digit. If at any time, the wrong picture is displayed, then the user stops entering his PII (i.e. PIN/password). That makes the man-in-the-middle
attack not possible, as the impostor cannot obtain all the correct pictures beforehand. In the method of the present invention, the phising party posing as the server will not learn any portion of the secret held by the user until the user has authenticated the server. Further, the party posing as the user will not learn any portion of the server authentication until it has provided a correct input, which is a prefix of the PII (i.e. PIN, password, or other authenticating information). Moreover, the information sent from client to server is readable to neither the server nor an impostor until after the user's correct sequence of pictures has been received. This means that it will not be possible for an attacker to obtain a partial PIN or password in one attack, and then extend this in a later attack: he will not get more characters until he sends the correct pictures over. The method of the present invention makes sure that the selection of the pictures depends on the protected characters or digits of the PII sent over, even without these being possible to read by the server.
The present invention, in one form, relates to a method of authenticating a user and a server over computer network. The server sends a logon interface to a client device of a user through a communication channel. Information from the user is sent to the server. The server sends stimuli to the user, the stimuli having been previously associated with the user. The user then sends secret information to the server for authentication of the user. Futher, exchange of information and stimuli may be repeated a sufficient number of times to convince the remote user of the authenticity of the server.
The present invention, in another form, is a server computer for authenticating a user of a client device. The server computer includes a logon procedure for sending a logon interface to the user through a communication channel with the client device. A receiving procedure is also used for receiving information from the user. The server includes an association of a series of stimuli with the user and is adapted to provide associated stimuli to the logon interface of the user. An authentication procedure authenticates the user through the logon interface.
A further aspect of the present invention involves a computing device for a user to communicate with a server over a network. The computing device includes a logon interface for the user to
access the server through a network communication channel. The computing device has transmission facilities for sending information from the user to the server; stimuli presentation facilities for providing stimuli indicated by information received from the server through the logon interface; and an authentication procedure for authenticating the server through the logon interface.
BRIEF DESCRIPTION OF THE DRAWINGS
The above mentioned and other features and objects of this invention, and the manner of attaining them, will become more apparent and the invention itself will be better understood by reference to the following description of an embodiment of the invention taken in conjunction with the accompanying drawings, wherein:
Figure 1 is a schematic diagrammatic view of a man-in-the-middle scenario of the present invention.
Figure 2 is a flow chart diagram of the operation of the present invention relating to authenticating a logon event using the method of the present invention.
Figure 3 is a schematic diagrammatic view of a transfer protocol used in one embodiment of the present invention.
Corresponding reference characters indicate corresponding parts throughout the several views. Although the drawings represent embodiments of the present invention, the drawings are not necessarily to scale and certain features may be exaggerated in order to better illustrate and explain the present invention. The exemplification set out herein illustrates an embodiment of the invention, in one form, and such exemplifications are not to be construed as limiting the scope of the invention in any manner.
DESCRIPTION OF THE PRESENT INVENTION
The embodiment disclosed below is not intended to be exhaustive or limit the invention to the precise form disclosed in the following detailed description. Rather, the embodiment is chosen and described so that others skilled in the art may utilize its teachings.
The detailed descriptions which follow are presented in part in terms of algorithms and symbolic representations of operations on data bits within a computer memory representing alphanumeric characters or other information. These descriptions and representations are the means used by those skilled in the art of data processing arts to most effectively convey the substance of their work to others skilled in the art.
An algorithm is here, and generally, conceived to be a self-consistent sequence of steps leading to a desired result. These steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It proves convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, symbols, characters, display data, terms, numbers, or the like. It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely used here as convenient labels applied to these quantities.
Some algorithms may use data structures for both inputting infoπnation and producing the desired result. Data structures greatly facilitate data management by data processing systems, and are not accessible except through sophisticated software systems. Data structures are not the information content of a memory, rather they represent specific electronic structural elements which impart a physical organization on the information stored in memory. More than mere abstraction, the data structures are specific electrical or magnetic structural elements in memory which simultaneously represent complex data accurately and provide increased efficiency in computer operation.
Further, the manipulations performed are often referred to in terms, such as comparing or adding, commonly associated with mental operations performed by a human operator. No such capability of a human operator is necessary, or desirable in most cases, in any of the operations described herein which form part of the present invention; the operations are machine operations. Useful machines for performing the operations of the present invention include general purpose digital computers or other similar devices. In all cases the distinction between the method operations in operating a computer and the method of computation itself should be recognized. The present invention relates to a method and apparatus for operating a computer in processing electrical or other (e.g., mechanical, chemical) physical signals to generate other desired physical signals.
The present invention also relates to an apparatus for performing these operations. This apparatus may be specifically constructed for the required purposes or it may comprise a general purpose computer as selectively activated or reconfigured by a computer program stored in the computer. The algorithms presented herein are not inherently related to any particular computer or other apparatus. In particular, various general purpose machines may be used with programs written in accordance with the teachings herein, or it may prove more convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these machines will appear from the description below.
The present invention deals with "object-oriented" software, and particularly with an "object-oriented" operating system. The "object-oriented" software is organized into "objects", each comprising a block of computer instructions describing various procedures ("methods") to be performed in response to "messages" sent to the object or "events" which occur with the object. Such operations include, for example, the manipulation of variables, the activation of an object by an external event, and the transmission of one or more messages to other objects.
Messages are sent and received between objects having certain functions and knowledge to carry out processes. Messages are generated in response to user instructions, for example, by a user activating an icon with a "mouse" pointer generating an event. Also, messages may be generated
by an object in response to the receipt of a message. When one of the objects receives a message, the object carries out an operation (a message procedure) corresponding to the message and, if necessary, returns a result of the operation. Each object has a region where internal states (instance variables) of the object itself are stored and where the other objects are not allowed to access. One feature of the object-oriented system is inheritance. For example, an object for drawing a "circle" on a display may inherit functions and knowledge from another object for drawing a "shape" on a display.
A programmer "programs" in an object-oriented programming language by writing individual blocks of code each of which creates an object by defining its methods. A collection of such objects adapted to communicate with one another by means of messages comprises an object-oriented program. Object-oriented computer programming facilitates the modeling of interactive systems in that each component of the system can be modeled with an object, the behavior of each component being simulated by the methods of its corresponding object, and the interactions between components being simulated by messages transmitted between objects. Objects may also be invoked recursively, allowing for multiple applications of an objects methods until a condition is satisfied. Such recursive techniques may be the most efficient way to programmatically achieve a desired result.
An operator may stimulate a collection of interrelated objects comprising an object-oriented program by sending a message to one of the objects. The receipt of the message may cause the object to respond by carrying out predetermined functions which may include sending additional messages to one or more other objects. The other objects may in turn carry out additional functions in response to the messages they receive, including sending still more messages. In this manner, sequences of message and response may continue indefinitely or may come to an end when all messages have been responded to and no new messages are being sent. When modeling systems utilizing an object-oriented language, a programmer need only think in terms of how each component of a modeled system responds to a stimulus and not in terms of the sequence of operations to be performed in response to some stimulus. Such sequence of
operations naturally flows out of the interactions between the objects in response to the stimulus and need not be preordained by the programmer.
Although object-oriented programming makes simulation of systems of interrelated components more intuitive, the operation of an object-oriented program is often difficult to understand because the sequence of operations carried out by an object-oriented program is usually not immediately apparent from a software listing as in the case for sequentially organized programs. Nor is it easy to determine how an object-oriented program works through observation of the readily apparent manifestations of its operation. Most of the operations carried out by a computer in response to a program are "invisible" to an observer since only a relatively few steps in a program typically produce an observable computer output.
In the following description, several terms which are used frequently have specialized meanings in the present context. The term "object" relates to a set of computer instructions and associated data which can be activated directly or indirectly by the user. The terms "windowing environment", "running in windows", and "object oriented operating system" are used to denote a computer user interface in which information is manipulated and displayed on a video display such as within bounded regions on a raster scanned video display. The terms "network", "local area network", "LAN", "wide area network", or "WAN" mean two or more computers which are connected in such a manner that messages may be transmitted between the computers. In such computer networks, typically one or more computers operate as a "server", a computer with large storage devices such as hard disk drives and communication hardware to operate peripheral devices such as printers or modems. Other computers, termed "workstations", provide a user interface so that users of computer networks can access the network resources, such as shared data files, common peripheral devices, and inter-workstation communication. Users activate computer programs or network resources to create "processes" which include both the general operation of the computer program along with specific operating characteristics determined by input variables and its environment.
The terms "desktop", "personal desktop facility", and "PDF" mean a specific user interface which presents a menu or display of objects with associated settings for the user associated with the desktop, personal desktop facility, or PDF. When the PDF accesses a network resource, which typically requires an application program to execute on the remote server, the PDF calls an Application Program Interface, or "API", to allow the user to provide commands to the network resource and observe any output. The term "Browser" refers to a program which is not necessarily apparent to the user, but which is responsible for transmitting messages between the PDF and the network server and for displaying and interacting with the network user. Browsers are designed to utilize a communications protocol for transmission of text and graphic information over a world wide network of computers, namely the "World Wide Web" or simply the "Web". Examples of Browsers compatible with the present invention include the Navigator program sold by Netscape Corporation and the Internet Explorer sold by Microsoft Corporation (Navigator and Internet Explorer are trademarks of their respective owners). Although the following description details such operations in terms of a graphic user interface of a Browser, the present invention may be practiced with text based interfaces, or even with voice or visually activated interfaces, that have many of the functions of a graphic based Browser. Often a browser is used as an API to an application program on a server.
Browsers display information which is formatted in a Standard Generalized Markup Language ("SGML") or a HyperText Markup Language ("HTML"), both being scripting languages which embed non- visual codes in a text document through the use of special ASCII text codes. Files in these formats may be easily transmitted across computer networks, including global information networks like the Internet, and allow the Browsers to display text, images, and play audio and video recordings. The Web utilizes these data file formats to conjunction with its communication protocol to transmit such information between servers and workstations. Browsers may also be programmed to display information provided in an extensible Markup Language ("XML") file, with XML files being capable of use with several Document Type Definitions ("DTD") and thus more general in nature than SGML or HTML. The XML file may be analogized to an object, as the data and the stylesheet formatting are separately contained
(formatting may be thought of as methods of displaying information, thus an XML file has data and an associated method).
The terms "personal digital assistant" or "PDA", as defined above, means any handheld, mobile device that combines computing, telephone, fax, e-mail and networking features. The terms "wireless wide area network" or "WWAN" mean a wireless network that serves as the medium for the transmission of data between a handheld device and a computer. The term "synchronization" means the exchanging of information between a handheld device and a desktop computer either via wires or wirelessly. Synchronization ensures that the data on both the handheld device and the desktop computer are identical.
In wireless wide area networks, communication primarily occurs through the transmission of radio signals over analog, digital cellular, or personal communications service ("PCS") networks. Signals may also be transmitted through microwaves and other electromagnetic waves. At the present time, most wireless data communication takes place across cellular systems using second generation technology such as code-division multiple access ("CDMA"), time division multiple access ("TDMA"), the Global System for Mobile Communications ("GSM"), personal digital cellular ("PDC"), or through packet-data technology over analog systems such as cellular digital packet data (CDPD") used on the Advance Mobile Phone Service ("AMPS"). The terms "wireless application protocol" or "WAP" mean a universal specification to facilitate the delivery and presentation of web-based data on handheld and mobile devices with small user interfaces.
The present invention relates to a system and method for authenticating client users without providing user secrets until a sufficiently high level of assurance of the authenticity of the server is achieved. The user provides identifying information to the server in a series of secure communications. The server responds to the user with a series of stimuli that is known to the user. Only once a sufficient number of rounds of communications are exchanged to authenticate the server does the user send the server a secret which then authenticates the user.
Figure 1 depicts a schematic representation of the wide area network arrangement of the present invention, using Internet 100 as the exemplary wide area network. As well known in the software arts, there are several protocols for a remote user at client device 102 to register with, or logon, server 104, for example a bank customer logon to a bank web site. The present invention is particularly concerned with preventing a third party, represented by man-in-the-middle 106, from obtaining information fraudulently via communications with client device 102. Client device 102 may a personal computer, a PDA, a cellular telephone, or other telecommunications device in communication, either by a physical line or a wireless connection, to Internet 100. The remote user at client device 102 may only observe a web site through the browser pages displayed by client device 102 caused by communications received from Internet 100. Server 104 includes software that provides a logon interface for the remote user on a browser operating on client device 102, along with associated information about the remote user relating to accepted stimuli (as defined below) and authentication information (also defined below). Unfortunately, man-in-the-middle 106 may include a close facsimile of the software on server 104, and possibly hardware and/or software enabling man-in-the-middle 106 to intercept communications between client device 102 and server 104. Without the use of an authenticating protocol, client device 102, and consequently the remote user, will not be reliably able to distinguish between communications with server 104 or man-in-the-middle 106.
Figure 2 shows a flow chart diagram of the operation of the present invention in the context of the configuration of Figure 1. Generally, the process starts at step 200 where server 104 sends a logon interface for a browser on client device 102. The remote user would then start the logon process at step 202 by sending identifying information to server 104, for example a user name, and ID number, or other identifier that is not necessarily private or secret. Further, in step 204, the remote user continues the logon process by entering prefix information at the logon screen. The prefix information may be simply a character from a password, or other portions of the authentication information of the remote user, packaged in a manner that makes the source of the information apparent but not necessarily the content. Server 104 then responds to client device 102 by providing a stimuli, for example an image, sound, or other reproducible output that is known to the remote user (but is not publicly known) in step 206. In step 208 the remote user
decides whether the stimuli received so far have authenticated server 104. If in step 208 there are further rounds of stimuli possible in the protocol (in embodiments where multiple rounds are possible) then the remote user proceeds with the logon process (for example by typing in an additional character) and returns to step 204. If in step 208 the stimuli from server 104 authenticates server 104 to the remote user (either because the embodiment involves a single stimuli, or by exhausting the multiple rounds of stimuli), then in step 210 where client device 102 convinces or confirms server 104 of the correctness of the secret information, such as by client device 102 simply sending the secret information to server 104 to authenticate the remote user, or alternatively conducting a PAKE protocol to create the same effect. The secret information may be in the form of a password, a biometric signature, a hardware token, or other non-public personally identifying information ("PII" hereinafter). Finally, if in step 208 client device 102 does not receive expected stimuli, then the logon is rejected and the process terminates at end 212.
Figure 3 shows one exemplary method of transferring certain prefix information in the form of a character based password in a manner that makes the source of the information apparent without revealing the content. A user of client device 102 starts entering characters of a password on a logon screen provided by server 104. Client device 102 creates array 300, which is generally filled with random data, but with a predetermined location of the array containing a value based on the first character of the password. Server 104 recieves array 300 and by virtue of its knowledge of the user's identity and other values, then accesses the portion of array 300 having the first character based value and returns stimuli 302 to client device 102. Client device 102 presents stimuli 302 to the user, and assuming that the user continues entering characters, creates a new array 304 that has two predetermined locations having character based data, with the remainder of array 304 containing random data. Server 104 uses the two character values in array 304 to calculate the appropriate stimuli 306 to return. Client device then encodes the first through third characters in locations of array 308. This process continues iteratively for the entire password, so that prior to the last character server 104 returns stimuli 310 based on several character values and the user enters the last character on client device 102 which sends all the character values in locations of array 312. Upon the last stimuli 314 being transmitted to and
presented by client device 102 and confirmed by the user, client device 102 sends the encoded PII 316 to server 104 which would then use conventional techniques to confirm the password and authenticate the user.
Let us begin by describing by example how to protect PIIs using the disclosed invention. A simplified version of an exemplary embodiment of our invention would do the following:
SETUP OF SERVER (performed once for each server): The server selects a secret and random element of sufficient length to make it infeasible to guess in any reasonable amount of time, using an existing number of computers or other computational devices. We refer to the secret as the seed.
SETUP OF CLIENT (performed once for each client): The client user downloads some software to be used during login attempts, registers a user name with the server, and provides the server with the PII. Alternatively, the server may generate the PII which is sent to the client over a secure channel (i.e., telephone or mail). The PII may be stored in an encoded obfuscated manner, (this is commonly done by storing the output of a one-way function along with some random bits called the salt, where the input to the one-way function is a appropriate encoded version of the PII concated together with the salt bits) applied to the secret (e.g., the server may store a salted hash of it). This has traditionally been the practice for how to store passwords in practice, but is not common in the context of providing security against phishing or man-in-the-middle attacks. In the following first example, though, we assume that the PII is stored securely but in an unobfuscated manner (i.e clear-text). After this, the server, following our protocol, selects and displays a sequence of images for the client, and provides the client with these. The selection can be performed as described below.
If the client user wishes to install the above mentioned software on multiple machines, he would do so, but he would not perform the remaining part of the setup process more than once. We note that the client software may be certified and its authenticity verified using standard procedures,
which are not specific to the disclosed invention. The software may also be included as part of a browser
SELECTION OF IMAGES: The server computes a value that is a one-way function, or at least partially a one-way function, of the client identity and the server seed. We can call this value the client seed, or cs for short. Here, we may have cs=hash(seed,identity), where identity represents the client identity, such as a user name, a static IP address, or other information unique to a client. We treat cs as the seed to a pseudo-random generator that outputs a sequence of bits that is in turn used to select a sequence of pictures from a database. If the database were to contain 2Λ20 (a million images) then the output of the pseudo-random generator would be grouped in segments of 20 bits, each one of which would index one of the pictures (potentially with repetition). The number of pictures selected would be a parameter of the software, and may be selected by the client and server at setup time, but may be altered at a later time.
LOGIN ATTEMPT:
The user enters his user name or other identifying information into the client software. This is sent to the server.
The user begins to enter his PII into the client software. While the user is entering the PII, which may occur quickly to human perception, the computer may perform the following steps prior to entering any PII, or may perform several steps in between each key stroke of the user. Alternatively, the computer may allow the user to enter all of the PII into the client software prior to starting the exchange with the remote server, although not disclosing the entire PII until the completion of the logon process, but this methodology may be circumvented by a phising site whereas interaction upon each key stroke prevents such circumvention. The client and the server establish a secure connection using a standard key establishment protocol, such as El Gamal, or a commercially available alternative such as SSL. All communication between the server and client after this step is encrypted and authenticated
using the established key, which we refer to by the variable name K. This prevents an eavesdropper from learning any useful information.
The PII is decomposed into n bits representing the PII, where some bits may be set to zero if the PII is shorter than n bits. (Thus, n represents the maximum allowed length of the PII in this example). The ith such bit is p_i.
Using the bits p_i and bits of K as indices, the client initiates an oblivious transfer (which preferably is non-malleable) with the server. An oblivious transfer is a well-known method intended to download elements from a database without having to identify to the database owner what elements are downloaded, and without having to send over the database in its entirety.
The server deteπnines the cs of the client, and breaks this up into N pieces, where any true subset of these pieces is a collection of random strings, but where the entire set represents the value cs. Here, N is the sum of n and a large portion of the length the key K. If the XOR function is used to break up cs into pieces, then the server would select N-I random elements of an appropriate length, and then select the last one so that the XOR of all of the N elements would equal cs. In addition, the server selects N psuedo-random elements, to ensure that subsequent executions have the same elements. The server places the N first-mentioned elements in the positions of a matrix corresponding to the PII that the server has stored for the client. The truly random elements are entered in the other positions. This is illustrated in more detail below. This way, exactly one element is entered in each position in the matrix used for the oblivious transfer. The server completes its part of the oblivious transfer on the entered values. In other words, we perform an Oblivious Transfer (OT) that either returns random bits or a string representing the pictures, depending on the user/client input. This is described in an example implementation below. The client extracts values from the N positions indexed by his PII and K, and combines these values, e.g., using the XOR function. The resulting value is called cs'. Note that this value should equal cs if the correct PII was entered and there was no man in the middle who caused inconsistent values of K to be understood by the client and the server. However, otherwise the value cs' would be substantially different from cs. Using cs', the client produces the
selection of the sequence of images, as described above, and displays these for the user to see.
The user either accepts or rejects the images. If he rejects, then the protocol is terminated, otherwise it is continued. The client sends over to the server either a encoded(encrypted) proof of what elements he selected in the oblivious transfer, or sends over his PII, also encrypted, or engages a separate proof that succeeds only on the input of the correct PII.
The server verifies that the information received in step 9 corresponds to the correct PII of the user; if this is the case, then the server accepts the login attempt.
MULTIPLE ROUND LOGIN ATTEMPT: Below we describe an alternate method for logging on. While based on similar principles to the above single-round login attempt, this technique allows for visual feedback to occur interactively with the user, while he is entering his PII. The user is given by server, over a secure fashion during set-up (via mail delivery, telephone conference, or other secure channel), a series of stimuli such as images represented by the strings v[l ..n] which may be selected from the output of psuedo-random generator. This can lead to more practical security due to human-computer interfacing. For the purposes of this description, we will assume that the PII consists of m-characters, each character of which takes q bits to represent. Therefore, the total representation of the PII requires qm=n bits. For simplicity of description we will assume that there are m-rounds in the protocol, one for each character. However, this can easily be modified to range from 1 round to n rounds, using known techniques obvious to an expert in the field. We describe by N the security parameter for this protocol. N=n + |K|, where |K| is the length in bits of K (K is defined in (2) below).
The user enters his user name or other identifying information into the client software. This is sent to the server.
The client and the server establish a secure connection using a standard key establishment protocol, such as El-Gamal, or a commercially available alternative such as SSL. All communication between the server and client after this step is encrypted using the established key, which we refer to by the variable name K. This prevents a passive eavesdropper from
learning any useful information, but by itself is not sufficient to prevent man-in-the-middle attacks.
The server determines the cs of the client, and uses it as a seed to a pseudo-random number generator in order to create m random-strings vi,...,vm> where each string Vj has the same length as cs. The server breaks each vj into N pieces, where any true subset of these pieces is a collection of pseudo-random strings determined in a deterministic manner from the seed cs, but where the entire set represents the value Vj. If the XOR function is used to break up Vj into pieces, then the server would select N-I pseudo-random elements of an appropriate length using a pseudo-random generator on cs, and then select the last one so that the XOR of all of the N elements would equal Vj. In addition, the server selects N other pseudo-random elements determined in a deterministic fashion from cs. The server places the N first-mentioned elements in the positions of a matrix corresponding to the PII and K that the server has stored for the client. The other psuedo-random elements are entered in the other positions. This is illustrated in more detail in the section entitled "Description of Possible Oblivious Transfer Protocol" below. This way, exactly one element is entered in each position in the matrix is used for the oblivious transfer. The server completes his part of the oblivious transfer using the entered values. The following steps are repeated for each character of the n-character PII:
The jth character of the PII is input into the client software. This character is denoted by Cj.
The characters ci ...Cj is decomposed into j*q bits representing cj ...Cj. For this round we will represent the the ith such bit of ci ...c,- as pi. Bits Cjq+i through pn are set to zero for padding.
Using the bits pi and bits of K as indices, the client initiates an oblivious transfer (that preferably is non-malleable) with the server. An oblivious transfer is a well-known method intended to download elements from a database without having to identify to the database owner what elements are downloaded, and without having to send over the database in its entirety.
The client extracts values from the m positions indexed by his PII and K, and combines these values, e.g., using the XOR function. The resulting value is called v/. Note that
this value should equal Vj if the correct PII was entered and there was no man in the middle who caused inconsistent values of K to be understood by the client and the server. However, otherwise the value Vj1 would be substantially different from y,-. Using v/, the client selects the image from the database to be displayed in the sequence of images, as described above, and displays these for the user to see.
The user either accepts or rejects the current image. If he rejects, then the protocol is terminated, otherwise it is continued, and the next character in the PII is entered, and we continue with step 4a again. Alternatively, the computer may allow the user to enter all of the PII into the client software prior to starting the exchange with the remote server, although not disclosing the entire PII until the compeletion of the logon process. If the character that was accepted was the last character to be entered, then we continue with step 5.
The client sends over to the server either a proof of what elements he selected in the oblivious transfer, or sends over his PII. The server verifies that the information received in step 5 corresponds to the correct PII of the user; if this is the case, then the server accepts the login attempt.
DESCRIPTION OF POSSIBLE OBLIVIOUS TRANSFER PROTOCOL: We let N denote the maximum accepted length of the PII added to the length of the portion of the key K that is used in the oblivious transfer protocol. We let t_i represent a bit of either the PII or K. For example, we may let t_i=p_i for 0<i<n, and let t_i = K_i for i>n, where K__i is the ith bit of K. Note: to avoid maximum lengths of PII, we can let unusually long PIIs replace portions of the K, which is trivial for a person skilled in the art. N is known by both client and the server. We let g and h be known generators of a prime group, where the group as well as the generators are system parameters, and log_g h is not known to any client. A simplified version of the oblivious transfer protocol is as follows: (1) The client selects N random secret keys x_i, and computes y_i=gΛ{x_i}, where A denotes modular exponentiation over a known prime group; (2) If t_i=0 then y_{iθ}=y_i and y_{il }=y_i h; otherwise, y_{il }=y_i and y_{iθ}=y__i / h; (3) The list of keys (y_{10} ... y_{N0}, y_{l l} ... y_{Nl}) is transmitted to the server (only the half of these values has to be transmitted, as knowledge of y_{iθ} determines y_{il } and vice versa, and each
selected value is of the form gΛ{x} for some known x, where the selection is based on the PII and K); (4) A matrix of 2 by N plaintext elements is prepared, where an element is random if the position is not selected, but otherwise is selected in a special way (this special way ensures any true subset of the selected elements is a set of random elements, but the entire set has the property that it can be used to derive cs, e.g, by XORing all the selected elements together, where each element is of the same length, and is preferably a group element); (5) the matrix of plaintext elements is encrypted using the matrix of public keys, where the first row of public keys is the list (y_{ 10} ... y_{N0} }) and the second is the list (y_{ 11} ... y_{Nl }), with the resulting ciphertexts being sent to the client; and (6) the client decrypts each ciphertext for which he knows the secret key. This corresponds exactly to the selection of positions made using PII and K. The client combines all the portions, e.g., using XOR, thereby obtaining cs'. As an alternative, the plaintext elements can be selected truly at random and not from the group, but merely of the same length as the modulo defining the group. This would have to apply both to the random and the special elements.
The above protocol is not what is referred to as non-malleable. In order to make it non- malleable in the random oracle model, which is a model well-known to those skilled in the art, one could augment the above protocol in the following manner: As the client sends over the list of keys in step 3, he would also send over one or more so-called ring signatures that are computed in the following manner: (A) The client knows either the discrete logarithm of y_{iθ} or y_{il } for each 0<i<N, but not for both of these. Using a ring signature, which is a type of digital signature known in the art, the client constructs a value that authenticates the entire list of encryption public-keys (i.e. y_{00},...,y_{ln}, or a digest of these. This is done for each one of the N keys. Or: The client constructs a ring signature that proves knowledge of exactly N of the secret keys associated with the 2N public keys that are transmitted. In case only half of these are actually transmitted, the other half can be derived from the first half, and the ring signature would still prove knowledge of N of the 2N secret keys. The secret keys are the discrete logarithms of the public key with respect to the generator g. Or: The client selects one more secret key and public key, includes said public key with the matrix of public keys used for the OT, and uses said secret key to sign a digest of the entire transcript. This signature would later
be verified by the server, who would abort if it is incorrect. The server would sign the entire transcript, including said public key and signature, and include the resulting signature in his next response.
The ring signature or ring signatures, as described above, would be verified by the server upon receipt. If any of the verifications fail, then the server would abort the protocol. Moreover, as the server prepares the response in step 5 above, it would also send over one or more digital signatures. If El Gamal encryption is employed for the encryption in step 5 (as has been used in the explanation above), then a ciphertext would be of the form (yAa * m, gΛa), where y is the public key used, a is a random value between 1 and the size of the group, and m is the message or plaintext value to be encrypted, encoded as a member of the group. In that case, the secret key associated with a given entry in the matrix would be the random value a (which is specific to this entry), and the public key would be the value yΛa. This could be used to construct standard digital signatures, such as RSA or Schnorr signatures. A total of 2N such signatures would be generated and transmitted. For each of these signatures, the signed message would be the entire matrix of ciphertexts or a digest thereof. Alternatively, and as will be appreciated by a person skilled in the art, the server can construct one digital signature that is a function of all the known secret keys, and for which again the message is either the entire matrix of ciphtertexts or a digest thereof. Upon receiving one or more such signatures, the client software would verify these, and abort the execution if not all of these were correct.
The above techniques prevent any manipulation of the oblivious transfer by a third party.
Alternative embodiments of the invention would have the following alternative features: (1) Use different types of encryption schemes than described above; (2) Use different types of signature schemes than described above; (3) Use different types of oblivious transfer schemes than described above; (4) Let only a small portion of K, or no portion of K at all, be used to select entries in the oblivious transfer protocol; (5) Combine the PII and K before using the combined value to perform a selection; (6) Use two or more interactive steps in which one or more symbols of the PII are used to obtain one or more pictures; where the protocol only continues after the user has approved the already displayed pictures, and as it continues, the user would enter some
more symbols of the PII, after which he would obtain further pictures; (7) Where instead of pictures, some combination of pictures, movies, sounds, physical features and other communicable information is deployed as stimuli; (8) Instead of XOR use another function to combine elements because a large variety of such functions maybe deployed, as will be appreciated by a person skilled in the art; (9) Use a function f instead of oblivious transfer, where f takes as input the same type of information as the oblivious transfer protocol takes, and results in an output of similar type, but where the transcripts may allow certain partial information about the inputs to be inferred by either the server, client or a third party, in excess of the information already stated that the client can infer; (10) Use a function f instead that with a large likelihood results in the correct computation, as described above, but with a certain not too large probability may result in another output; (11) Use another technique for protecting against malleability of the transcripts of the oblivious transfer.
The above holds for any situation where the server stores the PII of the user in plaintext, or where he can compute this information from data that he has stored in plaintext; here, the term server is meant to indicate one or more computers or other devices that when collaborating can act as one entity. Thus, the server may in actuality be distributed, and may consist in part of computers and in part of other devices with properties that allow either storage, retrieval or processing of information, as will be appreciated by a person skilled in the art.
In yet another embodiment, the server would not store the PII of a client in plaintext, but would instead store a processed version of this, such as a salted hash of the value. In this case, the elements entered in the matrix would be deterministically determined from the seed and the client user name, using a one way function of choice. Any two sessions for one and the same client user would result in the same matrix of plaintext values for any portion of the matrix that corresponds to the PII selected values. The portion of the matrix corresponding to the values selected by the value K would also be deterministically computed (and in the same manner) but then arranged in a way that causes the same set of values of this part of the matrix always to be extracted by the client, assuming the client and the server uses the same value K. This is achieved plainly by permuting these values, as will be appreciated by a person skilled in the art.
An Alternative Solution for a Slightly Different Scenario: the methods described above work well so long as users do not frequently change their PII. However, a common implementation of PII in use today is the SecureID system. This system consist of 2 synchronized parts, first there is software running on the server which has a clock that is tightly synchronized to clocks running on a number of tokens which clients carry. Each token generates a different pseudo-random sequences that changes over time (typically about once every 30 seconds). The server keeps track of which tokens are owned by which clients, and is able to generate the same pseudorandom sequence of each token. Therefore, when a client wishes to authenticate himself to a server, the client provides his name (or other unique identifier), and then enters the current pseudo-random sequence that is displayed on his token. The server takes the user's name, and looks up which token belongs to the client in question. He then determines if the sequence entered by the client corresponds to the sequence that would have been displayed by the client's token at the time the user tried to authenticate. If correct, the server authenticates the user. In this protocol, the pseudo-random sequence plays the role of the PII above. Notice that the current secure-Id protocol still permits phishing attacks. This is because an attacker could set-up a fraudulent service provider, which masquerades as a legitimate provider. When an unsuspecting client authenticates itself to the phishers site, the phisher may (and actually must) immediately logon to the authentic service providers site, providing the authentic site with the user-name and pseudo-random sequence provided by the user. If the phisher is sufficiently fast at doing this, then the pseudo-random sequence entered by the user will still be valid at the authentic site, and the phisher will gain access to the authentic service provider's site, with the identity of the victimized user.
Unfortunately, because the the PII changes so frequently, a user would be unable to constantly memorize the sequence of images corresponding to the PII (remember the sequence changes every 30 seconds). There is a clear extension of the methods of image authentication which have the images displayed with each number in the sequence on the token, and this would permit a user to verify the images returned, and to use the present invention. Unfortunately, this process would require changing existing infrastructure, and so a solution that uses only current infrastructure is desirable.
In order to thwart phishing attacks on the SecureID System, we need to have the server authenticate itself to the user in some fashion. This is actually easily done, as should be seen by someone skilled in the area, if one allows multiple pseudo-random numbers from the pseudorandom sequences to be displayed during the authentication process. However, due to the length of time this entails, this is viewed as an undesirable trait. Therefore, the goal is to authenticate users in a time period where only one pseudo-random number is displayed on the token, even though this goal may not provide sufficient protection against Man-in-the-middle attacks.
We begin the description of this alternative embodiment of the present invention assuming that the server already has given a SecureID token to the client under consideration, and that the server knows which pseudo-random sequence that user's token produces. First the client and the server establish a secure SSL connection (or some other encrypted and authenticated channel). Next, the client sends his username to the server over the encrypted channel. The server generates the current n-digit pseudo-random number that is currently displayed on that user's token. The server chooses exactly n/2 positions of the n-digit number, at random. For each of the m=n/2 randomly selected positions, the server replaces the corresponding digit with a randomly selected different digit. The server sends the modified pseudo-random number to the client. After receiving a modified pseudo-random number from the server, the user checks to ensure that exactly n-m digits in the number correspond to the number in the same position currently on the token. If not, the user rejects the pseudo-random number, and the protocol is halted. Otherwise the user then modifies the pseudo-random number, and corrects the remaining locations in the random-number, that the server replaced with random incorrect digits. The client then sends to corrected pseudo-random number to the server. If the server receives the corrected random-number, then it authenticates the client, and otherwise it rejects.
We note that the value m can be changed in-order to change the degree to which both client and server authenticate each other. A key component of this system is the fact that random locations in the pseudo-random sequence are modified in a random manner. This adds significant security to our system, as proposed to systems which denote fixed or indicated positions that a user must correct or supply in order to authenticate himself. Since in our proposed system, the location of
the errors is hidden to anyone who does not know what the correct pseudo-random sequence is, there is quantitatively more security in our proposed system. Just to correctly guess which locations need to be fixed, requires a chance of l/(n choose m), and then the chance of correctly guessing the correction is l/9Λm. In a system where there are m fixed locations that need to be supplied, the chance of guessing the correct number is l/(10Λm).
A further aspect of the present invention relates to Pairing. To ground and clarify our exposition, we consider two two parties, A and B, who have secret values K_A and K_B that they wish to compare. These secret values may be passwords or functions thereof (oneway or otherwise); they may be PINs; they may be cryptographic keys, such as Diffie-Hellman keys, or keys communicated by key transport; or keys obtained using other means; or values resulting from computation; or combinations of any of these. Two common cases are (1) for the values to be functions of PIIs (personal identifying information); and (2) for the values to be Diffie-Hellman keys augmented by short shared secrets, such as PINs. In the latter case, the PINs would be used to authenticate the established Diffie-Hellman keys.
In this alternative embodiment of the present invention, an attacker, potentially mounting a man- in-the-middle attack, will not learn any portion of the secret values to be compared except for with a negligible probability that is essentially equal to the probability of simply guessing the secret value in question given no information about it. Let us begin by describing by example how to protect secret information while still enable a comparison using the disclosed invention. A simplified version of this embodiment of the present invention involves the following: We assume that the two parties have established some secret values that they wish to compare. These values may be established securely, such as in the case of passwords, or may be established in a way that is susceptible to man-in-the-middle attacks, such as shared keys using Diffie-Hellman key exchange. Parts of the secret values may have been generated and established using some secure channel, such as communicating a short PIN by voice between the users of two devices attempting to perform pairing. As mentioned before, we let K_A and K_B represent the two secret values which are held by A and B. Furthermore, we let p and q be two large prime numbers, where p=kq+l for some integer k; and g and h be distinct generators of
G_p. Here, p, q, g and h are system parameters, and are known to all participants. We will assume that there is a simple function that maps all possible values of K_A and KJB to a number in the range [CLp-I], and how to do this is obvious to someone familiar with the field. We also have f_l, f_2 and f_3 which are different random oracles that are implemented as distinct one-way functions based on a hash function such as SHA-I. All of f_l, f_2 and f_3 are system parameters that are publicly known.
COMPARISON: Party B selects a value z uniformly at random from [O..q-1] and computes y_B=gAz mod p, which is sent over to party A. (This is a temporary secret key - public key pair.) Party A selects a value x uniformly at random from [O..q-1] and computes y={y_B}Λx hΛ{K_A}mod p, which is sent over to party B. (This is a D-H shared key offset by K_A.) Party B selects values r and w independently and uniformly at random from [O..q-1] and computes c=gΛw mod p, (a,b)=({(y * hΛ{-K_B} }Λ{r/z}*c) mod p, gAr mod p). (This is an encryption of c using public key gAx if K_A=K_B; otherwise, a random number.) Party B also computes K=f_l(c) and chk=f_2(y,a,b,K). (This is a MAC on the transcript from and to A.) Party B sends (a,b,chk) to party A. Party A computes K'=f_l(a/bΛx mod p) and verifies whether chk=f_2(y,a,b,K'). If this does not hold then party A outputs an error message and halts. Otherwise, party A outputs an accepting message, and computes a response d=f_3(y,a,b,K') that is sent to party B. Party B receives d and verifies if d=f_3(y,a,b,K). If that holds, then party B outputs an accepting message, otherwise an error message.
If a party outputs an accepting message, that means that it believes that K_A=K_B. If an intermediary attacker replaces any part of the transcript sent in step 1 before it is received by party B, or any part of the transcript sent in step 2 before it is received by party B, then the value chk will not equal the value f_2(y,a,b,K') computed by party A, and it will therefore output an error message and halt. If any part of the transcript sent in step 3 is changed by the intermediary attacker before being received by party B, then the value d will not equal the value f_3(y,a,b,K) computed by party B and party B will output an error message and halt. If an attacker attempts to play the role of party B but does not know K_B, then, with an overwhelming probability, the values K and K' will be different, and the attacker will not be able to guess the value K' except
with an increadibly small probability, and so it will not be able to compute a value chk that is equal to the value f_2(y,a,b,K') computed by party A. Similarly, if an attacker attempts to play the role of party A, but does not know the value K_A, then, again, the values K and K' will not be identical with an overwhelming probability, and the attacker will not be able to guess the value K with a reasonably large probability, and so, will not be able to compute a value d that will equal f_3(y,a,b,K). Moreover, if the values K_A and K_B are not equal, then the above comparisons will also result in error messages with an overwhelming probability. The exact probabilities depend on the lengths of K_A and K_B and the distributions from which they are drawn.
We note that K_A and K_B may be arbitrary values; they may be PINs; passwords; other personal identifying information; they may be cryptographic keys; or outputs of authentication devices, such as the SecurID device by RSA Security; or any combination of the above. They may also be functions of any combination of the above, where a possible choice of function is salting and hashing (as done to prepare passwords to be stored). If we wish to compare two keys established using a key exchange or key transport protocol, then K_A and KJB are these keys, as perceived by the two participants, augmented with some short PIN or password that has been exchanged out of band, such as over a voice channel.
We also note that while the above protocol can be used to perform pairing, as stated, it may also be used to perform both key exchange and pairing. In particular, this would correspond to K_A and K_B simply consisting of a short PIN or password that has been exchanged out of band; here K (resp. K') would be the resulting shared key.
One does not have to compare equality of the two values K_A and K_B, but may instead want to verify that some particular relation between them holds. One particular version of the above protocol is as follows: Party A knows a value K_A and party B knows a value K_B, where K_A=f_4(K_B,salt), where f_4 may correspond to a one-way function such as SHA-I, and where the value salt may be publicly known. We presume that the salt has been communicated before the following comparison protocol is executed. Party B selects a value z uniformly at
random from [Cq-I] and computes y_B=gΛz mod p, which is sent over to party A. (This is a temporary secret key - public key pair.) Party A selects a value x uniformly at random from [O..q-1] and computes y={y_B}Λx hΛ{K_A}mod p, which is sent over to party B. (This is a D-H shared key offset by K_A.) Party B selects values r and w independently and uniformly at random from [CLq-I] and computes c=gΛw mod p, (a,b)=({(y * hΛ{- f_4(K_B,salt)} }Λ{r/z}*c) mod p, gΛr mod p). (This is an encryption of c using pk gΛx if K_A= f_4(K_B,salt); otherwise, a random number.) Party B also computes K=f_l(c) and chk=f_2(y,a,b,K). (This is a MAC on the transcript from and to A.) Party B sends (a,b,chk) to party A. Party A computes K'=f_l(a/bΛx mod p) and verifies whether chk=f_2(y,a,b,K'). If this does not hold then party A outputs an error message and halts. Otherwise, party A outputs an accepting message, and computes a response d=f_3(y,a,b,K') that is sent to party B. Party B receives d and verifies if d=f_3(y,a,b,K). If that holds, then party B outputs an accepting message, otherwise an error message. After the above has been run, the parties may, if they both accept, perform an exchange of K B. More in particular, party B may send K_B, encrypted using the shared key K=K', to party A. Party A would decrypt this value and compute f_4(K_B,salt), comparing the result to the known value K_A.
An alternative to the above version would be to eliminate all computation and transmission of chk, as party A would perform his verification based on the transmitted encrypted and then decrypted version of K_B.
We note that the computational requirements of our described solutions are very low, and that they are suitable for a wide variety of devices. We also note that the described embodiments are only exemplary. Alternatives may use other computational components, such as elliptic curve cryptography instead of modular artithmetic; and alternative functions instead of SHA-I. One could also use oblivious transfer instead of El Gamal encryption, and signatures or other authentication mechanisms instead of message authentication codes or symmetric authentication. One may also use other encryption methods, as will be appreciated by a person skilled in the art.
While this invention has been described as having an exemplary design, the present invention may be further modified within the spirit and scope of this disclosure. This application is therefore intended to cover any variations, uses, or adaptations of the invention using its general principles. Further, this application is intended to cover such departures from the present disclosure as come within known or customary practice in the art to which this invention pertains.