WO2012090027A1 - Modèles de langages destinés à prédire un texte saisi - Google Patents
Modèles de langages destinés à prédire un texte saisi Download PDFInfo
- Publication number
- WO2012090027A1 WO2012090027A1 PCT/IB2010/056138 IB2010056138W WO2012090027A1 WO 2012090027 A1 WO2012090027 A1 WO 2012090027A1 IB 2010056138 W IB2010056138 W IB 2010056138W WO 2012090027 A1 WO2012090027 A1 WO 2012090027A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- text
- language model
- language
- appropriateness
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/02—Input arrangements using manually operated switches, e.g. using keyboards or dials
- G06F3/023—Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
- G06F3/0233—Character input methods
- G06F3/0237—Character input methods using prediction or retrieval techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/232—Orthographic correction, e.g. spell checking or vowelisation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/274—Converting codes to words; Guess-ahead of partial word inputs
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/451—Execution arrangements for user interfaces
- G06F9/454—Multi-language systems; Localisation; Internationalisation
Definitions
- the present disclosure relates to the field of language models, associated methods, computer programs and apparatus.
- Background Electronic devices that allow for text entry from a user also provide for text correction and/or prediction.
- mobile phones can often provide type alphanumeric language prediction for alphanumeric keypads.
- language models are often integrated into the system. These are utilised by text prediction/correction processes to enable word completion, disambiguation of ambiguous input, and automatic correction of spelling errors.
- the smaller the keypad of such devices the more important smart text entry solutions are, because users perform increasingly complex operations on their mobile devices that require one or more of correction, prediction, and disambiguation, etc (for example, using web browsers, writing e-mails, and participating in social networks).
- an apparatus comprising a processor and memory including computer program code, the memory and the computer program code being configured to, with the processor, cause the apparatus to:
- each language model comprises a set of entries that govern text prediction and/or correction of user text; providing for selection of a language model for use in text prediction and/or correction of user entered text by comparing the determined likelihood of appropriateness for said language models. Provision for selection may comprise one or more of:
- selecting a particular language model allowing an electronic device to select a particular language model, selecting a particular language model for an electronic device, providing an indication of a particular language model.
- the memory and the computer program code may be configured to, with the processor, cause the apparatus to:
- the determined likelihood of appropriateness may indicate the correlation between: a plurality of text prediction/correction suggestions provided by respective language models in response to user input; and
- the language models may be considered to be lexicons containing sets and/or subsets of entries. These entries may be words that reflect encoded data such as n-grams, distinct languages, categorized information etc.
- a language model could contain all US states but nothing else as it is a 'US state' language model.
- a language model could contain all words in the English dictionary, or a distinct language model could contain all email addresses within a company's internal directory.
- the memory and the computer program code may be configured to, with the processor, cause the apparatus to:
- the memory and the computer program code may be configured to, with the processor, cause the apparatus to:
- the memory and the computer program code may be configured to, with the processor, cause the apparatus to:
- the predetermined interval criterion may comprise one or more of the following:
- new line in user text entry new paragraph in user text entry, new sentence in user text entry, a change in keyboard language and/or layout, a pause between periods of user text entry, user edit of already entered text, user exiting/closing of an application in which text has been entered, user entering/opening an application, and user returning to an application in which text has been entered.
- the memory and the computer program code may be configured to, with the processor, cause the apparatus to:
- the memory and the computer program code may be configured to, with the processor, cause the apparatus to:
- the apparatus may comprise one or more language models that make up the plurality, and/or may have access to one or more language models stored remotely from the apparatus.
- Respective language models may comprise one or more of the following: different languages (English, French, German, Spanish, and the like), locations (such as street names, countries, counties, states, districts, and the like) n-grams (bigrams, trigrams, and the like), and other sets of data.
- an apparatus comprising:
- a determinator configured to determine, in response to input by a user, likelihood of appropriateness for two or more respective language models, wherein each language model comprises a set of entries that govern text prediction and/or correction of user text;
- a selector configured to provide for selection of a language model for use in text prediction and/or correction of user entered text by comparing the determined likelihood of appropriateness for said language models.
- an apparatus comprising:
- a means for determining configured to determine, in response to input by a user, likelihood of appropriateness for two or more respective language models, wherein each language model comprises a set of entries that govern text prediction and/or correction of user text;
- a means for selecting configured to provide for selection of a language model for use in text prediction and/or correction of user entered text by comparing the determined likelihood of appropriateness for said language models.
- a non-transitory computer readable medium comprising computer program code stored thereon, the computer program code being configured to, when run on a processor, perform at least:
- each language model comprises a set of entries that govern text prediction and/or correction of user text
- a method comprising: determining, in response to input by a user, likelihood of appropriateness for two or more respective language models, wherein each language model comprises a set of entries that govern text prediction and/or correction of user text;
- the method may provide for selection of a language model by performing one or more of:
- selecting a particular language model allowing an electronic device to select a particular language model, selecting a particular language model for an electronic device, providing an indication of a particular language model.
- the determination of the likelihood of appropriateness of respective language models may be based on a comparison of the likelihood of appropriateness of a plurality of text prediction and/or correction suggestions made by the respective language models in response to user [text] entry.
- the determined likelihood of appropriateness may indicate the correlation between: a plurality of text prediction/correction suggestions provided by respective language models in response to user [text] entry; and
- the determination of the likelihood of appropriateness of respective language models may comprise calculating the cumulative probability of a plurality of text prediction/correction suggestions for respective models.
- the method may comprise providing for selection of the language model with the greatest likelihood of appropriateness compared to the other language models.
- the method may comprise providing for selection of a language model in response to predetermined interval criterion.
- the method may comprise providing one or more suggestions from the selected language model in response to user [text] entry.
- the method may comprise providing for selection of a different language model to the currently selected language model in response to a predetermined frequency of occurrence of a particular language model having the greatest likelihood of appropriateness compared to the other language models.
- the predetermined interval criterion may comprise one or more of the following:
- new line in user text entry new paragraph in user text entry, new sentence in user text entry, a change in keyboard language and/or layout, a pause between periods of user text entry, user edit of already entered text, user exiting/closing of an application in which text has been entered, user entering/opening an application, and user returning to an application in which text has been entered.
- the present disclosure includes one or more corresponding aspects, examples or features in isolation or in various combinations whether or not specifically stated (including claimed) in that combination or in isolation.
- Corresponding means for performing one or more of the discussed functions are also within the present disclosure.
- Figure 2 shows an electronic device comprising the apparatus of the first example.
- Figure 3a shows a language model hierarchy
- Figure 3b shows an illustration of the nature of language models.
- Figure 4 shows an example where an apparatus has access to three language models.
- Figure 5a shows example screenshots of possible types of text entry.
- Figure 5b illustrates determination of likelihood of appropriateness according to one example.
- Figure 6 illustrates different comparisons of likelihood of appropriateness for different language models.
- Figu re 7 i l l ustrates selection of a lang uage model accord i n g to l i kel i hood of appropriateness for different language models.
- Figure 8 illustrates a method of operation according to one or more examples.
- Figure 9 illustrates schematically a computer readable media providing a program according to an example of the present disclosure.
- an apparatus comprising a processor and memory including computer program code.
- the memory and the computer program code are configured to, with the processor, cause the apparatus to determine, in response to [text] entry by a user, likelihood of appropriateness for two or more respective language models.
- Each language model comprises a set of entries that govern text prediction and/or correction of user text.
- the memory and computer program code are also configured to, with the processor, cause the apparatus to provide for selection of a language model for use in text prediction and/or correction of user entered text by comparing the determined likelihood of appropriateness for said language models.
- Vocabulary and phrase structure can differ across applications or even within a single application.
- a personal text message may utilise more casual or colloquial language (and thus corresponding language model), while searching for products on the internet may use less structured noun-based language (and thus corresponding language model).
- a user may need to select which language model they may wish to use, and the same language model would then be used in all contexts whether sending a text message, writing an email, writing a text document, etc unless the user changes to a different language model again by manual selection.
- a given user may be multi-lingual and may need to be able to switch between different languages (and thus different corresponding language models) to send messages to friends of different nationalities.
- language models are not necessarily static in their contents. For example, Barack Obama was elected the president of the United States, but his name is significantly atypical in current Western language and would be unknown to language models devised a year prior to his election.
- the beginning of an e-mail may consist of language that is defined as 80% "personal communication” and 20% “business”, as this would begin with an initial greeting, whereas the content may then progress throughout the email to be 20% "personal communication” and 80% "business” as the real issue is brought up in the content of the email.
- Figure 1 schematically illustrates apparatus 100, which comprises an input 1 10, a processor 120, a memory 130, and an output 140.
- the apparatus 100 is integrated as part of a mobile electronic device such as a mobile telephone. This is illustrated in Figure 2. In other examples it may be a module attached to a mobile device/telephone, or a mobile telephone device per se.
- While this example relates to a mobile telephone device, one or more other examples can relate to portable electronic devices, in particular, so-called hand-portable electronic devices which may be hand-held in use (although they may be placed in a cradle in use).
- hand-portable electronic devices include so-called Personal Digital Assistants (PDAs).
- PDAs Personal Digital Assistants
- the portable electronic devices/apparatus may provide one or more audio/text/video communication functions (for example, tele-communication, video-communication, and/or text transmission (Short Message Service (SMS)/ Multimedia Message Service (MMS)/emailing) functions), interactive/non-interactive viewing functions (for example, web-browsing, navigation, TV/program viewing functions), music recording/playing functions (for example, MP3 or other format and/or (FM/AM) radio broadcast recording/playing), downloading/sending of data functions, image capture function (for example, using a (for example, in-built) digital camera), and gaming functions.
- audio/text/video communication functions for example, tele-communication, video-communication, and/or text transmission (Short Message Service (SMS)/ Multimedia Message Service (MMS)/emailing) functions
- interactive/non-interactive viewing functions for example, web-browsing, navigation, TV/program viewing functions
- music recording/playing functions for example, MP3 or other format and/
- the input 1 10 and output 140 of this example are part of an input/output (I/O) interface for receiving signalling at the apparatus 100, and so as to be able to provide signalling to and from the processor 120.
- the input/output interface is formed as part of a physical electrical connector (not shown) to allow a module-type example of the apparatus 100 to be attachable to an electronic device.
- the processor 120 is a general purpose processor that is capable of performing the task of processing incoming signalling from the input 1 10 relating to text entry, but is also capable of performing other tasks as well.
- the processor is an Application Specific Integrated Circuit (ASIC) that is specifically configured to perform the same tasks.
- ASIC Application Specific Integrated Circuit
- the processor 120 is formed from a plurality of separate integrated circuits that co-operatively perform said tasks. These integrated circuits may be provided locally to one another, or in other examples can be provided separately and remotely from one another.
- the memory 130 is a computer readable medium that has computer program code/instructions stored thereon, and, in this example, several different language models that govern text prediction/correction.
- the memory is readable/writeable solid state memory.
- the memory may be read only memory (ROM), or may even be a physical hard drive.
- the memory 130 is to be electrically connected to the processor 120 to allow the processor 120 to access the computer program code and language models stored on the memory 130.
- the memory 130 may comprise multiple different memories that contain different language models, for example, remotely stored language models accessed via the internet or via a network server, or the like.
- the memory can actually be distinct and separate from the apparatus 100, and may be just remotely accessed by the processor 120 of the apparatus.
- the skilled person will appreciate that other such variations are also within the scope of the disclosure.
- the computer program code allows the processor 120, after receiving signalling from the input 1 10 relating to text entry by a user, to operate on that signalling in such a way so as to determine the likelihood of appropriateness for the respective language models stored on the memory.
- each language model is distinct from the others, and will therefore contain different entries (although there may be, or may not be, a degree of overlap in the entries).
- Language models are not merely limited to distinct languages, but relate to groupings of different 'words' linked by a given category. For example, a language model may contain all UK post codes and nothing more, another may contain a list of US states and nothing more. Each language model will therefore be particularly appropriate for particular types of text entry because the text prediction/correction suggestions that a given model will make will be particularly suited/relevant to the context of related text entry.
- the language model consists of a list of words with weights (corresponding to the frequencies of the words in a large text body).
- a language model will suggest words from its vocabulary/set of entries to the user. There is some adaptation, such that the weight of a word is increased when the user uses the word, and the user can add new words to the vocabulary.
- the language models may be considered to be lexicons containing sets and/or subsets of entries. These entries may be words that reflect encoded data such as n-grams, distinct languages, categorized information etc. For example, a language model could contain all US states but nothing else as it is a 'US state' language model. A language model could contain all words in the English dictionary, or a distinct language model could contain all email addresses within a company's internal directory.
- t h e text prediction/correction suggestions will only be from the selected language model. This illustrates why it can be advantageous to provide a robust automated selection mechanism for changing language model used. By determining the likelihood of appropriateness for respective language models for given text entry, it is possible to identify which is the most appropriate language model to be selected for a given point in time. This is explained in more detail below.
- the memory 130 actually has different language models stored thereon (discussed in more detail below in some example scenarios).
- other language models may also be stored separately and remotely accessed by the apparatus 100, so as to take account of a wider number of language models, thereby locating the most appropriate language model for given text entry.
- the input 1 10 is electrically connected to the processor 120, the output 140 is electrically connected to the processor 120, and the memory 130 is electrically connected to the processor 120. This is to allow the processor 120 to be in electrical communication with each of these elements.
- the apparatus is formed on a printed circuit board (PCB, but this may also be a printed wiring board - PWB).
- the processor 120 and memory 130 are actually soldered directly to the PCB, and the respective electrical connections between the processor 120 and the input 1 10, output 140, and memory 130 are also formed via soldering.
- a multi-bit bus is provided to electrically connect the processor 120 to the respective components 1 10, 130 & 140.
- the PCB provides a push-fit integrated circuit (IC) connector (not shown) for receiving the processor 120 and soldering lugs connected to this connector for allowing direct connection of the other components to the processor 120 (once push-fit into the IC connector).
- screw terminals may be used to connect the various components together.
- the processor 120 is actually formed within a module (for example, being formed within epoxy resin) and the processor 120 connected to external wire conductors that allow for connection to the other components (input 1 10, output 140 and memory 130).
- a module for example, being formed within epoxy resin
- the processor 120 connected to external wire conductors that allow for connection to the other components (input 1 10, output 140 and memory 130).
- FIG. 4 illustrates an apparatus 400 in accordance with the apparatus 100 of the first example.
- This apparatus 400 has access to three different language models, A, B and C stored on the memory (represented figuratively by the dashed lines connected to blocks A, B and C).
- Model A relates to casual language such as 'hello, hi, hey, how are you, what are you up to' etc.
- Model B relates to nouns such as 'dog, cat, me, you, box, hill', etc.
- Model C relates to acronyms or abbreviations such as ' U RL, DLL, USB' etc.
- the apparatus 400 will consider each of these to determine their likelihood of appropriateness in response to user text entry.
- each of the language models are n-gram language models that take the entire word history up to the currently active word into account. This all affects the probability of a particular word suggestion being correct.
- no manual or user-explicit switching between the different models is required - the switching between models or even using a combination of multiple models can take place automatically.
- word bigrams two-word sequences, for example “that's fine”, “I'm ok”, “thank you”, etc. This is helpful in situations where if a user repeatedly enters certain phrases, for example, “Let's run to school. Better yet, let's drive to" By remembering the repetition of such phrases the system can predict that the user is going to enter the word "school”, because the last time "to” was entered, "school” followed “to”.
- Figure 5a a user has started to enter text and has entered 'hrll'.
- the apparatus 400 receives signalling via the input to the apparatus 400 that represents the text entry and, by virtue of the code on the memory 130 and the processor 120, operates on the text entry.
- the apparatus 100 calls on the three different language models and requests for text prediction/correction suggestions from each in response to the user text entry.
- the apparatus is asking each of the accessible language models 'what do you think 'hrll' should be?'
- more than one suggestion is often returned, and they are ordered based on the language model's own ranking.
- the model can be seen to rank suggestions higher based on their respective probabilities of a given suggestion being correct within that specific model.
- Model A returns just one suggestion of 'hello'
- model B returns three suggestions of 'hill, hell, hull'
- model C returns two suggestions of 'URL, DLL'.
- Each language model associates its own probability of correctness for that suggestion according to the information contained within the model itself. For model A, it is 50% certain that 'hello' is the word the user meant to input, for model B, it is 10% sure of each suggestion, and for model C it is 20% of each suggestion. Other models may be more or less sure of different entries and they need not all have the same probability of being correct.
- the likelihood of appropriateness is measured by a cumulative probability value which is a total of 'log(x)' tallying up each entry, where x is the probability for each entry. For model A this is log(50) to give a likelihood of appropriateness of '3.91 ', for model B this is log(10)+log(10)+log(10) to give '3', and for model C this is log(20)+log(20) to give 2.6.
- cumulative probability is used to measure the level of appropriateness of a language model.
- a suitable alternative measure may be used instead.
- Model A can therefore be seen to be the most appropriate language model for the text entry 'hrll'. In this example, language model A would then be selected for use in text correction/prediction.
- the apparatus is provided as part of a network server and is intended to respond to language model queries from electronic devices.
- the server apparatus operates substantially as per the example described immediately above, except that it does not actually select a language model, but merely provides for selection by providing an indication of the most appropriate language model after having performed said determination. Other examples are discussed below.
- the language model was changed after a single word.
- language model A was selected and its suggestions used for text prediction/correction, while the other language models and their suggestions were ignored for the purposes of providing text prediction/correction suggestions to a user.
- suggestions from the other language models can still be included in suggestions provided to a user for text prediction/correction, but
- the apparatus 400 can also utilise rules that mean that language model selection is only performed at certain intervals, based on language model interval selection criterion/criteria.
- the provision of selection of a particular language model can provide for switching of models to take place automatically, if the system knows that, for instance, the current text entry field must contain an address. This can be considered to be a type of interval or context criterion that denotes that a particular type of text entry is about to be received.
- Figure 6 illustrates how the apparatus 400 can operate under different intervals and scenario.
- a language model is identified as 'most appropriate'. While this may not result in a new language model selection for text prediction/correction, each time a model is determined to be most appropriate it can be recorded as such. For example, in the first row of Figure 6, model A has been identified as most appropriate (via the process discussed in the example above) 10 times, while model B has only been identified as most appropriate 8 times, and model C only 2 times. In this example, the occurrences of being most appropriate are stored. When a certain model exceeds a predetermined threshold (for example, 10 instances of being most appropriate within a given application, or text document/message) the apparatus then performs language selection and switches to that model.
- a predetermined threshold for example, 10 instances of being most appropriate within a given application, or text document/message
- the language model would be switched to model A as it has reached the predetermined threshold of 10.
- the scoring of the occurrences can be reset, or at least a new count begun as a new interval has begun. A running total of all occurrences could be kept, for later use.
- the interval may be defined by some other measure, such as the end of a line, end of a paragraph, new sentence, or the like. A tally could therefore be kept for each sentence and the language model switched whenever a particular language model outranks the others as having the most occurrences within that interval.
- the apparatus is configured to select a different language model when a predetermined threshold is exceeded (in this case, 10), and a new language whenever a new paragraph is started based on the model with the highest recorded occurrence up to that point.
- a predetermined threshold in this case, 10
- the different language models are identified as most appropriate per line rather than every word.
- model A ranks as most appropriate for the first three lines based on the number of occurrences. Once the user starts a new paragraph, the occurrences are compared and model A is identified as having the highest number of occurrences. Language A is therefore selected, and the occurrence score reset to 0.
- model A and model B have equal occurrence scores. As a result, the language model is not changed but the status quo is maintained given there is no reason to change the language model. Nevertheless, the scores are then reset.
- model A then racks up more occurrences than any other model and quickly reaches the predetermined threshold of 10. Model A is then switched to. This occurs just before the end of the paragraph and the end of the message. At this point the scores are reset ready for the next message.
- the scores are kept, or constantly tallied up with each interval being monitored.
- applications start up with the last used model selected, or the most used model selected (either in last message or over a period of time), or with no model selected.
- model A is then identified and selected for text prediction/correction when the interval criterion of a new paragraph occurs based on the highest number of occurrences of model A being most appropriate.
- model B is selected at this point.
- model A then reaches the predetermined threshold before any other model and so model A is then selected.
- Figures 3a and 3b illustrates four different categories of language model according to the present example: (a) Online language model, shared over the network - such models are used when the data connection is fast enough. Each user can gain access to many different language models that are updated continuously based on data from thousands or millions of users. Such models are not stored locally on the user's device;
- (d) User-specific language model, stored on the device - these models do not contain n-grams of words, but rather n-grams of characters or other sub-word units, such as morphemes. If the text entry does not match any words in the default models (a) and (b) combined with (c), the system switches to the backup model (d).
- the backup model mimics typical strings of the target language, although the exact string proposed may not be available in the default word-based models (a) or (b). That is, the backup model makes an "educated guess" of what words there are in the language and how well this new word fits in the current text context based on, for example, word history, application history, context, the user entering text, or the like.
- the automatic switching to the backup model (d) in response to certain scenarios may be notified to the user using a sound signal, a change of the visual appearance (font or color) or something of the kind, thereby signalling that the suggested word is not necessarily a true word.
- a sound signal e.g., a sound signal
- a change of the visual appearance e.g., a change of the visual appearance
- something of the kind e.g., a sound signal, a change of the visual appearance (font or color) or something of the kind, thereby signalling that the suggested word is not necessarily a true word.
- ITU-T keypad if the user is not happy with the new word proposed by the character n-gram model, s/he can switch to multi-tap mode as a last resort.
- each language model comprises a set of entries that govern text prediction and/or correction of user text
- Figure 9 illustrates schematically a computer/processor readable media 900 providing a program according to an example of the present invention.
- the computer/processor readable media is a disc such as a digital versatile disc (DVD) or a compact disc (CD).
- DVD digital versatile disc
- CD compact disc
- the computer readable media may be any media that has been programmed in such a way as to carry out an inventive function.
- Another advantage of the above examples is that it they are not dependent on the different types of language model. Based on the probabilities provided by the different language models, the likelihood of appropriateness is determined and a language model selection can be performed accordingly.
- any mentioned apparatus/device/server and/or other features of particular mentioned apparatus/device/server may be provided by apparatus arranged such that they become configured to carry out the desired operations only when enabled, for example, switched on, or the like. In such cases, they may not necessarily have the appropriate software loaded into the active memory in the non-enabled (for example, switched off state) and only load the appropriate software in the enabled (for example, on state).
- the apparatus may comprise hardware circuitry and/or firmware.
- the apparatus may comprise software loaded onto memory. Such software/computer programs may be recorded on the same memory/processor/functional units and/or on one or more memories/processors/functional units.
- a particular mentioned apparatus/device/server may be preprogrammed with the appropriate software to carry out desired operations, and wherein the appropriate software can be enabled for use by a user downloading a "key", for example, to unlock/enable the software and its associated functionality.
- Advantages associated with such examples can include a reduced requirement to download data when further functionality is required for a device, and this can be useful in examples where a device is perceived to have sufficient capacity to store such pre-programmed software for functionality that may not be enabled by a user.
- any mentioned apparatus/circuitry/elements/processor may have other functions in addition to the mentioned functions, and that these functions may be performed by the same apparatus/circuitry/elements/processor.
- One or more disclosed aspects may encompass the electronic distribution of associated computer programs and computer programs (which may be source/transport encoded) recorded on an appropriate carrier (for example, memory, signal).
- any "computer” described herein can comprise a collection of one or more individual processors/processing elements that may or may not be located on the same circuit board, or the same region/position of a circuit board or even the same device. In some examples one or more of any mentioned processors may be distributed over a plurality of devices. The same or different processor/processing elements may perform one or more functions described herein.
- signal may refer to one or more signals transmitted as a series of transmitted and/or received signals.
- the series of signals may comprise one, two, three, four or even more individual signal components or distinct signals to make up said signalling. Some or all of these individual signals may be transmitted/received simultaneously, in sequence, and/or such that they temporally overlap one another.
- processors and memory may comprise a computer processor, Application Specific Integrated Circuit (ASIC), field-programmable gate array (FPGA), and/or other hardware components that have been programmed in such a way to carry out the inventive function.
- ASIC Application Specific Integrated Circuit
- FPGA field-programmable gate array
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- Human Computer Interaction (AREA)
- Machine Translation (AREA)
Abstract
Selon un ou plusieurs exemples décrits ici, l'invention concerne un appareil. L'appareil comprend un processeur et une mémoire qui comprend un code de programme informatique. La mémoire et le code de programme informatique sont configurés, avec le processeur, afin que l'appareil détermine, en réponse à un texte saisi par un utilisateur, la probabilité d'adéquation pour deux modèles de langages respectifs ou plus. Chaque modèle de langage comprend un ensemble d'entrées régissant la prédiction de texte et/ou la correction du texte de l'utilisateur. La mémoire et le code de programme informatique sont également configurés, avec le processeur, afin que l'appareil sélectionne un modèle de langage pour la prédiction du texte et/ou la correction du texte saisi par l'utilisateur en comparant la probabilité d'adéquation déterminée pour lesdits modèles de langages.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/IB2010/056138 WO2012090027A1 (fr) | 2010-12-30 | 2010-12-30 | Modèles de langages destinés à prédire un texte saisi |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/IB2010/056138 WO2012090027A1 (fr) | 2010-12-30 | 2010-12-30 | Modèles de langages destinés à prédire un texte saisi |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2012090027A1 true WO2012090027A1 (fr) | 2012-07-05 |
Family
ID=43806893
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/IB2010/056138 Ceased WO2012090027A1 (fr) | 2010-12-30 | 2010-12-30 | Modèles de langages destinés à prédire un texte saisi |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2012090027A1 (fr) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2014159298A1 (fr) * | 2013-03-14 | 2014-10-02 | Microsoft Corporation | Dictionnaires de modèle linguistique pour prédictions de texte |
| WO2015087075A1 (fr) * | 2013-12-11 | 2015-06-18 | Touchtype Limited | Système et procédé de saisie de texte dans des dispositifs électroniques |
| WO2015142967A1 (fr) * | 2014-03-20 | 2015-09-24 | Microsoft Technology Licensing, Llc | Reformatage d'une entrée sensible au contexte |
| EP2972690B1 (fr) * | 2013-03-14 | 2017-04-19 | Microsoft Technology Licensing, LLC | Prédiction de textes basée sur de multiples modèles de languages |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6876963B1 (en) * | 1999-09-24 | 2005-04-05 | International Business Machines Corporation | Machine translation method and apparatus capable of automatically switching dictionaries |
| WO2005036413A1 (fr) * | 2003-09-19 | 2005-04-21 | America Online, Inc. | Prevision contextuelle de mots utilisateurs et d'actions utilisateurs |
| EP1569071A1 (fr) * | 2004-02-24 | 2005-08-31 | Research In Motion Limited | Procédé pour faciliter l'entrée de données avec un clavier réduit en employant de l'information contextuelle |
| WO2008120033A1 (fr) * | 2007-03-29 | 2008-10-09 | Nokia Corporation | Prioritisation des mots sur la base du contenu d'une entrée |
| US20100268725A1 (en) * | 2009-04-20 | 2010-10-21 | Microsoft Corporation | Acquisition of semantic class lexicons for query tagging |
-
2010
- 2010-12-30 WO PCT/IB2010/056138 patent/WO2012090027A1/fr not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6876963B1 (en) * | 1999-09-24 | 2005-04-05 | International Business Machines Corporation | Machine translation method and apparatus capable of automatically switching dictionaries |
| WO2005036413A1 (fr) * | 2003-09-19 | 2005-04-21 | America Online, Inc. | Prevision contextuelle de mots utilisateurs et d'actions utilisateurs |
| EP1569071A1 (fr) * | 2004-02-24 | 2005-08-31 | Research In Motion Limited | Procédé pour faciliter l'entrée de données avec un clavier réduit en employant de l'information contextuelle |
| WO2008120033A1 (fr) * | 2007-03-29 | 2008-10-09 | Nokia Corporation | Prioritisation des mots sur la base du contenu d'une entrée |
| US20100268725A1 (en) * | 2009-04-20 | 2010-10-21 | Microsoft Corporation | Acquisition of semantic class lexicons for query tagging |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2014159298A1 (fr) * | 2013-03-14 | 2014-10-02 | Microsoft Corporation | Dictionnaires de modèle linguistique pour prédictions de texte |
| CN105190489A (zh) * | 2013-03-14 | 2015-12-23 | 微软技术许可有限责任公司 | 用于文本预测的语言模型词典 |
| EP2972690B1 (fr) * | 2013-03-14 | 2017-04-19 | Microsoft Technology Licensing, LLC | Prédiction de textes basée sur de multiples modèles de languages |
| WO2015087075A1 (fr) * | 2013-12-11 | 2015-06-18 | Touchtype Limited | Système et procédé de saisie de texte dans des dispositifs électroniques |
| CN105917327A (zh) * | 2013-12-11 | 2016-08-31 | 触摸式有限公司 | 用于将文本输入到电子设备中的系统和方法 |
| US10706230B2 (en) | 2013-12-11 | 2020-07-07 | Touchtype Limited | System and method for inputting text into electronic devices |
| CN105917327B (zh) * | 2013-12-11 | 2020-11-27 | 微软技术许可有限责任公司 | 用于将文本输入到电子设备中的系统和方法 |
| WO2015142967A1 (fr) * | 2014-03-20 | 2015-09-24 | Microsoft Technology Licensing, Llc | Reformatage d'une entrée sensible au contexte |
| US9672202B2 (en) | 2014-03-20 | 2017-06-06 | Microsoft Technology Licensing, Llc | Context-aware re-formating of an input |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9015036B2 (en) | Automatic context sensitive language correction using an internet corpus particularly for small keyboard devices | |
| US9715489B2 (en) | Displaying a prediction candidate after a typing mistake | |
| CN105229575B (zh) | 基于多个语言模型的文本预测 | |
| EP2972691B1 (fr) | Dictionaires de modèles de language pour prédiction de texte | |
| AU2014212844B2 (en) | Character and word level language models for out-of-vocabulary text input | |
| US8677236B2 (en) | Contact-specific and location-aware lexicon prediction | |
| JP5372148B2 (ja) | モバイルデバイス上で日本語テキストを処理する方法およびシステム | |
| US20080126075A1 (en) | Input prediction | |
| CN101782833B (zh) | 一种智能操作系统及方法 | |
| JP2019504413A (ja) | 絵文字を提案するためのシステムおよび方法 | |
| CN108073292B (zh) | 一种智能组词方法和装置、一种用于智能组词的装置 | |
| CN101673186A (zh) | 一种基于关键词输入的智能操作系统及方法 | |
| CN109545223B (zh) | 应用于用户终端的语音识别方法及终端设备 | |
| CN109101505B (zh) | 一种推荐方法、推荐装置和用于推荐的装置 | |
| WO2012090027A1 (fr) | Modèles de langages destinés à prédire un texte saisi | |
| AU2003236049A1 (en) | Conversation control system and conversation control method | |
| ZA200408277B (en) | Conversation control system and conversation control method | |
| WO2012072854A1 (fr) | Surveillance des erreurs | |
| CN115409031A (zh) | 分词方法以及装置、电子设备及存储介质 | |
| KR20200119029A (ko) | 가변적인 답변이 필요한 질문에 대한 답변 세트의 제공 방법 및 이를 이용하는 장치 | |
| WO2024121731A1 (fr) | Système d'affichage d'une signification visuelle contextuelle d'une interrogation et procédé associé | |
| CN112000233A (zh) | 联想候选的处理方法、装置和用于处理联想候选的装置 | |
| KR102219728B1 (ko) | 키패드를 이용한 키워드 검색 방법 및 장치 | |
| KR20210020980A (ko) | 키패드를 이용한 키워드 검색 방법 및 장치 | |
| CN115602159A (zh) | 资源名权重确定方法、语音信息处理方法以及装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10816451 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 10816451 Country of ref document: EP Kind code of ref document: A1 |