WO2015112250A1 - Visual-kinesthetic language construction - Google Patents
Visual-kinesthetic language construction Download PDFInfo
- Publication number
- WO2015112250A1 WO2015112250A1 PCT/US2014/066693 US2014066693W WO2015112250A1 WO 2015112250 A1 WO2015112250 A1 WO 2015112250A1 US 2014066693 W US2014066693 W US 2014066693W WO 2015112250 A1 WO2015112250 A1 WO 2015112250A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- language
- lexeme
- lexemes
- assembly
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B19/00—Teaching not covered by other main groups of this subclass
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B17/00—Teaching reading
- G09B17/003—Teaching reading electrically operated apparatus or devices
Definitions
- a language may be considered a system of communication encoded in auditory, visual, and/or tactile stimuli that may be within any one of the known families of natural human languages or within any system of artificially constructed, signed, or non-human languages.
- the term "language” may also include terminology specific to a professional or technical field.
- a "target language” may be considered a language being learned or used in communication by one or more users.
- the system provides a novel method and potential applications for visually and kinesthetically constructing language.
- a visual-kinesthetic language construction application may be one type of embodiment of the novel system implemented as a software application operating on an electronic device, without excluding other forms of derivative works.
- the user operates the system using one or more controllers.
- a "controller” may include any mechanism that allows the user to manipulate visual objects on an electronic device screen.
- Such mechanism may take the form of a computer mouse movement, finger movement or touch, keystroke, mechanical switch, gesture, remote control such as an infrared light signal, camera input, voice command or other vocalization, facial movement or expression (e.g., mouth movement), neurobiologic impulse such as a brainwave or eye motion, any other "kinesthetic" action including the actual or perceived sense of motion or touch, or any combination of mechanisms thereof.
- the system may enable a user to acquire or communicate the meaning of the words, phrases and sentences that such user constructs, to internalize the phonology and syntax of the target language, and/or to enable a user to express himself or herself in a target language without the necessity of vocalization, transcription, signing, or any other means of language encoding that requires knowledge of or skill in such language.
- the system further enables pseudo-verbal expression and interpersonal or mechanical interactions by a user.
- the system therefore has potential application to educational, communications, assistive and therapeutic purposes and technologies.
- Some embodiments of the novel system avoid forcing the user to produce (e.g., vocalize, write, or type) the target language. Instead, users may use the system to produce meaning without the perceived risk of failure associated with vocalization or transcription.
- the novel system offers a hybrid between reception (comprehension) and production (expressing vocally or by text), where the user assembles phrases visually and "kinesthetically" (using touch or motion). This enables language learners to transition from reception to production in a low-stress, self-paced way with built-in instructional supports and answer tracking.
- a user may be unable or hesitant to attempt a natural rendition of a target language, so the novel system can facilitate two-way communication between the user and the system or between or among users.
- Some embodiments of the novel system avoid or mitigate environmental, technical and usability obstacles related to speaking into a microphone and/or typing. Instead, users may primarily interact with the system by manipulating visual icons.
- the potential to use neurobiologic impulses or facial movements to manipulate the visual icons also opens up a new set of possibilities for persons with disabilities or trauma to learn or recover language (in either educational, assistive or therapeutic senses) and/or to use the system to communicate with others.
- Some embodiments of the system enable emergent and challenged readers to visually and kinesthetically construct phrases in a target language without requiring them to read any text. Knowledge of a language need not be a requirement for learning a new language. Some embodiments of the novel system do not require reading text. Furthermore, it may abstract words as visual icons so that the variations in word forms do not muddle comprehension or add cognitive load. Graphical presentation reduces cognitive load because it directly translates concept-to-word, rather than the conventional text-to-concept-to-word, and thus requires only a single cognitive processing channel, rather than two channels.
- Some embodiments of the novel system pre-process words into lexemes, each being a unit of un-parsable or unambiguous meaning.
- the system then expresses the meaning of each lexeme in visual and kinesthetic ways through an unambiguous graphical element, which may help to minimize difficulties in learning that are due to the fact that languages have words with dual or ambiguous meanings.
- Some embodiments of the novel system use graphical elements, including animation of the word concept, which can trigger memory of the word sound and convey semantic value (meaning), word class (part of speech) and other key information more effectively than text.
- Some embodiments of the novel system sequentially present lexemes that are typically used together in a target language in a way that illustrates both how the lexemes are distinct and how they can be combined. One way the system achieves this is to provide multiple pairing options (e.g., eat an apple + eat an orange + eat noodles).
- That variance builds "packages” that the user's brain uses to differentiate word boundaries until the user becomes cognizant that "eat,” for example, is the process of consuming something and “apple,” “orange,” and “noodles” exemplify items that can be consumed.
- the discrete visual nature of graphical lexemes helps language learners recognize word boundaries, interpret meaning, and observe the formation of phrase structures using basic syntax, all of which aide statistical learning (a phenomenon wherein the human mind interprets the patterns of speech sounds and attaches meaning based on situational contexts and what the brain unconsciously perceives as the statistical relevance).
- Some embodiments of the novel system provide modular building blocks that can be combined to form a large number of phrases, which may be an easier approach to learning than a strict didactic approach. For example, given a possible set of 10 such building blocks and the ability to assemble them into a string of 1 , 2 or 3 lexemes, one could make up to 1,110 possible combinations.
- This process empowers users to experiment with various forms, orders and structures and therefore consciously or sub-consciously begin to internalize the patterns that represent the target language's grammatical rules.
- the system can then provide corrective feedback that communicates whether a constructed phrase structure is well formed or ill-formed. Users build their knowledge of syntax over time through experimentation and corrective feedback. At the same time, the system may auto-complete or auto-correct grammar for users with cognitive or learning impairments to facilitate augmentative and alternative communications.
- Some embodiments of the novel system use vocabulary, syntax and phonological factors to scaffold with a fine-grained level of detail. Scaffolds may be specialized instructional supports that facilitate acquisition of just enough knowledge and skill to reach the next step in a learning sequence. This enables language learners to always be at the right level.
- the novel solution is possible for two reasons: One, the system can track which lexemes, word classes and phrase forms a user uses to construct phrases or sentences and further determines whether they are well formed or ill- formed syntactically and whether they are semantically intelligible. Two, the system can provide a fine-tuned sequence for how the target language progresses from simple to complex sounds, words and forms. Once this sequence is programmed, the system can present level-appropriate lexemes individualized for a user to use to construct phrases and sentences.
- Prosody or attributes of speech such as emotional tone of voice, syllable length, syllable stress, voice pitch and voice volume that are not encoded in words or grammar, is an important element of meaning in language.
- Some embodiments of the novel system may vocalize prosodic attributes such as emotional tone of voice, syllable length, syllable stress, voice pitch and voice volume as part of communication between user and machine or between two or more users, to better assist a learner who might not yet understand a language's rules for expression (e.g., meaning of an exclamation mark).
- the system may enable a user a balanced degree of freedom to experiment and construct phrases or sentence in a target language using a visual-kinesthetic application without being forced to speak, read or type. It abstracts the variation in word forms, reduces cognitive load, and clarifies meaning by using the smallest possible discrete element of language, the "lexeme.” And, it facilitates statistical learning and scaffolding that can gradually transition a user from target language reception to production at increasing levels of expressive sophistication and accuracy.
- Some embodiments of the novel system presented herein enable users to construct and express language by visual, kinesthetic or other means without requiring (but not inherently excluding) vocalization, reading or transcription and without prior language knowledge or skill.
- This form of pseudo-verbal and pseudo-physical expression enables users to interact with other persons and/or software programs.
- users construct a word, phrase or sentence in a target language in an interactive engagement with the novel system by using a controller to visually and kinesthetically combine one or more "Lex-Icons" on a graphical user interface.
- Each Lex-Icon represents one lexeme, which is the minimal unit of meaning in any language.
- Lex-Icons may be assembled into a phrase or sentence of varying length called an "assembly."
- the result is a string of 1 to N lexemes, each lexeme being represented visually by a Lex-Icon, sequenced in a particular order and direction by visual and/or kinesthetic means that, together, compose a word, phrase or sentence in a target language.
- a user might select one Lex-Icon that represents a dog, a second that represents the act of running, and a third that represents performing an action quickly.
- These three Lex-Icons could be arranged in the assembly DOG, RUN, QUICKLY.
- each constituent Lex-Icon will be converted into natural language with a process that ultimately results in the sentence form "The dog runs quickly.”
- the system may then execute one or more programmed actions based on the composition of the assembly. Depending upon the particular application derived from the system, it may also produce auditory output, such as a spoken voice that narrates the phrase or sentence aloud in the target language. Or, it may cause animated or static displays to demonstrate or illustrate the intended meaning of the assembly. For example, visually constructing the sentence "The dog runs quickly” could cause a dog to run rapidly across the screen.
- Design time may refer to the preparatory work needed to create and package a language- specific application prior to use.
- Runtime may refer to the time during which the system operates, i.e., a user loads an application wherein interactions between the user and the application may occur.
- each target language may be analyzed and pre- processed in order to perform the following:
- phrase form patterns any intermediate word classes allowed within the phrase forms, and enumerate any exceptions
- an optimized output file set which may include a local (client-side) database that is constrained and/or flattened in order to reduce complexity and processing at runtime, resulting in more rapid runtime rendering.
- runtime workflow may be summarized below in simplified form:
- the system loads user profile data and preferences, to the extent available.
- the system presents a set of lexical items based on data previously collected about the user and based on contextual factors that could include spatial-temporal entities.
- Lex-Icons The user arranges Lex-Icons to create an assembly, using controller mechanisms provided by the specific hardware device.
- the system parses the assembly and runs through a validation process.
- the system converts it into natural language by determining word form, tone, and other linguistic elements. It transcribes the full phrase or sentence.
- the transcription may be narrated as audio or provided in another form of output.
- the system may generate a programmed action such as an animation or other visualization of the assembly.
- the system may also generate a response assembly in the form of dialogue with other users or in the form of a virtual dialogue with an application.
- Data are logged about the event, some of which may be displayed to the user.
- the assembly is cleared and the user may create a new assembly.
- This process may allow users to be able to naturally acquire spoken language in a "non-instructional" sequence, i.e., without any explicit discussion of word meanings or grammatical rules. Instead, the human mind interprets the patterns of sounds in a target language and attaches meaning based on situational contexts and what the brain unconsciously perceives as the statistical relevance.
- the process may also facilitate system-mediated communication among users using one or more languages or users requiring assistive technology. The system may be applied to any target language, regardless of writing system or phonology
- FIG. 1 is a flow chart illustrating a process for how linguistic elements of a target language may be pre-processed and stored at design time.
- FIG. 2 is an illustrative graph diagram depicting the interrelationships among lexeme pairs and the direction of each relationship.
- FIG. 3 is a block diagram that functionally illustrates an embodiment of the runtime system.
- FIG. 4 is a flow chart illustrating a process for how a user may input a lexeme assembly at runtime.
- FIG. 5 provides a conceptual illustration of how the system may determine which Lex-Icons to present to a particular user
- FIG. 6 provides a logical diagram for an organizational schema of the area used to visualize lexeme assemblies-in-progress.
- FIGS. 7 A, 7B and 7C are a set of illustrative diagrams depicting how a user may rearrange a lexeme assembly at starting, intermediate and ending states, respectively.
- FIG. 8 is a logical diagram illustrating the syntactic hierarchy of a sample sentence, originally provided to the system by the user as an unordered string of lexemes.
- FIGS. 9A and 9B are a set of illustrative diagrams depicting how a user may associate one or more prosodic attributes with a lexeme or a lexeme assembly.
- FIG. 10 is a flow chart illustrating a process for how a lexeme assembly that is submitted by a user may be validated.
- FIG. 11 is a flow chart illustrating a process for the output of results and/or programmatic actions in response to a valid lexeme assembly submitted by a user.
- FIG. 12 is a flow chart illustrating a process for provision of corrective feedback in response to an invalid lexeme assembly submitted by a user.
- FIG. 13 is a flow chart illustrating a lexical rank-ordering algorithm.
- FIG. 14 is a flow chart illustrating a process for how a user may customize preferences.
- FIG. 15 is a block diagram showing the basic components of a general computing hardware device usable in conjunction with one or more embodiments of the system.
- FIG. 16 is a block diagram showing how the system may determine spatial-temporal context.
- FIGS. 17A, 17B, 17C, 17D, 17E and 17F are a set of illustrations depicting an embodiment of the system as a language learning application.
- FIGS. 18A, 18B, 18C, 18D, and 18E are a set of illustrations depicting an embodiment of the system as an augmentative and alternative communication software application.
- FIG. 19 describes a process for visualization of a word concept such as by means of a video or animation clip that demonstrates or models enacting the word concept.
- the system may be embodied by one or more software applications that operate on an electronic device (hereafter a "device") including but not limited to a computer, tablet, phone, game console or any other desired type of computing device.
- the applications may be embodied as a series of instructions stored in one or more computer-readable media, such as solid-state disks or drives such as NAND-based flash memory, random access memory (RAM) such as Dynamic RAM (DRAM) and Synchronous DRAM (SDRAM), read-only memory (ROM), hybrid drives, optical disks, or a plurality of other types of computing device memory storage, whether or not yet invented, and which may be executed by one or more processors in the computing device to achieve the results described
- RAM random access memory
- DRAM Dynamic RAM
- SDRAM Synchronous DRAM
- ROM read-only memory
- hybrid drives optical disks
- optical disks or a plurality of other types of computing device memory storage, whether or not yet invented, and which may be executed by one or more processors in the computing device to achieve
- FIG. 1 is a flow diagram of a method 100 for pre-processing a target language to create a system that operates using a series of methods 200 through 700. The steps shown in FIG. 1 may be performed once for each desired target language.
- a "core vocabulary” which may be a set of most frequently spoken words and phrases in a target language, is enumerated by the publisher of a given application and stored as a flat file. Note that there may be a significant variance in frequency of usage between written and spoken language, especially in ideogram-based languages, so it is important to make the distinction that the novel system use the spoken language as the sole basis for determining core vocabulary.
- CDI Common Development Inventory
- the application publisher may narrow down or expand the CDI wordlist based on the particular learning objectives, level and intended audience for a given application.
- CDIs may also contain indicators of word use frequency such as the percentage of sampled children at a specific age who have each word in their receptive or expressive vocabulary. (For example: At 30 months of age, 85% of sampled native Mandarin speakers recognized the word PAO 3 , translated to English as "run” in the sense of "moving forward rapidly on two or four legs.”).
- polysemic words in the core vocabulary may be separated into lexemes, still in a flat file format.
- a "lexeme” may refer to a word or phrase in the target language that has the following properties: It holds a single semantic value (meaning); it may assume multiple word or tone forms; it is, in its entirety, a minimal unit of meaning and therefore un-parsable; it occupies one word class; and it is bound by rules of grammar in a given target language.
- the English noun CRANE N would be separated into the lexemes CRANE bird and CRANE construction.
- the lexeme is represented by a capitalized root word, an underscore, and a second word elaborating on the meaning. Other representations may be used as well.
- each lexeme is stored in the system's lexeme relational or graph database, as further described herein and referred to hereafter as the "LexDB.”
- Each lexeme is a unique database entry and is programmatically assigned a unique ID.
- the application publisher enters relevant information about each lexeme such as the root word in each target language, the lexeme's semantic value, its word class (e.g., verb) in each language and, if applicable a word sub-class (e.g., stative verb) or a meta-class (e.g., subject or object). It could also include one or more lexical categories for the lexeme that either apply globally to all languages or only to one or more specific languages.
- TABLE 1 illustrates conceptually one way of storing such data in the LexDB for four English words broken into six sample lexemes. In reality, the list could be considerably longer; RUN has at least 33 meanings. Only a subset of the relevant data is shown in the sample table.
- the LexDB may store a value for each lexeme entry in a given target language that enables the system to rank-order the lexemes and thus determine a "lexical sequence," which may be a sequence in which said lexemes may be presented to a user such as through leveling.
- a lexical sequence value could be derived for a given lexeme by the application publisher: First, find a CDI (inventory) with a frequency value 1-100, representing a percentage of a population that understands the lexeme, for the lexeme for a given age.
- PAO 3 (RUN) in Mandarin is understood by 85% of 30-month-olds, a frequency value of 85 could be stored in the LexDB, which could store multiple frequency values for a particular lexeme, depending on contexts such as age or other audience factors. Then the application publisher could apply weighting factors, such as based on the lexeme's word class, tone class, or length in syllables. For example, in English, nouns could be weighted more heavily than verbs because nouns are generally learned first in English, whereas Mandarin would weight both word classes equally. Lexemes with fewer syllables tend to be acquired first, so therefore lengthy lexemes could be underweighted.
- Tone classes may be weighted based on the sequence in which they are learned in the target language. For example, in Mandarin lexemes with tone value of "1" and “4" might be assigned a higher weight, whereas tones “2" and “3,” which are typically acquired later, might receive a lower weight. Multiplying the raw frequency value (1-100) by one or more such weighting factors would result in a lexical sequence value, which would be stored in the LexDB for a given lexeme as applied to a given target language.
- a LexDB table may store one or more prerequisite lexemes that are to be presented to a user in the runtime application prior to presenting another specified lexeme.
- a specified lexeme's prerequisite lexeme may be a lexeme that, according to the rules of the target language, should be learned before the specified lexeme. Prerequisites are likely to be applicable to educational applications but not to communications applications.
- step 106 one or more lexeme presentation sequences are created, each sequence consisting of lexemes in the LexDB wherein such sequences are to be presented to a user in various forms as part of the user input method 200.
- a presentation sequence may also include other parameters including but not limited to a number of lexemes to be presented at any given time, a possibility set of word classes that may be presented at any given time, a permitted assembly length (the number of lexemes allowed in a phrase or sentence), and a permitted grammar set, as defined in step 107.
- TABLE 2 illustrates conceptually one possible way of expressing a lexeme presentation sequence, wherein "N" refers to a noun, "Adj" to an adjective and "V" to a verb.
- the Maximum number of Lexemes Displayed provides the maximum number of lexemes that will be concurrently available in a given level or point in the presentation sequence.
- a grammar set may be defined as a series of phrase forms, with each phrase form stored in the LexDB database as a sequence of word classes, such as Noun, Verb, and Adverb, and/or sub-classes such as Dynamic or Stative Verbs. Note that each language may apply only a subset of all the word classes enumerated in the LexDB, since word classes vary by language. TABLE 3 illustrates conceptually how the grammar set could be stored in a relational or graph database. The limitation of three word class columns and two language columns is for brevity of illustration.
- the grammar set may be further defined as a series of rules establishing the permissibility of an intermediate word class between any two given word classes as shown in TABLE 4. The purpose is to reduce the number of permutations in TABLE 3.
- a "Lex-Icon” is created for each lexeme in the Core Vocabulary and stored in or linked from the LexDB as a set of one or more files associated with each lexeme's unique ID.
- a "Lex-Icon” is a unit of meaning that represents one lexeme generally in a graphical form such as an image, sign (of a signed language gesture) and/or animation, but may alternatively be conveyed in text form, audio form, as neurobiologic signals, or as any other form capable of conveying the intended meaning.
- the means of representation may vary based on word class or other factors.
- Lex-Icons may identify the word class of the lexeme by having an additional differentiated visual element, such as variation of the graphical background and/or border color, style, texture or other visual effect.
- the Lex-Icon is the minimal unit of language used in the runtime visual-kinesthetic language construction process that begins with step 201. During this process, multiple Lex-Icons may be assembled into a phrase or sentence of varying length called an "assembly.” The result is a string of 1 to N lexemes, each represented by a Lex -Icon and sequenced in a particular order and direction that, together, compose a word, phrase or sentence in the target language.
- step 110 any exceptions to the target language grammar set that are specific to a given lexeme are defined.
- Each phrase form exception is stored in the LexDB as a sequence of word classes, such as ⁇ Noun, Verb, Adverb ⁇ .
- TABLE 5 illustrates conceptually how phrase form exceptions could be stored in a relational or graph database, in the case of a fictional lexeme that is a noun. Null fields default to the global conditions set in steps 107 and 108 and any exceptions override the global condition. There may be 1 to N word classes; three classes are shown in TABLE 5 purely for illustrative purposes.
- a relationship is defined, whether manually or by means of a predictive subroutine, between each lexeme in the LexDB that may logically be paired. All subject nouns will be paired with all verbs; all adjectives will be paired with all nouns; all adverbs will be paired with all verbs; and so forth for any other parts of speech that may govern or be dependent upon (e.g., modify) one another.
- Each pairing relationship has two attributes: intelligibility and dependency. Intelligibility has two possible values: true or false, depending on whether the particular pair of lexemes makes sense in the target language.
- Pairing values are the language-specific frequencies in which lexemes are juxtaposed in assemblies submitted by users. The values for each attribute are stored in the LexDB and associated with a lexeme's unique ID.
- Dependency also has two possible values: governor and dependent.
- a governor is a dominant lexeme that requires or allows another lexeme, the dependent, to exist in a sentence or phrase.
- One method of determining the direction of a dependency relationship for a lexeme pair is to use a hierarchical approach called "Dependency Grammar," a widely accepted linguistic method of hierarchically analyzing language that places the verb at the top of the hierarchy. Other methods may also be used.
- dependency grammar all lexemes associated with a verb are dependent upon that verb.
- Other rules include the following: A lexeme is syntactically dependent upon another lexeme if it requires that other lexeme in order to express meaning. In English, for example, an adverb is dependent on a verb.
- a verb governs an adverb.
- a lexeme modifies another lexeme, such as an adjective modifying a noun.
- FIG. 2 illustrates a directed graph of the interrelationships among lexeme pairs, represented by vertices, and the governing or dependency direction of each relationship, represented by edges (arrows).
- the directions originate from dependent lexeme to governing lexeme, in this case using the dependency grammar method.
- a list of predefined allowable assemblies may be created and stored in the LexDB. Such lists enumerate all allowable assemblies for a given application of the system as sequences of lexemes in a delimited format such as %lexemeA%, %lexemeB%, %lexemeC%, and so forth. In other words, if a user were to express an assembly of lexemes in a sequence identical to one of the predefined sequences in the LexDB, the assembly would be decided as valid.
- TABLE 8 illustrates conceptually a partial predefined list for one application:
- an inflectional paradigm of each lexeme is enumerated and stored in the LexDB.
- An "inflectional paradigm” is a set of rules that determine when to apply specific "word forms,” which are variations of a word, in order to express different grammatical categories such as tense, person, number, or mood, to a given lexeme. This concept is applicable only to "inflective languages,” which are target languages that have variable word forms, such as declensions and conjugations, for a given lexeme. Each word form will have a unique ID, with a common portion (e.g., the "065" in the example below) indicating the relationship.
- tone is enumerated for each lexeme as a numerical value 1 to N.
- Tone values may represent pitches such as high or low or changing in pitch such as rising or falling, for example.
- the system may generate and store rules for tonal transformations, which are instances where in natural speech words are vocalized differently based on context, and may be stored as logical arguments for each language.
- tonal transformations are instances where in natural speech words are vocalized differently based on context
- lexeme A and lexeme B may, in isolation, each have a certain tone when spoken individually, but when they are used together in a sequence, the tone of one or both lexemes A and/or B may change.
- Tone sandhi occurs when a Mandarin lexeme with tone 3 (dipping) is positioned before another lexeme.
- Tone tones have a value 1 (high level tone), 2 (rising), 3 (dipping), 4 (falling) or 5 (neutral), and tone 3 may have the modifier "F” for a falling half-tone, so that "3F” refers to a "half-third" tone which falls but does not rise in tonal inflection.
- TABLE 10 illustrates a series of transformations. For each transformation listed, there may be a starting state sequence of tones, which may indicate the sequence of tones found in a sequence of lexemes that may be input by the user.
- the transformation with the identifier value of 1 has a starting state sequence of ⁇ 3,3 ⁇ , which indicates that the sequence has a first lexeme with tone 3 followed by another lexeme with tone 3.
- the resulting state sequence of tones shows the changed sequence of tones due to a tonal transformation.
- the sequence of lexemes that originally were both tone 3 lexemes would be transformed in tone, such that the first lexeme now has a tone 2 tone, and the second lexeme has a tone 3F tone.
- the tones may also transform based on the number of syllables found in a particular word, so the tonal transformation rules may account for this as well.
- Table 10, ID number 2 shows an example in which a lexeme having two syllables each of tone 3 (represented as "3_2", where the tone number is appended by an underscore and the number of syllables) is followed by a tone 3 lexeme having just one syllable (annotating the one is optional, since it may be assumed that the other tones lacking the underscore appendage are also one-syllable tones).
- the tones in this sequence of lexemes would be transformed to a tone 2, followed by another tone 2, and then a tone 3.
- step 115 the entry for each lexeme in the LexDB may enumerate any exceptions to the tonal transformation rules using a method such as established in TABLE 5. Any exceptions will override the language tonal transformational rules.
- step 116 for click languages, the click consonants (obstruents) are enumerated for each lexeme as a numerical value 1 to N.
- clicks have a value 1, 2, or 3.
- Click values may correspond to different types of click releases such as dental, palatal, alveolar, lateral and bilabial, for example.
- step 117 where applicable, the required order of word sub-classes is defined for each target language and stored in the database. For example, in English if multiple adjectives are used contiguously to modify a noun, they must occur in the following sequence of sub-classes in order to be valid natural language: Determiners, Observation, Size, Shape, Age, Color, Origin, Material Composition, Qualifier.
- Step 118 enumerates any articles, classifiers (e.g., measure words), copulae (e.g., linking verbs), ergatives, evidential markers, reflexive markers, and various possessive, tense and other grammatical particles that need to be encapsulated with a given lexeme in various contexts and stores these in the lexeme's LexDB entry to generate an "extended word form.”
- classifiers e.g., measure words
- copulae e.g., linking verbs
- ergatives e.g., evidential markers, reflexive markers, and various possessive, tense and other grammatical particles that need to be encapsulated with a given lexeme in various contexts and stores these in the lexeme's LexDB entry to generate an "extended word form.”
- TABLE 12 provides a second example in Mandarin Chinese for the lexeme WO 5 , which means I/ME in English. (The number superscripts refer to tones.)
- extended word forms could be determined by virtue of grammatical categories such as tense, person, number, or mood, to a given lexeme.
- grammatical categories such as tense, person, number, or mood, to a given lexeme.
- the "running" form of the lexeme RUN V may be enumerated as follows:
- TABLE 13 shows how TABLE 9 could be extended to capture this information for the word form "running" of the lexeme RUN V .
- the field "preceding form” contains a word form that may precede the lexeme. Likewise, the succeeding form would contain a word that follows the lexeme. If the lexeme RUN V had the form "running" (Word Form ID 065 3 in this example) and was identified as being in a first-person- singular context, it would meet the condition for Extended Form ID 065 3 1 and thus return the preceding form "am.” In this way, the result would be the extended form "am running.”
- step 119 all possible assemblies (sequential combinations of lexemes) that may occur at runtime are pre-computed using the various rules described above.
- the pre- computing process executes the runtime validation process outlined in method 300 (see FIG 5) and generates a value of valid or invalid on both syntactic and semantic dimensions. (Providing a value on both dimensions serves as a sort of annotation as to the specific reason for failure, if invalid.).
- a LexDB increases in size, it may become less practical to compute the full set of possible assemblies, in which case a limit may be placed on assembly length. For example, one could specify that only word pairs would be pre-computed or that each lexeme would be pre-computed to one relationship distant, i.e., its immediate governor and immediate dependent(s).
- a set of audio files, text files or other file types consisting of natural words or phrases for each variant of a lexeme's 1 to N word forms, extended word forms, and/or pre-computed assemblies in each target language are associated with the lexeme entry in the LexDB.
- the audio files may be recordings of a person speaking the words and text files may display or encode the spelling or appearance of the words.
- Other file types may include images, videos and animations visually representing a word concept.
- This output file set is further associated with each relevant word form, extended word form or pre-computed assembly in the database.
- By "natural” it is intended that audio files shall approximate natural speech by using natural speech segmentation and vocal inflection.
- the LexDB should include (a) a collection of metadata that describes each lexeme and the rules by which it operates, as a kind of "dictionary" consisting of TABLES 1-13 (b) word classifications that may include one or more lexical categories that apply globally or to a specific target language (such as words relating to family, transportation, greetings, measurement, emotions, computers, or any variety of other lexical categories), (c) a grammar set for the target language, (d) an inflectional paradigm, tone mapping and transformation system, and/or click consonant mapping system, as applicable, (e) a graph defining relationships among lexemes per FIG 2 (or else a predefined list of valid assemblies), and (f) audio, text or other files (or links to such files) for each word/tone form and extended form.
- a specific target language such as words relating to family, transportation, greetings, measurement, emotions, computers, or any variety of other lexical categories
- a grammar set for the target language such as words relating to family, transportation, greetings
- Copula ⁇ null>, %audio_file_link%
- Copula ⁇ null>, %audio_file_link%
- Copula am ⁇ lst-person-singular ⁇ , %audio_file_link%
- Copula is ⁇ 3rd-person-singular ⁇ , %audio_file_link%
- Copula are ⁇ lst-,2nd-,3rd-person-plural; 2nd-person-singular ⁇ ,
- Copula ⁇ null>, %audio_file_link%
- PA03 root, %audio_file_link%
- PA03F transformed, %audio_file_link%
- Animation key frame %animation_file_link%
- step 121 the application publisher or system administrator defines default settings for a given application and, optionally, for a given individual user or group of users.
- settings may include but are not limited to the following application and user profile settings:
- Custom Lex-Icons a set of 1 to N custom Lex-Icons, each of which may represent a specific person, place or thing, real or fictional, including proper nouns.
- An application may, by default incorporate custom Lex-Icons. For example, if Johnny Appleseed were a character in a game that used the system, the application publisher could create a custom "Johnny" Lex-Icon at design time.
- Custom Visualizations a set of 1 to N custom images, videos or animations that
- custom Lex-Icons correspond to any custom Lex-Icons and that may be used in the Visualization Area.
- the publisher may create a default set of custom
- method 1000 may be used to use preexisting video or animation clips to demonstrate word concepts.
- Target Language the language that the system outputs could be variable based on the current user profile, speaker (the user of the application), listener (a party with whom the speaker is communicating using the application), spatial and temporal contexts, device, or other pre-defined contextual factor. For example, a user could choose to output her submitted assemblies as English, except when she communicates with Juan she has the application set to output Spanish. The default language will generally be the target language of the application.
- An application publisher may define a set of 1 to N languages available for output at design time.
- Output Voice Pattern a voice pattern that may vary based on attributes such as age, gender, tone and mood, to use for narrated audio output based on the current user profile, speaker, listener, spatial and temporal contexts, device, or other pre-defined contextual factor. For example, a user could choose to associate a calm older male voice with one speaker, but change it to an excited young female voice when directed at a particular listener.
- the publisher may create a set of voice patterns at design time and assign specific patterns as default values.
- An application may allow users to select from available patterns, modify patterns or create new patterns and associate them with various attributes such as speaker, listener, spatial and temporal contexts, device, etc.
- Spatial and temporal contexts are important for both language learning and communication. For a language learner, for example, it may be helpful to prioritize vocabulary relevant to the time of day or the learner's current location. For augmentative and alternative communication, prioritizing the relevant vocabulary could enable more rapid communication rates for persons with disabilities. For example, during the course of a school day, a child is likely to spend time in places that each have a well-defined function: classrooms, the cafeteria, the nurse's office, the Speech-Language Pathologist's (SLP) office, the media center, etc. Each place requires different vocabularies. Spatial location may improve symbol prediction by matching word usage with specific locations and then presenting symbols of relevance.
- SLP Speech-Language Pathologist's
- Temporal events can also be used differentiate situations where location stays constant but context changes. For example, a room may be used for math and language activities at different times of day. Thus, the system may adjust lexical item prediction to that change. It also may predict the use of time-dependent words and phrases such as "good morning” or "lunch,” as opposed to "good evening” or
- the first method is for a system user to enter data that associates a spatial- temporal entity with the combination of a time (e.g., a time of day on a day of the week) and/or a location (e.g., predefined geo-coordinates or a recognized positioning system location and a radius from either such point). For example, a user could identify an area of 20 meters in radius from a single point and label it as "home.”
- the system may thus record the location of where particular assemblies were submitted, store these data in a relational or graph database, and allow authorized users to retrieve and display such data.
- the default location may be null or may default to the country or region in which the user is located. This may be known by location services or assumed based on the point of distribution of the application, for example.
- a user may define one or more temporal events and set user-defined parameters such as a label, time ranges, recurrence and conditionals.
- the second method is for the system to query against the corpus of assemblies submitted by the user and/or a set of users in the aggregate in a given spatial or temporal context. These two methods are not mutually exclusive. For example, the system could first check if the user has predefined a category and, if not, attempt to find the result with the second method.
- TABLE 14 below sets forth an example of how a user could enter data to define spatial and temporal contexts and associate the combination with a spatial-temporal entity for said user or for a different user.
- the first row shows spatial contexts
- the first column shows temporal contexts
- the cells in the center of the table contain spatial-temporal entities that may be the result of a combination of spatial locations and temporal events. For example, at 10:00 on Mondays in the spatial location classroom 101, the spatial-temporal entity would be "General Studies.”
- This spatial- temporal entity may serve as a basis upon which to organize, store and retrieve lexical data and allow authorized users to retrieve and display such data.
- the system may assign a semantic category based on spatial or temporal contexts or the combination of the two.
- a semantic category is defined as a grouping of vocabulary within a language that consists of lexical items that are interrelated in terms of meaning. For example, car and truck would both belong to a "vehicles” or "transportation” category. Alternatively, the system could return a list of lexical items.
- the pre-processed language data and pre-computed validation results may be packaged for a particular application on a particular device class in order to optimize access speeds at runtime and/or reduce or remove the need for runtime communication with a server.
- the packaging process may include (a) transforming the LexDB database for faster processing (for example: if the LexDB were relational, it could at this step be flattened into a non-relational database, (b) constraining the set of data packaged to only the data required for a particular application (for example, only Arabic Level 1) or even a subset of an application, and (c) any other procedures that may optimize for an application or a device.
- a specific application of the novel system for a specific target language may operate.
- Such application may integrate or interface with other applications involving the same or different target languages.
- Such applications may be loaded onto a device and operated by a user. In such event, the runtime workflow process begins, as described in the succeeding section.
- FIG. 3 provides a sample screen that functionally illustrates an embodiment of the runtime system that will be used for explanatory purposes herein.
- the runtime system may take any number of visual forms.
- FIG. 3 provides a concrete visual example of one application of the system.
- FIG. 3 interface may omit the features shown in dotted lines in FIG. 3.
- the other elements shown in solid lines may also be omitted, rearranged, and/or presented differently.
- Block 1 the Speaker and Listener Selection, in some embodiments allows a user to view and/or modify the user profile of the speaker, i.e., the user who is composing assemblies, and the user profile of the listener(s), i.e., the user(s) with whom a speaker is communicating, if applicable.
- the process for and implications of such selections are described in method 700.
- Block 2 the Assembly Parameters Panel, in some embodiments contains parameters that a user may pass to the system when submitting an assembly. These may include parameters that identify the language in which output is to be narrated, signed and/or transcribed, tense and time indicators, plurality, contextual indicators such as place or season, or other options. The process for and implications of adding parameters are described in method 700.
- Block 3 the Settings, in some embodiments allow the user to change default preferences for the entire application, for a user profile or for other more specific conditions. The process for and implications of settings changes are described in method 700.
- Block 4 the Lexical Categories Panel, in some embodiments allows a user to choose a semantic category (or meaning-based grouping) and load 1 to N Lex-Icons within that category as one way of navigating a potentially large number of Lex-Icons in the system. (The category icon with the thick dotted line represents the currently selected icon.).
- Block 5 the Lex-Icon Panel, contains Lex-Icons that a user may select, place and rearrange to construct assemblies. If a semantic category has been selected in block 4, as illustrated by the thick, dark dotted lines, the Lex-Icons will be filtered to reflect only those Lex-Icons within said category. A user in some embodiments may use voice commands, text search or various other means to further filter the selection of Lex-Icons, icon.).
- Block 6 the Visualization Area in some embodiments is a graphical area with two purposes, dependent on context: (a) the area may be used to display real-time visualizations during assembly construction and (b) subsequent to assembly submission, the area may output images, signs, videos or animations depicting the assembly's meaning.
- the user may select Lex-Icons from the Lex- Icon Panel, and the system may display a corresponding image, animation, sign or text string in a pre-defined location in the Visualization Area.
- Block 7 the Prosodic Attributes Panel, present in some embodiments, contains a set of graphical icons that may be activated with a controller and thus allow a user to choose one or more prosodic attributes and assign them to an individual Lex-Icon or to an entire assembly.
- Block 8 the Assembly Bar, provides a linear view of an assembly-in-progress. In some embodiments a user may manipulate the arrangement of Lex-Icons on the assembly bar, whereas in other embodiments the assembly-in-progress may be pre- validated at each step and automatically re-arranged by the system to reflect Lex-Icon choices with or without proper word order.
- Block 9 the Logged Assembly 1, illustrates that in some embodiments a user may retrieve a previously submitted assembly stored in the system's database in order to repeat or revise it.
- Block 10 the Logged Assembly N, illustrates that any number of logged assemblies may be stored for later repetition or revision.
- Block 11 the Predictive Panel, in some embodiments presents symbols representing lexical categories or individual Lex-Icons based on predictive analyses such as spatial, temporal, frequency and other factors outlined in method 600.
- FIG. 4 is a flow diagram of a method 200 illustrating one of many possible sequences through which a user may input a lexical assembly in a target language.
- the various steps depicted may be performed by one or more computing devices, such as a personal computer, tablet, smartphone, desktop computer, etc., as described further below.
- Step 201 A computing device may load an application program and may initialize values from the LexDB based on the choice of a target language and type of application.
- a given application may have one or more scenes displaying a visual environment, as well as logic for what results should occur based upon the set of possible actions that a user can perform by manipulation with a controller, which can include a computer mouse movement, finger movement or touch, keystroke, mechanical switch, gesture, remote control such as an infrared light signal, camera input, voice command or other vocalization, facial movement or expression (e.g., mouth movement), neurobiologic impulse such as a brainwave or eye motion, any other kinesthetic action including the actual or perceived sense of motion or touch, or any combination of mechanisms thereof.
- a controller which can include a computer mouse movement, finger movement or touch, keystroke, mechanical switch, gesture, remote control such as an infrared light signal, camera input, voice command or other vocalization, facial movement or expression (e.g., mouth movement), neurobiologic impulse such as a brainwave or eye motion
- Step 202 The computer may attempt to identify the user and to load his or her "user profile," which is a data repository stored on either the device and/or on a server that contains all pertinent data previously collected from or about or associated with the user. (For example, if a locally stored learning profile is outdated, the system may attempt to contact an application server to synchronize the user profile with a central database over a network or, may use a locally cached database without connectivity.)
- User profile data may include a record of each assembly submitted and data such as whether it was semantically intelligible or unintelligible and what errors were made, if any; whether it was syntactically well formed or ill-formed and what errors were made, if any; how much time was expended to compose it; the lexemes composing each submitted assembly and whether each was used correctly; the word class of the such lexemes and whether they were applied accurately; the phrase length and how that correlates to validity; the phrase form pattern; the word form or toneme each lexeme assumed; the spatial and temp
- step 203 the computing device may load any current preferences from the profile, such as Speaker and Listener Selection, Assembly Parameters, or Settings, as resolved in step 706.
- the user may modify these at any time before submission, as described in the method 700.
- the computing device analyzes the user profile in order to determine the lexemes to be presented to an individual user at any given time in the form of Lex- Icons.
- the system accesses the user profile, it may determine the user's current state of competency with regard to the relevant interactive and/or to the target language overall. This state of competency may be based on factors such as demonstrated comprehension and received instruction. "Demonstrated comprehension" is measured on the basis of previous assemblies composed by the user that are well formed and/or intelligible. Such demonstrated comprehension may be for a specific assembly, a general word class, a particular phrase form, a particular lexeme, or other dimension and can be specific or general in nature, as required.
- Receiveived instruction means that the system demonstrated a lexeme, phrase form pattern, phonological element, etc. to the user and is measured solely by frequency. Together, these measurements can be combined algorithmically to form a score or an explicit, sequential level (e.g., level 1, 2, or 3, etc.).
- step 205 the computing device retrieves predictive data using the steps set forth in the method 600.
- the computing device may filter the lexical items to determine which lexical items (lexical categories, lexemes and assemblies) in the LexDB (together, the "candidate pool") may be presented to the current user in the current context in the application.
- Lexical items may be disqualified on the basis of application-level considerations such as the scope of the application, by which is meant its learning or communication intent that may, for example, be bounded by grade level, skill level or curriculum considerations, or irrelevance to the selected lexical categories (in some embodiments, Lex-Icons in the Lex-Icons Panel in FIG. 3, Block 5 will only load if associated with the current semantic category) or may be disqualified on a user- specific basis, the latter of which generally applies only to educational scenarios.
- Information in the user profile may disqualify, from the candidate pool, those lexical items that fail to meet certain requirements. For example: lexical items that do not belong to a required word class specified in settings or other rules discussed above are disqualified. Lexical items that require a prerequisite identified in step 105 to be presented, but where the prerequisite is not yet in the assembly, are disqualified. Where an instance of the system limits the use of certain lexemes to a predetermined number of times set by the application, such lexemes will be disqualified if the number of submissions as part of valid assemblies exceeds the predetermined limit.
- Lexical items that have an associated level in a lexical sequence (as determined in step 106 and stored in TABLE 2, for example) that is of equal or greater value than the level achieved by a particular user and stored in such user's profile will be disqualified. There may be any other combination of factors that may be specific to an application and which may disqualify lexical items based on similar types of rules.
- step 207 the computing device loads and presents the lexical items in the candidate pool that were not disqualified in the previous step in the context of the application and the user profile.
- Predicted lexical items may be presented within a Predictive Area, per FIG. 3, Block 11 or may be integrated into the into the Lexical Categories Panel (FIG. 3 Block 4) and/or the Lex-Icon Panel (FIG. 3 Block 5).
- the predictive data may also trigger the system to load the Lex-Icons in Block 5 for a particular semantic category. For example, if a spatial location and temporal event are combined into a single spatial-temporal entity such as "lunch," the semantic category "lunch” could load, along with all the associated lexical items for Block 5.
- the panel may display graphical elements representing available lexical categories.
- Lexical categories are containers for 1 to N lexemes or predefined or user- generated assemblies.
- a lexeme or assembly may belong to 1 to N lexical categories.
- Categories may include any form of lexeme or assembly categorization such as word classes, word sub-classes, multi-class groupings such as lexemes or assemblies about people, food, play, emergencies, etc. or system-generated categories such as aggregations of those lexemes or assemblies used most frequently by the user or by other users, lexemes or assemblies used most frequently in a particular spatial location, temporal event or factors discussed in method 600.
- Lexical categories are presented in the Lexical Categories Panel based on a pre-determined order, numeric rank order based on predictive data, a combination of the two, or any other means of ordering.
- FIG. 5 provides a conceptual illustration of one possible way in which the system may determine which lexical items (in this case limited only to Lex-Icons, solely for the sake of simplicity of explanation) to present to a particular user and how to rank-order them.
- first lexemes that are not in the currently selected semantic category (if applicable) are disqualified from the candidate pool shown on the left side of FIG. 5. In the illustration these disqualified Lex-Icons are grayed out.
- step 208 the user may retrieve or select a new semantic category (other than the default) using a controller or other input device (e.g., touch screen, mouse, stylus, etc.).
- a controller or other input device e.g., touch screen, mouse, stylus, etc.
- step 209 the user retrieves or selects a Lex-Icon using a controller.
- the system enables the user to combine and order Lex-Icons into an assembly by manipulating the graphical representations. This is done by selecting Lex-Icons one at a time, by means of which they are moved to an area on the graphical user interface termed the "assembly bar," which is a graphical construction wherein one or more Lex-Icons are arranged by the user in a specific order.
- the user has the freedom to use any of the available Lex-Icons in any order and may therefore create a variety of possible assemblies.
- Information stored in the user profile may set a maximum number of Lex- Icons permitted in an assembly; add or subtract specific requirements such as time performance; or implement any number of other ways to sequence the presentation.
- a user could add lexemes not represented by the set of Lex- Icons initially presented. For example: A user could convert a graphical image shown elsewhere on the device screen into a Lex-Icon by selecting that image using a controller (e.g., mouse pointer, touch screen, etc.); a user could perform a signed gesture and the system could use camera-based recognition to first digitize the sign and then convert it into a Lex-Icon, placing said Lex-Icon into the assembly bar; a user could type in text; or a user could speak and have such speech converted into Lex-Icons; or any number of other inputs and conversion methods.
- a controller e.g., mouse pointer, touch screen, etc.
- a user could perform a signed gesture and the system could use camera-based recognition to first digitize the sign and then convert it into a Lex-Icon, placing said Lex-Icon into the assembly bar
- a user could type in text; or a user could speak and have such speech converted into Lex-Icons; or any number of other
- step 210 as each Lex-Icon is selected by a user using a controller (e.g., touched, clicked, flicked, pressed, or otherwise indicated by a controller), the system may execute multiple response actions in rapid succession (preferably within a few milliseconds of activation so that the user's brain perceives the events concurrently).
- a controller e.g., touched, clicked, flicked, pressed, or otherwise indicated by a controller
- These actions may include: (a) parsing the Lex-Icon as an individual unit (without context) in its root form wherein, for example, the RUN lexeme would be parsed by default as "run” by looking up the lexeme ID in the LexDB and finding the root form in the target language (in this case English) as per TABLE 1 , whereas in the context of a full assembly, which occurs in a later step, it might be interpreted as "runs” or “ran” or “running” by doing a condition-based lookup per TABLE 9; (b) narrating the root form by playing the audio file associated with the root form for such lexeme ID; (c) displaying a graphic or animation of the root word being signed in a sign language; (d) moving the selected Lex-Icon to a predetermined position in the assembly bar; (e) adding the image, animation, sign or text corresponding to the lexeme to a predefined place in the Visualization Area (see FIG.
- step 211 (f) pre-validating the assembly currently under construction in its then-current state (as discussed in step 213); or any combination thereof.
- Information in a user's profile may impact the nature of the response actions.
- languages that are normally transcribed in a left-to-right direction (such as European languages) are left- aligned and the Lex-Icon moves to the right end of an existing assembly.
- the opposite approach may or may not be used for right-to-left languages (such as Semitic and Persian languages).
- user action may convert graphical elements representing lexemes to and from signs, text, objects and/or animations, rather than keeping them in the form of graphical elements.
- narration or other form of output of a selected Lex-Icon in step 210 may vary based on application type and preferences loaded in step 203.
- a communication application may output the lexeme in the user or speaker's preferred language
- an educational application may output in the target language at this step and only apply preferences to output after assembly submission.
- step 210 the computing device may re-run Steps 206 and 207 each time a Lex-Icon is selected and added to the assembly bar, depending upon a given application's design.
- the computing device in step 210 would first perform a lookup in the Lex DB that examines the frequency of prior pairings of the currently selected lexeme with each other lexeme in the LexDB that has a governing-dependent relationship with the currently selected lexeme.
- the frequency of such prior pairings may be stored as a numeric "pairing value" (per TABLES 5 and 6): For example, say RED has a dependent relationship with APPLE. If RED was used to modify APPLE 8 times it will have a score of 8.
- lookup the pairing value of all lexemes in the candidate pool and use such data to replace or augment the predictive algorithm in method 600.
- step 211 for those embodiments that build a "visualization,” by which is meant a set of static and/or animated images that depict the meaning of the assembly-in- progress in a nonlinear fashion, this occurs as follows: Let us assume that the user constructs an unordered set of lexemes: GIRL, TALL, EAT, QUICKLY, APPLE, RED. The system first performs a LexDB lookup and returns the word class for each lexeme: GIRL N , TALL AD j, EAT V , QUICKLYADV, APPLE N , and RED AD j.
- the system places static or animated images (or signs) for each lexeme in a predefined place in the visualization area based on that lexeme's word class and role in the sentence.
- predefined places may vary by application or target language, such as by whether a language predominantly uses SVO (Subject- Verb-Object), SOV (Subject-Object- Verb), or another basic syntactic structure.
- Word classes that modify the subject, verb and object, such as adverbs and adjectives, would be shown in a subsidiary position.
- FIG. 6 One example of a visualization schema is illustrated in FIG. 6 wherein each space would generally contain an image, animated image or sign, rather than text, which is shown for explanatory purposes only.
- the images, signs or animations, for example may be similar or identical to those used in the corresponding Lex-Icon.
- step 212 in those embodiments using linear assembly arrangements, such as with educational applications, the user may rearrange Lex-Icons.
- a user may optionally modify or self-correct an assembly-in-progress (one not yet submitted) by reordering the Lex-Icons, as well as by adding Lex-Icons to the assembly (returning to step 205) or removing Lex -Icons from the assembly.
- a user may reorder Lex-Icons by using a controller to move them to the left or right in the assembly, thus displacing adjacent Lex-Icons one space to the left or to the right. The illustrations below demonstrate this operation. In the initial state, shown in FIG. 7A, assume there are three Lex-Icons ordered from left to right: A, B and C.
- the user moves Lex-Icon A along the trajectory represented by the curved arrow below to the target represented by the dotted line in FIG. 7B.
- the activated Lex- Icon A is inserted in the middle between Lex-Icons B and C, resulting in the ending state in FIG. 7C.
- step 213 in embodiments allowing non-linear assembly arrangements, such as with augmentative and alternative communication applications or cross-language communication between speakers of two different languages, the system may perform a pre-validation analysis of the assembly-in-progress using non-linear rules that ignore phrase forms. In these scenarios, the system cannot assume that the user is capable of arranging well-formed phrase forms. Therefore, in "nonlinear mode," the system will interpret assemblies as unordered set of lexemes.
- TABLE 15 illustrates this type of subroutine, wherein the function S represents the subject, O represents the object and M represents a modifier.
- the process may seek to identify the subject of the sentence. To do so, the system would find a lexeme of the word class Noun and sub-class Subject Noun that is a dependent of the verb(s) in the assembly. Since TALL, EAT, QUICKLY, APPLE and RED all fail to meet this condition, the resulting value must be GIRL. Since GIRL is the subject and EAT is the verb, we can eliminate those lexemes from further steps. Next, the system identifies the object of the sentence, which must also be a dependent of the verb EAT.
- the system identifies any noun modifiers such as adjectives or measure words.
- the remaining lexemes TALL and RED are both adjectives and thus potential modifiers of both nouns in the assembly: GIRL and APPLE.
- the system must perform lookups for both lexemes to check the associated word pairs. It would find that, while GIRL could be modified by both TALL and RED, the noun APPLE may only be modified by RED and is not paired with TALL. Therefore, the system is able to identify RED as the resulting value for APPLE and, by process of elimination, identify TALL as the resulting value for GIRL.
- the sentence can be further deduced to take the form TALL GIRL _ EAT _ RED APPLE _.
- Adverbs in English may have multiple placements such as before the verb or at the end of a sentence without changing meaning and thus the system may randomly determine a location for this word in the sentence to arrive at TALL GIRL QUICKLY EAT RED APPLE. This will later (in the method 400, post-submission process) be translated into natural language using extended word forms.
- the system interprets this sentence hierarchically as [[TALL] GIRL] [QUICKLY] EAT [[RED] APPLE] before adding extended word forms at a later step.
- FIG. 8 illustrates this hierarchy.
- step 214 based on the pre-validation analysis results of step 213, the system may auto- arrange the assembly into a well-formed sentence on the Assembly Bar.
- steps 213 and 214 may repeat in those embodiments using non-linear pre-validation.
- other embodiments of the system could perform non-linear validation only after submission of an assembly.
- a user may select one or more prosodic attributes, such as an emotional tone of voice, syllable length, syllable stress, voice pitch and voice volume, or other prosodic attribute, in order to augment the meaning of a Lex-Icon or assembly.
- a user makes this selection from the Prosodic Attributes Panel and then associates such attribute(s) with a specific Lex-Icon or with an entire assembly-in-progress.
- a user may create, modify or remove such associations at any time prior to assembly submission.
- one prosodic attribute could be an angry tone of voice.
- a user could indicate a prosodic attribute, graphically represented with an icon (e.g., an emoticon) displayed on the Prosodic Attributes Panel, Block 7 in FIG. 9A, and then indicate a target for that attribute in Block 8.
- an icon e.g., an emoticon
- FIG. 9B shows two examples of associations for the sample attribute "angry”:
- the circle marked “A” in Block 8 represents an association between the "angry” prosodic attribute and a Lex-Icon and "B" represents an association with the entire assembly.
- the result of such associations may be as follows: (a) re-narrate the lexeme or assembly, as per step 210, (b) re-visualize the lexeme or assembly, as per step 211, if applicable, and/or (c) modify the post-validation visual, auditory or other form of output described in the method 400.
- association of a prosodic attribute such as "angry” may cause a Lex-Icon to be narrated or animated in step 406 with the relevant emotional tone of voice, syllable length, syllable stress, voice pitch and voice volume, etc. or may provide visual indicators of mood such as images, symbols or signs.
- Prosodic attributes in some languages may also generate attitudinal particles such as "la,” “ya,” and “a” in Mandarin. (There are 6 such particles in Mandarin and 30 in Cantonese, for example.).
- step 216 the user submits an assembly in the target language.
- the user determines when an assembly is complete and indicates this by activating a trigger using the controller, e.g., pressing a button. By doing so, the user "submits" said assembly, which sends it to the system to be evaluated.
- the system may auto-submit an assembly on behalf of a user under conditions defined at design time or by user settings. For example, in an assistive technology application, such application might be programmed to auto-submit an assembly after a certain time elapse if a semantic category called "911 -Emergency" was active.
- FIG. 10 is a flow diagram of a method 300 for the system to validate a lexeme assembly that has been arranged by a user.
- step 301 the computing device determines the assembly validation method.
- the workflow diverges based on the method selected. Two methods may entail (a) using a predefined checklist that provides a binary decision about whether an assembly is valid, which begins at step 302, or (b) using a set of logical conditions that determine whether an assembly is valid, which begins at step 303. Other methods are possible.
- VALID flow VALID flow
- step 501 INVALID flow
- predefined assembly order of lexemes must match the submitted assembly order.
- predefined and submitted assemblies may simply contain the same lexemes, without regard to order.
- Step 303 determines whether the assembly is well formed or not. For applications using non-linear grammar rules, the process skips ahead to step 305.
- "Well formed" means that a phrase form is ordered in a valid sequence according to the rules described above. To determine this, the system looks up each lexeme in the LexDB to find its word class and sub-class and compares their relative positions in the context of the submitted assembly and against the allowable phrase form patterns and intermediate word classes in the LexDB, as per TABLES 1 and 2, as well as any sub-class required order defined in step 117.
- each check may be a simple "yes” or “no" value (well formed or not) or may provide a score based on gradations of how well formed the assembly is. In the former case, all checks must return a "yes" value for any assembly to be considered well formed. In the latter case, the assembly will be considered well formed if the score exceeds a pre-defined threshold. If the assembly is well formed, proceed to step 304. Otherwise, it is invalid and will proceed to step 306 (logging) and then to step 501.
- the Speaker and Listener Selection in Block 1 of FIG. 3 could impact phrase form rules. For example, selection of a speaker could eliminate the need for a pronoun. If a speaker is the user (I/ME), then first person is assumed and I/ME could be used as the default pronoun if one is not explicitly selected by the user. Thus, the rule change in this sample case would be to allow phrase forms that are missing a subject.
- step 304 the computing device analyzes the distribution of the lexemes in the assembly and uses the lexeme pairing relationship data defined in step 111 to find the syntactic dependencies between each lexeme. This will result in a hierarchical structure wherein each lexeme in the assembly governs or is dependent on the other lexemes.
- the hierarchy may have multiple tiers such that a lexeme "A” may be dependent on a lexeme "B” which is dependent upon a lexeme “C.” This would be expressed as follows: [[A] B] C.
- step 305 the computing device checks against the LexDB to see whether the specific lexeme pairs are semantically intelligible. Two criteria are met in order to return a "Yes" value: (1) a pairing relationship must exist in the LexDB between each lexeme in the assembly that has been syntactically identified as governing or dependent on one another and (2) the relationship direction in the LexDB, governing or dependent, must match the relationship direction in the assembly. If either criterion fails, a value of "No" is returned. All lexeme pairings within an assembly must be intelligible in order for the assembly to be deemed intelligible. If any pairings return a value of "No,” the assembly will be deemed either fully or partially unintelligible, depending upon whether gradations of intelligibility are permitted by the system.
- step 306 the computing device logs information about the event and adds it to the user profile.
- These data may include the following: The contents of the submitted assembly, i.e., the specific sequence of lexemes; whether the submitted assembly was semantically intelligible or unintelligible and what errors were made, if any; whether the submitted assembly was syntactically well formed or ill-formed and what errors were made, if any; how much time was expended to compose it; the unique IDs of each lexeme composing the submitted assembly and whether each was used correctly; the word class of such lexemes and whether they were applied accurately; the phrase length and how that correlates to validity; the grammatical pattern (Noun-Verb, Adjective-Noun, Verb- Adverb, etc.); the word form or toneme each lexeme assumed; a timestamp; the spatial and temporal contexts of the device at the time of submission, if location services are enabled on the device; and/or any other relevant information.
- Valid assemblies proceed to step 401. Invalid assemblies proceed to step 501.
- FIG. 11 is a flow diagram of a method 400 for the system to output results and/or perform programmatic actions in response to a valid lexeme assembly in a target language.
- step 401 the computing device determines the word form for each lexeme in an assembly for the target language. This is done by (a) looking up its word class in the LexDB, then by (b) checking the lexeme's relative position and context in the assembly to determine lexeme usage variables such as person and tense, and finally by (c) selecting the correct word form for the lexeme's inflectional paradigm that was enumerated in step 112 or the tonal transformation paradigm enumerated in step 114, depending on the target language. Other paradigms may be enumerated and checked, as necessary for the target language.
- step 402 the computing device performs a lookup of the word form extension table in the LexDB, for each lexeme in the assembly, and may return one or more words or morphemes to append either before (preceding) or after (succeeding) the word form that is determined in step 118, per the example in TABLE 13.
- This step loops until all lexemes in the assembly either: 1) have any required extended word form appended; or 2) are determined not to require any change, as applicable.
- the subject noun DOG may have an extended word form "a dog" stored in the LexDB, per TABLE 13, where the preceding form has a null value.
- step 403 the computing device determines each lexeme's tone form for tonal languages. First, (a) the application performs a LexDB lookup to recall the lexeme's tone value. Next, (b) the tonal transformation table created in step 114 is examined to determine if any transformations are to be applied to the lexeme. Finally, (c) the tonal exceptions table created in step 115 is checked to apply any exceptions that override the tonal transformation. This results in a final tone form. This step loops until each lexeme in the assembly has a final tone value.
- step 404 the computing device reconstructs the full assembly as natural language in text form. All lexemes have in previous steps been converted to extended word forms (if applicable) and tone classes. This step simply reassembles them in the order submitted by the user. The resultant phrase or sentence is stored in memory as a transcription in the target language. Text may or may not be displayed to the user at this point. In the earlier example of DOG QUICKLY RUN, the reconstructed assembly would be converted to "The dog quickly runs.” Note: For ideogram-based languages, a Latinized text may be used as an alternative to ideograms. For example, with Mandarin pinyin could be used and then numbers could be inserted to represent tones.
- step 405 the computing device determines what, if any, programmed action to perform based on the submitted assembly.
- Programmed actions are discussed below, and are generally a programmed response to elements in the assembly.
- the system first (a) determines the course of action based on pre-defined logic and then (b) stores that decision in memory. This step does not necessarily execute the programmed action.
- Programmed actions may include but are not limited to the following: (a) modification or presentation of images, animations, video, texts, narrated audio, or other visual or auditory elements, among which could be a visualization as described in method 1000; (b) launching or branching of a new programmed process or any other number of visual, auditory or kinesthetic actions to occur on or through the device; (c) changing the set of Lex-Icons made available to the user; (d) presenting Lex-Icons back to the user in the form of a dialogue with the system wherein the application presents system-generated Lex-Icons on the assembly bar and may output them in any of the ways a user-submitted assembly could; and/or (e) presenting the assembly or various forms of output to another user who may then respond in kind as in (d).
- Examples of launching a new programmed process could include sending the assembly output as transcribed, visual or voice- recorded communications to one or more external applications or devices as an email, audio file, text file, image file, animation file, instant message, SMS (a.k.a. text message), computer-generated voice message, voice call, video call, or other electronic communication medium or combination thereof.
- the application could also initiate a text, voice, video or multimedia communication session and assemblies could be transformed into the output type(s) required by such communication medium.
- the system supports at least three modes of application-mediated interpersonal communication: (i) two or more users may synchronously or asynchronously communicate each using the system on a separate device, each running the application; (ii) two users may synchronously communicate using the system on the same device, wherein each user interacts with separate controls or screen areas; and (iii) two or more users may asynchronously communicate using the system on the same device, each taking turns as a speaker or listener by indicating such with the Speaker and Listener Selection.
- the system also supports communication between a user and the system, i.e., a machine, whether pre-programmed or artificially intelligent.
- step 406 the computing device outputs the parsed assembly in natural language form, which may include narrated audio, text, signs, symbols, animation, video, neurobiologic impulses, or other means of conveying language.
- Each lexeme word/tone form or extended word form may have a unique audio file or other unique output, as discussed in step 120.
- Step 407 executes the programmed action(s) determined in step 405.
- the programmed actions may entail a variety of responses, such as an animation of an object, a playing of a sound, or any other desired response as noted above.
- Step 408 outputs the reconstructed assembly text transcription to a log that may be displayed or accessed immediately and/or at a later time. Using such log, a user may repeat or retract submission of a previously submitted assembly or otherwise retrieve and revise such arrangement.
- Step 409 clears the submitted assembly and then presents a new set of Lex-Icons and allows the user to input a new assembly. The process returns to step 203 (unless the presentation sequence contains logic that ends interactivity after certain point or level).
- FIG. 12 is a flow diagram of a method 500 for the system to provide corrective feedback based on invalid lexeme assembly input. It illustrates one possible example of how error- handling protocols may diverge based on application type or other factors in application design.
- step 501 the computing device determines or customizes the error handling rules based on the application type. For example, cross-language, augmentative, and other communications applications may each have unique error-handling protocols as might educational applications. Based on the application type, the system may ignore, auto- correct or inform the user through corrective feedback of errors related to syntax or semantics.
- step 502 the computing device determines corrective feedback for the given application type. Such feedback may vary based on 1 to N conditions such as the number of preceding failed attempts. Steps 503, 504 and 505 each represent possible alternative types of corrective feedback based on conditions X, Y and Z, respectively. (There may be any number of conditions.) For example, in an educational application, a counter could be set that increments by a value of 1 with each incorrect assembly submission. If the value reaches 1 (condition "X”), the process would proceed to step 503. If the value reaches 2 (condition "Y”), it would proceed to step 504. If a correct assembly was submitted, the counter would reset to 0. This example might not have a condition "Z" and therefore no step 505.
- condition "X" may be a scenario where an assembly-in-progress or submitted assembly has two or more possible correct phrase forms that cannot be differentiated without affecting meaning, whereas condition "Y” may only be triggered if user settings are set to provide such feedback, and condition "Z” may be the default otherwise.
- condition "Y" may only be triggered if user settings are set to provide such feedback
- condition "Z” may be the default otherwise.
- step 503 the computing device runs one or more pre-programmed "wrong answer" cues, such as audio or visual feedback. Thereafter, the process returns to step 212 for user self-correction.
- step 504 the computing device auto-corrects the submitted assembly.
- the Lex-Icons could move into the correct order, using a minimal distance principle in the case of multiple possible right answers in the answers array. For example, if the user constructed an assembly consisting of three lexemes, but did so in the wrong order, the system could perform a LexDB lookup of any and all possible correct orders for an assembly consisting of the same three Lexemes and return the most similar assembly (requiring the fewest and shortest reordering movements) and then proceed to reorder the Lex-Icons to match the returned assembly. Thereafter, the process returns to step 401.
- Step 505 in our example provides no feedback and simply executes step 409.
- FIG. 13 is a flow chart illustrating a lexical rank-ordering algorithm for the purpose of determining which lexical items to present most prominently to a user.
- the FIG. 13 algorithm may be performed by one or more computing devices, executing computer-readable instructions, to implement the language learning system described herein.
- the system may rank lexical categories, lexemes and assemblies lexical categories, lexemes and assemblies (hereafter "lexical items"), in those embodiments where a predictive algorithm is desirable, by calculating a numeric ranking value using as a basis factors including but not limited to the following ranking factors: (a) frequency and recency with which the current speaker has used each lexeme, (b) the frequency and recency with which other system users have used each lexeme, (c) the frequency and recency with which the current speaker has used each lexeme in the current spatial location, if known, (d) the frequency and recency with which the current speaker has used each lexical items for a given temporal setting, (e) the frequency and recency with which the current speaker has used each lexeme with the current listener, if known, and/or (f) any variety of other factors that may weigh in a predictive algorithm.
- the output of this process, a numeric ranking value is used by the system in step 206 when sorting the
- the LexDB may be pre-populated at design time by an application publisher with frequency values, which may be based on a source such as a CDI, as per step 101.
- the LexDB may be provided with frequency values indicating, for a given lexeme and for a given population (e.g., native Mandarin speakers of 30 months of age), a percentage of the population that understands the lexeme.
- a system administrator prepares an automated batch process, such as a script, that runs when scheduled or otherwise directed by a system administrator.
- step 603 the batch process executes and initiates the ranking procedure for all lexical items that are stored in the LexDB for a given target language or a subset of them, as determined by parameters set by the system administrator in preparation of the batch process, per the previous step.
- step 604 the computing device retrieves data from a system database about prior submission of each lexical item, including the number of submissions (frequency), the amount of time lapsed since the most recent submission (recency), and which user profile, spatial, temporal and other attributes were associated with each submission. Any locally shared usage data may be uploaded to the system's central database at this time.
- step 605 the computing device scores lexical items by the frequency with which the speaker has submitted the lexical item in the past and whether that submission was valid. For example, if EAT was submitted and validated 6 times it will be scored 6, higher than SCHOOL, which was submitted 6 times, 1 of which was invalid, and scores a 5. If EAT is, hierarchically speaking, a child of the semantic category "food” and SCHOOL a child of "places,” then "food" is scored higher than “places.”
- the computing device may apply a weighting factor based on recency. This factor would weight lexical items submitted in the recent past more heavily than ones submitted in the distant past. This is important at an aggregate level because language is in a state of constant evolution and terms can go in and out of use. In addition, many terms are seasonal. It is also important at an individual level because recent usage is a better predictor of current language needs than distant past usage is. In learning applications, moreover, weighting based on recency may be useful because depending on how often the user actually uses the learning software, and depending on the user's progress in learning, the user's recent submissions are a better indicator of the user's current skill level than older ones.
- the 20 correct answers from yesterday are a good indicator that the user has actually learned the lexeme, even though the 20 is outnumbered by the 80 earlier incorrect submissions.
- the system might used a variety of different algorithms to determine recency. One option: For each doubling of the distance in time elapsed between the most recent use and a previous use of a lexeme such as EAT, the system might halve its impact on the score. Thus, if a user submitted EAT 48 hours ago, 24 hours ago and again 12 hours ago, the use at 24 hours would count for half the score as the one from 12 hours ago and the use at 48 hours would count for one-quarter the weight.
- the computing device may check if spatial and temporal criteria are met, so that spatial and temporal rules may be applied to the user's assemblies. These criteria may be as follows: (a) location services are available and enabled by the device currently in use, (b) a spatial location corresponding to current geographical coordinates or a temporal event or combination of the two has been defined by the user, and (c) the spatial-temporal entity identified in method 900 has prior data associated with it in a system database and such data are accessible to the application at the current time. If one or more of these criteria are not met, the process will skip to step 610.
- the system may use a default setting such as country or region or may use system-identified data clusters (for example: persons clustering near a bus stop at 9 a.m. may use certain types of transportation-related language that the system compiles over time), to the extent the system has relevant prior data.
- a default setting such as country or region
- system-identified data clusters for example: persons clustering near a bus stop at 9 a.m. may use certain types of transportation-related language that the system compiles over time
- the computing device may perform a lookup of a system database to find all lexical items previously used by the current user within the current spatial location radius or radii, temporal event range or combination thereof and retrieves the frequency of each relevant lexical item.
- a spatial-temporal frequency value or a multiple of the frequency value
- Users may use certain lexical items in greater frequency when in communication with certain other users, such as in an augmentative and alternative communication application, or with certain computer characters, such as in a game-based learning application. In such cases it may be beneficial to the user to weight lexical items higher if they are frequently used in communication with those other users. For example, a special needs child may generally use words related to food when speaking with a particular caregiver (a "listener") who normally helps that child eat lunch.
- the computing device may check if speaker and listener criteria are met. These criteria may be as follows: (a) a listener is defined in the Speaker and Listener Selection and (b) the listener has prior data associated with the current speaker, which defaults to the current user profile if not explicitly selected. If one or more criteria are not met, the process will skip to step 613.
- step 611 the computing device performs a lookup of a system database to find all lexical items previously communicated by the speaker to the selected listener.
- the system scores lexical items by the frequency with which the speaker has submitted the lexical item to the specific listener in the past and vice versa.
- step 612 the computing device adds the listener-specific frequency value (or a multiple of the frequency value) to the score from step 609.
- the computing device performs a lookup of a system database and finds average frequency and recency data for all users of the system within predefined demographic criteria, e.g., all users, all users of a specific age, all users of a specific gender, all users of a specific application type, etc. This average may be pre-computed at a less frequent interval than the ranking process due to the potentially large volume of data to analyze.
- predefined demographic criteria e.g., all users, all users of a specific age, all users of a specific gender, all users of a specific application type, etc. This average may be pre-computed at a less frequent interval than the ranking process due to the potentially large volume of data to analyze.
- step 614 scores lexical items by the frequency with which the aggregate set of system users has submitted the lexical item. For example, if all users average 20 for EAT and 5 for SCHOOL, that generates scores for each.
- the algorithm may weight aggregate frequency scores less than individual scores, so the system may multiply the aggregate score by a fraction, such as .20, for example.
- step 615 the computing device may weight the aggregate user score for recency such that older data are weighted lower. (Our running example ignores this step.) This step is identical in purpose to step 606 except that it applies to aggregate usage data, rather than individual usage data.
- step 616 the computing device adds the aggregate weighted average score computed in steps 614-615 to the lexical item's existing score.
- the computing device may rank-orders lexical items by score, with highest scoring lexical items first in sequence and lower scoring lexical item last. Lexical items with a score of 0 fall to the bottom of the sort. Within the 0-score group or among any tie scores, choices may be sorted randomly, by CDI frequency in the LexDB, alphabetically or using any other default factor.
- Step 618 stores the final numeric ranking value in a system database and associates it with the database record for the relevant lexical item.
- the computing device in step 619 When called by the runtime application in step 205, the computing device in step 619 performs a lookup of a system database, which may be a flattened, constrained version of such database cached locally on the device and updated over a network from servers from time to time, to find the numeric ranking value for all lexical items relevant to the user at that time. For example, if the user selects a semantic category, the system will lookup those lexemes that are, hierarchically speaking, children of the semantic category, and will load only those children lexemes into memory for purposes of sorting.
- a system database which may be a flattened, constrained version of such database cached locally on the device and updated over a network from servers from time to time, to find the numeric ranking value for all lexical items relevant to the user at that time. For example, if the user selects a semantic category, the system will lookup those lexemes that are, hierarchically speaking, children of the semantic category, and will load only those children
- the aforementioned algorithm is merely one illustrative example among a plurality of methods for how spatial, temporal, frequency, recency and other system-stored data may combine to determine the filtering and sorting mechanisms for presenting lexical items in one or more panels on the system user interface.
- FIG. 14 is a flow chart illustrating a process for how a user may customize preferences.
- Method 700 enables an authorized user to explicitly customize a user profile with preferences in addition to the implicit customizations made by the system on the basis of historical user data.
- the illustrated steps may be performed by one or more computing devices providing the language learning system herein.
- a user may access a user profile during the application runtime at or after step 203 by using a controller to view and/or modify the Speaker and Listener Selection, Assembly Parameters, or Settings, which are Blocks 1, 2 and 3 on FIG. 3, respectively.
- a user may take the following actions:
- (a) Speaker and Listener Selection view and/or modify the user profile of the speaker, i.e., the user who is composing assemblies, and the user profile of the listener(s), i.e., the user(s) with whom a speaker is communicating, if applicable.
- a speaker or listener may be defined as representing a real person or a virtual avatar.
- An application may include default speakers and listeners.
- Assembly Parameters Add or remove parameters that a user may pass to the system when submitting a specific assembly. These may include the language in which output is to be narrated, signed and/or transcribed, tense and time indicators, contextual indicators such as place or season, or other options.
- Settings view and/or modify preferences for the entire application, for a user profile or for other more specific conditions.
- step 702 if the user selected a speaker or listener(s) in step 701, the selection overrides the current state.
- the system assumes that the current user is the speaker and may assume that the listener is anonymous, the listener from the most recent application session, the most frequently user-selected listener, or other assumptions, depending upon application design.
- the system may assume that the user(s) or speaker(s) on the other devices are listener(s) for purposes of the device in focus (meaning a device on which an assembly is being constructed). State is maintained throughout an application session, unless explicitly changed by a user or business rule.
- step 703 if the user added or removed an assembly parameter in step 701, the selection overrides the current state. By default, the system uses no assembly parameters. State is maintained throughout an application session.
- step 704 if any settings changes are indicated in step 701, the changes override the system default settings and are stored in the user profile both for the current and future application sessions. Otherwise, the system uses the settings previously stored in the user profile.
- a user may assign settings to specific listeners, for example, such that the system will modify the form of output based on a particular listener's user profile in a system database, such as by changing which language is narrated, signed or transcribed for that particular listener, determining the voice pattern to use or by determining whether to output a response to a submitted assembly as narration, text, images, animation, neurobiologic impulses, or any other form or combination thereof.
- the system may record and display the listener associated with a submitted assembly.
- the computing device in step 705 resolves any conflicts among speaker and listener selection, assembly parameters and settings and aggregates all preferences relevant to a submitted assembly.
- Conflicts will be resolved by first (a) enumerating all settings, then (b) overriding application-level settings (set by a system administrator to apply to all users of an application) with user-level settings (set by an individual user to apply to all of his or her own usage of an application), then (c) overriding user-level settings with speaker-level settings (set by a user to apply to a particular user who may use the application to communicate with one or more listeners), then (d) overriding speaker-level settings with listener-level settings (set by a user to apply to his or her interactions with a particular user with whom the user may communication using the application), and (e) overriding listener-level settings with assembly parameters (set by a user for a particular assembly prior to submission using the Assembly Parameters Panel, per FIG.
- Block 2 By way of a theoretical example, if an application's default language was English, the current user's default language was Spanish, the current speaker was set to Japanese, and the current listener was set to Arabic, the assembly output would be in Arabic. This could be useful, for example, if a user needed to communicate regularly in a certain language, but needed to communicate in a different language for a particular person or in a particular scenario. If the user set an assembly parameter to output in ASL (American Sign Language), that particular assembly would be displayed as ASL signs.
- ASL American Sign Language
- FIG. 15 is a block diagram showing the basic components of a general computing hardware device that may be used to implement the system and computing device described herein.
- the various features and elements described herein may be implemented by one or more computing hardware devices, such as a personal computer, tablet computer, smartphone, cellular phone, laptop computer, etc.
- a computing hardware device that is usable in conjunction with one or more embodiments of the system may include the following components:
- Block 801 The processing component may include a central processing unit (CPU), application processing unit (APU), graphics processing units (GPU), a baseband (wireless communications) processor, other specialized processors, or any combination thereof. While such processors may potentially function independently, FIG. 15 simplifies processing into one block on the diagram for the sake of clarity.
- the processor component may be coupled to Blocks 802 through 807.
- Block 802 Memory storage may include solid-state disks or drives such as NAND-based flash memory, random access memory (RAM) such as Dynamic RAM (DRAM) and Synchronous DRAM (SDRAM), read-only memory (ROM), hybrid drives, optical disks, or a plurality of other types of computing device memory storage, whether or not yet invented. Memory may be read by or, except in the case of ROM, written to by the processing component.
- the memory 802 may store instructions that, when executed by the processor 801, may cause the hardware device to perform any of the functions described herein.
- Audio input and output can include an audio codec, inputs such as microphones and outputs such as speakers.
- Microphone input could be decoded by the audio codec and processed for speech recognition, which can be used as a user interface controller.
- Block 804 The device hardware may have a camera with image or motion sensors such that camera input could trigger application behaviors based on recognition and processing of non-touch kinesthetic motion or facial movements.
- Block 805 The user interface adapter may interface with user input devices, including (a) peripherals such as a computer mouse, joystick, trackball, game controller, or keyboard, (b) a kinesthetic stimulus such as a finger touch that can be interpreted by the device touch-screen controller, and (c) electroencephalogram (EEG) or other devices that detect neurobiologic impulses and could potentially mimic or supplant use of peripheral devices or touch.
- peripherals such as a computer mouse, joystick, trackball, game controller, or keyboard
- a kinesthetic stimulus such as a finger touch that can be interpreted by the device touch-screen controller
- EEG electroencephalogram
- Block 806 The display solution may be driven by the processing component to control the device display, which may also show input from I/O devices.
- Block 807 A connectivity and security solution can provide network connectivity such as WiFi, Bluetooth, and near-field communications, for example.
- Block 808 The system, e.g., a device application at Block 808, uses the network to securely communicate with the server infrastructure, if and when needed and bridge the device and server- side components. It may also receive transmissions from positioning systems using technologies including but not limited to WiFi, Bluetooth and other sources of location services over a network. The positioning systems may be previously known to the system or may be discovered by the system, which over time generates a map of positioning systems and allows users to augment system-generated data.
- Applications may be designed to utilize a live network communication at certain points in time, or they may be self-contained and require no live network connection.
- Block 809 The server infrastructure may include firewalls, load balancers, redundancy mechanisms, as well as other hardware components that serve security, scalability, reliability and other service level needs. It houses application and database server components, as well as an application development environment.
- Block 810 The application server may be a remote hardware device that runs elements of the system that an application operating on a user's device may need to access at runtime.
- the application server is a layer between the user's device and the system's central database.
- Block 811 The database server houses the system's central, server-side relational or graph database, which contains data that may be used or accessed at runtime by a multitude of user devices at any given time. These data may include the user profile and the LexDB.
- the system may, in addition to or in place of the central database, use a local database on a given user's device and, from time to time, synch all or part of the local database with the central database over a network when available and desirable.
- Application and database servers may be physical machines in a colocation facility, or they may be virtualized machines in a cloud environment.
- Block 812 The application development environment provides a set of tools and interfaces for allowing instructional designers, linguists and developers to work together to build datasets applying the LexDB to languages and applications. This set of tools facilitates the design phase of operation and integrates with the build process (whether building for a client browser online, or building for a mobile or embedded standalone device) to supply the required subset of data for a given application to execute.
- FIG. 16 illustrates an example flow diagram for an algorithm to take into account spatial and temporal context. This may be used by an application to customize the presentation of lexical items to a user based on said user's specific location or type of location or based on the time of day, day of week, season of the year, specific time ranges or events, or any combination thereof. The various steps are described below:
- Block 901. Whether or not spatial or temporal entities have been defined in step 121, per the example in TABLE 14, the device application may request and return a set of latitudinal and longitudinal coordinates or other global positioning identifier for its current location, or a location for a user's use of language (e.g., the location where the user is using the language software described herein). It may receive these coordinates through any type of network connection, broadcast transmission or proximity system. This location request is made at regular intervals by the application (with the exact interval being set by the device or application settings, but generally updated in what the user would perceive as being a fraction of one second) without user intervention, where permitted by application and user settings.
- the device application identifies any predefined locations, which can be a set of coordinates or a positioning system, or undefined location that meet system-defined or user-defined criteria. Predefined locations could be anyplace relevant to one or more users, such as "home” or "Washington Elementary School” or "a gas station.”
- the locations may be a specific location (e.g., the school listed above), or a general type of location (e.g., schools, in general).
- Positioning systems are a class of low-powered, low-cost transmitters such as beacons that can notify nearby consumer electronic devices of their presence by, for example, broadcasting their precise location to the device application, if in range. They may use low-energy proximity sensing technologies such as Bluetooth Low Energy technology or may use a WiFi-based system that measure the intensity of the received signal strength, or magnetic positioning using local variations of the earth's magnetic field, or any number of accepted positioning system technologies.
- coordinates may be returned by checking application servers for current defined locations.
- Each predefined location is treated as having an epicenter and a range.
- the epicenter represents the precise GPS coordinates or positioning system location.
- the range is a distance, generally measured in meters, from the epicenter in all directions and is set by a user who defines a location.
- Block 903. The device application calculates the distance between the device and the epicenter of all potential locations or positioning systems identified earlier in Block 902.
- Block 904. The device application determines if the distance is within range parameters.
- Each predefined location may have a range associated with it, per Block 902. If the device is inside that range, it is considered within the predefined location.
- Block 905. The device application may attempt to resolve any overlapping spatial and temporal entities (location zones and times), wherein the device is determined to be present within the radius two or more predefined locations or two or more time or date ranges.
- One method for resolving this is set forth below:
- the application would have a matching algorithm created by the system administrator that establishes a preferred order such as the following:
- beacon means a shorter distance from the epicenter of a location.
- each predefined location could be manually assigned a priority value by a user. Locations with higher priority value would prevail over one with a lower value.
- the affinity levels could combine to make a cumulative score that governs spatial- temporal entity will be dominant, ordered from weakest match to strongest. For example, a weak affinity could score as 10, medium as 20, and strong as 30 unless a closer location existed, in which case the affinity score would drop to 25. So, if one spatial-temporal entity were "school lunch" and it occurred in the cafeteria from Noon to 12:40 each weekday, if the device were found to be in the cafeteria during that time, it would get a score of 30 (high affinity) for the location plus 10 for a recurring day (weak affinity), for a total score of 40.
- Block 906 The device application checks and retrieves the current time on its system clock either before or after the device application has found matching location(s) and resolved which one(s) are considered valid, per the matching algorithm in Block 905.
- Block 907. The system determines the spatial-temporal entity by performing a lookup of a table such as TABLE 14 or analyzing data about frequency of assembly submissions. In the example of the two-dimensional table, if there is a null value for a given space-time combination or spatial or temporal contexts are indeterminate due to technical reasons, the lookup reverts to the default value for either spatial or temporal context (depending upon system or user setting priorities).
- the spatial-temporal entity determined in Step 907 may determine how the system renders symbols, images, text, audio and other user interface elements. It could trigger the system to navigate to a particular screen or display certain lexical items or refresh the existing screen with a new array of lexical categories or Lex-Icons, for example. Generally speaking, the system feeds the data about the current spatial-temporal entity into the prediction process of method 600.
- FIG. 17 illustrates an example of an application of the novel system intended for an elementary school audience learning Mandarin Chinese.
- An application could be considerably more complex and expansive or intended for other purposes than education.
- the following is intended only as one illustration of how the novel system could be implemented for one particular interactive scenario for one particular target language on one particular type of electronic device (a touch-screen tablet, in this case).
- the "Pet Training" application in FIGS. 17A-F is a game wherein the user constructs assemblies using Lex-Icons and said application converts the assemblies into Chinese narration of the meaning and runs an animation of pets performing a programmed action.
- Below is a storyboard that demonstrates the runtime workflow for one use case, wherein the user decides to make a pet rabbit jump, succeeds, and then progresses to a successive level of the game that allows more choices of Lex-Icons.
- the user selects an animal to "train.” Some animals are locked initially - until the user progresses enough to unlock them. This ties into the language sequencing prepared at design time.
- FIG. 17D the user decides to submit this assembly and taps the play button.
- the system validates the assembly, finds it valid, and then concurrently narrates tiao4 ("jump" in Chinese”) and executes an animation of the rabbit jumping.
- Lex-Icons for verbs, nouns and adverbs may be illustrated in different colors.
- the example above shows the assembly "Eat a dumpling.” The user may drag Lex-Icons left or right to reverse the order, or add others before submitting.
- FIGS. 18A-E are a set of illustrations depicting an embodiment of the system as an augmentative and alternative communication software application.
- the user begins on a "home" screen. From here, he or she may select a semantic category such as Clothes in order to find a certain grouping of words and phrases such as those related to Clothes.
- a semantic category such as Clothes
- FIG. 18B shows the Clothes category.
- the application displays a set of Lex-Icons related to the category Clothes.
- the user touches a Lex-Icon representing the phrase "I like to wear".
- FIG. 18D the user decides to submit this assembly bar and taps the green button.
- the system validates the assembly, finds it valid, converts it into the natural language phrase "I like to wear a hat” and then narrates the full assembly.
- the Lex-Icons remain in the assembly bar so that the user may repeat the audio narration until the user decides to create a new assembly.
- FIG. 18E On the left side of FIG. 18E is an area with tiles that are larger in size than the larger number of tiles display in the center of the screen. These larger tiles represent lexical items predicted by the system using lexical ranking algorithm. In this case, the only factor used in the algorithm is relevance to a predefined spatial-temporal entity. The entity shown in the illustration is "food,” so the tiles show food-related Lex-Icons.
- FIG. 19 describes a process for visualization of a word concept, such as by means of a video or animation clip that demonstrates or models enacting the word concept.
- This process, Method 1000 would occur after step 405 in the assembly validation process.
- the purpose of this visualization process is to assist with learning or communicating a word concept, particularly in the case of verbs, adverbs and spatial prepositions such as "in” or "above,” the meaning of which may be difficult to perceive from a static image or may present ambiguity of meaning.
- Another reason for this approach is that, while an application of the novel system may pre-define the visualization associated with a given assembly at design time, such pre-defmition requires the creation of numerous video or animation clips at significant expense. This becomes impractical with advanced language learning or communication activities as the number of potential assembly combinations increases exponentially.
- the system may instead use an automated method for determining the appropriate visualization or set of visualizations at runtime.
- the method may generate three resources at design time:
- a collection of video and/or animation files having a close-captioning or other file containing subtitle text.
- Such video or animation files need not have been created for the purpose of use in the application, but could have been created for an entirely different purpose.
- the files also do not need to reside on the system's hardware infrastructure, but could reside on third-party servers, for example.
- Options include text matching with various Boolean and wildcard parameters, grammatical patterns, and other logical criteria.
- the system may be instructed to find all video clips wherein the close-captioning text includes the pattern happy *day*, where * is a wildcard operator, and this would pull all video clips containing "happy birthday,” “happy holidays,” “happy today,” and other possibilities.
- it could find videos conforming to the grammatical structure Subject Noun then Verb, contains string "football” so that would return video clips containing phrases such as these: "He likes football” as well as “They play football” and "That crazy dog ate my football.”
- Options may include playing back the full sentence that contains the matching word or phrase or only the exact assembly, specifying a certain number of milliseconds before and after the assembly to start and stop, concatenating a certain number of clips together in a chain to playback, etc.
- the automated visualization process executes the following process after an assembly is validation and ready to execute, per Step 405.
- Block 1001. The system analyzes the text of the assembly based on logic that was predefined at design time or based the user settings that tells the system how to use the text to find a matching visualization. If the logic requires a grammatical match, this will be accomplished by matching the sequence of word classes in the assembly (already determined prior to Step 405) with the same sequence in the subtitle text.
- Block 1002 The system searches close captioning files for matches, if any. Said files may be indexed in advance for better search performance. Logic may also require or allow multiple matches. Matches may be scored or rank-ordered based on quality of the match and/or quality of the visualization file. [250] Block 1003. The system determines one or more best matches based on the scoring or rank-ordering. If there are no matches, it should be noted that this automated visualization runtime method is not mutually exclusive of the design-time method. For example, in one embodiment, the system may first check to see if a pre-determined visualization exists and, if not, use the runtime method.
- Block 1004. For a given match, the system loads the video at the right point in timeline, based on pre-defined logic or settings. For example, if the logic states that playback should begin with the start of a full sentence in which the assembly match is found, the system will load the video at that point in the timeline. Note that this does assume that the close-captioning file includes timeline information, as is standard practice.
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Educational Administration (AREA)
- Educational Technology (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
Features herein enable users to construct and express language by visual, kinesthetic or other means without necessarily requiring vocalization, reading or transcription. The system may aide in language acquisition and cross-language communication for augmentative and alternative communication. The system may pre-process language into unambiguous units of meaning and wherein elements that vary based on usage and context are abstracted. Each unit of meaning is converted into a graphical element. Once the language is simplified and rendered visually, a user uses software to interactively construct a word, phrase or sentence by manipulating the graphical elements in a user-defined sequence. After receiving visual, kinesthetic or other input from the user, the application executes validity checks on the input and may present corrective feedback. Next, it converts the input into audio, visualization and/or other output, and may execute other actions based on the combination of user input and business rules.
Description
VISUAL-KINESTHETIC LANGUAGE CONSTRUCTION
CROSS-REFERENCE TO RELATED APPLICATIONS
[01] The present application claims priority to U.S. Provisional Application No.
61/930,238, entitled "Visual-Kinesthetic Language Construction Method," filed on January 22, 2014, the contents of which are hereby incorporated by reference as example embodiments.
GOVERNMENT LICENSE RIGHTS STATEMENT
[02] This invention was made with government support under U.S. Federal Contract No.
W911QX-13-C-0042 awarded by the Defense Advanced Research Projects Agency (DARPA). The government has certain rights in the invention.
SUMMARY
[03] The following summary is for illustrative purposes only, and is not intended to limit or constrain the detailed description.
[04] Features herein relate generally to a method of aiding users to acquire and retain any language or to communicate using any language. A language may be considered a system of communication encoded in auditory, visual, and/or tactile stimuli that may be within any one of the known families of natural human languages or within any system of artificially constructed, signed, or non-human languages. The term "language" may also include terminology specific to a professional or technical field. A "target language" may be considered a language being learned or used in communication by one or more users.
[05] More particularly, the system provides a novel method and potential applications for visually and kinesthetically constructing language. A visual-kinesthetic language construction application may be one type of embodiment of the novel system implemented as a software application operating on an electronic device, without
excluding other forms of derivative works. The user operates the system using one or more controllers. A "controller" may include any mechanism that allows the user to manipulate visual objects on an electronic device screen. Such mechanism may take the form of a computer mouse movement, finger movement or touch, keystroke, mechanical switch, gesture, remote control such as an infrared light signal, camera input, voice command or other vocalization, facial movement or expression (e.g., mouth movement), neurobiologic impulse such as a brainwave or eye motion, any other "kinesthetic" action including the actual or perceived sense of motion or touch, or any combination of mechanisms thereof.
[06] The system may enable a user to acquire or communicate the meaning of the words, phrases and sentences that such user constructs, to internalize the phonology and syntax of the target language, and/or to enable a user to express himself or herself in a target language without the necessity of vocalization, transcription, signing, or any other means of language encoding that requires knowledge of or skill in such language. By the conversion of the output into audio and visualizations such as signs, images, video and animation, the system further enables pseudo-verbal expression and interpersonal or mechanical interactions by a user. The system therefore has potential application to educational, communications, assistive and therapeutic purposes and technologies.
[07] Some embodiments of the novel system avoid forcing the user to produce (e.g., vocalize, write, or type) the target language. Instead, users may use the system to produce meaning without the perceived risk of failure associated with vocalization or transcription. The novel system offers a hybrid between reception (comprehension) and production (expressing vocally or by text), where the user assembles phrases visually and "kinesthetically" (using touch or motion). This enables language learners to transition from reception to production in a low-stress, self-paced way with built-in instructional supports and answer tracking. In addition, a user may be unable or hesitant to attempt a natural rendition of a target language, so the novel system can
facilitate two-way communication between the user and the system or between or among users.
[08] Some embodiments of the novel system avoid or mitigate environmental, technical and usability obstacles related to speaking into a microphone and/or typing. Instead, users may primarily interact with the system by manipulating visual icons. The potential to use neurobiologic impulses or facial movements to manipulate the visual icons also opens up a new set of possibilities for persons with disabilities or trauma to learn or recover language (in either educational, assistive or therapeutic senses) and/or to use the system to communicate with others.
[09] Some embodiments of the system enable emergent and challenged readers to visually and kinesthetically construct phrases in a target language without requiring them to read any text. Knowledge of a language need not be a requirement for learning a new language. Some embodiments of the novel system do not require reading text. Furthermore, it may abstract words as visual icons so that the variations in word forms do not muddle comprehension or add cognitive load. Graphical presentation reduces cognitive load because it directly translates concept-to-word, rather than the conventional text-to-concept-to-word, and thus requires only a single cognitive processing channel, rather than two channels.
[10] Some embodiments of the novel system pre-process words into lexemes, each being a unit of un-parsable or unambiguous meaning. The system then expresses the meaning of each lexeme in visual and kinesthetic ways through an unambiguous graphical element, which may help to minimize difficulties in learning that are due to the fact that languages have words with dual or ambiguous meanings.
[11] Some embodiments of the novel system use graphical elements, including animation of the word concept, which can trigger memory of the word sound and convey semantic value (meaning), word class (part of speech) and other key information more effectively than text.
[12] Some embodiments of the novel system sequentially present lexemes that are typically used together in a target language in a way that illustrates both how the lexemes are distinct and how they can be combined. One way the system achieves this is to provide multiple pairing options (e.g., eat an apple + eat an orange + eat noodles). That variance builds "packages" that the user's brain uses to differentiate word boundaries until the user becomes cognizant that "eat," for example, is the process of consuming something and "apple," "orange," and "noodles" exemplify items that can be consumed. The discrete visual nature of graphical lexemes helps language learners recognize word boundaries, interpret meaning, and observe the formation of phrase structures using basic syntax, all of which aide statistical learning (a phenomenon wherein the human mind interprets the patterns of speech sounds and attaches meaning based on situational contexts and what the brain unconsciously perceives as the statistical relevance).
[13] Some embodiments of the novel system provide modular building blocks that can be combined to form a large number of phrases, which may be an easier approach to learning than a strict didactic approach. For example, given a possible set of 10 such building blocks and the ability to assemble them into a string of 1 , 2 or 3 lexemes, one could make up to 1,110 possible combinations. This process empowers users to experiment with various forms, orders and structures and therefore consciously or sub-consciously begin to internalize the patterns that represent the target language's grammatical rules. The system can then provide corrective feedback that communicates whether a constructed phrase structure is well formed or ill-formed. Users build their knowledge of syntax over time through experimentation and corrective feedback. At the same time, the system may auto-complete or auto-correct grammar for users with cognitive or learning impairments to facilitate augmentative and alternative communications.
[14] Some embodiments of the novel system use vocabulary, syntax and phonological factors to scaffold with a fine-grained level of detail. Scaffolds may be specialized instructional supports that facilitate acquisition of just enough knowledge and skill to
reach the next step in a learning sequence. This enables language learners to always be at the right level. The novel solution is possible for two reasons: One, the system can track which lexemes, word classes and phrase forms a user uses to construct phrases or sentences and further determines whether they are well formed or ill- formed syntactically and whether they are semantically intelligible. Two, the system can provide a fine-tuned sequence for how the target language progresses from simple to complex sounds, words and forms. Once this sequence is programmed, the system can present level-appropriate lexemes individualized for a user to use to construct phrases and sentences.
[15] "Receptive" vocabulary are words one understands, whereas "expressive" vocabulary are words one can produce. When first learning a target language, a language learner's receptive vocabulary far outstrips his or her expressive vocabulary. Most extant applications teach only receptive vocabulary or present receptive and expressive concurrently without any effective transition from receptive to expressive. Extant transitional techniques such as the "diglot weave," wherein users insert words from the foreign language words into native language sentences, offer users no choice of expression or room for experimentation. Some embodiments of the novel system instead provide a bridge between reception and production. A user who cannot fathom how to form text or speech into coherent words and phrases in the target language can kinesthetically manipulate graphical lexemes using a controller. The system may provide instructional supports that give real-time feedback to the user about his or her phrase experiments and allows the user to compose increasingly complex phrases in a scaffolded manner.
[16] "Prosody," or attributes of speech such as emotional tone of voice, syllable length, syllable stress, voice pitch and voice volume that are not encoded in words or grammar, is an important element of meaning in language. Some embodiments of the novel system may vocalize prosodic attributes such as emotional tone of voice, syllable length, syllable stress, voice pitch and voice volume as part of communication between user and machine or between two or more users, to better
assist a learner who might not yet understand a language's rules for expression (e.g., meaning of an exclamation mark).
[17] Some embodiments of the novel system described herein overcome or mitigate the problems inherent in previous learning methods. The system may enable a user a balanced degree of freedom to experiment and construct phrases or sentence in a target language using a visual-kinesthetic application without being forced to speak, read or type. It abstracts the variation in word forms, reduces cognitive load, and clarifies meaning by using the smallest possible discrete element of language, the "lexeme." And, it facilitates statistical learning and scaffolding that can gradually transition a user from target language reception to production at increasing levels of expressive sophistication and accuracy.
[18] Some embodiments of the novel system presented herein (individually and collectively referenced herein as "the system") enable users to construct and express language by visual, kinesthetic or other means without requiring (but not inherently excluding) vocalization, reading or transcription and without prior language knowledge or skill. This form of pseudo-verbal and pseudo-physical expression enables users to interact with other persons and/or software programs. In operation of some embodiments of the system, users construct a word, phrase or sentence in a target language in an interactive engagement with the novel system by using a controller to visually and kinesthetically combine one or more "Lex-Icons" on a graphical user interface. Each Lex-Icon represents one lexeme, which is the minimal unit of meaning in any language. During the construction process, multiple Lex-Icons may be assembled into a phrase or sentence of varying length called an "assembly." The result is a string of 1 to N lexemes, each lexeme being represented visually by a Lex-Icon, sequenced in a particular order and direction by visual and/or kinesthetic means that, together, compose a word, phrase or sentence in a target language. For example, a user might select one Lex-Icon that represents a dog, a second that represents the act of running, and a third that represents performing an action quickly. These three Lex-Icons could be arranged in the assembly DOG, RUN, QUICKLY. As
a result of the process described herein, each constituent Lex-Icon will be converted into natural language with a process that ultimately results in the sentence form "The dog runs quickly." The system may then execute one or more programmed actions based on the composition of the assembly. Depending upon the particular application derived from the system, it may also produce auditory output, such as a spoken voice that narrates the phrase or sentence aloud in the target language. Or, it may cause animated or static displays to demonstrate or illustrate the intended meaning of the assembly. For example, visually constructing the sentence "The dog runs quickly" could cause a dog to run rapidly across the screen.
[19] The system may operate in two temporal phases: design time and runtime. "Design time" may refer to the preparatory work needed to create and package a language- specific application prior to use. "Runtime" may refer to the time during which the system operates, i.e., a user loads an application wherein interactions between the user and the application may occur.
[20] In some embodiments, at design time, each target language may be analyzed and pre- processed in order to perform the following:
1. Enumerate the core vocabulary as a set of monosemic lexemes, each of which is stored in a database with a set of rules and metadata;
2. assign a value to each lexeme that can be used to determine the order in which lexemes are presented to a user;
3. define any prerequisite lexemes that are required to be presented prior to presenting another specified lexeme;
4. create lexical presentation sequences;
5. define phrase form patterns, any intermediate word classes allowed within the phrase forms, and enumerate any exceptions;
6. design and store visual objects to represent lexemes;
7. define pairing relationships among lexemes;
8. enumerate the inflectional paradigm for each lexeme in an inflective language, resulting in one or more word forms;
9. define tones and tonal transformations for lexemes in a tonal language and enumerate any exceptions;
10. define click consonants for lexemes in languages that use obstruents;
11. define the required order of sub-classes;
12. enumerate extended word forms to include elements that are particularly challenging to learn or translate such as articles, classifiers (e.g., measure words), copulae (e.g., linking verbs) and various possessive, tense and other grammatical particles;
13. pre-compute assembly validity checks;
14. define application settings such as spatial locations and temporal events; and
15. generate and package an optimized output file set, which may include a local (client-side) database that is constrained and/or flattened in order to reduce complexity and processing at runtime, resulting in more rapid runtime rendering.
[21] In some embodiments, the runtime workflow may be summarized below in simplified form:
1. The system loads user profile data and preferences, to the extent available.
2. The system presents a set of lexical items based on data previously collected about the user and based on contextual factors that could include spatial-temporal entities.
3. The user arranges Lex-Icons to create an assembly, using controller mechanisms provided by the specific hardware device.
4. The user or system submits an assembly.
5. The system parses the assembly and runs through a validation process.
6. If the assembly is invalid either semantically or syntactically, certain types of corrective feedback, such as visual evidence and/or audio feedback that the user's assembly is not well formed or is unintelligible, may be generated.
7. If an assembly is deemed valid, the system converts it into natural language by determining word form, tone, and other linguistic elements. It transcribes the full phrase or sentence.
8. The transcription may be narrated as audio or provided in another form of output.
9. The system may generate a programmed action such as an animation or other visualization of the assembly. The system may also generate a response assembly in the form of dialogue with other users or in the form of a virtual dialogue with an application.
10. Data are logged about the event, some of which may be displayed to the user.
1 1. The assembly is cleared and the user may create a new assembly.
This process, in the context of learning applications, may allow users to be able to naturally acquire spoken language in a "non-instructional" sequence, i.e., without any
explicit discussion of word meanings or grammatical rules. Instead, the human mind interprets the patterns of sounds in a target language and attaches meaning based on situational contexts and what the brain unconsciously perceives as the statistical relevance. The process may also facilitate system-mediated communication among users using one or more languages or users requiring assistive technology. The system may be applied to any target language, regardless of writing system or phonology
[23] The summary here is not an exhaustive listing of the novel features described herein, and are not limiting of the claims. These and other features are described in greater detail below.
BRIEF DESCRIPTION OF THE DRAWINGS
[24] These and other features, aspects, and advantages of the present disclosure will become better understood with regard to the following description, claims, and drawings. The present disclosure is illustrated by way of example, and not limited by, the accompanying figures in which like numerals indicate similar elements.
[25] FIG. 1 is a flow chart illustrating a process for how linguistic elements of a target language may be pre-processed and stored at design time.
[26] FIG. 2 is an illustrative graph diagram depicting the interrelationships among lexeme pairs and the direction of each relationship.
[27] FIG. 3 is a block diagram that functionally illustrates an embodiment of the runtime system.
[28] FIG. 4 is a flow chart illustrating a process for how a user may input a lexeme assembly at runtime.
[29] FIG. 5 provides a conceptual illustration of how the system may determine which Lex-Icons to present to a particular user
[30] FIG. 6 provides a logical diagram for an organizational schema of the area used to visualize lexeme assemblies-in-progress.
[31] FIGS. 7 A, 7B and 7C are a set of illustrative diagrams depicting how a user may rearrange a lexeme assembly at starting, intermediate and ending states, respectively.
[32] FIG. 8 is a logical diagram illustrating the syntactic hierarchy of a sample sentence, originally provided to the system by the user as an unordered string of lexemes.
[33] FIGS. 9A and 9B are a set of illustrative diagrams depicting how a user may associate one or more prosodic attributes with a lexeme or a lexeme assembly.
[34] FIG. 10 is a flow chart illustrating a process for how a lexeme assembly that is submitted by a user may be validated.
[35] FIG. 11 is a flow chart illustrating a process for the output of results and/or programmatic actions in response to a valid lexeme assembly submitted by a user.
[36] FIG. 12 is a flow chart illustrating a process for provision of corrective feedback in response to an invalid lexeme assembly submitted by a user.
[37] FIG. 13 is a flow chart illustrating a lexical rank-ordering algorithm.
[38] FIG. 14 is a flow chart illustrating a process for how a user may customize preferences.
[39] FIG. 15 is a block diagram showing the basic components of a general computing hardware device usable in conjunction with one or more embodiments of the system.
[40] FIG. 16 is a block diagram showing how the system may determine spatial-temporal context.
[41] FIGS. 17A, 17B, 17C, 17D, 17E and 17F are a set of illustrations depicting an embodiment of the system as a language learning application.
[42] FIGS. 18A, 18B, 18C, 18D, and 18E are a set of illustrations depicting an embodiment of the system as an augmentative and alternative communication software application.
[43] FIG. 19 describes a process for visualization of a word concept such as by means of a video or animation clip that demonstrates or models enacting the word concept.
DETAILED DESCRIPTION
[44] In the following description of various illustrative embodiments, reference is made to the accompanying drawings, which form a part hereof, and in which is shown, by way of illustration, various embodiments in which aspects of the disclosure may be practiced. It is to be understood that other embodiments may be utilized, and structural and functional modifications may be made, without departing from the scope of the present disclosure.
[45] The system may be embodied by one or more software applications that operate on an electronic device (hereafter a "device") including but not limited to a computer, tablet, phone, game console or any other desired type of computing device. The applications may be embodied as a series of instructions stored in one or more computer-readable media, such as solid-state disks or drives such as NAND-based flash memory, random access memory (RAM) such as Dynamic RAM (DRAM) and Synchronous DRAM (SDRAM), read-only memory (ROM), hybrid drives, optical disks, or a plurality of other types of computing device memory storage, whether or not yet invented, and which may be executed by one or more processors in the computing device to achieve the results described
[46] LANGUAGE PRE-PROCESSING WORKFLOW
[47] FIG. 1 is a flow diagram of a method 100 for pre-processing a target language to create a system that operates using a series of methods 200 through 700. The steps shown in FIG. 1 may be performed once for each desired target language.
[48] In step 101, a "core vocabulary," which may be a set of most frequently spoken words and phrases in a target language, is enumerated by the publisher of a given application and stored as a flat file. Note that there may be a significant variance in frequency of usage between written and spoken language, especially in ideogram-based languages, so it is important to make the distinction that the novel system use the spoken language as the sole basis for determining core vocabulary. The words in the core vocabulary will be used most naturally in conversation and therefore of the most utility in trying to learn the target language and its phonology. The typical source for core vocabularies will be a type of parent-reported early vocabulary inventory referenced as a "Communicative Development Inventory" or CDI. The application publisher may narrow down or expand the CDI wordlist based on the particular learning objectives, level and intended audience for a given application. CDIs may also contain indicators of word use frequency such as the percentage of sampled children at a specific age who have each word in their receptive or expressive vocabulary. (For example: At 30 months of age, 85% of sampled native Mandarin speakers recognized the word PAO3, translated to English as "run" in the sense of "moving forward rapidly on two or four legs.").
[49] In step 102, polysemic words in the core vocabulary may be separated into lexemes, still in a flat file format. A "lexeme" may refer to a word or phrase in the target language that has the following properties: It holds a single semantic value (meaning); it may assume multiple word or tone forms; it is, in its entirety, a minimal unit of meaning and therefore un-parsable; it occupies one word class; and it is bound by rules of grammar in a given target language. By way of example, the English noun CRANEN would be separated into the lexemes CRANE bird and CRANE construction. In those examples, the lexeme is represented by a capitalized root word, an underscore, and a second word elaborating on the meaning. Other representations may be used as well.
[50] In step 103, each lexeme is stored in the system's lexeme relational or graph database, as further described herein and referred to hereafter as the "LexDB." Each lexeme is a
unique database entry and is programmatically assigned a unique ID. The application publisher enters relevant information about each lexeme such as the root word in each target language, the lexeme's semantic value, its word class (e.g., verb) in each language and, if applicable a word sub-class (e.g., stative verb) or a meta-class (e.g., subject or object). It could also include one or more lexical categories for the lexeme that either apply globally to all languages or only to one or more specific languages. TABLE 1 illustrates conceptually one way of storing such data in the LexDB for four English words broken into six sample lexemes. In reality, the list could be considerably longer; RUN has at least 33 meanings. Only a subset of the relevant data is shown in the sample table.
TABLE 1 :
In step 104, the LexDB may store a value for each lexeme entry in a given target language that enables the system to rank-order the lexemes and thus determine a "lexical sequence," which may be a sequence in which said lexemes may be presented to a user such as through leveling. Following is one illustrative example of how a lexical sequence value could be derived for a given lexeme by the application publisher: First, find a CDI (inventory) with a frequency value 1-100, representing a percentage of a population that understands the lexeme, for the lexeme for a given age. If PAO3 (RUN) in Mandarin is understood by 85% of 30-month-olds, a
frequency value of 85 could be stored in the LexDB, which could store multiple frequency values for a particular lexeme, depending on contexts such as age or other audience factors. Then the application publisher could apply weighting factors, such as based on the lexeme's word class, tone class, or length in syllables. For example, in English, nouns could be weighted more heavily than verbs because nouns are generally learned first in English, whereas Mandarin would weight both word classes equally. Lexemes with fewer syllables tend to be acquired first, so therefore lengthy lexemes could be underweighted. Tone classes may be weighted based on the sequence in which they are learned in the target language. For example, in Mandarin lexemes with tone value of "1" and "4" might be assigned a higher weight, whereas tones "2" and "3," which are typically acquired later, might receive a lower weight. Multiplying the raw frequency value (1-100) by one or more such weighting factors would result in a lexical sequence value, which would be stored in the LexDB for a given lexeme as applied to a given target language.
[52] In step 105, a LexDB table may store one or more prerequisite lexemes that are to be presented to a user in the runtime application prior to presenting another specified lexeme. A specified lexeme's prerequisite lexeme may be a lexeme that, according to the rules of the target language, should be learned before the specified lexeme. Prerequisites are likely to be applicable to educational applications but not to communications applications.
[53] In step 106, one or more lexeme presentation sequences are created, each sequence consisting of lexemes in the LexDB wherein such sequences are to be presented to a user in various forms as part of the user input method 200. A presentation sequence may also include other parameters including but not limited to a number of lexemes to be presented at any given time, a possibility set of word classes that may be presented at any given time, a permitted assembly length (the number of lexemes allowed in a phrase or sentence), and a permitted grammar set, as defined in step 107. TABLE 2 illustrates conceptually one possible way of expressing a lexeme presentation sequence, wherein "N" refers to a noun, "Adj" to an adjective and "V" to a verb. The
Maximum number of Lexemes Displayed provides the maximum number of lexemes that will be concurrently available in a given level or point in the presentation sequence.
or ranges ent e n step .
In step 107, a grammar set may be defined as a series of phrase forms, with each phrase form stored in the LexDB database as a sequence of word classes, such as Noun, Verb, and Adverb, and/or sub-classes such as Dynamic or Stative Verbs. Note that each language may apply only a subset of all the word classes enumerated in the LexDB, since word classes vary by language. TABLE 3 illustrates conceptually how the grammar set could be stored in a relational or graph database. The limitation of three word class columns and two language columns is for brevity of illustration.
NOTE: The word classes 1, 2, 3, etc. in TABLE 3 could enumerate sub-classes such as Subject Noun, Common Noun, Proper Noun, etc. instead of simply Noun. The table, as shown, is a simplification.
[55] In step 108, the grammar set may be further defined as a series of rules establishing the permissibility of an intermediate word class between any two given word classes as shown in TABLE 4. The purpose is to reduce the number of permutations in TABLE 3.
TABLE 4:
[56] Other syntactic rules may be added likewise to the grammar set and stored in the LexDB for validation during the runtime process.
[57] In step 109 a "Lex-Icon" is created for each lexeme in the Core Vocabulary and stored in or linked from the LexDB as a set of one or more files associated with each lexeme's unique ID. A "Lex-Icon" is a unit of meaning that represents one lexeme generally in a graphical form such as an image, sign (of a signed language gesture) and/or animation, but may alternatively be conveyed in text form, audio form, as neurobiologic signals, or as any other form capable of conveying the intended meaning. The means of representation may vary based on word class or other factors. In the earlier example of the English noun CRANEN the lexeme CRANE bird would be represented by one Lex-Icon, an illustration and/or animation of a bird, whereas the lexeme CRANE construction would be represented by a second Lex-Icon, an illustration and/or animation of a construction crane. Furthermore, Lex-Icons may identify the word class of the lexeme by having an additional differentiated visual
element, such as variation of the graphical background and/or border color, style, texture or other visual effect.
[58] The Lex-Icon is the minimal unit of language used in the runtime visual-kinesthetic language construction process that begins with step 201. During this process, multiple Lex-Icons may be assembled into a phrase or sentence of varying length called an "assembly." The result is a string of 1 to N lexemes, each represented by a Lex -Icon and sequenced in a particular order and direction that, together, compose a word, phrase or sentence in the target language.
[59] In step 110 any exceptions to the target language grammar set that are specific to a given lexeme are defined. Each phrase form exception is stored in the LexDB as a sequence of word classes, such as { Noun, Verb, Adverb } . TABLE 5 illustrates conceptually how phrase form exceptions could be stored in a relational or graph database, in the case of a fictional lexeme that is a noun. Null fields default to the global conditions set in steps 107 and 108 and any exceptions override the global condition. There may be 1 to N word classes; three classes are shown in TABLE 5 purely for illustrative purposes.
Adjective %] xeme% null null No
In step 111 a relationship is defined, whether manually or by means of a predictive subroutine, between each lexeme in the LexDB that may logically be paired. All subject nouns will be paired with all verbs; all adjectives will be paired with all
nouns; all adverbs will be paired with all verbs; and so forth for any other parts of speech that may govern or be dependent upon (e.g., modify) one another. Each pairing relationship has two attributes: intelligibility and dependency. Intelligibility has two possible values: true or false, depending on whether the particular pair of lexemes makes sense in the target language. (Pairing relationships are generally language-agnostic, but may have exceptions per language.) Pairing values are the language-specific frequencies in which lexemes are juxtaposed in assemblies submitted by users. The values for each attribute are stored in the LexDB and associated with a lexeme's unique ID.
Dependency also has two possible values: governor and dependent. A governor is a dominant lexeme that requires or allows another lexeme, the dependent, to exist in a sentence or phrase. One method of determining the direction of a dependency relationship for a lexeme pair is to use a hierarchical approach called "Dependency Grammar," a widely accepted linguistic method of hierarchically analyzing language that places the verb at the top of the hierarchy. Other methods may also be used. In dependency grammar, all lexemes associated with a verb are dependent upon that verb. Other rules include the following: A lexeme is syntactically dependent upon another lexeme if it requires that other lexeme in order to express meaning. In English, for example, an adverb is dependent on a verb. Conversely, a verb governs an adverb. Yet another example of dependency occurs when a lexeme modifies another lexeme, such as an adjective modifying a noun. Numerous other linguistic constituency tests exist for making governor-dependent distinctions, many of which are widely established.
TABLE 6 illustrates several sample pairing relationships for the lexeme EATy:
[62] FIG. 2 illustrates a directed graph of the interrelationships among lexeme pairs, represented by vertices, and the governing or dependency direction of each relationship, represented by edges (arrows). The directions originate from dependent lexeme to governing lexeme, in this case using the dependency grammar method.
[63] As an alternative to the pairing relationship, a list of predefined allowable assemblies may be created and stored in the LexDB. Such lists enumerate all allowable assemblies for a given application of the system as sequences of lexemes in a delimited format such as %lexemeA%, %lexemeB%, %lexemeC%, and so forth. In other words, if a user were to express an assembly of lexemes in a sequence identical to one of the predefined sequences in the LexDB, the assembly would be decided as valid.
TABLE 8 illustrates conceptually a partial predefined list for one application:
In step 1 12, an inflectional paradigm of each lexeme is enumerated and stored in the LexDB. An "inflectional paradigm" is a set of rules that determine when to apply specific "word forms," which are variations of a word, in order to express different
grammatical categories such as tense, person, number, or mood, to a given lexeme. This concept is applicable only to "inflective languages," which are target languages that have variable word forms, such as declensions and conjugations, for a given lexeme. Each word form will have a unique ID, with a common portion (e.g., the "065" in the example below) indicating the relationship.
TABLE 9 illustrates an example of such an enumeration for the English verb RUNV:
[65] In step 113, for tonal languages, the tone is enumerated for each lexeme as a numerical value 1 to N. For example, in Mandarin Chinese, tones have a value 1, 2, 3, 4 or 5 (neutral), whereas in Zulu they may have a value 1, 2, or 3. Tone values may represent pitches such as high or low or changing in pitch such as rising or falling, for example.
[66] In step 114, the system may generate and store rules for tonal transformations, which are instances where in natural speech words are vocalized differently based on context, and may be stored as logical arguments for each language. For example, in Chinese languages such as Mandarin and Cantonese, lexeme A and lexeme B may, in isolation, each have a certain tone when spoken individually, but when they are used together in a sequence, the tone of one or both lexemes A and/or B may change. To illustrate this in detail, TABLE 10 sets out a concrete example in Mandarin Chinese for the complex and difficult "tone sandhi" transformation rules. Tone sandhi occurs when a Mandarin lexeme with tone 3 (dipping) is positioned before another lexeme.
The possibilities may be enumerated as set forth in TABLE 10. In the table, Mandarin tones have a value 1 (high level tone), 2 (rising), 3 (dipping), 4 (falling) or 5 (neutral), and tone 3 may have the modifier "F" for a falling half-tone, so that "3F" refers to a "half-third" tone which falls but does not rise in tonal inflection. TABLE 10 illustrates a series of transformations. For each transformation listed, there may be a starting state sequence of tones, which may indicate the sequence of tones found in a sequence of lexemes that may be input by the user.
TABLE 10:
[67] For example, the transformation with the identifier value of 1 has a starting state sequence of {3,3}, which indicates that the sequence has a first lexeme with tone 3 followed by another lexeme with tone 3. The resulting state sequence of tones shows the changed sequence of tones due to a tonal transformation. In the first example in Table 10, the sequence of lexemes that originally were both tone 3 lexemes would be transformed in tone, such that the first lexeme now has a tone 2 tone, and the second lexeme has a tone 3F tone.
[68] In some languages, the tones may also transform based on the number of syllables found in a particular word, so the tonal transformation rules may account for this as well. The second example in Table 10, ID number 2, shows an example in which a lexeme having two syllables each of tone 3 (represented as "3_2", where the tone
number is appended by an underscore and the number of syllables) is followed by a tone 3 lexeme having just one syllable (annotating the one is optional, since it may be assumed that the other tones lacking the underscore appendage are also one-syllable tones). The tones in this sequence of lexemes would be transformed to a tone 2, followed by another tone 2, and then a tone 3.
[69] In step 115, the entry for each lexeme in the LexDB may enumerate any exceptions to the tonal transformation rules using a method such as established in TABLE 5. Any exceptions will override the language tonal transformational rules.
[70] In step 116, for click languages, the click consonants (obstruents) are enumerated for each lexeme as a numerical value 1 to N. For example, in Zulu, clicks have a value 1, 2, or 3. Click values may correspond to different types of click releases such as dental, palatal, alveolar, lateral and bilabial, for example.
[71] In step 117, where applicable, the required order of word sub-classes is defined for each target language and stored in the database. For example, in English if multiple adjectives are used contiguously to modify a noun, they must occur in the following sequence of sub-classes in order to be valid natural language: Determiners, Observation, Size, Shape, Age, Color, Origin, Material Composition, Qualifier.
[72] Step 118 enumerates any articles, classifiers (e.g., measure words), copulae (e.g., linking verbs), ergatives, evidential markers, reflexive markers, and various possessive, tense and other grammatical particles that need to be encapsulated with a given lexeme in various contexts and stores these in the lexeme's LexDB entry to generate an "extended word form."
[73] For example, in English, the lexeme I/ME may sometimes require the copula "am" to succeed it, whereas in other languages this copula is non-existent. Thus, the lexeme I/ME in English would have an entry as illustrated conceptually in TABLE 11.
Verb me
In order to illustrate how word forms can be extended in different languages, TABLE 12 provides a second example in Mandarin Chinese for the lexeme WO5, which means I/ME in English. (The number superscripts refer to tones.)
TABLE 12:
Verb Wo3
Alternatively, or in addition to the predecessor/successor method described above, extended word forms could be determined by virtue of grammatical categories such as tense, person, number, or mood, to a given lexeme. Using this method, the "running" form of the lexeme RUNV may be enumerated as follows:
RUNV
Running { verb_present-participle }
1. am { lst-person-singular }
2. is { 3rd-person-singular }
3. are { lst-person,2nd-person,3rd-person-plural; 2nd-person-singular }
TABLE 13 shows how TABLE 9 could be extended to capture this information for the word form "running" of the lexeme RUNV. The field "preceding form" contains a word form that may precede the lexeme. Likewise, the succeeding form would contain a word that follows the lexeme. If the lexeme RUNV had the form "running" (Word Form ID 065 3 in this example) and was identified as being in a first-person-
singular context, it would meet the condition for Extended Form ID 065 3 1 and thus return the preceding form "am." In this way, the result would be the extended form "am running."
TABLE 13:
[77] At step 119, all possible assemblies (sequential combinations of lexemes) that may occur at runtime are pre-computed using the various rules described above. The pre- computing process executes the runtime validation process outlined in method 300 (see FIG 5) and generates a value of valid or invalid on both syntactic and semantic dimensions. (Providing a value on both dimensions serves as a sort of annotation as to the specific reason for failure, if invalid.). As a LexDB increases in size, it may become less practical to compute the full set of possible assemblies, in which case a limit may be placed on assembly length. For example, one could specify that only word pairs would be pre-computed or that each lexeme would be pre-computed to one relationship distant, i.e., its immediate governor and immediate dependent(s).
[78] At step 120, a set of audio files, text files or other file types consisting of natural words or phrases for each variant of a lexeme's 1 to N word forms, extended word forms, and/or pre-computed assemblies in each target language are associated with the lexeme entry in the LexDB. The audio files may be recordings of a person speaking the words and text files may display or encode the spelling or appearance of the words. Other file types may include images, videos and animations visually representing a word concept. This output file set is further associated with each relevant word form, extended word form or pre-computed assembly in the database.
By "natural" it is intended that audio files shall approximate natural speech by using natural speech segmentation and vocal inflection.
As a result of the previous steps, the LexDB should include (a) a collection of metadata that describes each lexeme and the rules by which it operates, as a kind of "dictionary" consisting of TABLES 1-13 (b) word classifications that may include one or more lexical categories that apply globally or to a specific target language (such as words relating to family, transportation, greetings, measurement, emotions, computers, or any variety of other lexical categories), (c) a grammar set for the target language, (d) an inflectional paradigm, tone mapping and transformation system, and/or click consonant mapping system, as applicable, (e) a graph defining relationships among lexemes per FIG 2 (or else a predefined list of valid assemblies), and (f) audio, text or other files (or links to such files) for each word/tone form and extended form. Below is one possible data set for one lexeme, greatly simplified and provided for illustrative purposes:
Lexeme ID: 065
Lexeme Name, English: RUN
Lexeme Name, Chinese: PAO
Global Semantic Value: to move forward quickly on two or four legs
Global Lexical Categories: actions, quick movements, motions
English Word Class: Verb
English Sub-Class: Action Verb
Mandarin Word Class: Verb
Mandarin Sub-Class: Action Verb
Word Forms Sub-Table, English:
Run: verb_present_l st-person,2nd-person,3rd-person-plural; verb_past-participle, %audio_file_link%
Copula: <null>, %audio_file_link%
Runs : verb_present_3rd-person- singular
Copula: <null>, %audio_file_link%
Running: verb_present-participle, %audio_file_link%
Copula: am { lst-person-singular }, %audio_file_link%
Copula: is { 3rd-person-singular }, %audio_file_link%
Copula: are { lst-,2nd-,3rd-person-plural; 2nd-person-singular },
%audio_file_link%
Ran: verb_past, %audio_file_link%
Copula: <null>, %audio_file_link%
To Run: infinitive, %audio_file_link%
Copula: <null>, %audio_file_link%
Tonemes Sub-Table, Mandarin:
PA03: root, %audio_file_link%
PA03F: transformed, %audio_file_link%
Pairing Relationships:
Quickly (Adverb): governing
He(Pronoun): governing
She(Pronoun): governing
Articles, English: <null>
Particles, Mandarin: de
De: Adverb-predecessor
Exceptions: <null>
Graphical Representations:
Lex-Icon static image, %image_file_link%
Animation key frame, %animation_file_link%
[80] In step 121, the application publisher or system administrator defines default settings for a given application and, optionally, for a given individual user or group of users. In some embodiments settings may include but are not limited to the following application and user profile settings:
(a) Custom Lex-Icons: a set of 1 to N custom Lex-Icons, each of which may represent a specific person, place or thing, real or fictional, including proper nouns. An application may, by default incorporate custom Lex-Icons. For example, if Johnny Appleseed were a character in a game that used the system, the application publisher could create a custom "Johnny" Lex-Icon at design time.
(b) Custom Visualizations: a set of 1 to N custom images, videos or animations that
correspond to any custom Lex-Icons and that may be used in the Visualization Area. As with custom Lex-Icons, the publisher may create a default set of custom
visualizations at design time. Alternatively, method 1000 may be used to use preexisting video or animation clips to demonstrate word concepts.
(c) Target Language: the language that the system outputs could be variable based on the current user profile, speaker (the user of the application), listener (a party with whom the speaker is communicating using the application), spatial and temporal contexts, device, or other pre-defined contextual factor. For example, a user could choose to output her submitted assemblies as English, except when she communicates with Juan she has the application set to output Spanish. The default language will generally be
the target language of the application. An application publisher may define a set of 1 to N languages available for output at design time.
I) Output Voice Pattern: a voice pattern that may vary based on attributes such as age, gender, tone and mood, to use for narrated audio output based on the current user profile, speaker, listener, spatial and temporal contexts, device, or other pre-defined contextual factor. For example, a user could choose to associate a calm older male voice with one speaker, but change it to an excited young female voice when directed at a particular listener. The publisher may create a set of voice patterns at design time and assign specific patterns as default values. An application may allow users to select from available patterns, modify patterns or create new patterns and associate them with various attributes such as speaker, listener, spatial and temporal contexts, device, etc.
:) Spatial and temporal contexts: Spatial and temporal contexts are important for both language learning and communication. For a language learner, for example, it may be helpful to prioritize vocabulary relevant to the time of day or the learner's current location. For augmentative and alternative communication, prioritizing the relevant vocabulary could enable more rapid communication rates for persons with disabilities. For example, during the course of a school day, a child is likely to spend time in places that each have a well-defined function: classrooms, the cafeteria, the nurse's office, the Speech-Language Pathologist's (SLP) office, the media center, etc. Each place requires different vocabularies. Spatial location may improve symbol prediction by matching word usage with specific locations and then presenting symbols of relevance. Temporal events can also be used differentiate situations where location stays constant but context changes. For example, a room may be used for math and language activities at different times of day. Thus, the system may adjust lexical item prediction to that change. It also may predict the use of time-dependent words and phrases such as "good morning" or "lunch," as opposed to "good evening" or
"dinner."
Spatial and temporal context can be defined by numerous methods, two of which are set forth here:
1. The first method is for a system user to enter data that associates a spatial- temporal entity with the combination of a time (e.g., a time of day on a day of the week) and/or a location (e.g., predefined geo-coordinates or a recognized positioning system location and a radius from either such point). For example, a user could identify an area of 20 meters in radius from a single point and label it as "home." The system may thus record the location of where
particular assemblies were submitted, store these data in a relational or graph database, and allow authorized users to retrieve and display such data. The default location may be null or may default to the country or region in which the user is located. This may be known by location services or assumed based on the point of distribution of the application, for example. Similarly, a user may define one or more temporal events and set user-defined parameters such as a label, time ranges, recurrence and conditionals.
2. The second method is for the system to query against the corpus of assemblies submitted by the user and/or a set of users in the aggregate in a given spatial or temporal context. These two methods are not mutually exclusive. For example, the system could first check if the user has predefined a category and, if not, attempt to find the result with the second method.
[82] TABLE 14 below sets forth an example of how a user could enter data to define spatial and temporal contexts and associate the combination with a spatial-temporal entity for said user or for a different user. The first row shows spatial contexts, the first column shows temporal contexts, and the cells in the center of the table contain spatial-temporal entities that may be the result of a combination of spatial locations and temporal events. For example, at 10:00 on Mondays in the spatial location Classroom 101, the spatial-temporal entity would be "General Studies." This spatial- temporal entity may serve as a basis upon which to organize, store and retrieve lexical data and allow authorized users to retrieve and display such data. For example, the system may assign a semantic category based on spatial or temporal contexts or the combination of the two. A semantic category is defined as a grouping of vocabulary within a language that consists of lexical items that are interrelated in terms of meaning. For example, car and truck would both belong to a "vehicles" or "transportation" category. Alternatively, the system could return a list of lexical items.
TABLE 14:
Monday or Default Classroom 101 Cafeteria Nurse's "A-Day" Temporal Office Schedule Context
Default Spatial General Studies Food Medical Context
9:00 - 9:30 Arriving Arriving {default} {default}
9:31 - 12:00 Locker General Studies Assembly {default}
12:01 - 13:00 Lunch Lunch Lunch {default}
13:01 - 14:00 Locker Art Assembly {default}
14:01 - 15:30 Locker Math Assembly {default}
15:30 - 16:00 Packing Up Packing Up {default} {default}
[83] The list in the example given in TABLE 14 would continue for Tuesday, Wednesdays, B-Days, C-Days, etc. and may offer a recurrence option (e.g., every Tuesday or every third Tuesday) or a date range option (e.g., for the dates 3/14/2018 until 3/17/2018). While the above table is two-dimensional (combination of time and space), the method also contemplates one-dimensional lookups, i.e., only time or only space being considered, as well as multi-dimensional lookups wherein other factors may be considered in addition to spatial and temporal contexts.
[84] At step 122, the pre-processed language data and pre-computed validation results may be packaged for a particular application on a particular device class in order to optimize access speeds at runtime and/or reduce or remove the need for runtime communication with a server. The packaging process may include (a) transforming the LexDB database for faster processing (for example: if the LexDB were relational, it could at this step be flattened into a non-relational database, (b) constraining the set of data packaged to only the data required for a particular application (for example, only Arabic Level 1) or even a subset of an application, and (c) any other procedures that may optimize for an application or a device.
[85] Once the above preparation, data pre-processing and packaging steps are completed, a specific application of the novel system for a specific target language may operate. Such application may integrate or interface with other applications involving the same or different target languages. Such applications may be loaded onto a device and operated by a user. In such event, the runtime workflow process begins, as described in the succeeding section.
[86] RUNTIME SYSTEM
[87] FIG. 3 provides a sample screen that functionally illustrates an embodiment of the runtime system that will be used for explanatory purposes herein. The runtime system may take any number of visual forms. FIG. 3 provides a concrete visual example of one application of the system.
[88] Some alternative embodiments of the FIG. 3 interface may omit the features shown in dotted lines in FIG. 3. In other embodiments, the other elements shown in solid lines may also be omitted, rearranged, and/or presented differently. Following is an explanation of the FIG. 3 example elements:
[89] Block 1, the Speaker and Listener Selection, in some embodiments allows a user to view and/or modify the user profile of the speaker, i.e., the user who is composing assemblies, and the user profile of the listener(s), i.e., the user(s) with whom a speaker is communicating, if applicable. The process for and implications of such selections are described in method 700.
[90] Block 2, the Assembly Parameters Panel, in some embodiments contains parameters that a user may pass to the system when submitting an assembly. These may include parameters that identify the language in which output is to be narrated, signed and/or transcribed, tense and time indicators, plurality, contextual indicators such as place or season, or other options. The process for and implications of adding parameters are described in method 700.
[91] Block 3, the Settings, in some embodiments allow the user to change default preferences for the entire application, for a user profile or for other more specific conditions. The process for and implications of settings changes are described in method 700.
[92] Block 4, the Lexical Categories Panel, in some embodiments allows a user to choose a semantic category (or meaning-based grouping) and load 1 to N Lex-Icons within that category as one way of navigating a potentially large number of Lex-Icons in the system. (The category icon with the thick dotted line represents the currently selected icon.).
[93] Block 5, the Lex-Icon Panel, contains Lex-Icons that a user may select, place and rearrange to construct assemblies. If a semantic category has been selected in block 4, as illustrated by the thick, dark dotted lines, the Lex-Icons will be filtered to reflect only those Lex-Icons within said category. A user in some embodiments may use voice commands, text search or various other means to further filter the selection of Lex-Icons, icon.).
[94] Block 6, the Visualization Area in some embodiments is a graphical area with two purposes, dependent on context: (a) the area may be used to display real-time visualizations during assembly construction and (b) subsequent to assembly submission, the area may output images, signs, videos or animations depicting the assembly's meaning. During assembly, the user may select Lex-Icons from the Lex- Icon Panel, and the system may display a corresponding image, animation, sign or text string in a pre-defined location in the Visualization Area.
[95] Block 7, the Prosodic Attributes Panel, present in some embodiments, contains a set of graphical icons that may be activated with a controller and thus allow a user to choose one or more prosodic attributes and assign them to an individual Lex-Icon or to an entire assembly.
[96] Block 8, the Assembly Bar, provides a linear view of an assembly-in-progress. In some embodiments a user may manipulate the arrangement of Lex-Icons on the assembly bar, whereas in other embodiments the assembly-in-progress may be pre- validated at each step and automatically re-arranged by the system to reflect Lex-Icon choices with or without proper word order.
[97] Block 9, the Logged Assembly 1, illustrates that in some embodiments a user may retrieve a previously submitted assembly stored in the system's database in order to repeat or revise it.
[98] Block 10, the Logged Assembly N, illustrates that any number of logged assemblies may be stored for later repetition or revision.
[99] Block 11 , the Predictive Panel, in some embodiments presents symbols representing lexical categories or individual Lex-Icons based on predictive analyses such as spatial, temporal, frequency and other factors outlined in method 600.
[100] USER INPUT WORKFLOW
[101] FIG. 4 is a flow diagram of a method 200 illustrating one of many possible sequences through which a user may input a lexical assembly in a target language. The various steps depicted may be performed by one or more computing devices, such as a personal computer, tablet, smartphone, desktop computer, etc., as described further below.
[102] Step 201 : A computing device may load an application program and may initialize values from the LexDB based on the choice of a target language and type of application. A given application may have one or more scenes displaying a visual environment, as well as logic for what results should occur based upon the set of possible actions that a user can perform by manipulation with a controller, which can include a computer mouse movement, finger movement or touch, keystroke, mechanical switch, gesture, remote control such as an infrared light signal, camera input, voice command or other vocalization, facial movement or expression (e.g.,
mouth movement), neurobiologic impulse such as a brainwave or eye motion, any other kinesthetic action including the actual or perceived sense of motion or touch, or any combination of mechanisms thereof.
[103] Step 202: The computer may attempt to identify the user and to load his or her "user profile," which is a data repository stored on either the device and/or on a server that contains all pertinent data previously collected from or about or associated with the user. (For example, if a locally stored learning profile is outdated, the system may attempt to contact an application server to synchronize the user profile with a central database over a network or, may use a locally cached database without connectivity.) User profile data may include a record of each assembly submitted and data such as whether it was semantically intelligible or unintelligible and what errors were made, if any; whether it was syntactically well formed or ill-formed and what errors were made, if any; how much time was expended to compose it; the lexemes composing each submitted assembly and whether each was used correctly; the word class of the such lexemes and whether they were applied accurately; the phrase length and how that correlates to validity; the phrase form pattern; the word form or toneme each lexeme assumed; the spatial and temporal contexts of the submission; the listener, if identified at the time of submission; and any other data that may be captured by the system. The user profile may also contain the settings described in step 121. If a stored learning profile cannot be loaded, the process continues with step 205.
[104] If there is a user profile, then in step 203 the computing device may load any current preferences from the profile, such as Speaker and Listener Selection, Assembly Parameters, or Settings, as resolved in step 706. The user may modify these at any time before submission, as described in the method 700.
[105] In step 204, the computing device analyzes the user profile in order to determine the lexemes to be presented to an individual user at any given time in the form of Lex- Icons. When the system accesses the user profile, it may determine the user's current state of competency with regard to the relevant interactive and/or to the target
language overall. This state of competency may be based on factors such as demonstrated comprehension and received instruction. "Demonstrated comprehension" is measured on the basis of previous assemblies composed by the user that are well formed and/or intelligible. Such demonstrated comprehension may be for a specific assembly, a general word class, a particular phrase form, a particular lexeme, or other dimension and can be specific or general in nature, as required. "Received instruction" means that the system demonstrated a lexeme, phrase form pattern, phonological element, etc. to the user and is measured solely by frequency. Together, these measurements can be combined algorithmically to form a score or an explicit, sequential level (e.g., level 1, 2, or 3, etc.).
[106] In step 205, the computing device retrieves predictive data using the steps set forth in the method 600.
[107] In step 206, the computing device may filter the lexical items to determine which lexical items (lexical categories, lexemes and assemblies) in the LexDB (together, the "candidate pool") may be presented to the current user in the current context in the application. Lexical items may be disqualified on the basis of application-level considerations such as the scope of the application, by which is meant its learning or communication intent that may, for example, be bounded by grade level, skill level or curriculum considerations, or irrelevance to the selected lexical categories (in some embodiments, Lex-Icons in the Lex-Icons Panel in FIG. 3, Block 5 will only load if associated with the current semantic category) or may be disqualified on a user- specific basis, the latter of which generally applies only to educational scenarios. Information in the user profile may disqualify, from the candidate pool, those lexical items that fail to meet certain requirements. For example: lexical items that do not belong to a required word class specified in settings or other rules discussed above are disqualified. Lexical items that require a prerequisite identified in step 105 to be presented, but where the prerequisite is not yet in the assembly, are disqualified. Where an instance of the system limits the use of certain lexemes to a predetermined number of times set by the application, such lexemes will be disqualified if the
number of submissions as part of valid assemblies exceeds the predetermined limit. Lexical items that have an associated level in a lexical sequence (as determined in step 106 and stored in TABLE 2, for example) that is of equal or greater value than the level achieved by a particular user and stored in such user's profile will be disqualified. There may be any other combination of factors that may be specific to an application and which may disqualify lexical items based on similar types of rules.
[108] In step 207, the computing device loads and presents the lexical items in the candidate pool that were not disqualified in the previous step in the context of the application and the user profile. Predicted lexical items may be presented within a Predictive Area, per FIG. 3, Block 11 or may be integrated into the into the Lexical Categories Panel (FIG. 3 Block 4) and/or the Lex-Icon Panel (FIG. 3 Block 5). The predictive data may also trigger the system to load the Lex-Icons in Block 5 for a particular semantic category. For example, if a spatial location and temporal event are combined into a single spatial-temporal entity such as "lunch," the semantic category "lunch" could load, along with all the associated lexical items for Block 5.
[109] In embodiments of the system where the Lexical Categories Panel is displayed to the user, the panel may display graphical elements representing available lexical categories. Lexical categories are containers for 1 to N lexemes or predefined or user- generated assemblies. A lexeme or assembly may belong to 1 to N lexical categories. Categories may include any form of lexeme or assembly categorization such as word classes, word sub-classes, multi-class groupings such as lexemes or assemblies about people, food, play, emergencies, etc. or system-generated categories such as aggregations of those lexemes or assemblies used most frequently by the user or by other users, lexemes or assemblies used most frequently in a particular spatial location, temporal event or factors discussed in method 600. Lexical categories are presented in the Lexical Categories Panel based on a pre-determined order, numeric rank order based on predictive data, a combination of the two, or any other means of ordering.
[110] FIG. 5 provides a conceptual illustration of one possible way in which the system may determine which lexical items (in this case limited only to Lex-Icons, solely for the sake of simplicity of explanation) to present to a particular user and how to rank-order them. In the illustration, first lexemes that are not in the currently selected semantic category (if applicable) are disqualified from the candidate pool shown on the left side of FIG. 5. In the illustration these disqualified Lex-Icons are grayed out. Next, assuming that our fictional use case refers to a learning application, lexemes that have not had their prerequisites presented to that user before are disqualified (marked "PR"). Next, suppose this particular level or activity in the application restricts one or more word classes, which disqualifies the corresponding lexemes from the candidate pool (marked "WC"). Then yet other lexemes are disqualified because the user profile indicates that the user is Level 2 in the application and some lexemes require a skill of Level 3 or above (marked "L3," etc. to indicate required level). The remaining lexemes are scored and using a scoring algorithm set forth in method 600. In the conceptual illustration, scores are presented as integers on each icon wireframe. Finally, lexemes are sorted from high to low score and presented on a certain panel such as the Predictive Panel in numeric rank order. The rules described above are only an example and any number of rules may be applied to lexeme qualification, rank ordering and sorting.
[Ill] In step 208, the user may retrieve or select a new semantic category (other than the default) using a controller or other input device (e.g., touch screen, mouse, stylus, etc.).
[112] In step 209, the user retrieves or selects a Lex-Icon using a controller. The system enables the user to combine and order Lex-Icons into an assembly by manipulating the graphical representations. This is done by selecting Lex-Icons one at a time, by means of which they are moved to an area on the graphical user interface termed the "assembly bar," which is a graphical construction wherein one or more Lex-Icons are arranged by the user in a specific order. The user has the freedom to use any of the available Lex-Icons in any order and may therefore create a variety of possible
assemblies. Information stored in the user profile may set a maximum number of Lex- Icons permitted in an assembly; add or subtract specific requirements such as time performance; or implement any number of other ways to sequence the presentation.
[113] Alternatively in step 209, a user could add lexemes not represented by the set of Lex- Icons initially presented. For example: A user could convert a graphical image shown elsewhere on the device screen into a Lex-Icon by selecting that image using a controller (e.g., mouse pointer, touch screen, etc.); a user could perform a signed gesture and the system could use camera-based recognition to first digitize the sign and then convert it into a Lex-Icon, placing said Lex-Icon into the assembly bar; a user could type in text; or a user could speak and have such speech converted into Lex-Icons; or any number of other inputs and conversion methods.
[114] In step 210, as each Lex-Icon is selected by a user using a controller (e.g., touched, clicked, flicked, pressed, or otherwise indicated by a controller), the system may execute multiple response actions in rapid succession (preferably within a few milliseconds of activation so that the user's brain perceives the events concurrently). These actions may include: (a) parsing the Lex-Icon as an individual unit (without context) in its root form wherein, for example, the RUN lexeme would be parsed by default as "run" by looking up the lexeme ID in the LexDB and finding the root form in the target language (in this case English) as per TABLE 1 , whereas in the context of a full assembly, which occurs in a later step, it might be interpreted as "runs" or "ran" or "running" by doing a condition-based lookup per TABLE 9; (b) narrating the root form by playing the audio file associated with the root form for such lexeme ID; (c) displaying a graphic or animation of the root word being signed in a sign language; (d) moving the selected Lex-Icon to a predetermined position in the assembly bar; (e) adding the image, animation, sign or text corresponding to the lexeme to a predefined place in the Visualization Area (see FIG. 3, Block 6) in step 211; (f) pre-validating the assembly currently under construction in its then-current state (as discussed in step 213); or any combination thereof. Information in a user's profile, such as level in a learning sequence, may impact the nature of the response
actions. In operation of the assembly bar in some embodiments, languages that are normally transcribed in a left-to-right direction (such as European languages) are left- aligned and the Lex-Icon moves to the right end of an existing assembly. The opposite approach may or may not be used for right-to-left languages (such as Semitic and Persian languages). Alternatively, user action may convert graphical elements representing lexemes to and from signs, text, objects and/or animations, rather than keeping them in the form of graphical elements.
[115] Further note that the narration or other form of output of a selected Lex-Icon in step 210 may vary based on application type and preferences loaded in step 203. For example, a communication application may output the lexeme in the user or speaker's preferred language, whereas an educational application may output in the target language at this step and only apply preferences to output after assembly submission.
[116] In step 210, the computing device may re-run Steps 206 and 207 each time a Lex-Icon is selected and added to the assembly bar, depending upon a given application's design. In such an event, the computing device in step 210 would first perform a lookup in the Lex DB that examines the frequency of prior pairings of the currently selected lexeme with each other lexeme in the LexDB that has a governing-dependent relationship with the currently selected lexeme. The frequency of such prior pairings may be stored as a numeric "pairing value" (per TABLES 5 and 6): For example, say RED has a dependent relationship with APPLE. If RED was used to modify APPLE 8 times it will have a score of 8. At runtime, lookup the pairing value of all lexemes in the candidate pool and use such data to replace or augment the predictive algorithm in method 600.
[117] In step 211, for those embodiments that build a "visualization," by which is meant a set of static and/or animated images that depict the meaning of the assembly-in- progress in a nonlinear fashion, this occurs as follows: Let us assume that the user constructs an unordered set of lexemes: GIRL, TALL, EAT, QUICKLY, APPLE, RED. The system first performs a LexDB lookup and returns the word class for each
lexeme: GIRLN, TALLADj, EATV, QUICKLYADV, APPLEN, and REDADj. Then the system places static or animated images (or signs) for each lexeme in a predefined place in the visualization area based on that lexeme's word class and role in the sentence. These predefined places may vary by application or target language, such as by whether a language predominantly uses SVO (Subject- Verb-Object), SOV (Subject-Object- Verb), or another basic syntactic structure. Word classes that modify the subject, verb and object, such as adverbs and adjectives, would be shown in a subsidiary position. One example of a visualization schema is illustrated in FIG. 6 wherein each space would generally contain an image, animated image or sign, rather than text, which is shown for explanatory purposes only. The images, signs or animations, for example, may be similar or identical to those used in the corresponding Lex-Icon.
[118] In step 212, in those embodiments using linear assembly arrangements, such as with educational applications, the user may rearrange Lex-Icons. A user may optionally modify or self-correct an assembly-in-progress (one not yet submitted) by reordering the Lex-Icons, as well as by adding Lex-Icons to the assembly (returning to step 205) or removing Lex -Icons from the assembly. A user may reorder Lex-Icons by using a controller to move them to the left or right in the assembly, thus displacing adjacent Lex-Icons one space to the left or to the right. The illustrations below demonstrate this operation. In the initial state, shown in FIG. 7A, assume there are three Lex-Icons ordered from left to right: A, B and C. Next, the user moves Lex-Icon A along the trajectory represented by the curved arrow below to the target represented by the dotted line in FIG. 7B. When the user completes the movement, the activated Lex- Icon A is inserted in the middle between Lex-Icons B and C, resulting in the ending state in FIG. 7C.
[119] Some embodiments may not require or may even allow rearrangement or may auto- arrange the assembly based on context. These follow the non-linear workflow beginning with step 213. Applications using the linear workflow skip ahead to step 215.
[120] In step 213, in embodiments allowing non-linear assembly arrangements, such as with augmentative and alternative communication applications or cross-language communication between speakers of two different languages, the system may perform a pre-validation analysis of the assembly-in-progress using non-linear rules that ignore phrase forms. In these scenarios, the system cannot assume that the user is capable of arranging well-formed phrase forms. Therefore, in "nonlinear mode," the system will interpret assemblies as unordered set of lexemes.
[121] For illustrative purposes, let us assume that a user has constructed an assembly-in- progress consisting of the lexemes GIRL, TALL, EAT, QUICKLY, APPLE, and RED. The system has in step 211 already performed a LexDB lookup and returned the word class for each lexeme: GIRLN, TALLADJ, EATV, QUICKLYADV, APPLEN, REDADJ. The system now executes a subroutine wherein it runs through a set of attempts to determine all possible permutations of the assembly that would lead to a well-formed sentence, as described in step 303 below. Where ambiguity exists, the system attempts to remove such ambiguity through a process of elimination. TABLE 15 illustrates this type of subroutine, wherein the function S represents the subject, O represents the object and M represents a modifier. First, the process may seek to identify the subject of the sentence. To do so, the system would find a lexeme of the word class Noun and sub-class Subject Noun that is a dependent of the verb(s) in the assembly. Since TALL, EAT, QUICKLY, APPLE and RED all fail to meet this condition, the resulting value must be GIRL. Since GIRL is the subject and EAT is the verb, we can eliminate those lexemes from further steps. Next, the system identifies the object of the sentence, which must also be a dependent of the verb EAT. Based on a lookup of the word pairing relationships, per TABLES 6 and 7, the system finds that TALL and RED have no pairing relationship with EAT, whereas QUICKLY and APPLE do. QUICKLY, however, is an adverb and not a common noun and so is eliminated, leaving APPLE as the resulting value. APPLE is now removed from further steps. Next, the system seeks an adverb that is a dependent of the verb EAT. Of the remaining lexemes, TALL, QUICKLY, and RED, only
QUICKLY is of the adverb word class, so the other two lexemes are eliminated and QUICKLY is the resulting value. QUICKLY is removed from further steps. Next, the system identifies any noun modifiers such as adjectives or measure words. The remaining lexemes TALL and RED are both adjectives and thus potential modifiers of both nouns in the assembly: GIRL and APPLE. Thus, the system must perform lookups for both lexemes to check the associated word pairs. It would find that, while GIRL could be modified by both TALL and RED, the noun APPLE may only be modified by RED and is not paired with TALL. Therefore, the system is able to identify RED as the resulting value for APPLE and, by process of elimination, identify TALL as the resulting value for GIRL.
TABLE 15 :
*** For the sake of the example in TABLE 15, let us assume that RED can modify GIRL based on rules in the LexDB. ] Once the system has found values for each attribute, it positions each lexeme in relation to the verb, in this case EATV. For example, in English the subject noun precedes the verb and the object (also a noun) succeeds the verb. The system knows this based on the
grammar sets for a given target language determined at design-time, as per TABLES 3-4. Thus, the basic sentence structure for our example must be _ GIRL _ EAT _ APPLE _. Adjectives in English must precede the nouns they modify. Thus, the sentence can be further deduced to take the form TALL GIRL _ EAT _ RED APPLE _. Adverbs in English may have multiple placements such as before the verb or at the end of a sentence without changing meaning and thus the system may randomly determine a location for this word in the sentence to arrive at TALL GIRL QUICKLY EAT RED APPLE. This will later (in the method 400, post-submission process) be translated into natural language using extended word forms. For now, the system interprets this sentence hierarchically as [[TALL] GIRL] [QUICKLY] EAT [[RED] APPLE] before adding extended word forms at a later step. FIG. 8 illustrates this hierarchy.
[123] If the pre-validation analysis fails to identify a possible valid arrangement of lexemes or identifies multiple valid arrangements that would result in different meanings, the system next executes error-handling step 501. See FIG. 12.
[124] In step 214, based on the pre-validation analysis results of step 213, the system may auto- arrange the assembly into a well-formed sentence on the Assembly Bar.
[125] With each new Lex-Icon added to the assembly, steps 213 and 214 may repeat in those embodiments using non-linear pre-validation. Alternatively, other embodiments of the system could perform non-linear validation only after submission of an assembly.
[126] In step 215, a user may select one or more prosodic attributes, such as an emotional tone of voice, syllable length, syllable stress, voice pitch and voice volume, or other prosodic attribute, in order to augment the meaning of a Lex-Icon or assembly. A user makes this selection from the Prosodic Attributes Panel and then associates such attribute(s) with a specific Lex-Icon or with an entire assembly-in-progress. A user may create, modify or remove such associations at any time prior to assembly submission.
[127] For example, one prosodic attribute could be an angry tone of voice. A user could indicate a prosodic attribute, graphically represented with an icon (e.g., an emoticon)
displayed on the Prosodic Attributes Panel, Block 7 in FIG. 9A, and then indicate a target for that attribute in Block 8.
[128] FIG. 9B shows two examples of associations for the sample attribute "angry": The circle marked "A" in Block 8 represents an association between the "angry" prosodic attribute and a Lex-Icon and "B" represents an association with the entire assembly.
[129] The result of such associations may be as follows: (a) re-narrate the lexeme or assembly, as per step 210, (b) re-visualize the lexeme or assembly, as per step 211, if applicable, and/or (c) modify the post-validation visual, auditory or other form of output described in the method 400. For example, association of a prosodic attribute such as "angry" may cause a Lex-Icon to be narrated or animated in step 406 with the relevant emotional tone of voice, syllable length, syllable stress, voice pitch and voice volume, etc. or may provide visual indicators of mood such as images, symbols or signs. Prosodic attributes in some languages may also generate attitudinal particles such as "la," "ya," and "a" in Mandarin. (There are 6 such particles in Mandarin and 30 in Cantonese, for example.).
[130] In step 216, the user submits an assembly in the target language. The user determines when an assembly is complete and indicates this by activating a trigger using the controller, e.g., pressing a button. By doing so, the user "submits" said assembly, which sends it to the system to be evaluated. Alternatively, the system may auto-submit an assembly on behalf of a user under conditions defined at design time or by user settings. For example, in an assistive technology application, such application might be programmed to auto-submit an assembly after a certain time elapse if a semantic category called "911 -Emergency" was active.
[131] The next several steps represent various stages in parsing the assembly, evaluating it against validity criteria, and converting it from a simple string of lexemes into natural language.
ASSEMBLY VALIDATION WORKFLOW
[133] FIG. 10 is a flow diagram of a method 300 for the system to validate a lexeme assembly that has been arranged by a user.
[134] In step 301, the computing device determines the assembly validation method. The workflow diverges based on the method selected. Two methods may entail (a) using a predefined checklist that provides a binary decision about whether an assembly is valid, which begins at step 302, or (b) using a set of logical conditions that determine whether an assembly is valid, which begins at step 303. Other methods are possible.
[135] In step 302, the computing device performs a lookup of predefined assemblies stored in the LexDB, as per step 111, and returns a binary intelligibility value of true or false: If the user-arranged assembly exists, it returns valid = true. Otherwise, it is invalid. If valid, proceed to step 306 (logging) and then step 401 (VALID flow). If invalid, proceed to step 306 (logging) and then step 501 (INVALID flow). With applications using linear grammar rules, the predefined assembly order of lexemes must match the submitted assembly order. With applications using non-linear grammar rules, predefined and submitted assemblies may simply contain the same lexemes, without regard to order.
[136] Step 303, in those embodiments where linear assembly arrangements are used, determines whether the assembly is well formed or not. For applications using non-linear grammar rules, the process skips ahead to step 305. "Well formed" means that a phrase form is ordered in a valid sequence according to the rules described above. To determine this, the system looks up each lexeme in the LexDB to find its word class and sub-class and compares their relative positions in the context of the submitted assembly and against the allowable phrase form patterns and intermediate word classes in the LexDB, as per TABLES 1 and 2, as well as any sub-class required order defined in step 117. The result of each check may be a simple "yes" or "no" value (well formed or not) or may provide a score based on gradations of how well formed the assembly is. In the former case, all checks must return a "yes" value for any assembly to be considered well formed. In the latter case, the assembly will be considered well formed if the score exceeds a pre-defined
threshold. If the assembly is well formed, proceed to step 304. Otherwise, it is invalid and will proceed to step 306 (logging) and then to step 501.
[137] For some embodiments of the system, such as communications applications, the Speaker and Listener Selection in Block 1 of FIG. 3 could impact phrase form rules. For example, selection of a speaker could eliminate the need for a pronoun. If a speaker is the user (I/ME), then first person is assumed and I/ME could be used as the default pronoun if one is not explicitly selected by the user. Thus, the rule change in this sample case would be to allow phrase forms that are missing a subject.
[138] In step 304, the computing device analyzes the distribution of the lexemes in the assembly and uses the lexeme pairing relationship data defined in step 111 to find the syntactic dependencies between each lexeme. This will result in a hierarchical structure wherein each lexeme in the assembly governs or is dependent on the other lexemes. The hierarchy may have multiple tiers such that a lexeme "A" may be dependent on a lexeme "B" which is dependent upon a lexeme "C." This would be expressed as follows: [[A] B] C. For example: If the user submits a lexeme assembly in the sequence DOGN, QUICKLYADV, RUNV, then a distribution analysis based on the dependency grammar method will return the nested hierarchical structure of [DOG] QUICKLY [RUN], wherein DOGN and QUICKLYADV are dependent on RUNy. (Note: In natural language, this would read "The dog quickly runs.").
[139] In step 305 the computing device checks against the LexDB to see whether the specific lexeme pairs are semantically intelligible. Two criteria are met in order to return a "Yes" value: (1) a pairing relationship must exist in the LexDB between each lexeme in the assembly that has been syntactically identified as governing or dependent on one another and (2) the relationship direction in the LexDB, governing or dependent, must match the relationship direction in the assembly. If either criterion fails, a value of "No" is returned. All lexeme pairings within an assembly must be intelligible in order for the assembly to be deemed intelligible. If any pairings return a value of "No," the assembly will be
deemed either fully or partially unintelligible, depending upon whether gradations of intelligibility are permitted by the system.
[140] In step 306 the computing device logs information about the event and adds it to the user profile. These data may include the following: The contents of the submitted assembly, i.e., the specific sequence of lexemes; whether the submitted assembly was semantically intelligible or unintelligible and what errors were made, if any; whether the submitted assembly was syntactically well formed or ill-formed and what errors were made, if any; how much time was expended to compose it; the unique IDs of each lexeme composing the submitted assembly and whether each was used correctly; the word class of such lexemes and whether they were applied accurately; the phrase length and how that correlates to validity; the grammatical pattern (Noun-Verb, Adjective-Noun, Verb- Adverb, etc.); the word form or toneme each lexeme assumed; a timestamp; the spatial and temporal contexts of the device at the time of submission, if location services are enabled on the device; and/or any other relevant information.
[141] Valid assemblies proceed to step 401. Invalid assemblies proceed to step 501.
[142] VALID ASSEMBLY WORKFLOW
[143] FIG. 11 is a flow diagram of a method 400 for the system to output results and/or perform programmatic actions in response to a valid lexeme assembly in a target language.
[144] In step 401 the computing device determines the word form for each lexeme in an assembly for the target language. This is done by (a) looking up its word class in the LexDB, then by (b) checking the lexeme's relative position and context in the assembly to determine lexeme usage variables such as person and tense, and finally by (c) selecting the correct word form for the lexeme's inflectional paradigm that was enumerated in step 112 or the tonal transformation paradigm enumerated in step 114, depending on the target language. Other paradigms may be enumerated and checked, as necessary for the target language.
[145] To continue the earlier example of the assembly [DOG] QUICKLY [RUN], the system would first analyze the English lexeme RUNy since step 304 found that it is the governor of the other two lexemes. First, RUN is (a) established to be a verb by a LexDB lookup. Then, analysis of the assembly in sub-step (b) finds that it is used in a 3rd-person singular context. This is known because the dependent subject noun DOG is inherently 3rd-person and singular is the default form for a noun unless plurality is indicated by a user such as via FIG. 3, Block 2. Finally, the system (c) runs through the list of possible word forms enumerated at design time, per TABLE 9, as set forth below:
1. Run { verb_present_lst-person,2nd-person,3rd-person-plural; verb_past-participle }
2. Runs { verb_present_3rd-person-singular }
3. Running { verb_present-participle }
4. Ran { verb_past }
[146] Since it was established that the context is { verb_present_3rd-person-singular } , the operation in the current step would return the word form value "runs." This step loops until all lexeme word forms are determined.
[147] In step 402, the computing device performs a lookup of the word form extension table in the LexDB, for each lexeme in the assembly, and may return one or more words or morphemes to append either before (preceding) or after (succeeding) the word form that is determined in step 118, per the example in TABLE 13. This step loops until all lexemes in the assembly either: 1) have any required extended word form appended; or 2) are determined not to require any change, as applicable. In the earlier example of DOG QUICKLY RUN, the subject noun DOG may have an extended word form "a dog" stored in the LexDB, per TABLE 13, where the preceding form has a null value.
[148] In step 403, the computing device determines each lexeme's tone form for tonal languages. First, (a) the application performs a LexDB lookup to recall the lexeme's tone value. Next, (b) the tonal transformation table created in step 114 is examined to determine if any transformations are to be applied to the lexeme. Finally, (c) the tonal exceptions table created in step 115 is checked to apply any exceptions that override the
tonal transformation. This results in a final tone form. This step loops until each lexeme in the assembly has a final tone value.
[149] In step 404, the computing device reconstructs the full assembly as natural language in text form. All lexemes have in previous steps been converted to extended word forms (if applicable) and tone classes. This step simply reassembles them in the order submitted by the user. The resultant phrase or sentence is stored in memory as a transcription in the target language. Text may or may not be displayed to the user at this point. In the earlier example of DOG QUICKLY RUN, the reconstructed assembly would be converted to "The dog quickly runs." Note: For ideogram-based languages, a Latinized text may be used as an alternative to ideograms. For example, with Mandarin pinyin could be used and then numbers could be inserted to represent tones.
[150] In step 405, the computing device determines what, if any, programmed action to perform based on the submitted assembly. Programmed actions are discussed below, and are generally a programmed response to elements in the assembly. The system first (a) determines the course of action based on pre-defined logic and then (b) stores that decision in memory. This step does not necessarily execute the programmed action.
[151] Programmed actions may include but are not limited to the following: (a) modification or presentation of images, animations, video, texts, narrated audio, or other visual or auditory elements, among which could be a visualization as described in method 1000; (b) launching or branching of a new programmed process or any other number of visual, auditory or kinesthetic actions to occur on or through the device; (c) changing the set of Lex-Icons made available to the user; (d) presenting Lex-Icons back to the user in the form of a dialogue with the system wherein the application presents system-generated Lex-Icons on the assembly bar and may output them in any of the ways a user-submitted assembly could; and/or (e) presenting the assembly or various forms of output to another user who may then respond in kind as in (d). Examples of launching a new programmed process could include sending the assembly output as transcribed, visual or voice- recorded communications to one or more external applications or devices as an email,
audio file, text file, image file, animation file, instant message, SMS (a.k.a. text message), computer-generated voice message, voice call, video call, or other electronic communication medium or combination thereof. The application could also initiate a text, voice, video or multimedia communication session and assemblies could be transformed into the output type(s) required by such communication medium.
[152] For a given combination of Lex-Icons, there may potentially be multiple possible valid assemblies which each lead to a different set of potential pre-programmed responses. (For example, in the sample provided in FIGs 16A-D below, "jump" would lead to one response branch and "fly" to another, and the different responses may include different animations of the animal jumping or flying, respectively.).
[153] Per (e) above, the system supports at least three modes of application-mediated interpersonal communication: (i) two or more users may synchronously or asynchronously communicate each using the system on a separate device, each running the application; (ii) two users may synchronously communicate using the system on the same device, wherein each user interacts with separate controls or screen areas; and (iii) two or more users may asynchronously communicate using the system on the same device, each taking turns as a speaker or listener by indicating such with the Speaker and Listener Selection. Per (d) above, the system also supports communication between a user and the system, i.e., a machine, whether pre-programmed or artificially intelligent.
[154] In step 406, the computing device outputs the parsed assembly in natural language form, which may include narrated audio, text, signs, symbols, animation, video, neurobiologic impulses, or other means of conveying language. Each lexeme word/tone form or extended word form may have a unique audio file or other unique output, as discussed in step 120.
[155] Step 407 executes the programmed action(s) determined in step 405. As noted, the programmed actions may entail a variety of responses, such as an animation of an object, a playing of a sound, or any other desired response as noted above.
[156] Step 408 outputs the reconstructed assembly text transcription to a log that may be displayed or accessed immediately and/or at a later time. Using such log, a user may repeat or retract submission of a previously submitted assembly or otherwise retrieve and revise such arrangement.
[157] Step 409 clears the submitted assembly and then presents a new set of Lex-Icons and allows the user to input a new assembly. The process returns to step 203 (unless the presentation sequence contains logic that ends interactivity after certain point or level).
[158] INVALID ASSEMBLY WORKFLOW
[159] FIG. 12 is a flow diagram of a method 500 for the system to provide corrective feedback based on invalid lexeme assembly input. It illustrates one possible example of how error- handling protocols may diverge based on application type or other factors in application design.
[160] In step 501, the computing device determines or customizes the error handling rules based on the application type. For example, cross-language, augmentative, and other communications applications may each have unique error-handling protocols as might educational applications. Based on the application type, the system may ignore, auto- correct or inform the user through corrective feedback of errors related to syntax or semantics.
[161] In step 502, the computing device determines corrective feedback for the given application type. Such feedback may vary based on 1 to N conditions such as the number of preceding failed attempts. Steps 503, 504 and 505 each represent possible alternative types of corrective feedback based on conditions X, Y and Z, respectively. (There may be any number of conditions.) For example, in an educational application, a counter could be set that increments by a value of 1 with each incorrect assembly submission. If the value reaches 1 (condition "X"), the process would proceed to step 503. If the value reaches 2 (condition "Y"), it would proceed to step 504. If a correct assembly was submitted, the counter would reset to 0. This example might not have a condition "Z" and therefore no
step 505. In another example, of peer-to-peer communication, condition "X" may be a scenario where an assembly-in-progress or submitted assembly has two or more possible correct phrase forms that cannot be differentiated without affecting meaning, whereas condition "Y" may only be triggered if user settings are set to provide such feedback, and condition "Z" may be the default otherwise. This is to say that there may be a plurality of possible alternatives for providing corrective feedback as well as methods for determining which alternative to use. The following three steps are merely examples.
[162] In step 503, the computing device runs one or more pre-programmed "wrong answer" cues, such as audio or visual feedback. Thereafter, the process returns to step 212 for user self-correction.
[163] In step 504, the computing device auto-corrects the submitted assembly. There are a plurality of means by which auto-correction may occur. An illustrative example follows: The Lex-Icons could move into the correct order, using a minimal distance principle in the case of multiple possible right answers in the answers array. For example, if the user constructed an assembly consisting of three lexemes, but did so in the wrong order, the system could perform a LexDB lookup of any and all possible correct orders for an assembly consisting of the same three Lexemes and return the most similar assembly (requiring the fewest and shortest reordering movements) and then proceed to reorder the Lex-Icons to match the returned assembly. Thereafter, the process returns to step 401.
[164] Step 505 in our example provides no feedback and simply executes step 409.
[165] LEXICAL ITEM RANKING ALGORITHM
[166] FIG. 13 is a flow chart illustrating a lexical rank-ordering algorithm for the purpose of determining which lexical items to present most prominently to a user. As with the other various algorithms described herein, the FIG. 13 algorithm may be performed by one or more computing devices, executing computer-readable instructions, to implement the language learning system described herein.
[167] In method 600, the system may rank lexical categories, lexemes and assemblies lexical categories, lexemes and assemblies (hereafter "lexical items"), in those embodiments where a predictive algorithm is desirable, by calculating a numeric ranking value using as a basis factors including but not limited to the following ranking factors: (a) frequency and recency with which the current speaker has used each lexeme, (b) the frequency and recency with which other system users have used each lexeme, (c) the frequency and recency with which the current speaker has used each lexeme in the current spatial location, if known, (d) the frequency and recency with which the current speaker has used each lexical items for a given temporal setting, (e) the frequency and recency with which the current speaker has used each lexeme with the current listener, if known, and/or (f) any variety of other factors that may weigh in a predictive algorithm. The output of this process, a numeric ranking value, is used by the system in step 206 when sorting the display of Lex-Icons or icons representing lexical categories or assemblies.
[168] At step 601, the LexDB may be pre-populated at design time by an application publisher with frequency values, which may be based on a source such as a CDI, as per step 101. As noted above in step 101, the LexDB may be provided with frequency values indicating, for a given lexeme and for a given population (e.g., native Mandarin speakers of 30 months of age), a percentage of the population that understands the lexeme.
[169] At step 602 a system administrator prepares an automated batch process, such as a script, that runs when scheduled or otherwise directed by a system administrator.
[170] At step 603 the batch process executes and initiates the ranking procedure for all lexical items that are stored in the LexDB for a given target language or a subset of them, as determined by parameters set by the system administrator in preparation of the batch process, per the previous step.
[171] In step 604, the computing device retrieves data from a system database about prior submission of each lexical item, including the number of submissions (frequency), the amount of time lapsed since the most recent submission (recency), and which user profile,
spatial, temporal and other attributes were associated with each submission. Any locally shared usage data may be uploaded to the system's central database at this time.
[172] In step 605, the computing device scores lexical items by the frequency with which the speaker has submitted the lexical item in the past and whether that submission was valid. For example, if EAT was submitted and validated 6 times it will be scored 6, higher than SCHOOL, which was submitted 6 times, 1 of which was invalid, and scores a 5. If EAT is, hierarchically speaking, a child of the semantic category "food" and SCHOOL a child of "places," then "food" is scored higher than "places."
[173] In step 606, the computing device may apply a weighting factor based on recency. This factor would weight lexical items submitted in the recent past more heavily than ones submitted in the distant past. This is important at an aggregate level because language is in a state of constant evolution and terms can go in and out of use. In addition, many terms are seasonal. It is also important at an individual level because recent usage is a better predictor of current language needs than distant past usage is. In learning applications, moreover, weighting based on recency may be useful because depending on how often the user actually uses the learning software, and depending on the user's progress in learning, the user's recent submissions are a better indicator of the user's current skill level than older ones. For example, if a user made 80 submissions that incorrectly used a lexeme one month ago, but submitted 20 correctly yesterday, the 20 correct answers from yesterday are a good indicator that the user has actually learned the lexeme, even though the 20 is outnumbered by the 80 earlier incorrect submissions. The system might used a variety of different algorithms to determine recency. One option: For each doubling of the distance in time elapsed between the most recent use and a previous use of a lexeme such as EAT, the system might halve its impact on the score. Thus, if a user submitted EAT 48 hours ago, 24 hours ago and again 12 hours ago, the use at 24 hours would count for half the score as the one from 12 hours ago and the use at 48 hours would count for one-quarter the weight. The purpose of this "half-life" type of approach would be to diminish the relative weight of older submissions versus more recent submissions.
[174] In step 607, the computing device may check if spatial and temporal criteria are met, so that spatial and temporal rules may be applied to the user's assemblies. These criteria may be as follows: (a) location services are available and enabled by the device currently in use, (b) a spatial location corresponding to current geographical coordinates or a temporal event or combination of the two has been defined by the user, and (c) the spatial-temporal entity identified in method 900 has prior data associated with it in a system database and such data are accessible to the application at the current time. If one or more of these criteria are not met, the process will skip to step 610. Alternatively to criterion (b), if no spatial-temporal entity has been defined by the user in method 900, the system may use a default setting such as country or region or may use system-identified data clusters (for example: persons clustering near a bus stop at 9 a.m. may use certain types of transportation-related language that the system compiles over time), to the extent the system has relevant prior data.
[175] In step 608, the computing device may perform a lookup of a system database to find all lexical items previously used by the current user within the current spatial location radius or radii, temporal event range or combination thereof and retrieves the frequency of each relevant lexical item.
[176] In step 609, the computing device may add a spatial-temporal frequency value (or a multiple of the frequency value) to the weighted frequency-recency score from steps 605- 606. For example, assume EAT is used 1 time at Location A and SCHOOL is used 4 times at Location A. Now the system could modify the score to, say, add 1 to EAT (1+6=7) and 4 to SCHOOL (5+4=9), so that in Location A, SCHOOL gets a higher score. Different lexemes may have different weighting values for different locations, and applying these location-based weights may help further improve the system's ability to properly handle the context in interpreting the user's assemblies.
[177] Users may use certain lexical items in greater frequency when in communication with certain other users, such as in an augmentative and alternative communication application, or with certain computer characters, such as in a game-based learning
application. In such cases it may be beneficial to the user to weight lexical items higher if they are frequently used in communication with those other users. For example, a special needs child may generally use words related to food when speaking with a particular caregiver (a "listener") who normally helps that child eat lunch. In step 610, the computing device may check if speaker and listener criteria are met. These criteria may be as follows: (a) a listener is defined in the Speaker and Listener Selection and (b) the listener has prior data associated with the current speaker, which defaults to the current user profile if not explicitly selected. If one or more criteria are not met, the process will skip to step 613.
[178] In step 611, the computing device performs a lookup of a system database to find all lexical items previously communicated by the speaker to the selected listener. The system scores lexical items by the frequency with which the speaker has submitted the lexical item to the specific listener in the past and vice versa.
[179] In step 612, the computing device adds the listener-specific frequency value (or a multiple of the frequency value) to the score from step 609.
[180] In step 613, the computing device performs a lookup of a system database and finds average frequency and recency data for all users of the system within predefined demographic criteria, e.g., all users, all users of a specific age, all users of a specific gender, all users of a specific application type, etc. This average may be pre-computed at a less frequent interval than the ranking process due to the potentially large volume of data to analyze.
[181] When a new user begins to use the system, he or she will have no usage history.
Therefore, it may be helpful to use the aggregate usage histories of all other users (or users in a demographically similar group) as an initial basis for guessing how often the new user may need to access those lexical items. Even as the new user accumulates usage data for many lexical items, he or she may go to a location where there is no past data for such user and, again, aggregated data may be a useful starting point. Thus, step 614 scores lexical items by the frequency with which the aggregate set of system users has
submitted the lexical item. For example, if all users average 20 for EAT and 5 for SCHOOL, that generates scores for each. The algorithm may weight aggregate frequency scores less than individual scores, so the system may multiply the aggregate score by a fraction, such as .20, for example. (The value .20 is arbitrary, but it is safe to assume that individual scores will usually be weighted greater than aggregate scores, since the goal of the system is generally to individualize instruction or communication.) In our running example, that adjusts the aggregate user scores to 4 for EAT and 1 for SCHOOL. Note that CDI frequency values identified in step 101 may factor into this aggregate frequency score.
[182] In step 615, the computing device may weight the aggregate user score for recency such that older data are weighted lower. (Our running example ignores this step.) This step is identical in purpose to step 606 except that it applies to aggregate usage data, rather than individual usage data.
[183] In step 616, the computing device adds the aggregate weighted average score computed in steps 614-615 to the lexical item's existing score. Now our example results in EAT being 7+4=11 and SCHOOL being 9+1=10.
[184] In step 617, the computing device may rank-orders lexical items by score, with highest scoring lexical items first in sequence and lower scoring lexical item last. Lexical items with a score of 0 fall to the bottom of the sort. Within the 0-score group or among any tie scores, choices may be sorted randomly, by CDI frequency in the LexDB, alphabetically or using any other default factor.
[185] Step 618 stores the final numeric ranking value in a system database and associates it with the database record for the relevant lexical item.
[186] When called by the runtime application in step 205, the computing device in step 619 performs a lookup of a system database, which may be a flattened, constrained version of such database cached locally on the device and updated over a network from servers from time to time, to find the numeric ranking value for all lexical items relevant to the user at
that time. For example, if the user selects a semantic category, the system will lookup those lexemes that are, hierarchically speaking, children of the semantic category, and will load only those children lexemes into memory for purposes of sorting.
[187] The aforementioned algorithm is merely one illustrative example among a plurality of methods for how spatial, temporal, frequency, recency and other system-stored data may combine to determine the filtering and sorting mechanisms for presenting lexical items in one or more panels on the system user interface.
[188] USER PREFERENCE CUSTOMIZATION
[189] FIG. 14 is a flow chart illustrating a process for how a user may customize preferences.
[190] Method 700 enables an authorized user to explicitly customize a user profile with preferences in addition to the implicit customizations made by the system on the basis of historical user data. The illustrated steps may be performed by one or more computing devices providing the language learning system herein.
[191] In step 701, at application runtime, a user may access a user profile during the application runtime at or after step 203 by using a controller to view and/or modify the Speaker and Listener Selection, Assembly Parameters, or Settings, which are Blocks 1, 2 and 3 on FIG. 3, respectively. For example, a user may take the following actions:
(a) Speaker and Listener Selection: view and/or modify the user profile of the speaker, i.e., the user who is composing assemblies, and the user profile of the listener(s), i.e., the user(s) with whom a speaker is communicating, if applicable. Note that a speaker or listener may be defined as representing a real person or a virtual avatar. An application may include default speakers and listeners.
(b) Assembly Parameters: Add or remove parameters that a user may pass to the system when submitting a specific assembly. These may include the language in which output is to be narrated, signed and/or transcribed, tense and time indicators, contextual indicators such as place or season, or other options.
(c) Settings: view and/or modify preferences for the entire application, for a user profile or for other more specific conditions.
[192] If a user modifies any of the above elements at any time prior to submission of an assembly, it may impact the system's response to such submitted assembly.
[193] In step 702, if the user selected a speaker or listener(s) in step 701, the selection overrides the current state. By default, the system assumes that the current user is the speaker and may assume that the listener is anonymous, the listener from the most recent application session, the most frequently user-selected listener, or other assumptions, depending upon application design. In the case of a peer-to-peer communication application wherein the communication is occurring between multiple devices, each running the application, the system may assume that the user(s) or speaker(s) on the other devices are listener(s) for purposes of the device in focus (meaning a device on which an assembly is being constructed). State is maintained throughout an application session, unless explicitly changed by a user or business rule.
[194] In step 703, if the user added or removed an assembly parameter in step 701, the selection overrides the current state. By default, the system uses no assembly parameters. State is maintained throughout an application session.
[195] In step 704, if any settings changes are indicated in step 701, the changes override the system default settings and are stored in the user profile both for the current and future application sessions. Otherwise, the system uses the settings previously stored in the user profile.
[196] A user may assign settings to specific listeners, for example, such that the system will modify the form of output based on a particular listener's user profile in a system database, such as by changing which language is narrated, signed or transcribed for that particular listener, determining the voice pattern to use or by determining whether to output a response to a submitted assembly as narration, text, images, animation,
neurobiologic impulses, or any other form or combination thereof. For data collection purposes, the system may record and display the listener associated with a submitted assembly.
[197] When triggered by step 203, the computing device in step 705 resolves any conflicts among speaker and listener selection, assembly parameters and settings and aggregates all preferences relevant to a submitted assembly. Conflicts will be resolved by first (a) enumerating all settings, then (b) overriding application-level settings (set by a system administrator to apply to all users of an application) with user-level settings (set by an individual user to apply to all of his or her own usage of an application), then (c) overriding user-level settings with speaker-level settings (set by a user to apply to a particular user who may use the application to communicate with one or more listeners), then (d) overriding speaker-level settings with listener-level settings (set by a user to apply to his or her interactions with a particular user with whom the user may communication using the application), and (e) overriding listener-level settings with assembly parameters (set by a user for a particular assembly prior to submission using the Assembly Parameters Panel, per FIG. 3, Block 2). By way of a theoretical example, if an application's default language was English, the current user's default language was Spanish, the current speaker was set to Japanese, and the current listener was set to Arabic, the assembly output would be in Arabic. This could be useful, for example, if a user needed to communicate regularly in a certain language, but needed to communicate in a different language for a particular person or in a particular scenario. If the user set an assembly parameter to output in ASL (American Sign Language), that particular assembly would be displayed as ASL signs.
[198] HARDWARE COMPONENTS
[199] FIG. 15 is a block diagram showing the basic components of a general computing hardware device that may be used to implement the system and computing device described herein.
[200] The various features and elements described herein may be implemented by one or more computing hardware devices, such as a personal computer, tablet computer, smartphone, cellular phone, laptop computer, etc. A computing hardware device that is usable in conjunction with one or more embodiments of the system may include the following components:
[201] Block 801 : The processing component may include a central processing unit (CPU), application processing unit (APU), graphics processing units (GPU), a baseband (wireless communications) processor, other specialized processors, or any combination thereof. While such processors may potentially function independently, FIG. 15 simplifies processing into one block on the diagram for the sake of clarity. The processor component may be coupled to Blocks 802 through 807.
[202] Block 802: Memory storage may include solid-state disks or drives such as NAND-based flash memory, random access memory (RAM) such as Dynamic RAM (DRAM) and Synchronous DRAM (SDRAM), read-only memory (ROM), hybrid drives, optical disks, or a plurality of other types of computing device memory storage, whether or not yet invented. Memory may be read by or, except in the case of ROM, written to by the processing component. The memory 802 may store instructions that, when executed by the processor 801, may cause the hardware device to perform any of the functions described herein.
[203] Block 803: Audio input and output can include an audio codec, inputs such as microphones and outputs such as speakers. Microphone input could be decoded by the audio codec and processed for speech recognition, which can be used as a user interface controller.
[204] Block 804: The device hardware may have a camera with image or motion sensors such that camera input could trigger application behaviors based on recognition and processing of non-touch kinesthetic motion or facial movements.
[205] Block 805: The user interface adapter may interface with user input devices, including (a) peripherals such as a computer mouse, joystick, trackball, game controller, or keyboard, (b) a kinesthetic stimulus such as a finger touch that can be interpreted by the device touch-screen controller, and (c) electroencephalogram (EEG) or other devices that detect neurobiologic impulses and could potentially mimic or supplant use of peripheral devices or touch.
[206] Block 806: The display solution may be driven by the processing component to control the device display, which may also show input from I/O devices.
[207] Block 807: A connectivity and security solution can provide network connectivity such as WiFi, Bluetooth, and near-field communications, for example.
[208] Block 808: The system, e.g., a device application at Block 808, uses the network to securely communicate with the server infrastructure, if and when needed and bridge the device and server- side components. It may also receive transmissions from positioning systems using technologies including but not limited to WiFi, Bluetooth and other sources of location services over a network. The positioning systems may be previously known to the system or may be discovered by the system, which over time generates a map of positioning systems and allows users to augment system-generated data.
[209] Applications may be designed to utilize a live network communication at certain points in time, or they may be self-contained and require no live network connection.
[210] Block 809: The server infrastructure may include firewalls, load balancers, redundancy mechanisms, as well as other hardware components that serve security, scalability, reliability and other service level needs. It houses application and database server components, as well as an application development environment.
[211] Block 810: The application server may be a remote hardware device that runs elements of the system that an application operating on a user's device may need to access at runtime. The application server is a layer between the user's device and the system's central database.
[212] Block 811 : The database server houses the system's central, server-side relational or graph database, which contains data that may be used or accessed at runtime by a multitude of user devices at any given time. These data may include the user profile and the LexDB. The system may, in addition to or in place of the central database, use a local database on a given user's device and, from time to time, synch all or part of the local database with the central database over a network when available and desirable.
[213] Application and database servers may be physical machines in a colocation facility, or they may be virtualized machines in a cloud environment.
[214] Block 812: The application development environment provides a set of tools and interfaces for allowing instructional designers, linguists and developers to work together to build datasets applying the LexDB to languages and applications. This set of tools facilitates the design phase of operation and integrates with the build process (whether building for a client browser online, or building for a mobile or embedded standalone device) to supply the required subset of data for a given application to execute.
[215] SPATIAL AND TEMPORAL CONTEXT
[216] FIG. 16 illustrates an example flow diagram for an algorithm to take into account spatial and temporal context. This may be used by an application to customize the presentation of lexical items to a user based on said user's specific location or type of location or based on the time of day, day of week, season of the year, specific time ranges or events, or any combination thereof. The various steps are described below:
[217] Block 901. Whether or not spatial or temporal entities have been defined in step 121, per the example in TABLE 14, the device application may request and return a set of latitudinal and longitudinal coordinates or other global positioning identifier for its current location, or a location for a user's use of language (e.g., the location where the user is using the language software described herein). It may receive these coordinates through any type of network connection, broadcast transmission or proximity system. This location request is made at regular intervals by the application (with the exact
interval being set by the device or application settings, but generally updated in what the user would perceive as being a fraction of one second) without user intervention, where permitted by application and user settings. The advantages of knowing the location are 1) to feed into the algorithm of this method 900 and 2) to track location data for purposes of the lexical ranking algorithm of method 600. Block 902. Within a maximum range set by the system administrator (in order to avoid the clutter of identifying too many possible locations), the device application identifies any predefined locations, which can be a set of coordinates or a positioning system, or undefined location that meet system-defined or user-defined criteria. Predefined locations could be anyplace relevant to one or more users, such as "home" or "Washington Elementary School" or "a gas station." The locations may be a specific location (e.g., the school listed above), or a general type of location (e.g., schools, in general). With positioning systems, which offer a high degree of precision, locations may also be rooms or areas within buildings or structures, such as "the kitchen" or "classroom 101" or "the cafeteria." This enables fine-tuned tailoring of lexical item sets to these places that each have specific language needs. Positioning systems are a class of low-powered, low-cost transmitters such as beacons that can notify nearby consumer electronic devices of their presence by, for example, broadcasting their precise location to the device application, if in range. They may use low-energy proximity sensing technologies such as Bluetooth Low Energy technology or may use a WiFi-based system that measure the intensity of the received signal strength, or magnetic positioning using local variations of the earth's magnetic field, or any number of accepted positioning system technologies. In cases where positioning systems are not used, coordinates may be returned by checking application servers for current defined locations. Each predefined location is treated as having an epicenter and a range. The epicenter represents the precise GPS coordinates or positioning system location. The range is a distance, generally measured in meters, from the epicenter in all directions and is set by a user who defines a location. The purpose of having epicenter and range concepts are to establish the edge of a location zone, i.e., the
edge of a zone is the perimeter of the circle drawn around the epicenter at a distance of meters = range from said epicenter.
[219] Block 903. The device application calculates the distance between the device and the epicenter of all potential locations or positioning systems identified earlier in Block 902.
[220] Block 904. The device application determines if the distance is within range parameters.
Each predefined location may have a range associated with it, per Block 902. If the device is inside that range, it is considered within the predefined location.
[221] Block 905. The device application may attempt to resolve any overlapping spatial and temporal entities (location zones and times), wherein the device is determined to be present within the radius two or more predefined locations or two or more time or date ranges. One method for resolving this is set forth below:
The application would have a matching algorithm created by the system administrator that establishes a preferred order such as the following:
• If a location exists, that signals a strong affinity, with closer beacons being stronger than further-away ones. "Closer" means a shorter distance from the epicenter of a location. Alternatively, each predefined location could be manually assigned a priority value by a user. Locations with higher priority value would prevail over one with a lower value.
• If the current time falls into a predefined time range, that signals a medium affinity.
• If the current time falls into a recurring day or date, that signals a weak affinity.
• If the predefined location or time allows for any location or any time, that signals no affinity.
The affinity levels could combine to make a cumulative score that governs spatial- temporal entity will be dominant, ordered from weakest match to strongest. For example,
a weak affinity could score as 10, medium as 20, and strong as 30 unless a closer location existed, in which case the affinity score would drop to 25. So, if one spatial-temporal entity were "school lunch" and it occurred in the cafeteria from Noon to 12:40 each weekday, if the device were found to be in the cafeteria during that time, it would get a score of 30 (high affinity) for the location plus 10 for a recurring day (weak affinity), for a total score of 40. This would beat out "school-wide assembly" which also takes place in the cafeteria (score 30) but without any defined time (score 0), totaling a lesser score of 30. The strongest matching entity (in this case "school lunch") will then cause the display of whatever lexical data were defined for that entity in step 121 (in this case food items would make sense).
[222] In order of weakest match to strongest match, here is how the different scenarios would break down according to the above algorithm:
1. Match any day
2. Match a specific day
3. Match a specific time, any day
4. Match any day and location
5. Match a specific day and time
6. Match a specific day and location
7. Match a time and location
8. Match all 3 (time, day, location)
[223] Block 906. The device application checks and retrieves the current time on its system clock either before or after the device application has found matching location(s) and resolved which one(s) are considered valid, per the matching algorithm in Block 905.
[224] Block 907. The system determines the spatial-temporal entity by performing a lookup of a table such as TABLE 14 or analyzing data about frequency of assembly submissions. In the example of the two-dimensional table, if there is a null value for a given space-time combination or spatial or temporal contexts are indeterminate due to technical reasons, the lookup reverts to the default value for either spatial or temporal context (depending upon system or user setting priorities).
[225] Block 908. The spatial-temporal entity determined in Step 907 may determine how the system renders symbols, images, text, audio and other user interface elements. It could trigger the system to navigate to a particular screen or display certain lexical items or refresh the existing screen with a new array of lexical categories or Lex-Icons, for example. Generally speaking, the system feeds the data about the current spatial-temporal entity into the prediction process of method 600.
[226] EXAMPLE OF APPLICATION
[227] FIG. 17 illustrates an example of an application of the novel system intended for an elementary school audience learning Mandarin Chinese. An application could be considerably more complex and expansive or intended for other purposes than education. Thus, the following is intended only as one illustration of how the novel system could be implemented for one particular interactive scenario for one particular target language on one particular type of electronic device (a touch-screen tablet, in this case).
[228] The "Pet Training" application in FIGS. 17A-F is a game wherein the user constructs assemblies using Lex-Icons and said application converts the assemblies into Chinese narration of the meaning and runs an animation of pets performing a programmed action. Below is a storyboard that demonstrates the runtime workflow for one use case, wherein the user decides to make a pet rabbit jump, succeeds, and then progresses to a successive level of the game that allows more choices of Lex-Icons.
[229] In FIG. 17A, the user selects an animal to "train." Some animals are locked initially - until the user progresses enough to unlock them. This ties into the language sequencing prepared at design time.
[230] In FIG. 17B, the user touches a Lex-Icon representing the verb "jump" in the sense of "propelling oneself upward with sudden muscular effort." This Lex-Icon may animate on initial presentation to help the user understand its meaning.
[231] In FIG. 17C, the Lex-Icon for "jump" moves to the assembly bar. The lexeme "jump" is narrated in Chinese as tiao4.
[232] In FIG. 17D, the user decides to submit this assembly and taps the play button. The system validates the assembly, finds it valid, and then concurrently narrates tiao4 ("jump" in Chinese") and executes an animation of the rabbit jumping.
[233] In FIG. 17E, the user is awarded points for each valid assembly submitted. Let us assume that the user at this point passes a game level. This unlocks a new animal to train and, in this case, adds new Lex-Icons that include new word classes (parts of speech).
[234] In FIG. 17F, now the user experiments with constructing a short phrase. In this implementation, Lex-Icons for verbs, nouns and adverbs may be illustrated in different colors. The example above shows the assembly "Eat a dumpling." The user may drag Lex-Icons left or right to reverse the order, or add others before submitting.
[235] Following is a second example of an application of the novel system, this one intended for an audience of persons with special communication needs, such as certain forms of Autism Spectrum Disorder, that require assistive technology in order to communicate with a speech-generating device. Such a communication application could vary considerably in functionality and presentation. Thus, the following is intended only as one illustration of how the novel system could be implemented for one particular interactive scenario for one particular target language on one particular type of electronic device (a touch-screen tablet, in this case).
[236] The "Speak Agent AAC" application is a mobile application wherein the user constructs assemblies using Lex-Icons and said application converts the assemblies into natural language output and vocalizes it for the benefit of a listener. Below is a storyboard that demonstrates the runtime workflow for one use case, wherein the user decides to make a sentence in English and have it output in English. With the novel system, the user could just as easily make the sentence in English and have it output in any target language.
[237] FIGS. 18A-E are a set of illustrations depicting an embodiment of the system as an augmentative and alternative communication software application.
[238] In FIG. 18A, the user begins on a "home" screen. From here, he or she may select a semantic category such as Clothes in order to find a certain grouping of words and phrases such as those related to Clothes.
[239] FIG. 18B shows the Clothes category. Here the application displays a set of Lex-Icons related to the category Clothes. The user touches a Lex-Icon representing the phrase "I like to wear..."
[240] In FIG. 18C, the Lex-Icon for "I like to wear..." moves to the assembly bar. (There is no narration in this particular application, however, until the user completes an assembly.) Next, assume that the user touches a Lex-Icon representing the common noun HAT.
[241] In FIG. 18D, the user decides to submit this assembly bar and taps the green button. The system validates the assembly, finds it valid, converts it into the natural language phrase "I like to wear a hat" and then narrates the full assembly. The Lex-Icons remain in the assembly bar so that the user may repeat the audio narration until the user decides to create a new assembly.
[242] On the left side of FIG. 18E is an area with tiles that are larger in size than the larger number of tiles display in the center of the screen. These larger tiles represent lexical items predicted by the system using lexical ranking algorithm. In this case, the only factor
used in the algorithm is relevance to a predefined spatial-temporal entity. The entity shown in the illustration is "food," so the tiles show food-related Lex-Icons.
[243] The user could set the application to output French instead of English. So now when the user presses the green button, the application narrates "Je tiens a porter un chapeau," which is the French translation of "I like to wear a hat." Note that the display need not change in order to narrate in a different target language, and the text can remain in English, assuming that is said user's native or preferred language.
[244] AUTOMATED VISUALIZATION OF ASSEMBLIES
[245] FIG. 19 describes a process for visualization of a word concept, such as by means of a video or animation clip that demonstrates or models enacting the word concept. This process, Method 1000, would occur after step 405 in the assembly validation process. The purpose of this visualization process is to assist with learning or communicating a word concept, particularly in the case of verbs, adverbs and spatial prepositions such as "in" or "above," the meaning of which may be difficult to perceive from a static image or may present ambiguity of meaning. Another reason for this approach is that, while an application of the novel system may pre-define the visualization associated with a given assembly at design time, such pre-defmition requires the creation of numerous video or animation clips at significant expense. This becomes impractical with advanced language learning or communication activities as the number of potential assembly combinations increases exponentially. Thus, the system may instead use an automated method for determining the appropriate visualization or set of visualizations at runtime.
[246] With automated visualization, while the decision of which visualization file(s) to access is executed at runtime, the method may generate three resources at design time:
1. A collection of video and/or animation files, having a close-captioning or other file containing subtitle text. Such video or animation files, however, need not have been created for the purpose of use in the application, but could have been created for an
entirely different purpose. The files also do not need to reside on the system's hardware infrastructure, but could reside on third-party servers, for example.
2. There may be pre-defined logic set by the system or parameters editable by the user that determine how the method will find matching visualizations: Options include text matching with various Boolean and wildcard parameters, grammatical patterns, and other logical criteria. For example: The system may be instructed to find all video clips wherein the close-captioning text includes the pattern happy *day*, where * is a wildcard operator, and this would pull all video clips containing "happy birthday," "happy holidays," "happy today," and other possibilities. Or, it could find videos conforming to the grammatical structure Subject Noun then Verb, contains string "football" so that would return video clips containing phrases such as these: "He likes football" as well as "They play football" and "That crazy dog ate my football."
3. There may be equivalent logic for determining what clips to playback: Options may include playing back the full sentence that contains the matching word or phrase or only the exact assembly, specifying a certain number of milliseconds before and after the assembly to start and stop, concatenating a certain number of clips together in a chain to playback, etc.
[247] At runtime, the automated visualization process executes the following process after an assembly is validation and ready to execute, per Step 405.
[248] Block 1001. The system analyzes the text of the assembly based on logic that was predefined at design time or based the user settings that tells the system how to use the text to find a matching visualization. If the logic requires a grammatical match, this will be accomplished by matching the sequence of word classes in the assembly (already determined prior to Step 405) with the same sequence in the subtitle text.
[249] Block 1002. The system searches close captioning files for matches, if any. Said files may be indexed in advance for better search performance. Logic may also require or allow multiple matches. Matches may be scored or rank-ordered based on quality of the match and/or quality of the visualization file.
[250] Block 1003. The system determines one or more best matches based on the scoring or rank-ordering. If there are no matches, it should be noted that this automated visualization runtime method is not mutually exclusive of the design-time method. For example, in one embodiment, the system may first check to see if a pre-determined visualization exists and, if not, use the runtime method.
[251] Block 1004. For a given match, the system loads the video at the right point in timeline, based on pre-defined logic or settings. For example, if the logic states that playback should begin with the start of a full sentence in which the assembly match is found, the system will load the video at that point in the timeline. Note that this does assume that the close-captioning file includes timeline information, as is standard practice.
[252] Although example embodiments are described above, the various features and steps may be combined, divided, omitted, rearranged, revised and/or augmented in any desired manner, depending on the specific outcome and/or application. Various alterations, modifications, and improvements will readily occur to those skilled in art. Such alterations, modifications, and improvements as are made obvious by this disclosure are intended to be part of this description though not expressly stated herein, and are intended to be within the spirit and scope of the disclosure. Accordingly, the foregoing description is by way of example only, and not limiting. This patent is limited only as defined in the following claims and equivalents thereto.
Claims
1. A method, comprising:
generating, by a computing device, a display of a plurality of graphical elements, each graphical element corresponding to a unique monosemic lexeme according to a target language;
receiving user inputs to rearrange the graphical elements on a display into a sequence of the graphical elements;
interpreting the sequence of the graphical elements according to the target language; and
generating a response indicating whether the sequence of the graphical elements is a valid arrangement of the lexemes according to rules of the target language.
2. The method of claim 1, further comprising storing a database identifying, for each of a plurality of lexemes, a plurality of frequency values identifying, for a corresponding plurality of user demographics in a language, a proportion of the user demographic that understands the corresponding lexeme in the language.
3. The method of claim 1, further comprising identifying a plurality of different language learning levels, and for each level, a corresponding maximum quantity or category of lexemes whose graphical elements may be included in an assembly.
4. The method of claim 1, further comprising storing a listing of tonal transformations for corresponding lexemes.
5. The method of claim 1, further comprising storing information identifying a plurality of word forms for a first lexeme, and for each word form, information identifying one or more preceding or succeeding word forms permitted in the target language.
6. The method of claim 1, further comprising displaying a visualization of moving visual elements to convey a meaning of an assembly formed by the sequence, in response to determining that the sequence is a valid arrangement of lexemes according to the rules of the target language.
7. The method of claim 1, further comprising determining, by the computing device, that the sequence of the graphical elements is invalid, and in response, providing corrective feedback comprising audio, video, or audiovisual elements.
8. The method of claim 1, further comprising determining, by the computing device, that the sequence of the graphical elements is invalid, and in response, automatically rearranging the graphical elements in such a way as to express an alternative sequence of the graphical elements that is valid according to the rules of the target language.
9. The method of claim 1, further comprising determining, by the computing device, that the sequence of the graphical elements is valid, and in response, converting the sequence of graphical elements into a transcription in the target language using appropriate word forms and other linguistic elements according to the rules of the target language.
10. The method of claim 9, further comprising converting the sequence of graphical elements into a transcription in a second target language using appropriate word forms and other linguistic elements according to the rules of the second target language.
11. The method of claim 1 , further comprising storing a user profile for a user, the user profile identifying validity test results for lexeme assemblies submitted by the user.
12. The method of claim 1, further comprising using the user profile to determine any restrictive parameters that may limit the quantity or category of lexemes whose graphical elements may be presented to the user.
13. The method of claim 1, further comprising:
receiving, by the computing device, a specification of one or more spatial locations; associating the one or more spatial locations with a semantic category;
determining a current location of a user; and
determining relevant semantic categories based on the current location of the user and the association of the one or more spatial locations.
14. The method of claim 1, further comprising:
receiving, by the computing device, a specification of one or more time ranges;
associating the one or more time ranges with a semantic category;
determining a current time; and
determining relevant semantic categories based on the current time and the association of the one or more time ranges.
15. The method of claim 1, further comprising ranking candidate graphical elements for a user based on the user's previous proficiency with corresponding lexemes.
16. The method of claim 1, further comprising using the arrangement to facilitate communications between two or more users who are communicating with one another using different languages.
17. A method, comprising :
associating, by a computing device, one or more spatial locations with a semantic category;
determining a spatial location corresponding to a user's use of language;
using the associated one or more spatial locations to retrieve predicted or suggested lexemes for presentation to the user; and
presenting the predicted or suggested lexemes to the user.
18. The method of claim 17, wherein the presenting the predicted or suggested lexemes to the user further comprises displaying graphical elements corresponding to the predicted or suggested lexemes.
19. The method of claim 17, wherein associating, by the computing device, one or more spatial locations with the semantic category further comprises identifying a spatial location type, and associating the semantic category with a plurality of locations matching the spatial location type.
20. The method of claim 19, further comprising storing, in a database, frequencies with which lexemes or semantic categories are used by other users in a first spatial location or in a plurality of locations matching a spatial location type.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201461930238P | 2014-01-22 | 2014-01-22 | |
| US61/930,238 | 2014-01-22 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2015112250A1 true WO2015112250A1 (en) | 2015-07-30 |
Family
ID=53681823
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2014/066693 Ceased WO2015112250A1 (en) | 2014-01-22 | 2014-11-20 | Visual-kinesthetic language construction |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2015112250A1 (en) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10019995B1 (en) | 2011-03-01 | 2018-07-10 | Alice J. Stiebel | Methods and systems for language learning based on a series of pitch patterns |
| CN109478106A (en) * | 2016-07-15 | 2019-03-15 | 微软技术许可有限责任公司 | Leverage ambient context for enhanced communication throughput |
| US11062615B1 (en) | 2011-03-01 | 2021-07-13 | Intelligibility Training LLC | Methods and systems for remote language learning in a pandemic-aware world |
| CN118750683A (en) * | 2024-08-30 | 2024-10-11 | 南昌大学第二附属医院 | Smart infusion stand with remote monitoring |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020128819A1 (en) * | 2000-10-20 | 2002-09-12 | Andrea Jessee | Detection & Correction of errors in german grammatical case |
| US20040153557A1 (en) * | 2002-10-02 | 2004-08-05 | Joe Shochet | Multi-user interactive communication network environment |
| US20050137847A1 (en) * | 2003-12-19 | 2005-06-23 | Xerox Corporation | Method and apparatus for language learning via controlled text authoring |
| US20080275694A1 (en) * | 2007-05-04 | 2008-11-06 | Expert System S.P.A. | Method and system for automatically extracting relations between concepts included in text |
| US20110076653A1 (en) * | 2005-04-05 | 2011-03-31 | Brent Culligan | Systems and Methods for Semantic Knowledge Assessment, Instruction, and Acquisition |
| US20110320468A1 (en) * | 2007-11-26 | 2011-12-29 | Warren Daniel Child | Modular system and method for managing chinese, japanese and korean linguistic data in electronic form |
| US20130212563A1 (en) * | 2010-02-18 | 2013-08-15 | Johan Kraft | Method and a System for Searching for Parts of a Computer Program Which Affects a Given Symbol |
-
2014
- 2014-11-20 WO PCT/US2014/066693 patent/WO2015112250A1/en not_active Ceased
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020128819A1 (en) * | 2000-10-20 | 2002-09-12 | Andrea Jessee | Detection & Correction of errors in german grammatical case |
| US20040153557A1 (en) * | 2002-10-02 | 2004-08-05 | Joe Shochet | Multi-user interactive communication network environment |
| US20050137847A1 (en) * | 2003-12-19 | 2005-06-23 | Xerox Corporation | Method and apparatus for language learning via controlled text authoring |
| US20110076653A1 (en) * | 2005-04-05 | 2011-03-31 | Brent Culligan | Systems and Methods for Semantic Knowledge Assessment, Instruction, and Acquisition |
| US20080275694A1 (en) * | 2007-05-04 | 2008-11-06 | Expert System S.P.A. | Method and system for automatically extracting relations between concepts included in text |
| US20110320468A1 (en) * | 2007-11-26 | 2011-12-29 | Warren Daniel Child | Modular system and method for managing chinese, japanese and korean linguistic data in electronic form |
| US20130212563A1 (en) * | 2010-02-18 | 2013-08-15 | Johan Kraft | Method and a System for Searching for Parts of a Computer Program Which Affects a Given Symbol |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10019995B1 (en) | 2011-03-01 | 2018-07-10 | Alice J. Stiebel | Methods and systems for language learning based on a series of pitch patterns |
| US10565997B1 (en) | 2011-03-01 | 2020-02-18 | Alice J. Stiebel | Methods and systems for teaching a hebrew bible trope lesson |
| US11062615B1 (en) | 2011-03-01 | 2021-07-13 | Intelligibility Training LLC | Methods and systems for remote language learning in a pandemic-aware world |
| US11380334B1 (en) | 2011-03-01 | 2022-07-05 | Intelligible English LLC | Methods and systems for interactive online language learning in a pandemic-aware world |
| CN109478106A (en) * | 2016-07-15 | 2019-03-15 | 微软技术许可有限责任公司 | Leverage ambient context for enhanced communication throughput |
| CN109478106B (en) * | 2016-07-15 | 2022-11-15 | 微软技术许可有限责任公司 | Utilizing environmental context for enhanced communication throughput |
| CN118750683A (en) * | 2024-08-30 | 2024-10-11 | 南昌大学第二附属医院 | Smart infusion stand with remote monitoring |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU2024220683B2 (en) | Selective visual display | |
| Aryadoust | An integrated cognitive theory of comprehension | |
| Traxler | Trends in syntactic parsing: Anticipation, Bayesian estimation, and good-enough parsing | |
| Summers et al. | Bilingual performance on nonword repetition in Spanish and English | |
| Butler et al. | Exploring functional-cognitive space | |
| Tyler | Expanding and mapping the indexical field: Rising pitch, the uptalk stereotype, and perceptual variation | |
| Mumford | An analysis of spoken grammar: The case for production | |
| US20170287356A1 (en) | Teaching systems and methods | |
| Sadoski et al. | A dual coding theoretical model of decoding in reading: Subsuming the Laberge and Samuels model | |
| KR101348282B1 (en) | Method for generating animation from text, Apparatus thereof | |
| Sherwani et al. | Orality-grounded HCID: Understanding the oral user | |
| Morrow et al. | A multidisciplinary approach to designing and evaluating electronic medical record portal messages that support patient self-care | |
| Hwang et al. | Having a syntactic choice is not always better: the effects of syntactic flexibility on Korean production | |
| KR20140094919A (en) | System and Method for Language Education according to Arrangement and Expansion by Sentence Type: Factorial Language Education Method, and Record Medium | |
| Viana et al. | Creative writing and iconicity in English as a foreign language | |
| Alviar et al. | Complex communication dynamics: Exploring the structure of an academic talk | |
| WO2015112250A1 (en) | Visual-kinesthetic language construction | |
| Gotowski et al. | Using syntax and semantics to acquire subjective adjective meanings | |
| CN116975251A (en) | Language learning content display method, device, computer equipment and storage medium | |
| Graf Estes et al. | Flexibility in statistical word segmentation: Finding words in foreign speech | |
| Corley | Making predictions from speech with repairs: Evidence from eye movements | |
| Shih et al. | Virtual voice assistants | |
| da Silva et al. | QUERAI–A Smart Quiz Generator | |
| Tamis-LeMonda et al. | Language Development | |
| Gustafson et al. | The role of linguistic experience in the processing of probabilistic information in production |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14879881 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 14879881 Country of ref document: EP Kind code of ref document: A1 |