WO2019167052A1

WO2019167052A1 - A system for augmentative and alternative communication for people with severe speech and motor disabilities

Info

Publication number: WO2019167052A1
Application number: PCT/IN2018/000034
Authority: WO
Inventors: Yogesh Kumar Meena; Hubert Cecotti; Kongfatt Wong-Lin; Girijesh PRASAD
Original assignee: Individual
Current assignee: Individual
Priority date: 2018-02-27
Filing date: 2018-06-29
Publication date: 2019-09-06
Anticipated expiration: 2020-08-27

Abstract

The present invention relates to a system and method for augmentative and alternative communication (AAC) aidforpeople with severe speech and motor disabilities. The usability of such AAC systems is currently limited due to the lack of adaptive and user-centered approaches leading to low accuracy and the need for frequent recalibration. The present invention relates to a system and method for the adaptation (over time) of the dwell time in asynchronous mode and the fixed interval (trial period)in synchronous mode using a gaze based virtual keyboards. The present invention also relates to the use of a gaze basedvirtual keyboard system which is designed for a structurally complex language and optimized for multimodality involving several portable, non-invasive, and low- cost input devices including a computer mouse; a touch screen; an eyetracker or an eye-tracking device; a surface electromyography; and an access soft-switch. This invention is based on a menu driven /selection approach with 10 commands providing access to type 88 different characters of the Hindi language along with delete, clear-all, and go-back commands for corrections. The current invention relates to a first gaze based virtual keyboard system with multimodal input from various input devices, which aims to work for the Hindi language and which can be extended to various other complex language systems.

Description

Title of the invention:

A SYSTEM FOR AUGMENTATIVE AND ALTERNATIVE COMMUNICATION FOR PEOPLE WITH SEVERE SPEECH AND MOTOR DISABILITIES

Field of Invention:

The present invention generally relates to a system and method which pertains to an electronic interface between the human and computers, the said system employing eye-tracking mechanism along with the use of multimodality of input devices including the computer mouse, touch screen, eye-tracking device, surface electromyography, and an access soft switch.

Background of the invention

Over 20 M people suffer from speech and motor disorders annually worldwide; they face a lot of difficulties in communicating with other people in an intelligible way. These disabilities may include upper limb paralysis, muscular dystrophy, spinal cord injuries, cerebral palsy, and Parkinson's disease, which may impact their quality of life and employability. The disability ratios are generally higher for developing countries. More particularly, population surveys showed disabled people constitute approximately 40-80 million [about 5%) of the total population [1.02 billion] of India. Among the five types of disabilities on which data had been collected in the 2001 Census, a major portion of this population is affected by mobility impairment [27.9%) and speech impairment [7.5%]. In the fast growing technology era where computing based gadgets such as PCs and laptops are becoming an integral part of our daily lives, physical disabilities prevent people from using basic equipment [e.g. standard keyboard, touch screen, and mouse setup]. Thus, accessing the information and communication technology [ICT] applications to interact with the outside world becomes a challenging task for disabled people. Therefore, computer-based augmentative and alternative communication [AAC] systems a re developed to assist these people.

The traditional input devices, such as mouse and keyboard, are not suitable for speech and motionimpaired users to communicate. I n particular, these users have limited ability for fine motor control and therefore, they may not be able to use a mouse and normal keyboards for typing and interacting with a computer system. Hence, the efficient interaction with a computer as a means of communication and/or control becomes a great challenge for them, e.g. during accessing the ICT technologies on the web document typing, message sending, entertainment, and operating other AAC devices. The use of alternative devices and methods developed specially in accordance with their needs can help these users overcome these d ifficulties. However, the development of new assistive technologies depends on various factors, such as the type of disabilities, the equipment cost, parameter adaptation approaches, and the system usability Moreover, the advancement of the existing devices must cater to the needs of special user categories, e g. brain-computer interfaces (BCI) can be implemented for locked-in patients. Therefore, the invention aims to provide an approach for adapting the parameters over time in gaze-based interaction and validate this approach on a virtual scanning keyboard devised for the Hindi language which has a rather complex structure. The invention aims to incorporate the adaptive methods and multimodal facilities with the Hindi based eye gaze directed virtual keyboard application to overcome various confounding factors of conventional existing virtual keyboards.

Several studies have been carried out relating to this invention includingseveral models and approaches of Human-Computer Interface (HCI) and Brain-Computer Interface (BCI) to control or communicate over the assistive technology Further, this invention proposes new gaze-based system and control methods for both synchronous and asynchronous operations during the Human-Computer Interaction of HCI. Further, the invention also describes the development of the eye gaze tracking based multimodal virtual keyboard system which can be used to provide input to the virtual keyboard through various input devices likecomputer mouse, touch screen, eye tracking device, surface electromyography and an access soft switch. Particularly, it takes into account challenges related to managing a complex structure and a large set of characters in the language especially the Hindi language. The invention further provides design and experimental procedure for adapting to various parameters such as dwell time during the eye gaze, use of different models of a multimodal system for selection of the identified category from the menu driventhe display.

Disabled people, who are not completely locked-in, may still partially be able to use their body parts and gaze to communicate and control assistive devices. The gaze- based control of a wheelchair has been implemented successfully [Purwanto et al., 2009, Meena et al„ 2015a, Mat- sumoto et al., 2001, Meena et al., 2016a, Meena et al., 2017a,], which has shown its strong potential as an input modality for assistive technology. Moreover, to account for the substantial lack of mobility in severally disabled users, alternative input devices can be implemented as access switches for usage inavirtual keyboard based AAC systems.

These input devices such as access switches require minimal motor control and are available in a wide variety to be used by any active body part of the user i.e. hand, foot, mouth or head. Further, the electromyography (EMG) signals can also be used as input devices inavirtual keyboard based AAC systems. These access switches/EMG devices can be used with eye-tracking devices wherein the gaze is used to search the desired element on the screen. The user may activate the access switch/E G device to select the desired element. The switch activation varies according to the type of the switch e.g. a hand-held switch is activated by a press, while an eye-operated switch is activated by eye blinking.

Currently, a wide range of eye-tracking devices are available in the market, which offers varied functionalities, price range, and precision. There are some devices, whichrequire high precision to measure the eyes characteristics, which in turn requires the use of expensive eye-trackers.

In eye-tracking, broadly two devices have been used to measure eye movements. First, a wearable-camera-based device wherein a high resolution image for calculating the gaze point can be obtained from thewearable camera at a close distance. However, because the camera equipmentmust be worn, the user may experience discomfort during eye-tracking interactions (Lee et al., 2007a; Jacob 1995). Second, a remote-camera-based device wherein the position ofthe gaze is captured through non contacting fixed cameras at a relatively fardistance without any additional equipment or support In this case, becausethe image resolution for the eye is relatively low, pupil tremors cause severevibrations of the calculated gaze point. Furthermore, time-varying characteristics of the remote-camera-based method can lead to a low accuracy and theneed for frequent calibration (Jacob 1995; Katarzyna et al., 2014)

Similar to electroencephalography (EEG)-based BCI, the gaze-based control can be accessed in eyetracking based HCI in both synchronous (cue-paced) and asynchronous (self-paced) modes (Nicolas-Alonso et al., 2012). In synchronous mode, a user action such as click events isperformed after a fixed interval (trial period) whereas in asynchronous modethe click events are performed through a dwell time. In synchronous mode, an item is selected when the user focuses on the target item during a pre-defined trial of a particular duration, and at the end of the trial, the targetitem is selected if it has the maximum duration of the focus compared tothe estimated duration of focus on other items. In such a case, the user hasto spend a maximum amount of time on the desired item. In asynchronousmode, an item is selected when the user is focusing on this item (target)for a predefined continuous time period. These two methods effectively reflect user intention, however, these are time-consuming when there are manyselections to be made (Wolpa et al., 2002; Huckauf et al„ 2011).

In order to have aninexpensive and non-invasive device, a remote camera-based eye-tracking device can be implemented. These devices can be used to design a low-cost AAC system with minimal physical contact with the device. The virtual keyboard based AAC systems have been designed based on various keyboard approaches i.e. the Dvorak, FITALY, 0RΊΊ, Cirrin, Lewis, Hooke's, Chubon, Metropolis, and ATOMIK. However, most of the HCI based applications have been developed for the Latin script, wherein the eye- tracker devices were used to provide computer inputs to select the target items on the visual display unit (e.g. in the virtual keyboard]. The non-invasive EEG- based BCIs are also active in the field of virtual keyboard development to enable communication for severely disabled people.

To carry out the text composition with the Hindi language (i.e. official language of I ndia (490 million speakers]], which has a complex structure and a large set of characters, most of the above mentioned keyboards suffer from lower text entry rate, lesser user friendliness, error prone text entry, and complexity in design, larger layout, and large design space exploration In particular, the Hindi language has more than twice the number of characters used in the English language. There have been several attempts to optimize the Hind i language virtual keyboards. These approaches included separate inflexion panel and visual display unit (virtual keyboard] with dynamic inflexion window. These keyboards were implemented with mouse and switch input modalities and thus their congested key structure was not suitable for the gaze-based control. Therefore, some special arrangements are required to be made in designing the layouts so that gaze constraints be taken into account.

The composition of text in the Hindi language contains a large alphabet, matras (i.e. diacritics], haiants (i.e. killer strokes] and other complex characters. Hence typing in Hindi language using QWERTY keyboard is not an easy task as significant training is required to compose the text. In previous studies, QWERTY keyboard has been optimized to design vi rtual keyboard applications in Bengali and Meitei language. However, only one study has shown the head mounted gaze controlled text entry interface in the Hindi language, but the usability of the system is not very satisfactory.

A key concern in eye-tracker based interfaces is to quantify the intention, which may be confounded by pseudo interpretations. This issue aggravates due to involuntary eye movements, which lead to the false item selections (Midas- Touch problem]. Therefore, it is a highly challenging to control the QWERTY keyboard by eye movement tracking. M ultimodal and hybrid interfaces have been utilized to counter th is issue However, there has been no attempt to make a gaze based virtual keyboard (visual display unit] which uses a multimodal access facility and which can also overcome the issue of Midas touch.

The usability of AAC systems with gaze-based access controls is currently impaired by the difficulty to find optimal parameters, such as the dwell time, as they can depend on both the user and the current state of the user (e.g fatigue, knowledge of the system]. I n addition, the changes of attention, the degree of fatigue, and the users' head motion while controlling the application represent obstacles for efficient gaze-based access controls as they can lead to a low accuracy. These continuous variations can be overcome by recalibrating the system, but it can be time consuming and may not be user friendly. Another solution to overcome this affect is that the system should be made adaptive over the time based on its performance by considering key features of the application (i.e. by incorporating the corrections).

The previously mentioned issues related to the high number of commands that can be accessed, the Midas touch problem, and the requirement of adapting parameters need to be taken into account to design an intelligent user interface meeting constraints of portable eye-tracking systems.

This invention intends to address the issues of adaptation of (over time) of dwell time in asynchronous mode and the trial periodin synchronous mode for gaze based virtual keyboards, and incorporation of the multimodal access facility wherein the search of a target item in the visual display unit (virtual keyboard) is done by gaze detection and the selection can happen via the use of a dwell time, soft-switch, or gesture detection using surface electromyography (sEMG) in asynchronous mode; and the search and selection may be performed with eye-tracker in synchronous mode.

Brief description of the accompanying drawings.

Fig.l represents the layout of the Visual Display Unit.

Fig. 2 represents the position of the commands and the tree structure for letter selection.

Fig. 3 represents the different input devices for providing i nput signals Fig. 4 represents the multimodal system with multi modal input devices Fig. 5 represents the gesture controlled arm band with different arm gestures.

Fig. 6 represents the layout of the system i ncorporating the devices.

Detailed description of the invention with reference to the accompanying drawings:

A gaze based control can be accessed in two diffierent modes. The eye-tracking can be used for both search and selection purposes with synchronous (i.e. cue-paced) and asynchronous (i.e. self- paced) modes. First, the asynchronousmode offers a natural mode of i nteraction without waiting for an external cue.The com mand selection is managed through the dwell ti me concept. Duringthis mode, the users focus their attention by fixating the target item for aspecific period of time (i.e., dwell time in seconds) which results in the selection of that particular item. Second, the way of interaction Asynchronous mode is mainly based on an external cue. This mode can be usedto avoid artifacts such as involuntary eye movements of users as the com mandis selected at the end of the trial duration/trial period. During this mode, the users focused their attention by fixating an item during a single trial of aparticular length (i.e. the trial length (in seconds)), and the item was selectedat the end of the trial based on the maximum duration of focus.

If the total number of commands that are available at any time in the system is denoted by M[here M = 10). Each command C_; is defined by the coordinates correspond i ng to the center of its box _c, y (^x'c,y'c), where ie {1..M}. We denote the gaze coordinates at time t by (x_t, y_t), then the distance between a command box and the current gaze position, d'_tis defined by its Euclidean distance as:

We denote the selected command at time t by select_t, where 1 < select_{t <} M.

For the asynchronous and synchronous modes, we defined the dwell time and the trial periodas Ato and Ati, respectively Ato represents the minimum time that is required to select a com mand i.e. when a subject continuously keeps his/her gaze on a com mand. I n synchronous mode, Ati represents the time after which a command can be selected based on where the user was looking during the trial period.

However, both synchronous and asynchronous modes can suffer from low-text entry rate and depend on specialties of the users when using the predefined time parameters. Therefore, the adaptation over time is essential for designing a more natural mode of interaction. i) Eye-tracker with adaptive dwell time in asynchronous mode

For the adaptive dwell time in asynchronous mode, we consider two conditions where At can change between Amin₀Aand Amax₀where Amtn₀Aand A??ia ₀Acorrespond to 1 s and 5 s, respectively. I nitially, At₀is set to 2000 ms.

In the first condition, if the number of commands, N_COr, corresponding to a "delete" or "undo" represents more than half of the commands in the history of commands Ni,(i.e. 2N_COr ³ N_h), then it is assumed that there exists some difficulties for the user, and the dwell time has to be increased. In the second conditionif the average time between two consecutive commands during N_h commands is close to the dwell time, then it is assumedthatthe current dwell time acts as a bottleneck and it can be reduced. If we denote the variable that contains the difference of time between two consecutive commands by At,_· in which At_t(k) corresponds to the time interval between the command k and k - 1, the current average of At_c over the past N_h commands can be defined by:

ii) Eye-tracker with adaptive trial periodin synchronous mode

With the adaptive trial period(i.e. trial duration Ati) in synchronous mode, we consider three conditions where Ati can change between Amin i and Amax i. The values of Amini and Amaxi correspond to 1 s and 5 s, respectively.

Initially, At^ is set to 2000 ms. I n the first condition, we define by P (select_s)k the average probability to detect a com mand in the k^th trial by considering the last N_h previous trials. If this probability is high, then it indicates that the commands are selected in a reliable manner and the trial period can be decreased.

The second condition deals with the trials with no command selection. In this case, it is assumed that if a command was not selected during the interval Ati, it means that At, was too short to allow the user to select an item. In such a case, the trial periodis increased where the number of rejected commands are N_r in the history of the last N_h commands (N_{r <} N_h). In the third condition, if the number of commands, N_COr, corresponding to a "delete" or“undo" represents more than half of the commands in the history of N , commands included, then we assume that there exists some difficulties for the user, and the dwell time has to be increased.

The developed graphical user interface (GUI) consists of two mai n components, which are shown in Fig 1. he first component is the visual display unit wherein a total of ten commands are shown and the command which is currently being pointed to, is highlighted in a different colour. The second component is an output text display where the user can see the typed text in real-time. The position and tree structure of the ten commands (i.e. cl to clO] are depicted in Fig 2. An alphabetical organization with script specific arrangement layout is developed as the alphabetic arrangement is easier to learn and remember, specially for complex structured language (Bhattacharya et a I., 2013a). The size of each rectangularcommand button is approximately 14% of the GUI window. All command buttons are placed on the periphery of the visual display screen while the output text box is placed at the center of the screen as shown in Figl. The GUI of the virtual keyboard is based on a multi-level menu selection method comprised of ten commands at each level. The tree-based structure of the GUI provides the ability to type 45 Hindi language letters, 17 different matras (i.e. diacritics) and halants (i.e. killer strokes), 14 punctuation marks and special characters, and 10 numbers (from 0 to 9). Moreover, other functionalities such as delete, delete all, new line, space and go back commands for corrections are also included.

The first level of the GUI consists of 10 command boxes; each represents a set of language characters (i.e. 10 characters). The selection of a particular character requires the user to follow a two-step task. In the first step, the user has to select a particular command box (i.e. at first level of GUI) where the desired character is located. The successful selection of command box shifts the GUI to the second level, where the ten commands on the screen are assigned to the ten characters which belong to the selected command box in the previous level. In the second step, the user can see the desired character and finally select it for writing to the text-box. After the selection of a particular character at the second level, the GUI goes back to the initial stage (i.e. at first level) to start further iterations. The placement and size of the command boxes are identical at both levels of GUI. The system is also designed with the inclusion of multiple modalities, and extra command features to write all the Hindi language letters including half letter script, more punctuation marks, and special characters. The system adapts the parameters over time for both synchronous and asynchronous modes to improve the speed and accuracy of the system.

While using the gaze based eye-tracking system on a virtual keyboard, it is also required that the user be provided an efficient feedback that the intended command box/character has been selected so that the mistakes are avoided and there in an increase in the efficiency. Hence, a visual feedback is provided to the user by a change in the color of the button border while looking at it. Initially, the colour of the button border is silver (RGB: 192,192,192). When the user pays attention to a particular button for a duration of time t, the color of the border changes linearly in relation to the dwell time At₀ or the trial period(i.e. trial duration) Ati and the border becomes greener with time. The RGB color is defined as (R=v,G=255,B=v), where v = 255 * (Ato - t)/Ato · The visual feedback allows the user to continuously adjust and adapt his/her gaze to the intended region on the screen. An audio feedback is also provided to the user through an acoustic beep after successful execution of each command. This sound makes them proactive so that they can prepare easily for next character. Moreover, to improve the system performance by using minimal eye movements, the last five used characters are also displayed in the GUI at the bottom of eachcommand box, This helps the user to see the previously written character without moving eyes to the output display box. The eye-tracking device acquires the gaze data.

In order to assess the adaptability and usability of the system, twenty-fou r healthy volunteers (5 females) in the age range of 21-32 years (27.05+2.96) participated in this study (see in Table 1). Fifteen participants performed the experiments with vision correction. No participant had prior experience of using an eye-tracker, soft-switch and/or sEMG with the application.

Three different input devices were used in this study (see Fig 3). First, a portable eye-trackerhas been used for pursuing the eye gaze of the participants. Second, gesture recognition was obtained with the Myo armband for recording sEMG. This non-invasive device includes a 9 degree-of- freedom (DoF) Inertial Measurement Unit (! M U), and 8 d ry sEMG sensors. The Myo can be slipped directly on the arm to read sEMG signals with no preparation needed for the participant (no shaving of hair or skin-cleaning)and Third, a soft-switchwhich is used as a si ngle-input device.

The eye-tracking device, records data at a sampli ng rate of 30 Hz.lt involves binocular infrared illumination with spatial resolution (0.1 root mean square (RMS)), which records x and y coordinates of gaze and pupil diameter for both eyes in mm.

The Myo armband provides sEMG signals with a sampli ng frequency of 200 Hz per channel. Electrode placement was set empirically in relation to the size of the participant's forearm because the Myo armband's minimum circumference size is about 20 cm. An additional short calibration was performed for each participant with the Myo (about 1 mi n).

The soft-switch was used as a single-input device to select a com mand on visual display unit The distance of the participant from a computer screen was about 80 cm. The vertical and horizontal visual angles were measured at approximately 21 and 36 degrees, respectively.

To start the procedure, the typing task in the experimental protocol involves a predefined sentence with 29 characters from the Devanagari script and 9 numbers, given as

The transliteration of the tasksentence in English is "Kabtak Jabtak Abhyaasa Karate Raho. 44 - 4455 - 771" and the direct translation in English is Till When Until Keep Practicing. 44-4455-771. The complete task involves 76 commands in one repetition if performed without committing any error. This predefined sentence was formed with a particular combination of words in order to obtain an equiprobable distribution of commands for each of the ten items in the GUI. Thus, the adopted arrangement provides unbiased involvement of command boxes and eye-gaze distribution over the GUI of the virtual keyboard.There were five differentcombinations of the input modalities (see Fig 4)which provided twenty different conditions of experimental design.

First, the search and selection of the target item were performed by eyes and a normal computer mouse, respectively (see Fig 4 (A)). Second, the search of the target item was performed by the eyes and the participant used the touch screen to finally select the item (see Fig 4 (B]]. Third, the eye-tracker along with the soft-switch was used in a hybrid mode wherein the user focused the eye-gaze to point to the target item, and the selection happens via a soft-switch (see Fig 4 (C)). Fourth, the eye-tracker was used in combination with five different sEMG-based hand gestures wherein eye-gaze was used for the search purpose and each gesture acted as an input modality to select the item (see Fig 4 (D]]. This combination of input modality covered five different experimental conditions. Fifth, the eye-tracker was used for both search and selection purposes with synchronous and asynchronous modes (see Fig 4 (E]). During the asynchronous mode, the participants focused the eye-gaze at the target item for a specific period of time (i.e. dwell time (in seconds)) which results in the selection of that particular item. During the synchronous mode, the participants focused the eye-gaze at an item (target or non-target) during a single trial of a particular length (i.e. the trial length (in seconds)), and at the end of the trial the item was selected based on the maximum duration of the focus. We implemented asynchronous and synchronous modes with five different dwell time and trial periodvalues, respectively, resulting in ten different experimental conditions. In addition, there were two more experimental conditions which incorporated asynchronous and synchronous modes with adaptive dwell time and adaptive trial period, respectively (see Fig 4 (E)).

In one condition, a soft-switch and five sEMG based hand gestures (i.e. fist, wave left, wave right, fingers spread, and double tap) along with eye-tracker were used. In this case, the eye-tracker was used in a hybrid mode, wherein the user gaze at the target item, and the selection happens via switch/sEMG signals. The use of the input devices is done once the user receives the visual feedback i.e. when the color of the gazed item begins to change.

In another condition, the eye-tracker is used only in an asynchronous mode with different dwell time (i.e. Dwell time = 1 s, 1.5 s, 2 s, 2.5 s, 3 s, and adaptive dwell time), where the item is determined through gazing, and the item selection is made by dwell time/adaptive dwell time.

In yet another condition, the eye-tracker is used only in a synchronous mode for pointing to and selection of the items, where the pointing to items is done by the gaze and the selection of the items is made by the five different trial period(i.e. 1 s, 1.5 s, 2 s, 2.5 s, 3 s) and one with adaptive trial period. These different trial periods were considered to find out the optimal trial periodto design eye-tracking based synchronous system.

In another condition, use of different input devices is carried out with the eye-tracker. A soft- switch is used with the eye-tracking device wherein the addition of the soft-switch has helped overcome the Midas touch problem (i.e selecting irrelevant non-targeted commands) of MCI, as the user needs only point to the target item through eye-tracker, and the selection happens via the soft-switch. In other words, the searching of the items on the computer screen is done by the user's gaze pointing to the items and selection is made by the soft- switch device. In this case, the soft-switch was pressed by the user’s dominant hand. The color based visual feedback is provided to the user during the searching of an item. The visual feedback allows the user to continuously adjust and adapt his/her gaze to the intended region on the screen. Once the item is selected, the auditory feedback is also provided /given to the user as an acoustic beep.

Further in the hybrid mode, the sEMGhand gestures can be combined with an eye-trackerto provide extra input modalities to the users. The eye-tracker is used to point to a command on the screen. Then, the command is selected through a hand gesture by using predefined functions from the Myo SDK. Five conditions were evaluated related to gesture control with the Myo: fist (hand close), wave left (wrist flexion), wave right (wrist extension), finger spread (hand open), and double tap (see Fig 5). The color-based visual feedback is provided to the user during the searching of an item. Once the item is selected, the auditory feedback is provided to the user as an acoustic beep. Thus the hybrid system helps to overcome the Midas touch problem of gaze controlled HC1 system.

From the above description, it can be seen that an adaptive gaze-based multimodal virtual keyboard / visual display unit can even handle a language as complex as the Hindi language which has 45 letters, 17 diacritics and killer strokes, 14 punctuation marks and special characters, and 10 numbers. The virtual keyboard/ visual display unit is capable of being operated using a portable non-invasive eye-tracker, sEMG based hand gesture recognition device, and/or a soft- switch. The invention intends to resolve the issues of HCI pertaining to the virtual keyboard while using adaptive parameters and low-cost input devices.

With the use of this invention, in case of both adaptive synchronous and asynchronous modes, the speed and accuracy of the system can be increased for the virtual keyboard of any language. Secondly, the user can make use of eye-gaze detection, gesture recognition, and a single input switch, either alone as a single modality or in combination as multimodality device, for a very complex structured language like Hindi. Further, with the use of time adaptive methods along with the different input devices like touch screen, soft-switch and gesture recognition devices wherein users can employ any of these according to their comfort and/or need, the performance of virtual keyboards can be greatly increased in terms of accuracy and speed. The speed and accu racy of the system have been optimized by optimizing the size of the command boxes, the distance between the boxes and taking into consideration the involuntary head and body movement.

This invention provides for the adaptation over time of gaze-based access controls for an eyetracking based multimodal virtual keyboard specifically created for a language with a very complex structure like the Hindi language. The system is based on a tree selection method and it requires two consecutive steps for enabling a command. First, the user has to point to the item that must be selected. A red pointer on the visual display can be moved to the chosen location and a visual feedback is provided on the chosen location. Second, the user has to approve the location of the pointer in order to select the corresponding item. Once an item has been selected by the user, the system provides an audio feedback for the correspondi ng targeted item. The use of the adaptive techniques in the invention can increase the accuracy of the eye-tracker, which may limit the number of commands that can be accessed at any moment as the calibration data should be updated when the user changes his/her head and body position over time. With or without adaptive methods this approach can be applied to any other language. The proposed system can be directly used for the arathi/Konkani language users (70 million speakers) by including one additional letter (i.e.

Therefore, the present research findings have potential application for a large user population (560 million).

Claims

WeClaims:

1. A system for augmentative and alternative communication for people with severe speech and motor disabilities comprising of a user interface having multimodal inputs, the system comprising of:

a. input means (1,2, 3, 4) for receiving the input communication from the user;

b. processing means (7} for processing the input received;

C. display means (5) to display the processed input from the processi ng means;

d. auditory means to (6] confirm the selection based on the input means; wherein each of the input device either alone or in combination of the other input devices is used to display the desired command on the display device.

2. A system as claimed in claim 1, wherein the input means is a Gaze based eye tracking device which can track the movement of the eyes and is adapted to differentiate between the focused and involuntary movement of the eye during the process of gazing.

3. A system as claimed in claim 2, wherein the device is a non-invasive remote camera based eye tracking device.

4. A system as claimed in claim 1, wherein the input means is a mouse/keyboard or a touch screen device, which can provide the input to the system.

5. A system as claimed in claim 1, wherein the input device can be a soft -switch which when pressed in combi nation of the eye gaze input, selects the gazed input.

6. A system as claimed in any of the above claims wherein the input device can be an armband /sEMG based gesture recognition device which can sense the gestures of the hand as an input to the device.

7. A system as claimed in any of the preceding claims wherein the processor gets the input from the multimodal input devices and generates response on the visual display units.

8. A system as claimed in any of the preceding claims wherein the input received from the gaze based eye tracking device is analysed by the processor for various dwell time of the eye and output is generated on the visual display unit.

9. A system as claimed in any of the preceding claims wherein the visual display unit/ virtual keyboard is a menu driven device.

10. A systems claimed in any of the preceding claims wherein the characters on the Visual display unit / virtual keyboard are based on a tree selection method

11. A system as claimed in the above claim wherein the selection of a command requires at least two consecutive steps, wherein the characters are first locked by the help of the Gaze based tracking device and then the command is selected by means of input devices in synchronous mode.

12. A system as claimed in any of the preceding claims wherein the character is locked on the visual display unit with the input from Gaze based eye tracking device and the command is then selected with the help of the Gaze dwell time in asynchronous mode.

13. A system as claimed in any of the preceding claims wherein the dwell time of the system is increased if the number of "delete" or "undo” command are more than the number of commands in the command history.

14. A system as claimed in any of the preceding claims wherein the dwell time of the system is decreased if the average time between two consecutive commands i n command h istory is equal or close to the dwell time.

15. A system as claimed in the above claim wherein the last five used characters are displayed in the visual display unit at the bottom of each command box, helping the user to see the previously written characters without shifting significantly thei r gaze from the desired command box to the output display box.

16. A system as claimed in the above claim wherein the system offers has different modalities, which can be selected in relation to the preference of the user.