[go: up one dir, main page]

WO2019167052A1 - A system for augmentative and alternative communication for people with severe speech and motor disabilities - Google Patents

A system for augmentative and alternative communication for people with severe speech and motor disabilities Download PDF

Info

Publication number
WO2019167052A1
WO2019167052A1 PCT/IN2018/000034 IN2018000034W WO2019167052A1 WO 2019167052 A1 WO2019167052 A1 WO 2019167052A1 IN 2018000034 W IN2018000034 W IN 2018000034W WO 2019167052 A1 WO2019167052 A1 WO 2019167052A1
Authority
WO
WIPO (PCT)
Prior art keywords
input
gaze
eye
command
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/IN2018/000034
Other languages
French (fr)
Inventor
Yogesh Kumar Meena
Hubert Cecotti
Kongfatt Wong-Lin
Girijesh PRASAD
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of WO2019167052A1 publication Critical patent/WO2019167052A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/014Hand-worn input/output arrangements, e.g. data gloves
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/015Input arrangements based on nervous system activity detection, e.g. brain waves [EEG] detection, electromyograms [EMG] detection, electrodermal response detection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0236Character input methods using selection techniques to select from displayed items
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B21/00Teaching, or communicating with, the blind, deaf or mute
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/038Indexing scheme relating to G06F3/038
    • G06F2203/0381Multimodal input, i.e. interface arrangements enabling the user to issue commands by simultaneous use of input devices of different nature, e.g. voice plus gesture on digitizer

Definitions

  • the present invention generally relates to a system and method which pertains to an electronic interface between the human and computers, the said system employing eye-tracking mechanism along with the use of multimodality of input devices including the computer mouse, touch screen, eye-tracking device, surface electromyography, and an access soft switch.
  • the traditional input devices such as mouse and keyboard
  • these users have limited ability for fine motor control and therefore, they may not be able to use a mouse and normal keyboards for typing and interacting with a computer system.
  • the efficient interaction with a computer as a means of communication and/or control becomes a great challenge for them, e.g. during accessing the ICT technologies on the web document typing, message sending, entertainment, and operating other AAC devices.
  • the use of alternative devices and methods developed specially in accordance with their needs can help these users overcome these d ifficulties.
  • the development of new assistive technologies depends on various factors, such as the type of disabilities, the equipment cost, parameter adaptation approaches, and the system usability Moreover, the advancement of the existing devices must cater to the needs of special user categories, e g. brain-computer interfaces (BCI) can be implemented for locked-in patients. Therefore, the invention aims to provide an approach for adapting the parameters over time in gaze-based interaction and validate this approach on a virtual scanning keyboard devised for the Hindi language which has a rather complex structure. The invention aims to incorporate the adaptive methods and multimodal facilities with the Hindi based eye gaze directed virtual keyboard application to overcome various confounding factors of conventional existing virtual keyboards.
  • BCI brain-computer interfaces
  • HCI Human-Computer Interface
  • BCI Brain-Computer Interface
  • this invention proposes new gaze-based system and control methods for both synchronous and asynchronous operations during the Human-Computer Interaction of HCI.
  • the invention also describes the development of the eye gaze tracking based multimodal virtual keyboard system which can be used to provide input to the virtual keyboard through various input devices likecomputer mouse, touch screen, eye tracking device, surface electromyography and an access soft switch. Particularly, it takes into account challenges related to managing a complex structure and a large set of characters in the language especially the Hindi language.
  • the invention further provides design and experimental procedure for adapting to various parameters such as dwell time during the eye gaze, use of different models of a multimodal system for selection of the identified category from the menu driventhe display.
  • Disabled people who are not completely locked-in, may still partially be able to use their body parts and gaze to communicate and control assistive devices.
  • the gaze- based control of a wheelchair has been implemented successfully [Purwanto et al., 2009, Meena et al leverage 2015a, Mat- sumoto et al., 2001, Meena et al., 2016a, Meena et al., 2017a,], which has shown its strong potential as an input modality for assistive technology.
  • alternative input devices can be implemented as access switches for usage inavirtual keyboard based AAC systems.
  • access switches require minimal motor control and are available in a wide variety to be used by any active body part of the user i.e. hand, foot, mouth or head.
  • electromyography (EMG) signals can also be used as input devices inavirtual keyboard based AAC systems.
  • EMG devices can be used with eye-tracking devices wherein the gaze is used to search the desired element on the screen.
  • the user may activate the access switch/E G device to select the desired element.
  • the switch activation varies according to the type of the switch e.g. a hand-held switch is activated by a press, while an eye-operated switch is activated by eye blinking.
  • a wearable-camera-based device wherein a high resolution image for calculating the gaze point can be obtained from thewearable camera at a close distance. However, because the camera equipmentmust be worn, the user may experience discomfort during eye-tracking interactions (Lee et al., 2007a; Jacob 1995).
  • a remote-camera-based device wherein the position ofthe gaze is captured through non contacting fixed cameras at a relatively fardistance without any additional equipment or support In this case, becausethe image resolution for the eye is relatively low, pupil tremors cause severevibrations of the calculated gaze point.
  • time-varying characteristics of the remote-camera-based method can lead to a low accuracy and theneed for frequent calibration (Jacob 1995; Katarzyna et al., 2014)
  • the gaze-based control can be accessed in eyetracking based HCI in both synchronous (cue-paced) and asynchronous (self-paced) modes (Nicolas-Alonso et al., 2012).
  • synchronous mode a user action such as click events isperformed after a fixed interval (trial period) whereas in asynchronous modethe click events are performed through a dwell time.
  • synchronous mode an item is selected when the user focuses on the target item during a pre-defined trial of a particular duration, and at the end of the trial, the targetitem is selected if it has the maximum duration of the focus compared tothe estimated duration of focus on other items.
  • a remote camera-based eye-tracking device can be implemented. These devices can be used to design a low-cost AAC system with minimal physical contact with the device.
  • the virtual keyboard based AAC systems have been designed based on various keyboard approaches i.e. the Dvorak, FITALY, 0R ⁇ , Cirrin, Lewis, Hooke's, Chubon, Metropolis, and ATOMIK.
  • most of the HCI based applications have been developed for the Latin script, wherein the eye- tracker devices were used to provide computer inputs to select the target items on the visual display unit (e.g. in the virtual keyboard].
  • the non-invasive EEG- based BCIs are also active in the field of virtual keyboard development to enable communication for severely disabled people.
  • composition of text in the Hindi language contains a large alphabet, matras (i.e. diacritics], haiants (i.e. killer strokes] and other complex characters.
  • matras i.e. diacritics
  • haiants i.e. killer strokes
  • other complex characters i.e. typing in Hindi language using QWERTY keyboard is not an easy task as significant training is required to compose the text.
  • QWERTY keyboard has been optimized to design vi rtual keyboard applications in Bengali and Meitei language.
  • only one study has shown the head mounted gaze controlled text entry interface in the Hindi language, but the usability of the system is not very satisfactory.
  • AAC systems with gaze-based access controls is currently impaired by the difficulty to find optimal parameters, such as the dwell time, as they can depend on both the user and the current state of the user (e.g fatigue, knowledge of the system].
  • optimal parameters such as the dwell time
  • the changes of attention, the degree of fatigue, and the users' head motion while controlling the application represent obstacles for efficient gaze-based access controls as they can lead to a low accuracy.
  • These continuous variations can be overcome by recalibrating the system, but it can be time consuming and may not be user friendly.
  • Another solution to overcome this affect is that the system should be made adaptive over the time based on its performance by considering key features of the application (i.e. by incorporating the corrections).
  • This invention intends to address the issues of adaptation of (over time) of dwell time in asynchronous mode and the trial periodin synchronous mode for gaze based virtual keyboards, and incorporation of the multimodal access facility wherein the search of a target item in the visual display unit (virtual keyboard) is done by gaze detection and the selection can happen via the use of a dwell time, soft-switch, or gesture detection using surface electromyography (sEMG) in asynchronous mode; and the search and selection may be performed with eye-tracker in synchronous mode.
  • sEMG surface electromyography
  • Fig.l represents the layout of the Visual Display Unit.
  • Fig. 2 represents the position of the commands and the tree structure for letter selection.
  • Fig. 3 represents the different input devices for providing i nput signals
  • Fig. 4 represents the multimodal system with multi modal input devices
  • Fig. 5 represents the gesture controlled arm band with different arm gestures.
  • Fig. 6 represents the layout of the system i ncorporating the devices.
  • a gaze based control can be accessed in two diffierent modes.
  • the eye-tracking can be used for both search and selection purposes with synchronous (i.e. cue-paced) and asynchronous (i.e. self- paced) modes.
  • the asynchronousmode offers a natural mode of i nteraction without waiting for an external cue.
  • the com mand selection is managed through the dwell ti me concept.
  • the users focus their attention by fixating the target item for aspecific period of time (i.e., dwell time in seconds) which results in the selection of that particular item.
  • the way of interaction Asynchronous mode is mainly based on an external cue.
  • This mode can be usedto avoid artifacts such as involuntary eye movements of users as the com mandis selected at the end of the trial duration/trial period.
  • the users focused their attention by fixating an item during a single trial of aparticular length (i.e. the trial length (in seconds)), and the item was selectedat the end of the trial based on the maximum duration of focus.
  • Each command C is defined by the coordinates correspond i ng to the center of its box c, y ( x 'c,y'c), where ie ⁇ 1..M ⁇ .
  • d' t the distance between a command box and the current gaze position, d' t is defined by its Euclidean distance as:
  • select t we denote the selected command at time t by select t , where 1 ⁇ select t ⁇ M.
  • Ato represents the minimum time that is required to select a com mand i.e. when a subject continuously keeps his/her gaze on a com mand.
  • I n synchronous mode, Ati represents the time after which a command can be selected based on where the user was looking during the trial period.
  • At can change between Amin 0 Aand Amax 0 where Amtn 0 Aand A??ia 0 Acorrespond to 1 s and 5 s, respectively.
  • At 0 is set to 2000 ms.
  • At ⁇ in which At t (k) corresponds to the time interval between the command k and k - 1
  • the current average of At c over the past N h commands can be defined by: ii) Eye-tracker with adaptive trial periodin synchronous mode
  • Ati can change between Amin i and Amax i.
  • the values of Amini and Amaxi correspond to 1 s and 5 s, respectively.
  • At ⁇ is set to 2000 ms.
  • I n the first condition, we define by P (select s )k the average probability to detect a com mand in the k th trial by considering the last N h previous trials. If this probability is high, then it indicates that the commands are selected in a reliable manner and the trial period can be decreased.
  • the second condition deals with the trials with no command selection. In this case, it is assumed that if a command was not selected during the interval Ati, it means that At, was too short to allow the user to select an item. In such a case, the trial periodis increased where the number of rejected commands are N r in the history of the last N h commands (N r ⁇ N h ). In the third condition, if the number of commands, N COr, corresponding to a "delete" or“undo" represents more than half of the commands in the history of N , commands included, then we assume that there exists some difficulties for the user, and the dwell time has to be increased.
  • the developed graphical user interface consists of two mai n components, which are shown in Fig 1. he first component is the visual display unit wherein a total of ten commands are shown and the command which is currently being pointed to, is highlighted in a different colour. The second component is an output text display where the user can see the typed text in real-time. The position and tree structure of the ten commands (i.e. cl to clO] are depicted in Fig 2. An alphabetical organization with script specific arrangement layout is developed as the alphabetic arrangement is easier to learn and remember, specially for complex structured language (Bhattacharya et a I., 2013a). The size of each rectangularcommand button is approximately 14% of the GUI window.
  • the GUI of the virtual keyboard is based on a multi-level menu selection method comprised of ten commands at each level.
  • the tree-based structure of the GUI provides the ability to type 45 Hindi language letters, 17 different matras (i.e. diacritics) and halants (i.e. killer strokes), 14 punctuation marks and special characters, and 10 numbers (from 0 to 9).
  • other functionalities such as delete, delete all, new line, space and go back commands for corrections are also included.
  • the first level of the GUI consists of 10 command boxes; each represents a set of language characters (i.e. 10 characters).
  • the selection of a particular character requires the user to follow a two-step task.
  • the user has to select a particular command box (i.e. at first level of GUI) where the desired character is located.
  • the successful selection of command box shifts the GUI to the second level, where the ten commands on the screen are assigned to the ten characters which belong to the selected command box in the previous level.
  • the user can see the desired character and finally select it for writing to the text-box.
  • the GUI goes back to the initial stage (i.e. at first level) to start further iterations.
  • the placement and size of the command boxes are identical at both levels of GUI.
  • the system is also designed with the inclusion of multiple modalities, and extra command features to write all the Hindi language letters including half letter script, more punctuation marks, and special characters.
  • the system adapts the parameters over time for both synchronous and asynchronous modes to improve the speed and accuracy of the system.
  • a visual feedback is provided to the user by a change in the color of the button border while looking at it.
  • the colour of the button border is silver (RGB: 192,192,192).
  • the visual feedback allows the user to continuously adjust and adapt his/her gaze to the intended region on the screen.
  • An audio feedback is also provided to the user through an acoustic beep after successful execution of each command. This sound makes them proactive so that they can prepare easily for next character.
  • the last five used characters are also displayed in the GUI at the bottom of eachcommand box, This helps the user to see the previously written character without moving eyes to the output display box.
  • the eye-tracking device acquires the gaze data.
  • a portable eye-tracker has been used for pursuing the eye gaze of the participants.
  • gesture recognition was obtained with the Myo armband for recording sEMG.
  • This non-invasive device includes a 9 degree-of- freedom (DoF) Inertial Measurement Unit (! M U), and 8 d ry sEMG sensors.
  • DoF 9 degree-of- freedom
  • M U Inertial Measurement Unit
  • 8 d ry sEMG sensors The Myo can be slipped directly on the arm to read sEMG signals with no preparation needed for the participant (no shaving of hair or skin-cleaning)and
  • a soft-switch which is used as a si ngle-input device.
  • the eye-tracking device records data at a sampli ng rate of 30 Hz.lt involves binocular infrared illumination with spatial resolution (0.1 root mean square (RMS)), which records x and y coordinates of gaze and pupil diameter for both eyes in mm.
  • RMS root mean square
  • the Myo armband provides sEMG signals with a sampli ng frequency of 200 Hz per channel. Electrode placement was set empirically in relation to the size of the participant's forearm because the Myo armband's minimum circumference size is about 20 cm. An additional short calibration was performed for each participant with the Myo (about 1 mi n).
  • the soft-switch was used as a single-input device to select a com mand on visual display unit
  • the distance of the participant from a computer screen was about 80 cm.
  • the vertical and horizontal visual angles were measured at approximately 21 and 36 degrees, respectively.
  • the typing task in the experimental protocol involves a predefined sentence with 29 characters from the Devanagari script and 9 numbers, given as
  • the transliteration of the tasksentence in English is "Kabtak Jabtak Abhyaasa Karate Raho. 44 - 4455 - 771" and the direct translation in English is Till When Until Keep Practicing. 44-4455-771.
  • the complete task involves 76 commands in one repetition if performed without committing any error.
  • This predefined sentence was formed with a particular combination of words in order to obtain an equiprobable distribution of commands for each of the ten items in the GUI.
  • the adopted arrangement provides unbiased involvement of command boxes and eye-gaze distribution over the GUI of the virtual keyboard.There were five differentcombinations of the input modalities (see Fig 4)which provided twenty different conditions of experimental design.
  • the search and selection of the target item were performed by eyes and a normal computer mouse, respectively (see Fig 4 (A)).
  • the search of the target item was performed by the eyes and the participant used the touch screen to finally select the item (see Fig 4 (B]].
  • the eye-tracker along with the soft-switch was used in a hybrid mode wherein the user focused the eye-gaze to point to the target item, and the selection happens via a soft-switch (see Fig 4 (C)).
  • the eye-tracker was used in combination with five different sEMG-based hand gestures wherein eye-gaze was used for the search purpose and each gesture acted as an input modality to select the item (see Fig 4 (D]].
  • the eye-tracker was used for both search and selection purposes with synchronous and asynchronous modes (see Fig 4 (E]).
  • the participants focused the eye-gaze at the target item for a specific period of time (i.e. dwell time (in seconds)) which results in the selection of that particular item.
  • the participants focused the eye-gaze at an item (target or non-target) during a single trial of a particular length (i.e. the trial length (in seconds)), and at the end of the trial the item was selected based on the maximum duration of the focus.
  • a soft-switch and five sEMG based hand gestures i.e. fist, wave left, wave right, fingers spread, and double tap
  • eye-tracker was used in a hybrid mode, wherein the user gaze at the target item, and the selection happens via switch/sEMG signals.
  • the use of the input devices is done once the user receives the visual feedback i.e. when the color of the gazed item begins to change.
  • Dwell time 1 s, 1.5 s, 2 s, 2.5 s, 3 s, and adaptive dwell time
  • the eye-tracker is used only in a synchronous mode for pointing to and selection of the items, where the pointing to items is done by the gaze and the selection of the items is made by the five different trial period(i.e. 1 s, 1.5 s, 2 s, 2.5 s, 3 s) and one with adaptive trial period. These different trial periods were considered to find out the optimal trial periodto design eye-tracking based synchronous system.
  • a soft- switch is used with the eye-tracking device wherein the addition of the soft-switch has helped overcome the Midas touch problem (i.e selecting irrelevant non-targeted commands) of MCI, as the user needs only point to the target item through eye-tracker, and the selection happens via the soft-switch.
  • the searching of the items on the computer screen is done by the user's gaze pointing to the items and selection is made by the soft- switch device.
  • the soft-switch was pressed by the user’s dominant hand.
  • the color based visual feedback is provided to the user during the searching of an item. The visual feedback allows the user to continuously adjust and adapt his/her gaze to the intended region on the screen. Once the item is selected, the auditory feedback is also provided /given to the user as an acoustic beep.
  • the sEMGhand gestures can be combined with an eye-trackerto provide extra input modalities to the users.
  • the eye-tracker is used to point to a command on the screen.
  • the command is selected through a hand gesture by using predefined functions from the Myo SDK.
  • Five conditions were evaluated related to gesture control with the Myo: fist (hand close), wave left (wrist flexion), wave right (wrist extension), finger spread (hand open), and double tap (see Fig 5).
  • the color-based visual feedback is provided to the user during the searching of an item. Once the item is selected, the auditory feedback is provided to the user as an acoustic beep.
  • the hybrid system helps to overcome the Midas touch problem of gaze controlled HC1 system.
  • an adaptive gaze-based multimodal virtual keyboard / visual display unit can even handle a language as complex as the Hindi language which has 45 letters, 17 diacritics and killer strokes, 14 punctuation marks and special characters, and 10 numbers.
  • the virtual keyboard/ visual display unit is capable of being operated using a portable non-invasive eye-tracker, sEMG based hand gesture recognition device, and/or a soft- switch.
  • the invention intends to resolve the issues of HCI pertaining to the virtual keyboard while using adaptive parameters and low-cost input devices.
  • the speed and accuracy of the system can be increased for the virtual keyboard of any language.
  • the user can make use of eye-gaze detection, gesture recognition, and a single input switch, either alone as a single modality or in combination as multimodality device, for a very complex structured language like Hindi.
  • the performance of virtual keyboards can be greatly increased in terms of accuracy and speed.
  • the speed and accu racy of the system have been optimized by optimizing the size of the command boxes, the distance between the boxes and taking into consideration the involuntary head and body movement.
  • This invention provides for the adaptation over time of gaze-based access controls for an eyetracking based multimodal virtual keyboard specifically created for a language with a very complex structure like the Hindi language.
  • the system is based on a tree selection method and it requires two consecutive steps for enabling a command. First, the user has to point to the item that must be selected. A red pointer on the visual display can be moved to the chosen location and a visual feedback is provided on the chosen location. Second, the user has to approve the location of the pointer in order to select the corresponding item. Once an item has been selected by the user, the system provides an audio feedback for the correspondi ng targeted item.
  • the use of the adaptive techniques in the invention can increase the accuracy of the eye-tracker, which may limit the number of commands that can be accessed at any moment as the calibration data should be updated when the user changes his/her head and body position over time. With or without adaptive methods this approach can be applied to any other language.
  • the proposed system can be directly used for the arathi/Konkani language users (70 million speakers) by including one additional letter (i.e. Therefore, the present research findings have potential application for a large user population (560 million).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Biomedical Technology (AREA)
  • Dermatology (AREA)
  • Neurology (AREA)
  • Neurosurgery (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The present invention relates to a system and method for augmentative and alternative communication (AAC) aidforpeople with severe speech and motor disabilities. The usability of such AAC systems is currently limited due to the lack of adaptive and user-centered approaches leading to low accuracy and the need for frequent recalibration. The present invention relates to a system and method for the adaptation (over time) of the dwell time in asynchronous mode and the fixed interval (trial period)in synchronous mode using a gaze based virtual keyboards. The present invention also relates to the use of a gaze basedvirtual keyboard system which is designed for a structurally complex language and optimized for multimodality involving several portable, non-invasive, and low- cost input devices including a computer mouse; a touch screen; an eyetracker or an eye-tracking device; a surface electromyography; and an access soft-switch. This invention is based on a menu driven /selection approach with 10 commands providing access to type 88 different characters of the Hindi language along with delete, clear-all, and go-back commands for corrections. The current invention relates to a first gaze based virtual keyboard system with multimodal input from various input devices, which aims to work for the Hindi language and which can be extended to various other complex language systems.

Description

Title of the invention:
A SYSTEM FOR AUGMENTATIVE AND ALTERNATIVE COMMUNICATION FOR PEOPLE WITH SEVERE SPEECH AND MOTOR DISABILITIES
Field of Invention:
The present invention generally relates to a system and method which pertains to an electronic interface between the human and computers, the said system employing eye-tracking mechanism along with the use of multimodality of input devices including the computer mouse, touch screen, eye-tracking device, surface electromyography, and an access soft switch.
Background of the invention
Over 20 M people suffer from speech and motor disorders annually worldwide; they face a lot of difficulties in communicating with other people in an intelligible way. These disabilities may include upper limb paralysis, muscular dystrophy, spinal cord injuries, cerebral palsy, and Parkinson's disease, which may impact their quality of life and employability. The disability ratios are generally higher for developing countries. More particularly, population surveys showed disabled people constitute approximately 40-80 million [about 5%) of the total population [1.02 billion] of India. Among the five types of disabilities on which data had been collected in the 2001 Census, a major portion of this population is affected by mobility impairment [27.9%) and speech impairment [7.5%]. In the fast growing technology era where computing based gadgets such as PCs and laptops are becoming an integral part of our daily lives, physical disabilities prevent people from using basic equipment [e.g. standard keyboard, touch screen, and mouse setup]. Thus, accessing the information and communication technology [ICT] applications to interact with the outside world becomes a challenging task for disabled people. Therefore, computer-based augmentative and alternative communication [AAC] systems a re developed to assist these people.
The traditional input devices, such as mouse and keyboard, are not suitable for speech and motionimpaired users to communicate. I n particular, these users have limited ability for fine motor control and therefore, they may not be able to use a mouse and normal keyboards for typing and interacting with a computer system. Hence, the efficient interaction with a computer as a means of communication and/or control becomes a great challenge for them, e.g. during accessing the ICT technologies on the web document typing, message sending, entertainment, and operating other AAC devices. The use of alternative devices and methods developed specially in accordance with their needs can help these users overcome these d ifficulties. However, the development of new assistive technologies depends on various factors, such as the type of disabilities, the equipment cost, parameter adaptation approaches, and the system usability Moreover, the advancement of the existing devices must cater to the needs of special user categories, e g. brain-computer interfaces (BCI) can be implemented for locked-in patients. Therefore, the invention aims to provide an approach for adapting the parameters over time in gaze-based interaction and validate this approach on a virtual scanning keyboard devised for the Hindi language which has a rather complex structure. The invention aims to incorporate the adaptive methods and multimodal facilities with the Hindi based eye gaze directed virtual keyboard application to overcome various confounding factors of conventional existing virtual keyboards.
Several studies have been carried out relating to this invention includingseveral models and approaches of Human-Computer Interface (HCI) and Brain-Computer Interface (BCI) to control or communicate over the assistive technology Further, this invention proposes new gaze-based system and control methods for both synchronous and asynchronous operations during the Human-Computer Interaction of HCI. Further, the invention also describes the development of the eye gaze tracking based multimodal virtual keyboard system which can be used to provide input to the virtual keyboard through various input devices likecomputer mouse, touch screen, eye tracking device, surface electromyography and an access soft switch. Particularly, it takes into account challenges related to managing a complex structure and a large set of characters in the language especially the Hindi language. The invention further provides design and experimental procedure for adapting to various parameters such as dwell time during the eye gaze, use of different models of a multimodal system for selection of the identified category from the menu driventhe display.
Disabled people, who are not completely locked-in, may still partially be able to use their body parts and gaze to communicate and control assistive devices. The gaze- based control of a wheelchair has been implemented successfully [Purwanto et al., 2009, Meena et al„ 2015a, Mat- sumoto et al., 2001, Meena et al., 2016a, Meena et al., 2017a,], which has shown its strong potential as an input modality for assistive technology. Moreover, to account for the substantial lack of mobility in severally disabled users, alternative input devices can be implemented as access switches for usage inavirtual keyboard based AAC systems.
These input devices such as access switches require minimal motor control and are available in a wide variety to be used by any active body part of the user i.e. hand, foot, mouth or head. Further, the electromyography (EMG) signals can also be used as input devices inavirtual keyboard based AAC systems. These access switches/EMG devices can be used with eye-tracking devices wherein the gaze is used to search the desired element on the screen. The user may activate the access switch/E G device to select the desired element. The switch activation varies according to the type of the switch e.g. a hand-held switch is activated by a press, while an eye-operated switch is activated by eye blinking.
Currently, a wide range of eye-tracking devices are available in the market, which offers varied functionalities, price range, and precision. There are some devices, whichrequire high precision to measure the eyes characteristics, which in turn requires the use of expensive eye-trackers.
In eye-tracking, broadly two devices have been used to measure eye movements. First, a wearable-camera-based device wherein a high resolution image for calculating the gaze point can be obtained from thewearable camera at a close distance. However, because the camera equipmentmust be worn, the user may experience discomfort during eye-tracking interactions (Lee et al., 2007a; Jacob 1995). Second, a remote-camera-based device wherein the position ofthe gaze is captured through non contacting fixed cameras at a relatively fardistance without any additional equipment or support In this case, becausethe image resolution for the eye is relatively low, pupil tremors cause severevibrations of the calculated gaze point. Furthermore, time-varying characteristics of the remote-camera-based method can lead to a low accuracy and theneed for frequent calibration (Jacob 1995; Katarzyna et al., 2014)
Similar to electroencephalography (EEG)-based BCI, the gaze-based control can be accessed in eyetracking based HCI in both synchronous (cue-paced) and asynchronous (self-paced) modes (Nicolas-Alonso et al., 2012). In synchronous mode, a user action such as click events isperformed after a fixed interval (trial period) whereas in asynchronous modethe click events are performed through a dwell time. In synchronous mode, an item is selected when the user focuses on the target item during a pre-defined trial of a particular duration, and at the end of the trial, the targetitem is selected if it has the maximum duration of the focus compared tothe estimated duration of focus on other items. In such a case, the user hasto spend a maximum amount of time on the desired item. In asynchronousmode, an item is selected when the user is focusing on this item (target)for a predefined continuous time period. These two methods effectively reflect user intention, however, these are time-consuming when there are manyselections to be made (Wolpa et al., 2002; Huckauf et al„ 2011).
In order to have aninexpensive and non-invasive device, a remote camera-based eye-tracking device can be implemented. These devices can be used to design a low-cost AAC system with minimal physical contact with the device. The virtual keyboard based AAC systems have been designed based on various keyboard approaches i.e. the Dvorak, FITALY, 0RΊΊ, Cirrin, Lewis, Hooke's, Chubon, Metropolis, and ATOMIK. However, most of the HCI based applications have been developed for the Latin script, wherein the eye- tracker devices were used to provide computer inputs to select the target items on the visual display unit (e.g. in the virtual keyboard]. The non-invasive EEG- based BCIs are also active in the field of virtual keyboard development to enable communication for severely disabled people.
To carry out the text composition with the Hindi language (i.e. official language of I ndia (490 million speakers]], which has a complex structure and a large set of characters, most of the above mentioned keyboards suffer from lower text entry rate, lesser user friendliness, error prone text entry, and complexity in design, larger layout, and large design space exploration In particular, the Hindi language has more than twice the number of characters used in the English language. There have been several attempts to optimize the Hind i language virtual keyboards. These approaches included separate inflexion panel and visual display unit (virtual keyboard] with dynamic inflexion window. These keyboards were implemented with mouse and switch input modalities and thus their congested key structure was not suitable for the gaze-based control. Therefore, some special arrangements are required to be made in designing the layouts so that gaze constraints be taken into account.
The composition of text in the Hindi language contains a large alphabet, matras (i.e. diacritics], haiants (i.e. killer strokes] and other complex characters. Hence typing in Hindi language using QWERTY keyboard is not an easy task as significant training is required to compose the text. In previous studies, QWERTY keyboard has been optimized to design vi rtual keyboard applications in Bengali and Meitei language. However, only one study has shown the head mounted gaze controlled text entry interface in the Hindi language, but the usability of the system is not very satisfactory.
A key concern in eye-tracker based interfaces is to quantify the intention, which may be confounded by pseudo interpretations. This issue aggravates due to involuntary eye movements, which lead to the false item selections (Midas- Touch problem]. Therefore, it is a highly challenging to control the QWERTY keyboard by eye movement tracking. M ultimodal and hybrid interfaces have been utilized to counter th is issue However, there has been no attempt to make a gaze based virtual keyboard (visual display unit] which uses a multimodal access facility and which can also overcome the issue of Midas touch.
The usability of AAC systems with gaze-based access controls is currently impaired by the difficulty to find optimal parameters, such as the dwell time, as they can depend on both the user and the current state of the user (e.g fatigue, knowledge of the system]. I n addition, the changes of attention, the degree of fatigue, and the users' head motion while controlling the application represent obstacles for efficient gaze-based access controls as they can lead to a low accuracy. These continuous variations can be overcome by recalibrating the system, but it can be time consuming and may not be user friendly. Another solution to overcome this affect is that the system should be made adaptive over the time based on its performance by considering key features of the application (i.e. by incorporating the corrections).
The previously mentioned issues related to the high number of commands that can be accessed, the Midas touch problem, and the requirement of adapting parameters need to be taken into account to design an intelligent user interface meeting constraints of portable eye-tracking systems.
This invention intends to address the issues of adaptation of (over time) of dwell time in asynchronous mode and the trial periodin synchronous mode for gaze based virtual keyboards, and incorporation of the multimodal access facility wherein the search of a target item in the visual display unit (virtual keyboard) is done by gaze detection and the selection can happen via the use of a dwell time, soft-switch, or gesture detection using surface electromyography (sEMG) in asynchronous mode; and the search and selection may be performed with eye-tracker in synchronous mode.
Brief description of the accompanying drawings.
Fig.l represents the layout of the Visual Display Unit.
Fig. 2 represents the position of the commands and the tree structure for letter selection.
Fig. 3 represents the different input devices for providing i nput signals Fig. 4 represents the multimodal system with multi modal input devices Fig. 5 represents the gesture controlled arm band with different arm gestures.
Fig. 6 represents the layout of the system i ncorporating the devices.
Detailed description of the invention with reference to the accompanying drawings:
A gaze based control can be accessed in two diffierent modes. The eye-tracking can be used for both search and selection purposes with synchronous (i.e. cue-paced) and asynchronous (i.e. self- paced) modes. First, the asynchronousmode offers a natural mode of i nteraction without waiting for an external cue.The com mand selection is managed through the dwell ti me concept. Duringthis mode, the users focus their attention by fixating the target item for aspecific period of time (i.e., dwell time in seconds) which results in the selection of that particular item. Second, the way of interaction Asynchronous mode is mainly based on an external cue. This mode can be usedto avoid artifacts such as involuntary eye movements of users as the com mandis selected at the end of the trial duration/trial period. During this mode, the users focused their attention by fixating an item during a single trial of aparticular length (i.e. the trial length (in seconds)), and the item was selectedat the end of the trial based on the maximum duration of focus.
If the total number of commands that are available at any time in the system is denoted by M[here M = 10). Each command C; is defined by the coordinates correspond i ng to the center of its box c, y (x'c,y'c), where ie {1..M}. We denote the gaze coordinates at time t by (xt, yt), then the distance between a command box and the current gaze position, d'tis defined by its Euclidean distance as:
Figure imgf000008_0001
We denote the selected command at time t by selectt, where 1 < selectt < M.
For the asynchronous and synchronous modes, we defined the dwell time and the trial periodas Ato and Ati, respectively Ato represents the minimum time that is required to select a com mand i.e. when a subject continuously keeps his/her gaze on a com mand. I n synchronous mode, Ati represents the time after which a command can be selected based on where the user was looking during the trial period.
However, both synchronous and asynchronous modes can suffer from low-text entry rate and depend on specialties of the users when using the predefined time parameters. Therefore, the adaptation over time is essential for designing a more natural mode of interaction. i) Eye-tracker with adaptive dwell time in asynchronous mode
For the adaptive dwell time in asynchronous mode, we consider two conditions where At can change between Amin0Aand Amax0where Amtn0Aand A??ia 0Acorrespond to 1 s and 5 s, respectively. I nitially, At0is set to 2000 ms.
In the first condition, if the number of commands, NCOr, corresponding to a "delete" or "undo" represents more than half of the commands in the history of commands Ni,(i.e. 2NCOr ³ Nh), then it is assumed that there exists some difficulties for the user, and the dwell time has to be increased. In the second conditionif the average time between two consecutive commands during Nh commands is close to the dwell time, then it is assumedthatthe current dwell time acts as a bottleneck and it can be reduced. If we denote the variable that contains the difference of time between two consecutive commands by At,· in which Att(k) corresponds to the time interval between the command k and k - 1, the current average of Atc over the past Nh commands can be defined by:
Figure imgf000009_0001
ii) Eye-tracker with adaptive trial periodin synchronous mode
With the adaptive trial period(i.e. trial duration Ati) in synchronous mode, we consider three conditions where Ati can change between Amin i and Amax i. The values of Amini and Amaxi correspond to 1 s and 5 s, respectively.
Initially, At^ is set to 2000 ms. I n the first condition, we define by P (selects)k the average probability to detect a com mand in the kth trial by considering the last Nh previous trials. If this probability is high, then it indicates that the commands are selected in a reliable manner and the trial period can be decreased.
Figure imgf000009_0002
The second condition deals with the trials with no command selection. In this case, it is assumed that if a command was not selected during the interval Ati, it means that At, was too short to allow the user to select an item. In such a case, the trial periodis increased where the number of rejected commands are Nr in the history of the last Nh commands (Nr < Nh). In the third condition, if the number of commands, NCOr, corresponding to a "delete" or“undo" represents more than half of the commands in the history of N , commands included, then we assume that there exists some difficulties for the user, and the dwell time has to be increased.
The developed graphical user interface (GUI) consists of two mai n components, which are shown in Fig 1. he first component is the visual display unit wherein a total of ten commands are shown and the command which is currently being pointed to, is highlighted in a different colour. The second component is an output text display where the user can see the typed text in real-time. The position and tree structure of the ten commands (i.e. cl to clO] are depicted in Fig 2. An alphabetical organization with script specific arrangement layout is developed as the alphabetic arrangement is easier to learn and remember, specially for complex structured language (Bhattacharya et a I., 2013a). The size of each rectangularcommand button is approximately 14% of the GUI window. All command buttons are placed on the periphery of the visual display screen while the output text box is placed at the center of the screen as shown in Figl. The GUI of the virtual keyboard is based on a multi-level menu selection method comprised of ten commands at each level. The tree-based structure of the GUI provides the ability to type 45 Hindi language letters, 17 different matras (i.e. diacritics) and halants (i.e. killer strokes), 14 punctuation marks and special characters, and 10 numbers (from 0 to 9). Moreover, other functionalities such as delete, delete all, new line, space and go back commands for corrections are also included.
The first level of the GUI consists of 10 command boxes; each represents a set of language characters (i.e. 10 characters). The selection of a particular character requires the user to follow a two-step task. In the first step, the user has to select a particular command box (i.e. at first level of GUI) where the desired character is located. The successful selection of command box shifts the GUI to the second level, where the ten commands on the screen are assigned to the ten characters which belong to the selected command box in the previous level. In the second step, the user can see the desired character and finally select it for writing to the text-box. After the selection of a particular character at the second level, the GUI goes back to the initial stage (i.e. at first level) to start further iterations. The placement and size of the command boxes are identical at both levels of GUI. The system is also designed with the inclusion of multiple modalities, and extra command features to write all the Hindi language letters including half letter script, more punctuation marks, and special characters. The system adapts the parameters over time for both synchronous and asynchronous modes to improve the speed and accuracy of the system.
While using the gaze based eye-tracking system on a virtual keyboard, it is also required that the user be provided an efficient feedback that the intended command box/character has been selected so that the mistakes are avoided and there in an increase in the efficiency. Hence, a visual feedback is provided to the user by a change in the color of the button border while looking at it. Initially, the colour of the button border is silver (RGB: 192,192,192). When the user pays attention to a particular button for a duration of time t, the color of the border changes linearly in relation to the dwell time At0 or the trial period(i.e. trial duration) Ati and the border becomes greener with time. The RGB color is defined as (R=v,G=255,B=v), where v = 255 * (Ato - t)/Ato · The visual feedback allows the user to continuously adjust and adapt his/her gaze to the intended region on the screen. An audio feedback is also provided to the user through an acoustic beep after successful execution of each command. This sound makes them proactive so that they can prepare easily for next character. Moreover, to improve the system performance by using minimal eye movements, the last five used characters are also displayed in the GUI at the bottom of eachcommand box, This helps the user to see the previously written character without moving eyes to the output display box. The eye-tracking device acquires the gaze data.
In order to assess the adaptability and usability of the system, twenty-fou r healthy volunteers (5 females) in the age range of 21-32 years (27.05+2.96) participated in this study (see in Table 1). Fifteen participants performed the experiments with vision correction. No participant had prior experience of using an eye-tracker, soft-switch and/or sEMG with the application.
Three different input devices were used in this study (see Fig 3). First, a portable eye-trackerhas been used for pursuing the eye gaze of the participants. Second, gesture recognition was obtained with the Myo armband for recording sEMG. This non-invasive device includes a 9 degree-of- freedom (DoF) Inertial Measurement Unit (! M U), and 8 d ry sEMG sensors. The Myo can be slipped directly on the arm to read sEMG signals with no preparation needed for the participant (no shaving of hair or skin-cleaning)and Third, a soft-switchwhich is used as a si ngle-input device.
The eye-tracking device, records data at a sampli ng rate of 30 Hz.lt involves binocular infrared illumination with spatial resolution (0.1 root mean square (RMS)), which records x and y coordinates of gaze and pupil diameter for both eyes in mm.
The Myo armband provides sEMG signals with a sampli ng frequency of 200 Hz per channel. Electrode placement was set empirically in relation to the size of the participant's forearm because the Myo armband's minimum circumference size is about 20 cm. An additional short calibration was performed for each participant with the Myo (about 1 mi n).
The soft-switch was used as a single-input device to select a com mand on visual display unit The distance of the participant from a computer screen was about 80 cm. The vertical and horizontal visual angles were measured at approximately 21 and 36 degrees, respectively.
To start the procedure, the typing task in the experimental protocol involves a predefined sentence with 29 characters from the Devanagari script and 9 numbers, given as
Figure imgf000011_0001
The transliteration of the tasksentence in English is "Kabtak Jabtak Abhyaasa Karate Raho. 44 - 4455 - 771" and the direct translation in English is Till When Until Keep Practicing. 44-4455-771. The complete task involves 76 commands in one repetition if performed without committing any error. This predefined sentence was formed with a particular combination of words in order to obtain an equiprobable distribution of commands for each of the ten items in the GUI. Thus, the adopted arrangement provides unbiased involvement of command boxes and eye-gaze distribution over the GUI of the virtual keyboard.There were five differentcombinations of the input modalities (see Fig 4)which provided twenty different conditions of experimental design.
First, the search and selection of the target item were performed by eyes and a normal computer mouse, respectively (see Fig 4 (A)). Second, the search of the target item was performed by the eyes and the participant used the touch screen to finally select the item (see Fig 4 (B]]. Third, the eye-tracker along with the soft-switch was used in a hybrid mode wherein the user focused the eye-gaze to point to the target item, and the selection happens via a soft-switch (see Fig 4 (C)). Fourth, the eye-tracker was used in combination with five different sEMG-based hand gestures wherein eye-gaze was used for the search purpose and each gesture acted as an input modality to select the item (see Fig 4 (D]]. This combination of input modality covered five different experimental conditions. Fifth, the eye-tracker was used for both search and selection purposes with synchronous and asynchronous modes (see Fig 4 (E]). During the asynchronous mode, the participants focused the eye-gaze at the target item for a specific period of time (i.e. dwell time (in seconds)) which results in the selection of that particular item. During the synchronous mode, the participants focused the eye-gaze at an item (target or non-target) during a single trial of a particular length (i.e. the trial length (in seconds)), and at the end of the trial the item was selected based on the maximum duration of the focus. We implemented asynchronous and synchronous modes with five different dwell time and trial periodvalues, respectively, resulting in ten different experimental conditions. In addition, there were two more experimental conditions which incorporated asynchronous and synchronous modes with adaptive dwell time and adaptive trial period, respectively (see Fig 4 (E)).
In one condition, a soft-switch and five sEMG based hand gestures (i.e. fist, wave left, wave right, fingers spread, and double tap) along with eye-tracker were used. In this case, the eye-tracker was used in a hybrid mode, wherein the user gaze at the target item, and the selection happens via switch/sEMG signals. The use of the input devices is done once the user receives the visual feedback i.e. when the color of the gazed item begins to change.
In another condition, the eye-tracker is used only in an asynchronous mode with different dwell time (i.e. Dwell time = 1 s, 1.5 s, 2 s, 2.5 s, 3 s, and adaptive dwell time), where the item is determined through gazing, and the item selection is made by dwell time/adaptive dwell time.
In yet another condition, the eye-tracker is used only in a synchronous mode for pointing to and selection of the items, where the pointing to items is done by the gaze and the selection of the items is made by the five different trial period(i.e. 1 s, 1.5 s, 2 s, 2.5 s, 3 s) and one with adaptive trial period. These different trial periods were considered to find out the optimal trial periodto design eye-tracking based synchronous system.
In another condition, use of different input devices is carried out with the eye-tracker. A soft- switch is used with the eye-tracking device wherein the addition of the soft-switch has helped overcome the Midas touch problem (i.e selecting irrelevant non-targeted commands) of MCI, as the user needs only point to the target item through eye-tracker, and the selection happens via the soft-switch. In other words, the searching of the items on the computer screen is done by the user's gaze pointing to the items and selection is made by the soft- switch device. In this case, the soft-switch was pressed by the user’s dominant hand. The color based visual feedback is provided to the user during the searching of an item. The visual feedback allows the user to continuously adjust and adapt his/her gaze to the intended region on the screen. Once the item is selected, the auditory feedback is also provided /given to the user as an acoustic beep.
Further in the hybrid mode, the sEMGhand gestures can be combined with an eye-trackerto provide extra input modalities to the users. The eye-tracker is used to point to a command on the screen. Then, the command is selected through a hand gesture by using predefined functions from the Myo SDK. Five conditions were evaluated related to gesture control with the Myo: fist (hand close), wave left (wrist flexion), wave right (wrist extension), finger spread (hand open), and double tap (see Fig 5). The color-based visual feedback is provided to the user during the searching of an item. Once the item is selected, the auditory feedback is provided to the user as an acoustic beep. Thus the hybrid system helps to overcome the Midas touch problem of gaze controlled HC1 system.
From the above description, it can be seen that an adaptive gaze-based multimodal virtual keyboard / visual display unit can even handle a language as complex as the Hindi language which has 45 letters, 17 diacritics and killer strokes, 14 punctuation marks and special characters, and 10 numbers. The virtual keyboard/ visual display unit is capable of being operated using a portable non-invasive eye-tracker, sEMG based hand gesture recognition device, and/or a soft- switch. The invention intends to resolve the issues of HCI pertaining to the virtual keyboard while using adaptive parameters and low-cost input devices.
With the use of this invention, in case of both adaptive synchronous and asynchronous modes, the speed and accuracy of the system can be increased for the virtual keyboard of any language. Secondly, the user can make use of eye-gaze detection, gesture recognition, and a single input switch, either alone as a single modality or in combination as multimodality device, for a very complex structured language like Hindi. Further, with the use of time adaptive methods along with the different input devices like touch screen, soft-switch and gesture recognition devices wherein users can employ any of these according to their comfort and/or need, the performance of virtual keyboards can be greatly increased in terms of accuracy and speed. The speed and accu racy of the system have been optimized by optimizing the size of the command boxes, the distance between the boxes and taking into consideration the involuntary head and body movement.
This invention provides for the adaptation over time of gaze-based access controls for an eyetracking based multimodal virtual keyboard specifically created for a language with a very complex structure like the Hindi language. The system is based on a tree selection method and it requires two consecutive steps for enabling a command. First, the user has to point to the item that must be selected. A red pointer on the visual display can be moved to the chosen location and a visual feedback is provided on the chosen location. Second, the user has to approve the location of the pointer in order to select the corresponding item. Once an item has been selected by the user, the system provides an audio feedback for the correspondi ng targeted item. The use of the adaptive techniques in the invention can increase the accuracy of the eye-tracker, which may limit the number of commands that can be accessed at any moment as the calibration data should be updated when the user changes his/her head and body position over time. With or without adaptive methods this approach can be applied to any other language. The proposed system can be directly used for the arathi/Konkani language users (70 million speakers) by including one additional letter (i.e.
Figure imgf000014_0001
Therefore, the present research findings have potential application for a large user population (560 million).

Claims

WeClaims:
1. A system for augmentative and alternative communication for people with severe speech and motor disabilities comprising of a user interface having multimodal inputs, the system comprising of:
a. input means (1,2, 3, 4) for receiving the input communication from the user;
b. processing means (7} for processing the input received;
C. display means (5) to display the processed input from the processi ng means;
d. auditory means to (6] confirm the selection based on the input means; wherein each of the input device either alone or in combination of the other input devices is used to display the desired command on the display device.
2. A system as claimed in claim 1, wherein the input means is a Gaze based eye tracking device which can track the movement of the eyes and is adapted to differentiate between the focused and involuntary movement of the eye during the process of gazing.
3. A system as claimed in claim 2, wherein the device is a non-invasive remote camera based eye tracking device.
4. A system as claimed in claim 1, wherein the input means is a mouse/keyboard or a touch screen device, which can provide the input to the system.
5. A system as claimed in claim 1, wherein the input device can be a soft -switch which when pressed in combi nation of the eye gaze input, selects the gazed input.
6. A system as claimed in any of the above claims wherein the input device can be an armband /sEMG based gesture recognition device which can sense the gestures of the hand as an input to the device.
7. A system as claimed in any of the preceding claims wherein the processor gets the input from the multimodal input devices and generates response on the visual display units.
8. A system as claimed in any of the preceding claims wherein the input received from the gaze based eye tracking device is analysed by the processor for various dwell time of the eye and output is generated on the visual display unit.
9. A system as claimed in any of the preceding claims wherein the visual display unit/ virtual keyboard is a menu driven device.
10. A systems claimed in any of the preceding claims wherein the characters on the Visual display unit / virtual keyboard are based on a tree selection method
11. A system as claimed in the above claim wherein the selection of a command requires at least two consecutive steps, wherein the characters are first locked by the help of the Gaze based tracking device and then the command is selected by means of input devices in synchronous mode.
12. A system as claimed in any of the preceding claims wherein the character is locked on the visual display unit with the input from Gaze based eye tracking device and the command is then selected with the help of the Gaze dwell time in asynchronous mode.
13. A system as claimed in any of the preceding claims wherein the dwell time of the system is increased if the number of "delete" or "undo” command are more than the number of commands in the command history.
14. A system as claimed in any of the preceding claims wherein the dwell time of the system is decreased if the average time between two consecutive commands i n command h istory is equal or close to the dwell time.
15. A system as claimed in the above claim wherein the last five used characters are displayed in the visual display unit at the bottom of each command box, helping the user to see the previously written characters without shifting significantly thei r gaze from the desired command box to the output display box.
16. A system as claimed in the above claim wherein the system offers has different modalities, which can be selected in relation to the preference of the user.
PCT/IN2018/000034 2018-02-27 2018-06-29 A system for augmentative and alternative communication for people with severe speech and motor disabilities Ceased WO2019167052A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN201811007268 2018-02-27
IN201811007268 2018-02-27

Publications (1)

Publication Number Publication Date
WO2019167052A1 true WO2019167052A1 (en) 2019-09-06

Family

ID=67805293

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2018/000034 Ceased WO2019167052A1 (en) 2018-02-27 2018-06-29 A system for augmentative and alternative communication for people with severe speech and motor disabilities

Country Status (1)

Country Link
WO (1) WO2019167052A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3945402A1 (en) * 2020-07-29 2022-02-02 Tata Consultancy Services Limited Method and device providing multimodal input mechanism
JP2024016984A (en) * 2022-07-27 2024-02-08 学校法人光産業創成大学院大学 Gaze input device and program for gaze input device
WO2024235373A1 (en) * 2023-05-16 2024-11-21 Bayerische Motoren Werke Aktiengesellschaft Control of a motor vehicle

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040143430A1 (en) * 2002-10-15 2004-07-22 Said Joe P. Universal processing system and methods for production of outputs accessible by people with disabilities
WO2017222997A1 (en) * 2016-06-20 2017-12-28 Magic Leap, Inc. Augmented reality display system for evaluation and modification of neurological conditions, including visual processing and perception conditions

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040143430A1 (en) * 2002-10-15 2004-07-22 Said Joe P. Universal processing system and methods for production of outputs accessible by people with disabilities
WO2017222997A1 (en) * 2016-06-20 2017-12-28 Magic Leap, Inc. Augmented reality display system for evaluation and modification of neurological conditions, including visual processing and perception conditions

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3945402A1 (en) * 2020-07-29 2022-02-02 Tata Consultancy Services Limited Method and device providing multimodal input mechanism
JP2024016984A (en) * 2022-07-27 2024-02-08 学校法人光産業創成大学院大学 Gaze input device and program for gaze input device
WO2024235373A1 (en) * 2023-05-16 2024-11-21 Bayerische Motoren Werke Aktiengesellschaft Control of a motor vehicle

Similar Documents

Publication Publication Date Title
US20230072423A1 (en) Wearable electronic devices and extended reality systems including neuromuscular sensors
CN114341779B (en) Systems, methods, and interfaces for performing input based on neuromuscular control
Majaranta et al. Text entry by gaze: Utilizing eye-tracking
Lystbæk et al. Exploring gaze for assisting freehand selection-based text entry in ar
CN102822771B (en) Based on the contextual action of eye tracker
Majaranta et al. Eye movements and human-computer interaction
CN101311882A (en) Eye-tracking human-computer interaction method and device
Meena et al. Design and evaluation of a time adaptive multimodal virtual keyboard
Rozado et al. Gliding and saccadic gaze gesture recognition in real time
US20250095302A1 (en) Wearable Electronic Devices And Extended Reality Systems Including Neuromuscular Sensors
Zhao et al. Typing with eye-gaze and tooth-clicks
Lupu et al. Eye tracking user interface for Internet access used in assistive technology
Cecotti et al. A multimodal virtual keyboard using eye-tracking and hand gesture detection
WO2019167052A1 (en) A system for augmentative and alternative communication for people with severe speech and motor disabilities
Tian et al. Designing upper-body gesture interaction with and for people with spinal muscular atrophy in VR
Singh et al. Enhancing an eye-tracker based human-computer interface with multi-modal accessibility applied for text entry
Ren et al. Eye-hand typing: Eye gaze assisted finger typing via bayesian processes in ar
Walmsley et al. Disambiguation of imprecise input with one-dimensional rotational text entry
Darbar et al. OnArmQWERTY: An Empirical Evaluation of On-Arm Tap Typing for AR HMDs
Singh et al. Object acquisition and selection using automatic scanning and eye blinks in an HCI system
Soundarajan et al. A gaze-based virtual keyboard using a mouth switch for command selection
Porter et al. Blink, pull, nudge or tap? the impact of secondary input modalities on eye-typing performance
Ohno et al. Gaze-Based Interaction for Anyone, Anytime.
Nowosielski 3-steps keyboard: reduced interaction interface for touchless typing with head movements
EP4321976A1 (en) Providing input commands from input device to electronic apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18907718

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18907718

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 18907718

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09/06/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18907718

Country of ref document: EP

Kind code of ref document: A1