WO2024072310A1

WO2024072310A1 - An interaction system

Info

Publication number: WO2024072310A1
Application number: PCT/SE2023/050970
Authority: WO
Inventors: Mattias KRUS
Original assignee: FlatFrog Laboratories AB
Current assignee: FlatFrog Laboratories AB
Priority date: 2022-09-30
Filing date: 2023-09-29
Publication date: 2024-04-04
Anticipated expiration: 2025-03-30
Also published as: EP4594847A1

Abstract

An interaction system is disclosed comprising a positioning unit configured to receive spatial data of the position of a user, a processing unit configured to detect a feature of a user's body, hand and/or an input object based on the spatial data, determine, upon detecting said feature, user coordinates of the user's hand and/or the input object, based on the spatial data, generate a mapping region as a range of input coordinates, map the mapping region to a display coordinate system of an interaction display, determine an associated onset input coordinate of the user's hand and/or input object in the mapping region, map the onset input coordinate to the display coordinate system, generate a control signal to display an output associated with the onset input coordinate at a display coordinate in the display coordinate system.

Description

AN INTERACTION SYSTEM

Technical Field

The present invention relates to an interaction system for receiving input from a user in a display coordinate system and a related method.

Background

A user may interact with a display in various ways e.g. by utilizing touch sensitive functionality thereof to provide input. A presenter may manipulate different GUI elements or display objects located at different parts of the display, or highlight parts of a presentation, besides from the typical writing and drawing of text and figures on the display. As such interaction systems becomes more widespread and used in wider range of situations, even in presentation- and conference events for large audiences and meetings when large displays are used, or in class rooms, it becomes however difficult for participating users to take advantage of capabilities of the interaction system. The advantages of the interaction system may be diminished by cumbersome organization and management of several user’s engagement with the display. It would thus be advantageous to provide an interaction system which is more practical and feasible to use in a wider range of applications where the number of participating users may vary significantly.

Summary

It is an objective of the invention to at least partly overcome one or more of the above-identified limitations of the prior art.

One objective is to provide an interaction system with facilitated user input.

Another objective is to provide an interaction system which is easier to use for a wider range in the number of participating users.

One or more of these objectives, and other objectives that may appear from the description below, are at least partly achieved by means of an interaction system and a related method according to the independent claims, embodiments thereof being defined by the dependent claims.

According to a first aspect an interaction system is provided comprising a positioning unit configured to receive spatial data of the position of a user in front of an interaction display, a processing unit in communication with the positioning unit and being configured to detect a feature of a user’s body, hand and/or an input object held by the user based on the spatial data, when the user presents said feature in front of the interaction display, determine, upon detecting said feature, user coordinates (x,y) of the user’s hand and/or the input object, based on the spatial data, generate a mapping region as a range of input coordinates (xi,yi), wherein the mapping region defines an interaction space in a region (A) in front of the user and the interaction display, map the mapping region to a display coordinate system (xd,yd) of the interaction display, determine, upon detecting said feature, an associated onset input coordinate (x’i,y’i) of the user’s hand and/or input object in the mapping region, map the onset input coordinate (x’i,y’i) to the display coordinate system (xd,yd), and generate a control signal to display an output (203) associated with the onset input coordinate (x’i,y’i) at a display coordinate (x’d,y’d) in the display coordinate system.

According to a second aspect a method in an interaction system is provided comprising receiving at a positioning unit spatial data of the position of a user in front of an interaction display, detecting a feature of a user’s body, hand and/or an input object held by the user based on the spatial data, when the user presents said feature in front of the interaction display, determining, upon detecting said feature, user coordinates (x,y) of the user’s hand and/or the input object, based on the spatial data, generating a mapping region as a range of input coordinates (xi,yi), wherein the mapping region defines an interaction space in a region (A) in front of the user, mapping the mapping region to a display coordinate system (xd,yd) of the interaction display, determining, upon detecting said feature, an associated onset input coordinate (x’i,y’i) of the user’s hand and/or input object in the mapping region, mapping the onset input coordinate (x’i,y’i) to the display coordinate system (xd,yd), and generating a control signal to display an output associated with the onset input coordinate (x’i,y’i) at a display coordinate (x’d.y’d) in the display coordinate system.

According to a third aspect a computer program product is provided comprising instructions which, when the program is executed by a computer, cause the computer to carry out the steps of the method according to the second aspect.

Further examples of the invention are defined in the dependent claims, wherein features for the first aspect may be implemented for the second and subsequent aspects, and vice versa.

Some examples of the disclosure provide for an interaction system with a facilitated user input to a display.

Some examples of the disclosure provide an interaction system which is easier to use for a large number of participating users.

Some examples of the disclosure provide an interaction system which is easier to use for a wider range of applications.

Some examples of the disclosure provide for an intuitive input to a display of an interaction system.

Some examples of the disclosure provide for a facilitated input of several users simultaneously to a display of an interaction system.

Some examples of the disclosure provide for increased input flexibility to an interaction system.

Some examples of the disclosure provide for interaction with a display from a distance.

Some examples of the disclosure provide for reducing false user inputs to an interaction system.

Some examples of the disclosure provide for an interaction system which accepts several modes of user input, such as both non-contact input and touch input to a display.

It should be emphasized that the term “comprises/comprising” when used in this specification is taken to specify the presence of stated features, integers, steps or components but does not preclude the presence or addition of one or more other features, integers, steps, components or groups thereof.

Brief Description of the Drawings

These and other aspects, features and advantages of which examples of the invention are capable of will be apparent and elucidated from the following description of examples of the present invention, reference being made to the accompanying schematic drawings, in which;

Fig. 1 shows an interaction system according to an example of the disclosure;

Figs. 2a-g shows an example of a user providing input to a interaction system according to an example of the disclosure;

Fig. 3 shows an interaction system according to an example of the disclosure; and

Fig. 4 is a flowchart of a method in an interaction system according to an example of the disclosure.

Detailed Description

Specific examples of the invention will now be described with reference to the accompanying drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these examples are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. The terminology used in the detailed description of the examples illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, like numbers refer to like elements.

Fig. 1 is a schematic illustration of an interaction system 100 comprising a positioning unit 101 configured to receive spatial data of the position of a user in front of an interaction display 201 . The interaction display 201 may be a digital whiteboard or other display accepting touch input, as well as a non-touch display in some examples. The user may be positioned at a distance from the interaction display 201 , such as at a distance longer than an arm’s length to the interaction display 201 , in particular when several users are participating in using the interaction system 100, such as in a classroom or in other large presentation venues. Fig. 3 is a schematic illustration of several users positioned at different distances (h , I2) from the interaction display 201 . The spatial data may comprise image data of the user, being captured by an image sensor device 105 as described further below. The interaction system 100 comprises a processing unit 102 in communication with the positioning unit 101. The processing unit 102 is configured to communicate with the interaction display 201 , and to generate a control signal to the interaction display 201 for display of the user’s input as described below. It should be understood that the control signal may also control or manipulate a GUI displayed on the interaction display 201. The processing unit 102 and the positioning unit 101 may be connected to an existing interaction display 201 for providing additional modes of input to such existing interaction display 201. The interaction system 100 may comprise the interaction display 201 in some examples.

The processing unit 102 is configured to detect a feature of a user’s hand, such as a finger or the hand itself, based on the spatial data, when the user presents said feature in front of the interaction display 201. The user may form shapes of the hand and fingers which are captured in the spatial data and recognized by the processing unit 102 based on comparisons with stored reference models of such shapes, i.e. reference features. The processing unit 102 may be configured to detect the directionality of such features, e.g. in which direction the user points a finger. In other examples, the processing unit 102 is configured to detect a feature of other parts of the user’s body, such as a user’s face, eyes, wrist, underarm, elbow, overarm, and/or shoulder. Alternatively, or in combination, the processing unit 102 is configured to detect a feature of an input object 202 held by the user based on the spatial data, when the user presents the input object 202, such as a stylus, in front of the interaction display

201 . The feature may comprise a particular shape or color of the input object

202. The processing unit 102 is configured to determine, upon detecting the aforementioned feature, user coordinates (x,y) of the user’s hand and/or the input object 202, based on the spatial data. It should be understood that the user coordinates (x,y) may be determined for any of the above mentioned features, such as for a user’s finger, or the tip of a stylus, and that the disclosure refers to user coordinates (x,y) of the user’s hand and/or the input object 202 for brevity. Figs. 2a-g are schematic illustrations of an example where a user provides input to the interaction system 100, where each caption (i.e. each of the illustrated squares surrounding the user’s position as indicated in Figs. 2a-g) may represent at least part of image data received as spatial data by the positioning unit 101. Figs. 2a-b show the user positioned in any general pose for passive participation when input is not desired (Fig. 2a) followed by raising of the hand, index finger or the presentation of an input object 202 such as a stylus (Fig. 2b) as the user intends to provide input to the interaction system 100. The processing unit 102 may be configured to detect the hand, index finger or the input object 202 as the aforementioned feature and the associated user coordinates (x,y) in the coordinate system of the captured image data, as schematically illustrated in Fig. 2b.

The processing unit 102 is configured to generate a mapping region 103 as a range of input coordinates (xi,yi), as schematically shown in the example of Fig. 2c. The mapping region 103 may be chosen as a predetermined size relative to the size of the user in the captured image data, such as a predetermined ratio relative to the user’s size. For example, it may be assumed that the user’s head has a certain size, corresponding to an average reference size. The number of pixels in the image data corresponding to e.g. the width or height of the user’s head may thus be determined, and the size of the mapping region 103 may be determined as a multiple of said number of pixels. In one example the mapping region 103 has a width of 40 cm and a height of 30 cm, but it is conceivable that different sizes may set depending on the application. Advantageously the user should be able to comfortably reach across the full extent of mapping region 103, from corner to corner, while still having a size which facilitates distinct and intuitive interaction. The size of the mapping region 103 may also be determined based on an estimated distance between the user and the interaction display 201 as described further below. The mapping region 103 may be centered at the presented feature, e.g. at the detected fingertip or at the tip of the stylus 202, as shown in the example of Fig. 2c. It is conceivable however that the position of the mapping region 103 may have any defined relationship to the user’s presented feature. E.g. the mapping region 103 may be off-set in a direction towards the user’s head, so that the user may reach across the full extent of the mapping region 103 while keeping the hand closer to the head. In a further example the mapping region 103 may be generated at a defined position relative to another body part of the user. E.g. the mapping region 103 may extend from the position of the user’s shoulder, similar to what is seen in the illustration of Fig. 2c. This provides also for a comfortable reach for the user with the hand or input object 202 across the size of the mapping region 103. Upon detecting said feature, the processing unit 102 may be triggered to generate and position the mapping region 103 so that the user’s hand or input object 202 is positioned inside the mapping region 103, e.g. as seen in the example in Fig. 1 . However, in other examples, such as when the mapping region 103 is positioned based on the location of the user’s shoulder, it is conceivable that the user’s hand or input object 202 may initially be positioned outside the mapping region 103.

The processing unit 102 may thus be configured to translate the user coordinates (x,y) of the user’s hand and/or input object 202 to the coordinate system (xi,yi) of the mapping region 103, i.e. to the input coordinates (xi,yi) as referred to above and illustrated in Fig. 2c. The mapping region 103 accordingly defines an interaction space 104 in a region (A) in front of the user (Fig. 1) where the user’s input coordinates (xi,yi) are determined.

The processing unit 102 is configured to map the mapping region 103 to a display coordinate system (xd,yd) of the interaction display 201. Hence, each input coordinate (xi,yi) has a corresponding coordinate (xd,yd) in the display coordinate system (xd,yd). The mapping region 103 is advantageously mapped to the full size of the display area of interaction display 201 so that the user may interact with the entire interaction display 201 by navigating in the mapping area 103. The processing unit 102 is configured to determine, upon detecting the aforementioned feature, an associated onset input coordinate (x’i,y’i) of the user’s hand and/or input object 202 in the mapping region 103. The onset input coordinate (x’i,y’i) is the first set of coordinates detected for the user’s input in the input coordinate system (xi,yi) of the mapping region 103. I.e. as the user intends to provide input to the interaction system 100 and presents a feature such as the index finger, which is recognized by the processing unit 102, the processing unit 102 is triggered to determine the onset input coordinate (x’i,y’i) of the index finger in the mapping region 103, as schematically illustrated in the example of Fig. 2d. In some examples it may not be necessary to determine the exact position of the user’s index finger, and it may be sufficient to determine the position of the user’s hand in the input coordinate system (xi, yi) . Although examples of the disclosure describe determining positions with reference to x- and y-coordinates it should be understood that corresponding positioning and mapping may be done based on a three dimensional x-,y-,z-coordinate system, e.g. in case the spatial data is captured with 3D information such as by a stereo imaging system. Depth information, with respect to the position of the user, may in such case be utilized for the positioning of an appropriate mapping region 103 in relation to the user, and the user’s input in 3D may be projected to the display coordinate system.

The processing unit 102 may thus be configured to detect said feature by identifying a shape of the user’s hand, such as a fingertip, and/or of the input object 202, which triggers the generation of the mapping region 103 around the feature and determining of the associated input coordinate (x’i,y’i). In some examples the processing unit is be configured to detect said feature by identifying a gesturing movement of the user’s hand and/or of the input object 202. A particular shape and/or color of the input object 202 may be tracked to determine its movement. Thus, the user may trigger the generation of the mapping region 103 and the determining of the input coordinate (x’i,y’i) by displaying a predetermine gesture in front of the interaction display 201 .

In some examples the processing unit 102 may be configured to detect a combination of a plurality of the aforementioned features presented by the user, so that the mapping region 103 is generated once said combination matches a predetermined user command, i.e. when the user wants to trigger user input. Requiring such combination of features to be detected provides for reducing the occurrence of false triggering of user input. This is particularly advantageous in presentation settings, i.e. for non-gaming applications, where users typically jump in and out of input mode over the course of the presentation, whenever a user wishes to provide a contribution to the presentation. Prior art techniques typically generate too many false triggers for user input in order to be useable in such applications or any other application which may involve continuous switching between input and passive mode and in particular when multiple users participate. Stable and intuitive trigging is key to make the product usable in non-gaming scenarios. Furthermore, prior art systems that do full analysis on all frames to reduce the occurrence false events create a long latency. The key to a good user experience is that the user quickly gets feedback where he/she points in the display coordinate system and can adapt accordingly.

In one example, the processing unit 102 may need to detect the users face or eyes as well as the user’s hand or input object 202 in order to generate the mapping region 103. The user would thus need to look towards the interaction display 201 , if the spatial data is captured from the viewpoint of the interaction display 201 , in combination with presenting the hand or input object 202 in order to trigger user input. Detecting the aforementioned feature should thus be construed as encompassing any combination of such features. The processing unit 102 may thus be configured to generate the mapping region 103 and generate the onset input coordinate (x’i,y’i) upon detecting a predetermined combination of features.

In one example, the user input may be triggered based on detecting a combination of features comprising at least a first feature of the user’s body and a second feature of an input object. E.g. it may be necessary to detect the user’s hand holding a stylus within a certain radius of the user’s head or shoulder, in order to trigger user input.

The user input may be triggered based on detecting a combination of features in a defined sequence. The sequence may involve a defined sequence of actions. For example, it may be necessary to detect a sequence of movements, such as moving from a non-triggering pose to a triggering pose over a defined time interval. A quick or distinct movement to a certain pose can for example set apart a triggering action from other movements.

The user input may be triggered based on detecting a combination of features when arranged in a defined spatial relationship. For example, a shoulder, elbow, and wrist of a skeleton model of the user may trigger user input if placed in a defined spatial relationship. The spatial relationship may be defined with respect to a particular viewpoint such as the viewpoint from an image sensor device 105 being positioned on the interaction display 201. E.g. a user pointing towards the image sensor device 105 with the whole arm so that the wrist and elbow are within a small region around the shoulder, from the viewpoint of the image sensor device 105, may trigger user input. A GUI element displayed close to the image sensor device 105 may “trick” the user to point towards a GUI element to trigger user input.

The processing unit 102 may be configured to detect an audio-visual combination of said feature and a user generated sound so that the mapping region 103 is generated once said audio-visual combination of features matches a predetermined user command to trigger user input in the display coordinate system (xd,yd). For example, the user may generate the sound verbally, or by snapping fingers, or use an object such as a clicking pen. This provides for further reducing the occurrence of false triggering of user input.

Triggering of user input may be further improved by a machine learning algorithm. Data collection and training of the algorithm could be done using an auto-labeling scheme to classify user actions connected the desire to provide user input. Audio-visual data may be collected of the participating users. The algorithm or model may be continuously populated with data to further improve on its ability to pick up on ques for triggering of the user input, based on the behavior of the users. The learned model can except image data and/or skeletal data as input and yield a classification of one or more person’s intentions. E.g., a pointing gesture could be interpreted as a system trigger intention.

The processing unit 102 is configured to map the onset input coordinate (x’i,y’i) to the display coordinate system (xd,yd), and generate a control signal to display an output 203 associated with the onset input coordinate (x’i,y’i) at a display coordinate (x’d.y’d) in the display coordinate system (xd,yd). Fig. 2e shows an example where the input coordinate (x’i,y’i) in the center of the mapping region 103 is mapped to the display coordinate (x’d.y’d) at the corresponding center of the interaction display 201 . An output 203, such as a marker or cursor, may be displayed at the display coordinate (x’d.y’d). The user may continue to move the hand in the generated mapping area 103 and the positioning unit 101 receives the spatial data such as image data of the user’s movement. The processing unit 102 is configured to determine the input coordinates (x’i,y’i) of the user’s movement, e.g. of the position of the user’s hand, index finger or input object 202, based on the image data, and map the movement to the corresponding display coordinates (x’d.y’d), as exemplified in Figs. 2f-g. The processing unit 102 is configured to generate a control signal to display the associated output 203 of the movement. The user may accordingly move the marker or cursor across the interaction display 201 by gesturing of the hand or input object 202 in the mapping region 103.

The positioning unit 101 may accordingly be configured to continuously receive said spatial data to track a motion of the user’s hand and/or input object 202 in the mapping region 103 over a duration of time. The processing unit 102 may be configured to determine associated input coordinates (x’i,y’i) for said motion and generate a control signal to display an output 203 of the motion in the display coordinate system (xd,yd). The processing unit 102 may be configured to determine a velocity and/or an acceleration of the user’s hand and/or input object 202 for said duration of time to generate a control signal to display an output 203 of said motion having a corresponding said velocity and/or acceleration.

The mapping region 103 may have a fixed position, for example in relation to x-y coordinate system in Fig. 2f so that the user can move the hand across the mapping region 103 and generate different input coordinates (x’i,y’i). It is also conceivable that the mapping region 103 may be fixed in relation to other parts of the user, such as the user’s head or shoulder, which would allow the user to navigate in the mapping region by the relative movement between the user’s hand and the head or shoulder, while the mapping region 103 simultaneously follows the position of the user’s head or shoulder.

The processing unit 102 may be configured to generate a control signal to stop the display of the output 203 in the display coordinate system (xd,yd) as the user removes the detected feature, e.g. the index finger or input object 202, from the mapping region 103. The user may thus lower the hand (see e.g. Fig. 2a) to stop the generation of input to the interaction display 201 .

The processing unit 102 may be configured to detect different trigger actions defined to stop the user input. For example, a quick motion or atypical user pose which deviates from normal use case when presenting or participating in conference setting may trigger such stop of user input. E.g., pointing towards the display but rotating the body so that it is facing away from the display may trigger exiting of the user input mode.

The interaction system 100 thus allows for a user to remotely generate input to an interaction display 201 . This is advantageous when several users participate in the presentation of information on the interaction display 201 or otherwise control the input to such display 201 , such as a digital whiteboard, in a classroom or any other situation involving participants positioned at varying distances to the interaction display 201 . A plurality of users may interact with the display 201 by gesturing in a respective mapping region 103 generated for each user when the respective user triggers the detection of a feature such as an index finger or input object 202 and the associated input coordinates (x’i,y’i). The interaction system 100 thus provides for facilitated user input, in particular for a large number of participating users.

The interaction system 100 may comprise an image sensor device 105 configured to capture image data of the user and communicate the image data to the positioning unit 101. The spatial data thus comprises the image data. The positioning unit 101 may be configured to detect said feature and/or input object 202 based on the captured image data. In other examples the spatial data may be provided by a distance sensor, e.g. a time-of-flight (TOF) sensor, such as a sensor comprising a LIDAR, for determining the position of the user’s hand or input object 202. Such sensor may comprise a scanning LIDAR, with a scanning collimated laser, or a flash LIDAR where the entire field of view is illuminated with a wide diverging laser beam in a single pulse. The TOF sensor may also be based on LED illumination, e.g., pulsed LED light.

The image sensor device 105, or any of the sensors mentioned above, may be arranged on the interaction display 201 , as schematically illustrated in Fig. 1 . The user may thus be captured with a field-of-view from the position of the interaction display 201. It is conceivable however that the image sensor device 105 may be freely movable relative the interaction display 201 . For example a laptop webcam, or a conference camera system positioned at a distance from the interaction display 201 , can provide for detecting the hand/fingertip/gesture to generate the mapping region 103 around this feature and map to the display coordinates (x’d,y’d) as described above, even without knowing the position of the laptop relative to the interaction display 201 . The mapping region 103 may be essentially parallel with the interaction display 201 in some examples. It is however conceivable that the user may face an image sensor device 105 at an angle to the interaction display 201 , e.g. when sitting in front of a webcam of a laptop at a long side of a table, while the interaction display 201 is positioned at the short side of the table.

The processing unit 102 may be configured to determine a distance (I1 2) between the user and the interaction display 201 and to determine a size of mapping region 103 based on the distance (I1J2). Fig. 3 shows an example where users are placed at a first distance (h) and a second distance (I2) from the interaction display 201 where the spatial data is provided by a sensor such as an image sensor device 105 arranged at the interaction display 201 . A first mapping region 103a is generated for the user at I2, and a second mapping region 103b is generated for the user at h , as described above. The first distance (h) is closer to the interaction display 201 than the second distance (I2). The user at I2 will thus appear smaller than the user at h from the field-of-view of the image sensor device 105. The processing unit 102 may scale the mapping regions 103a, 103b, accordingly, so that relative size between the first mapping region 103a and the user at I2 is the same as the relative size between the second mapping region 103b and the user at I2. Both users will thus experience the same size interaction space 104a, 104b, and the user experience for generating input to the interaction display 201 will be independent of the user’s distance to the same.

The processing unit 102 may be configured to determine the distance (I1 , 12) based on comparing a size of a feature of the user, such as the size of a user’s head, in the spatial data with a stored reference size of a corresponding reference feature. Hence, a user’s head may be assumed to have a standard size, such as a reference width or height, which is then utilized to estimate the distance (I1J2) and scale the size of the mapping region 103. In some examples the processing unit 102 may be configured to determine the distance (I1J2) based on spatial data from a depth sensor, such as a stereo imaging sensor, in order to scale the mapping region 103 as described above.

The processing unit 102 may be configured to continuously adjust the size and/or position of the mapping region 103 as the user moves to varying distances (I1J2) from the interaction display 201. This allows the user to experience a consistency in the size of the mapping region 103, i.e. of the interaction space 104, as the user may move around in the room in which the interaction display 201 is located.

The output 203 may be a marker tracking the onset input coordinate (x’i, y’i) mapped to the display coordinate system (xd,yd). This allows the user to highlight parts of the content shown on the interaction display 201. The user may also provide input commands via the displayed marker. E.g. the user may steer the marker to a particular parts of the GUI which may generate an input command to the interaction display 201 . It is also conceivable that the user may do a further gesture, such as displaying the full palm of the hand, upon which the processing unit 102 is configured to detect such gesture as an input command to actively select a particular display element of the GUI over which the marker is currently positioned.

Different control layers of the GUI may be accessed by the user depending on combinations or sequences of the features detected. A single user may trigger multiple controls of the interaction system 100. The interaction system 100 can detect if two or more of the input controls are associated with the same user and grant additional functionality for that user, e.g. possibility to zoom, highlight along the line between the pointers, etc. The functionality may be configurable by the GUI. Furthermore, it is conceivable that a dedicated command or action can be enabled to control the interaction system 100 if multiple users trigger user input simultaneously and e.g. point at the same GUI element.

The output 203 may in other examples be a drawing brush or eraser, e.g. for a digital whiteboard, tracking the movement of the users hand or stylus 202. The generated control signal may in some examples not generate a visible output in the display coordinate system (xd,yd). E.g. the determined input coordinate (xi,yi) may be utilized by the processing unit 102 to keep track of the user’s position in the mapping area 103. The user may for example present a feature/gesture for temporary interruption of the visual output, but the position of the user may still be traced. As the user has moved e.g. the hand to a new position in the mapping region 103, the input and associated visual output may be resumed, as the new input coordinate (xi,yi) is tracked by the processing unit 102. The user may thus get a sense of continuity in the movement and location of the input even when intermittent interruptions are made. The processing unit 102 may be configured to associate the user with the output 203 based on the position and/or size of the mapping region 103 generated for said user. A unique identifier may be assigned to the user based on the position and/or size of the mapping region 103. Hence, the output 203 could be linked to a particular user which is identified via the mapping region 103, since the mapping regions 103 will have different positions or sizes depending on where the users are located in front of the interaction display 201 . Each user may thus be assigned to a unique identifier based on the mapping region 103 and have an associated unique control signal for the output 203. Several users may thus be conveniently managed for simultaneous input to the interaction display 201 . The generated mapping region 103 may itself be linked to a particular user. Besides from associating the mapping region 103 to the user via the characteristics of the mapping region 103, such as its position or relative size, it is also conceivable that features of the unique user are utilized to connect the mapping region 103 to such unique user. This can be advantageous if the user moves around during the interaction session. For example, the features characterizing a particular user may be identified in image data of the user, captured by the image sensor device 105, and may comprise facial features, or relative dimensions between parts in a skeleton model established of the user. The processing unit 102 may thus be configured to associate the mapping region 103 with a unique user. The system interaction system 100 may showcase which users that are currently using the system 100 by displaying an image of the users faces and an assigned color. The images of the users can be compared with a reference database of people to classify the user.

The interaction display 201 may be a touch display for simultaneous touch interaction. The interaction system 100 thus provides for extending the capabilities of the touch display to accept an additional mode of user input, with the advantageous benefits as described above.

Fig. 4 illustrates a flow chart of a method 300 in an interaction system 100. The order in which the steps of the method 300 are described and illustrated should not be construed as limiting and it is conceivable that the steps can be performed in varying order. The method 300 comprises receiving 301 at a positioning unit 101 spatial data of the position of a user in front of an interaction display 201 , and detecting 302 a feature of a user’s hand and/or an input object 202 held by the user based on the spatial data, when the user presents said feature in front of the interaction display 201. The method 300 comprises determining 303, upon detecting said feature, user coordinates (x,y) of the user’s hand and/or the input object 202, based on the spatial data. The method 300 comprises generating 304 a mapping region 103 as a range of input coordinates (xi, yi) around said user coordinates of the user’s hand and/or input object, wherein the mapping region defines an interaction space 104 in a region (A) in front of the user. The method 300 comprises mapping 305 the mapping region 103 to a display coordinate system (xd,yd) of the interaction display 201 . The method 300 comprises determining 306, upon detecting said feature, an associated onset input coordinate (x’i, y’i) of the user’s hand and/or input object in the mapping region 103. The method 300 comprises mapping 307 the onset input coordinate (x’i, y’i) to the display coordinate system (xd,yd), and generating 308 a control signal to display an output 203 associated with the onset input coordinate (x’i, y’i) at a display coordinate (x’d.y’d) in the display coordinate system. The method 300 thus provides for the advantageous benefits as described above in relation to the interaction system 100 and Figs. 1 - 3. The method 300 provides a facilitated user input to a display, in particular when a large number of participating users interact with such display.

In one embodiment, a snapping feature is provided to snap the output 203 to a GUI user interface element within a certain distance of output 203. This feature can be configured to snap to different types of user interface elements, such as checkboxes, radio buttons, and drop-down menus, and can be configured to snap to buttons within a certain distance, angle, percentage of total distance, or area around the output 203. Additionally, the feature can be configured to ignore certain types of user interface elements or areas of the screen. In a first example, the snapping feature can be configured to snap to user interface elements within 50 pixels of the output 203. In a second example, the snapping feature can be configured to snap to user interface elements within a range of e.g. 45-degree angle, 90-degree angle, 180-degree angle, or 270- degree angle of output 203. In a third example, the snapping feature can be configured to snap to user interface elements within 10%, 20%, 30%, 40%, or 50% of output 203’s total distance from the button. In a fourth example, the snapping feature can be configured to snap output 203 to user interface elements within a range of e.g. 5 pixel radius, 10 pixel radius, 15 pixel radius, 20 pixel radius, 25 pixel radius, or 30 pixel radius of the output 203. In a fifth example, the cursor feature can be configured to snap to user interface elements within a range of e.g. 5 cm, 10cm, or 20cm of distance on the physical display from output 203.

A computer program product is provided comprising instructions which, when the program is executed by a computer, cause the computer to carry out the steps of the method 300.

The present invention has been described above with reference to specific examples. However, other examples than the above described are equally possible within the scope of the invention. The different features and steps of the invention may be combined in other combinations than those described. The scope of the invention is only limited by the appended patent claims.

More generally, those skilled in the art will readily appreciate that all parameters, dimensions, materials, and configurations described herein are meant to be exemplary and that the actual parameters, dimensions, materials, and/or configurations will depend upon the specific application or applications for which the teachings of the present invention is/are used.

Claims

1. An interaction system (100) comprising a positioning unit (101 ) configured to receive spatial data of the position of a user in front of an interaction display (201 ), a processing unit (102) in communication with the positioning unit and being configured to detect a feature of a user’s body, hand and/or an input object (202) held by the user based on the spatial data, when the user presents said feature in front of the interaction display, determine, upon detecting said feature, user coordinates (x,y) of the user’s hand and/or the input object, based on the spatial data, generate a mapping region (103) as a range of input coordinates (xi,yi), wherein the mapping region defines an interaction space (104) in a region (A) in front of the user, map the mapping region to a display coordinate system (xd,yd) of the interaction display, determine, upon detecting said feature, an associated onset input coordinate (x’i,y’i) of the user’s hand and/or input object in the mapping region, map the onset input coordinate (x’i,y’i) to the display coordinate system (xd,yd), and generate a control signal to display an output (203) associated with the onset input coordinate (x’i,y’i) at a display coordinate (x’d,y’d) in the display coordinate system.

2. Interaction system according to claim 1 , wherein the processing unit is configured to detect said feature by identifying a shape of the user’s hand, such as a fingertip, and/or of the input object.

3. Interaction system according to claim 1 , wherein the processing unit is configured to detect said feature by identifying a gesturing movement of the user’s hand and/or of the input object.

4. Interaction system according to any of claims 1 - 3, comprising an image sensor device (105) configured to capture image data of the user and communicate the image data to the positioning unit, said spatial data comprising the image data, wherein the positioning unit is configured to detect said feature and/or input object (202) based on the captured image data.

5. Interaction system according to claim 4, wherein the image sensor device is arranged on the interaction display.

6. Interaction system according to claim 4, wherein the image sensor device is freely movable relative the interaction display.

7. Interaction system according to any of claims 1 - 6, wherein the positioning unit is configured to continuously receive said spatial data to track a motion of the user’s hand and/or input object in the mapping region over a duration of time, whereby the processing unit is configured to determine associated input coordinates (xi,yi) for said motion and generate a control signal to display an output (203) of the motion in the display coordinate system (xd,yd).

8. Interaction system according to claim 7, wherein the processing unit is configured to determine a velocity and/or an acceleration of the user’s hand and/or input object for said duration of time to generate a control signal to display an output (203) of said motion having a corresponding said velocity and/or acceleration.

9. Interaction system according to any of claims 1 - 8, wherein the processing unit is configured to determine a distance (h , I2) between the user and the interaction display, and determine a size of mapping region based on said distance.

10. Interaction system according to claim 9, wherein the processing unit is configured to determine the distance based on comparing a size of a feature of the user, such as the size of a user’s head, in the spatial data with a stored reference size of a corresponding reference feature.

11. Interaction system according to claim 9, wherein the processing unit is configured to determine the distance based on spatial data from a depth sensor, such as a stereo imaging sensor.

12. Interaction system according to any of claims 9 - 11 , wherein the processing unit is configured to continuously adjust the size and/or position of the mapping region as the user moves to varying distances from the interaction display.

13. Interaction system according to any of claims 1 - 12, wherein the processing unit is configured to generate a control signal to stop the display of the output in the display coordinate system as said feature is removed from the mapping region.

14. Interaction system according to any of claims 1 - 13, wherein the output is a marker tracking the onset input coordinate (x’i, y’i) mapped to the display coordinate system.

15. Interaction system according to any of claims 1 - 14, wherein the processing unit is configured to associate the mapping region with the user.

16. Interaction system according to any of claims 1 - 15, wherein the processing unit is configured to associate the user with the output based on the position and/or size of the mapping region generated for said user, wherein a unique identifier is assigned to the user based on the position and/or size of the mapping region.

17. Interaction system according to any of claims 1 - 16, wherein the processing unit is configured to generate said mapping region and generate the onset input coordinate (x’i,y’i) upon detecting a predetermined combination of said features.

18. Interaction system according to any of claims 1 - 17, wherein the interaction display is a touch display for simultaneous touch interaction.

19. Interaction system according to any of claims 1 - 18, wherein the processing unit is configured to detect a combination of a plurality of said features presented by the user, so that the mapping region is generated once said combination of features matches a predetermined user command to trigger user input in the display coordinate system.

20. Interaction system according to claim 19, wherein the user input is triggered based on detecting a combination of features comprising at least a first feature of the user’s body and a second feature of an input object.

21. Interaction system according to claim 19 or 20, wherein the user input is triggered based on detecting a combination of features in a defined sequence.

22. Interaction system according to any of claims 19 - 21 , wherein the user input is triggered based on detecting a combination of features in a defined spatial relationship.

23. Interaction system according to any of claims 1 - 22, wherein the processing unit is configured to detect an audio-visual combination of said feature and a user generated sound so that the mapping region is generated once said audio-visual combination of features matches a predetermined user command to trigger user input in the display coordinate system.

24. A method (300) in an interaction system (100) comprising receiving (301 ) at a positioning unit (101 ) spatial data of the position of a user in front of an interaction display (201), detecting (302) a feature of a user’s body, hand and/or an input object (202) held by the user based on the spatial data, when the user presents said feature in front of the interaction display, determining (303), upon detecting said feature, user coordinates (x,y) of the user’s hand and/or the input object, based on the spatial data, generating (304) a mapping region (103), wherein the mapping region defines an interaction space (104) in a region (A) in front of the user, mapping (305) the mapping region to a display coordinate system (xd,yd) of the interaction display, determining (306), upon detecting said feature, an associated onset input coordinate (x’i,y’i) of the user’s hand and/or input object in the mapping region, mapping (307) the onset input coordinate (x’i,y’i) to the display coordinate system (xd,y ), and generating (308) a control signal to display an output (203) associated with the onset input coordinate (x’i,y’i) at a display coordinate (x’d.y’d) in the display coordinate system.

25. A computer program product comprising instructions which, when the program is executed by a computer, cause the computer to carry out the steps of the method according to claim 24.