HK1113950B

HK1113950B - Visual code system for camera-equipped mobile devices and applications thereof

Info

Publication number: HK1113950B
Application number: HK08103155.7A
Authority: HK
Inventors: 迈克尔‧罗斯; 比特‧格菲勒
Original assignee: Scanbuy公司; 迈克尔‧罗斯; 比特‧格菲勒
Priority date: 2005-04-13
Filing date: 2005-05-28
Publication date: 2013-04-19

Description

Visual code system for camera-equipped mobile devices and applications thereof

Cross reference to related art

This application claims priority to U.S. patent application No.11/105,150, 4/2005-13, based on provisional application No.60/563,669, filed 4/2004 on 19/2004, the disclosures of which are incorporated herein by reference in their entirety and the priority of which is claimed.

Technical Field

The present invention relates to the field of decoding machine readable codes with camera equipped mobile devices. More specifically, the present invention discloses a dedicated visual code that can be decoded and utilized by virtually any camera-equipped mobile device.

Background

For more than thirty years, bar codes (barcodes) have been used to identify and price objects. More typically, bar codes are used in retail sales to identify items of merchandise. For example, a gallon of milk may contain a bar code that, when scanned, informs the cashier of the price of the milk.

More recently, barcodes have gained new use as computers and barcode scanners have become more portable. The circuitry required to scan a conventional one-dimensional bar code can now be housed in a device as small as a conventional key fob. Thus, many mobile telephones, personal digital assistants ("PDAs") and pagers can be retrofitted with, or can be connected to, laser-based scanning devices. This allows the mobile device to function as a scanner capable of storing hundreds of thousands of scanned bar codes. Mobile devices with attached scanners allow for the opening up of new places in wireless electronic commerce. Some companies have developed software and hardware that allow users to scan any bar code and redirect that bar code to media information (e.g., web site, product description, price, etc.) about the scanned product. These programs provide a connection between the physical world and the online world, which previously did not exist.

However, mobile devices with attached scanners have some drawbacks that limit their extension to the mobile market. First, there are no few mobile devices produced for the general public that contain integrated laser-based scanners. Thus, for a user to obtain the scanning capabilities of a mobile device, he/she must purchase additional equipment. The additional scanning apparatus also increases the size and weight of the mobile device, thereby reducing its mobility.

Currently, many cell phones and mobile devices have built-in cameras. Several factors drive the explosion of affordable digital camera applications and their inclusion in mobile devices. One of the most important factors is the recent availability of inexpensive image sensors based on CMOS technology. The cameras on these devices provide a means of capturing barcode information that was previously only available through laser-based scanners. Decoding barcode images with a digital camera contained within a mobile device presents several difficult problems. These problems far exceed the challenges presented in commercially available bar code readers.

Thus, there is a clear need for a visual code system that can be decoded on a camera-equipped mobile device. The visual code system should be capable of decoding from a wide range of angles and distances for different applications.

Disclosure of Invention

The visual code system of the present invention enables a camera-equipped mobile device to function as a sensor and decoder for a dedicated two-dimensional visual code. The visual code may be printed on a paper document, displayed on an electronic screen, or attached to a physical object and serves as a key to access information and functions related to the object.

Enhancing mobile devices in this manner is promising because they have constant contact with their users, and thus can be used in many everyday lives. They provide continuous wireless connectivity and models with integrated cameras are becoming more and more common. Being able to detect objects in the direct vicinity of the user enhances the role of the mobile phone in mobile commerce, education and gaming situations. In addition, it provides a natural way of communicating with each other and makes data entry more convenient.

Visual code systems provide a basis for textual or pictorial information superimposed on a camera image in near real-time, thereby providing the user with a sense of "augmented reality". This makes possible a variety of different applications in situations where information is closely connected to a physical object.

One example is the maintenance of equipment or equipment in the field. For example, various portions of the device may be associated with different visual codes. By decoding different visual codes, pictorial information aligned with items in the image can be superimposed on the camera image. Additional possible applications include displaying maintenance instructions, ordering spare parts, checking maintenance schedules, etc. Other closely related applications include: ordering goods conveniently in a mail order catalog; inputting a reference number in an online banking application; connecting the latest online content to the newspaper; support for mobile marketing short advertising campaigns integrated with TV commercials, "vote now", or "buy now" TV advertisements; booking tickets for various events; and simplifying interaction with the mobile device.

Key features of the visual code system of the present invention are a built-in direction independent coordinate system, determination of rotation angle, determination of inclination and determination of distance that enable a large number of applications such as item selection and form input. The recognition algorithm accurately determines the coordinates of the target point with respect to the coordinate system defined by the vision code. The coordinates are independent of the orientation, tilt or rotation of the visual code in the camera image. This allows the application to associate each point (and thus an arbitrarily shaped area) on the viewfinder image with information and with a specific operation. In this way, a single visual code may be associated with a plurality of such regions (e.g., individual cells of a table or individual elements and regions of an image).

The recognition algorithm provides the rotation angle of the visual code in the camera image as an additional input parameter. This can be exploited to simplify interaction with the mobile device. Different information is displayed without any interaction through the keyboard, simply by changing the orientation of the mobile device. In the same way as described above, horizontal and vertical inclinations are used as additional input parameters (i.e. different information is associated with different inclination angles). The mapping between the image coordinate system and the code coordinate system also serves to keep projection distortions in the camera image from warping, thereby allowing form elements (form elements) to be automatically captured. The various areas of the printed form are not distorted to remove distortion and thereby prepare them for further processing, e.g., transmitting them to an online server.

Interactive venue

When mobile phones or PDAs are enhanced to act as sensors for real world objects, new possible interactions are exploited. By providing a natural way of "picking up" data in daily life, the separation between the physical world and the virtual world around the user is reduced. The information is assembled with the physical entities, thereby locating and basing them on real-world content. Because mobile phones are ubiquitous, available devices that are in constant contact with users, they are in a good position to contribute to this vision. As regards the availability of mobile settings, the short interaction time required to identify the code is crucial. It does not require much effort by the user, only a few seconds.

There are many application sites that utilize visual codes: in the mail order catalog, items may be selected for convenient ordering by scanning the visual code. In online banking applications, it is often necessary to manually transcribe long reference codes or transaction numbers from printed forms. Visual code and mobile phones can replace expensive dedicated devices that perform this task. In newspapers, visual codes can be used to store or rate newspaper articles for background information related to advertisements, or to update information that is quickly out of date, such as stock quotes and weather forecasts. Using the code coordinate system, a printed image map having a predetermined sensitive area associated with the online content may be defined. Thus, a single code may be associated with multiple items: for example, various areas on a map may link weather forecasts corresponding to those areas, and each row on a printed form containing stock quotes may link to a chart displayed on the phone.

In the emerging area of mobile sales, visual codes can be used to simplify SMS campaigns and SMS lottery tickets. In advertising campaigns, the Coca Cola code numbers are printed on the labels of 160 million beverage bottles. The numbers are sent by SMS and the customer gets a logo, a beep and a consolation card and the tag can participate in the lottery drawing. 5.9 million SMS messages are sent, which corresponds to a response rate of 3.7%. The use of visual codes makes the process more user-friendly and results in more attendees in such an event, which is of great commercial interest.

The proposed two-dimensional code is also suitable for display on electronic screens such as computer monitors, TV screens, wall monitors in public places, and even small screens of handheld devices. By superimposing a visual code on the screen image, questionnaires, research surveys and polls on the TV can be implemented (e.g. selecting the best candidate's TV program). The music company Polydor uses SMS as a mobile response channel in TV commercials for CDs. Viewers can subscribe to CDs by SMS before the official release date, which immediately increases the sales index. Furthermore, using visual codes, the response rate can be substantially improved. Camera-equipped mobile phones are not only able to retrieve object IDs, but can also be used for more complex forms of interaction. Our code recognition system also provides the orientation of the phone, the amount of rotation, and the coordinates of the center of the image.

It is therefore an object of the present invention to provide a visual code system that utilizes a direction independent coordinate system defined by a visual code and an associated recognition algorithm that determines the coordinates of a target point in a camera image relative to the coordinate system.

It is another object of the present invention to provide a visual code system that associates multiple regions of an image with a single code, wherein each region is linked to different online information and operations associated with the different regions.

It is a further object of the present invention to provide a visual code system comprising a rotation degree determining feature, a vertical inclination and a horizontal inclination determining feature, and a distance determining feature, which allows different rotation, inclination and distance values associated with different information.

It is another object of the present invention to use the visual code system of the present invention for item selection, displaying information in the form of text, pictures (e.g., on the screen of the mobile device or on a nearby electronic display), auditory (e.g., through the speaker, headset of the mobile device), or tactile, and triggering different operations.

It is a further object of the present invention to provide an application in which textual or pictorial feedback is itself superimposed on the camera image, correctly aligned ("registered") with the items present in the image, giving the user an "augmented reality" feeling.

It is another object of the present invention to provide an application in which a visual code system is used in conjunction with a printed form to facilitate data entry.

These and other objects of the present invention will become more apparent upon reference to the following detailed description and the accompanying drawings.

Drawings

FIG. 1 illustrates an example of a visual code and its novel features.

Fig. 2 shows a flow chart of the steps used by a camera-equipped mobile device to decode a visual code.

Fig. 3 shows a flow chart of the steps used by the code location step of fig. 2.

Fig. 4 shows an example of a printed document containing a visual code.

FIG. 5 illustrates another example of a printed document containing a visual code.

FIG. 6 illustrates an example form for entry of a calendar event.

Detailed Description

A detailed description of the preferred embodiment (and some alternatives) of the invention is given below. It should be apparent, however, to one skilled in the art that the described embodiments may be modified in form and detail to optimize for various different situations.

Referring first to FIG. 1, shown is an example of a visual code 101 for use with the present invention. The visual code 101 is composed of a large guide bar 103 and a small guide bar 105 for determining the position and direction of the visual code 101, and three corner stones 107, 109, and 111 for detecting distortion, and a data area having actual code bits. The displayed version of the visual code 101 has a capacity of 83 bits, but its design is generic and suitable for extension to large numbers of bits. The visual code 101 may be reliably positioned even in small, oblique, low-quality images.

The visual code 101 may be printed using standard ink or infrared ink. If the visual code 101 is printed using infrared ink, the digital camera must be able to recognize the infrared image. The infrared ink has the advantage that the visual code 101 is invisible to the naked eye. Thus, if the algorithm of the present invention is running constantly in the context of a digital camera that uses a mobile device whenever, decoding of the visual code 101 can be used to bring certain information up.

These three corner stones 107, 109 and 111 define the coordinate system introduced by the code. The corner stone 107 is located at the origin or position (0, 0). Similarly, the corner stone 109 is shown in position (10, 0), while the corner stone 111 is in position (0, 10). In this coordinate system, the center of the large guide bar 103 is located at position (10, 5), and the center of the small guide bar 105 is located at position (8, 10). The code requires a blank space of individual code elements (1 cell in the code coordinate system) around the code. The plurality of visual codes may be printed one after the other, for example, arranged in a grid. A blank space of a single code element (1 cell in the code coordinate system) is required between each adjacent code.

Referring next to fig. 2, a flow chart is shown illustrating the steps used by a camera-equipped mobile device to decode the visual code 101. First, in step S201, an image captured by a digital camera is corrected for radial distortion. Images obtained by digital cameras typically exhibit some amount of barrel distortion. However, with newer phone cameras, the problem of radial lens distortion is less pronounced, especially if the code occupies only a small area of the image. The image is corrected by moving each pixel radially towards the centre of the image. The displacement depends only on the distance of the pixel from the center of the image and not on the particular image to be corrected. The following correction functions yield sufficient results (w and h are the pixel width and pixel height of the image, respectively, r is the distance from the image center, and k is a constant reflecting the radial distortion of the lens):

x_undistorted＝(x_distorted-w/2)(1+kr²)+w/2。

y_undistorted＝(y_distorted-h/2)(1+kr²)+w/2

for the nokia 7650 integrated camera, through manual testing, acceptable K ═ 6 × 10 was found^-7. For the newer nokia 6600, 7610, and 6630, no correction is required. Even for nokia 7650, code identification was performed without correction.

Step S201 is optional, as a single visual code typically covers only a small area of the camera image. In this case, the radial lens distortion is not significant. However, for some applications, such as those involving the precise selection of items in a camera image, this step may improve accuracy.

Next, in step S203, because the three color channels in the image generally have different shapes and contrast qualities, the image is grayscale scaled and adaptive threshold selected. Thus, the algorithm uses the formula: gray scale ═ red + green)/2, instead of using the more abbreviated form: gray ═ red + green + blue)/3 or ITU standard formula for luminance Y ═ 0.2126 red +0.7152 green +0.0722 blue. The formula is computationally efficient and produces enough starting points for threshold selection.

Adaptive methods are employed to produce grayscale images in black and white versions because the brightness of the camera image is not constant and the printed code may be unevenly illuminated. The algorithm uses a (weighted) moving average of the gray values while traversing the image in a serpentine fashion. The average value g is updated according to the following formula_s(n)：

Wherein p is_nRepresenting the gray value of the current pixel and s the width of the moving average. Feed g_sGiving an initial valueWhere c is the maximum possible grey value. The color of the thresholded pixel t (n) is then selected based on the following formula:

recommended values for t and s are t-15 andwhere w is the width of the image.

After the image has been grey scaled in step 203, the algorithm then attempts to identify and designate different regions of the image in step S205. The method comprises the following steps: find the areas of adjacent black pixels, count them, and assign a number to each area. The algorithm used is a well-known two-stage method. In the first phase, the image is traversed line by line, assigning the initial labels to the found regions. In this process, it may happen that two areas with different reference numbers actually belong to the same area. In this case, the equality of the two temporary labels is stored in a table. The second stage resolves the equality by merging the corresponding regions and assigning a final label to each region.

Next, in step S207, the algorithm calculates the shape and orientation of the different regions. To identify candidates for orientation bars in the found regions, the concept of second moments is used. The second moment of the region composed of the pixel group R and having the center of gravity is defined as:

for these moments, an ellipse E { (x, y) | dx2+2exy + fy2 ≦ 1} having the same major and minor axes as the region may be defined by the following settings:

the ratio of the lengths of the major and minor axes is a good measure of the "eccentricity" of the zone: an ideal circle and square have a ratio equal to 1, while a line segment has a ratio of approximately zero. This is very useful for identifying a region having a rod-like shape. Further, the direction vector of the main axis of the region may be calculated as:

wherein the content of the first and second substances,

once the shape and orientation of the different regions are identified, the algorithm performs the steps of locating and evaluating the visual code in step S209. The code in the image is located by finding the guide bar candidates and finding the corresponding corner stone. Guide bar candidates are found by selecting only those regions with sufficiently small axial ratios (e.g., below 0.05). For each of these candidates, the size and direction of the area are used to evaluate the expected locations of the small guide bar 105 and the three corner stones 107, 109, and 111. It is then checked whether these features are actually present at the expected locations. The found corner stone candidates are accepted if their axial ratio exceeds a certain limit (theoretically, they should have a ratio equal to 1, as described above). If there are no code features (the corner stones 107, 109, and 111, and the small guide bar 105) for a certain guide bar candidate, the particular candidate is discarded and the next is checked.

Referring now to FIG. 3, a flowchart is shown illustrating the steps used by the code location step 209 of FIG. 2. First, in step 303, the algorithm computes a projection mapping from code coordinates to image coordinates (and an inverse mapping from image coordinates to code coordinates). Once a bootstrap candidate with all code features is found, the algorithm assumes that this is indeed the code in the image and starts reading the coded bits. To date, the bit positions in the image must be determined, i.e., one needs a function to map each code bit to an image pixel. Because the code elements are coplanar, there is a unique homography (projective transformation matrix) between the code plane and the image plane. Once the four corresponding points are known, a projection map can be computed. In this algorithm, these corresponding points are the centers of the three corner stones 107, 109 and 111, plus the center of the small guide bar 105. This results in a code coordinate system with an origin in the upper left corner of the code.

The code coordinates of the corner stone 107 at the upper left corner, the corner stone 109 at the upper right corner, the large guide bar 103, and the corner stone 111 at the lower left corner are defined as (0, 0), (10, 0), (8, 10), and (0, 10). Suppose these elements are at image coordinates (x) respectively_i，y_i) (i ∈ { 0.,. 3 }). Code coordinates (u, v) ((u, v))^u，v∈{ 0.., 10}) to image coordinates (x, y), wherein,

according to the following four reference points (x)_i，y_i) The parameters a to h are calculated (i ∈ { 0.,. 3 }):

Δx₁＝x₁-x₂ Δy₁＝y₁-y₂ Δx₂＝x₃-x₂ Δy₂＝y₃-y₂

∑x＝0.8x₀-0.8x₁+x₂-x₃ ∑y＝0.8y₀-0.8y₁+y₂-y₃

a＝x₁-x₀+gx₁d＝y₁-y₀+gy₁

b＝x₃-x₀+hx₃e＝y₃-y₀+hy₃。

c＝x₀ f＝y₀

for applications that select items visible in the image, the inverse mapping to the one mentioned above is important. The pixel coordinates are given so that their corresponding coordinates in the coordinate system generated by the code can be obtained. The image coordinates (x, y) are mapped to the code coordinates (u, v) according to the following relationship:

wherein the content of the first and second substances,

A＝e-fh

B＝ch-b

C＝bf-ce。

D＝fg-dG＝dh-eg

E＝a-cgH＝bg-ah

F＝cd-afI＝ae-bd

next, in step S205, the algorithm calculates a rotation angle of the visual code in the image. It takes a value between 0 ° and 359 °. The code having the same direction as the image has a rotation angle of 0 °. The rotation is determined by mapping points (0, 0) and (100, 0) from the code coordinate system to the image coordinate system, resulting in an image point (a)_xAy) and (b)_x，b_y). Then, the rotation angle is determined as:

however, several special cases have to be considered.

After calculating the rotation angle of the visual code, next, in step 307, the algorithm determines the horizontal tilt and the vertical tilt. The term "inclination (tilting)" denotes the amount by which the image plane is inclined with respect to the code plane. "horizontal tilt" is the amount by which the image plane is tilted with respect to the horizontal axis of the visual code. Similarly, "vertical tilt" represents the amount by which the image plane is tilted with respect to the vertical axis of the visual code. A gradient of 1 indicates no gradient, a value less than 1 indicates upward left/upward gradient, and a value greater than 1 indicates downward right/upward gradient. The inclination parameter is calculated as follows: four image points at a constant distance h (image height) from the image center point in the axial direction of the code coordinate system are calculated. Map them to the corresponding code coordinates and calculate their distance from the center point. The ratio of these distances determines the inclination parameter t_xAnd t_y. They are independent of the size of the code in the image. If the constant r, which depends on the camera parameters, is known, the tilt angle t can be determined from these ratios_x ^αAnd $ t_y ^α. It can be obtained by experiments.

i-image coordinates of the center point of the image

c ═ code coordinate (i)

x ═ image coordinates (c + (1, 0)) -i

y-image coordinates (c + (0, 1)) -i

u＝x/|x|

v＝y/|y|

Code coordinate (i-hu) -c non-calculation

r ═ code coordinate (i + hu) -c noncircular vision

t ═ code coordinate (i-hv) -c ∞

b ═ code coordinate (i + hv) -c

t_x＝l/r

t_y＝t/b

Next, in step S308, the algorithm calculates the distance of the code from the camera. If the true code size s_real(center of corner stone in upper left corner and corner stone in upper right corner of printed codeThe distance between) and the focal length f of the camera are known, the pinhole model is used, from s_image(pixel distance between centers of the corner stone in the camera image) the metric distance from the camera to the untilted visual code can be calculated as (w)_imageIs the pixel width of the image):

since in general s_realAnd f are unknown, and we want to use the code distance for interaction purposes, rather than measuring its exact value, so we define the distance by the size of the visual code in the image. For the farthest distance to identify the code in viewfinder mode, we set d_{camera，code}: 100. This is the case for conventional devices, when s_imageWhen 25 pixels, a total of 15.625% of the image width is achieved. Thus, the distance is calculated as:

if s is_realAnd f are known and can still be based on d_{camera，code}And calculating the metric distance.

It should be apparent to those skilled in the art that steps 303, 305, 307, 308, and 309 may be performed in any order, as each step is separate and outputs different parameters or results.

Once the positions of the guide bars 103 and 105 and the corner stones 107, 109 and 111 are identified and the appropriate projection mapping is calculated (from code coordinates to image coordinates), the algorithm reads the coded bits in step 309 by simply testing the appropriate pixel (x, y) of the black and white image using the code coordinates (u, v), where u, v ∈ { 0.., 10}, (x, y) ═ image coordinates ((u, v)).

Optionally, in step S311, error detection may be performed. The (83, 76, 3) linear code used encodes the 76-bit ID into an 83-bit codeword. The resulting weighted average distance is three, which should be sufficient to identify the wrong code with high probability. Alternatively, other error detection or error correction codes, such as Reed-Solomon codes, may be used to improve reliability and robustness to bit errors. This may occur at the expense of somewhat higher processing time and a lower number of bits that are actually available for data storage. For example, 15 bits of the original 83 bits may be used for Reed-Solomon error correction codes, leaving 68 bits for data storage.

Application of algorithms

A. Determination of the angle of rotation and the inclination

In addition to the code coordinates of the center of the image, the recognition algorithm provides the orientation of the mobile device relative to the code. Specifically, the algorithm provides the degree of rotation of the code in the image, the viewing position (from left, right, top, bottom), the amount by which the image plane is tilted relative to the code plane, and the distance of the code from the camera. The projection mapping (plane homography) between the image plane and the code plane, implemented by the recognition algorithm, allows to map a given arbitrary point in the image coordinates onto a corresponding point in the code coordinates and vice versa.

B. Item selection

The code coordinate system and orientation determination features enable a large number of interesting applications involving the selection of items on printed documents. One example is selecting an article displayed on a printed mail order catalog. A single code suffices as long as the code coordinates of the item are known. This may be accomplished by content creation software that generates an "image map" of the code and its associated sensitive areas. Thus, an article on the mail order catalog may be scanned, stored in the mobile device, and current information about the article may be retrieved. The order may then be placed using a wireless communication channel provided by the mobile device.

For example, in FIG. 4, an example of a printed document containing a visual code is shown. The areas on the map and the entries in the table map individually to different URLs, thereby being associated with the online content.

In newspapers, online background information may link articles, advertisements, or very quickly outdated information such as weather forecasts or stock quotes through visual codes. By using a code coordinate system, a single code located near several items of interest suffices. By calculating the vertical and horizontal distances from the visual code elements, the exact target area, such as a local weather forecast, can be determined by projection mapping. In the same way, the rank of the target can be obtained on the stock quote sheet, thus obtaining the required information. Fig. 4 shows a cut-out pattern from a page of newspaper containing a map with current weather data and a table containing snow heights for various regions. The dotted lines drawn on the newspaper page represent sensitive areas that are each linked to online content. Such a drawing may be created with suitable content creation software. As a prototype, we developed a drawing tool that allowed these regions to be drawn to an image and to specify an associated URL for each region. The tool calculates the coordinates of these regions in the coordinate system of the code present in the image and stores this data as an XML file. Multiple URLs may be assigned to each region by taking into account other parameters provided by the code system, such as degree of rotation and degree of tilt. As shown in fig. 4, a single code is sufficient to select the region and table entry, respectively. By rotating the mobile device, different aspects of the presence information can be selected: in the example, the vertical direction shows the snow depth of the selected area, while a slight rotation shows the current temperature.

As another example, a visual code may be used for mobile marketing to automatically enter a sample number (coupon number) into a mobile device. One example is the SMS activity of the beverage producer: the beverage bottle contains a number (represented by a visual code) that can be sent as an SMS to participate in a lottery drawing. On a TV, visual codes can be used to interactively enter answers to multiple choice questions in special surveys and tests, or to select candidates in a TV program. The code coordinates and orientation of the device may serve as additional input parameters (in addition to the code values), for example, to answer a multiple choice question.

An additional possible application involves simplifying the interaction itself with the mobile device. For example, the telephone number may be encoded as a visual code, printed on a business card or phonebook. Depending on the orientation of the code in the image, the device either immediately calls the encoded telephone number, stores the virtual business card in the memory of the device, or (for mobile telephone numbers) launches the SMS editing application of the device. Parts of the user interface of the apparatus, such as a deep-hidden menu, may be embodied in the following way: associate them with visual codes, print them on paper, or display them on a large flat screen.

Another possible application of the present invention is to add online content and operations to printed documents, such as advertising leaflets, magazines, etc. An example online operation is the sale of event tickets, theater tickets, or movie tickets via a mobile phone.

Fig. 5 shows a page of a printed television magazine, which is provided with an embedded visual code. By aiming at the printed visual code, a storyline can be displayed, a cast can be given, a movie rating can be displayed or submitted, a movie trailer can be launched, and tickets can be ordered directly via the mobile phone.

C. Form input

Visual codes can also be combined with printed forms to simplify form entry, where the mobile magazine provides a wireless communication channel to conveniently and inexpensively send back entered data. The fields of these forms may be check boxes, rulers and canvas fields for free input of text or drawings. Using the box around the form as a marker, individual form elements can be precisely localized. The projection mapping of the recognition algorithm is such that parts of the image as well as the entire image are "non-distorted".

FIG. 6 illustrates an example form of entries for calendar events. The recognition algorithm first recognizes the code and locates the edges of the box. Then, the check boxes positioned in the form are scanned for check marks. Finally, the canvas domain is "un-warped", i.e., its tilt is removed, as shown in the lower part of fig. 6, and stored.

To define the form, a markup language is used that describes the structure of the form, including the location and kind of input elements it contains. This code is first used as a key to retrieve the form description from an online server. After interpreting the retrieved tags, the user input may be classified according to domain type and sent back to the server.

Claims

1. A method for decoding a visual code, comprising:

capturing a visual code image containing at least one visual code using a mobile device equipped with a digital camera;

carrying out gray scale calibration on the visual code image;

selecting a threshold value for the visual code image;

identifying a region of interest in the visual code image;

calculating a direction and shape of the identified region of interest within the visual code;

the at least one visual code positioned within the visual code image;

calculating a first projection mapping of the at least one visual code from a visual code image coordinate system to a visual code coordinate system;

calculating a second projection mapping of the at least one visual code from the visual code coordinate system to the visual code image coordinate system;

calculating a rotation angle of the at least one visual code positioned in the visual code image;

calculating a horizontal tilt value and a vertical tilt value of the at least one visual code;

calculating a distance of the at least one visual code from the mobile device;

reading coded bits in the at least one visual code to obtain at least one decoded value;

retrieving online content based on at least one of the at least one decoded value, the first projection map, the second projection map, the rotation angle, the horizontal tilt value, vertical tilt value, and the distance; and

displaying the retrieved online content on the mobile device.

2. A method for decoding a visual code according to claim 1, further comprising:

correcting the visual code image for radial lens distortion prior to grayscale calibration of the visual code image.

3. A method for decoding a visual code according to claim 1, further comprising:

error detection is performed on the at least one visual code.

4. A method for decoding a visual code according to claim 1, wherein said at least one visual code comprises three corner stones, one small guide bar and one large guide bar.

5. A method for decoding a visual code according to claim 4, wherein an individual visual code is located by identifying said corner stone, said small guide strip and said large guide strip.

6. A method for decoding a visual code according to claim 1, wherein said at least one visual code comprises three corner stones, a small guide bar and a large guide bar, and wherein directions of said small guide bar and said large guide bar are used when calculating a rotation angle of said at least one visual code, calculating a distance of said at least one visual code, and calculating a tilt value of said at least one visual code.

7. A method for decoding a visual code according to claim 1, wherein grayscaling the visual code image is performed by calculating the value of each grayscaled pixel by averaging the values of red and green pixels.

8. A method for decoding a visual code according to claim 7, wherein said greyscale marked visual code image is converted to a black and white image using a weighted moving average algorithm.

9. A method for decoding a visual code according to claim 4, wherein said small guide bar and said large guide bar are identified by locating a region in said visual code image having an axial ratio within a predetermined range.

10. A method for decoding a visual code according to claim 4, wherein said three corner stones are positioned at the positions of (0, 0), (10, 0) and (0, 10), and wherein the center of said large guide bar is positioned at the (8, 10) position in the visual code coordinate system.

11. A method for decoding a visual code according to claim 1, wherein an option is selected from a menu using at least one from the group consisting of a rotation angle, a horizontal tilt value, a vertical tilt value and a distance of said at least one visual code.

12. A method for decoding a visual code according to claim 4, wherein the plurality of visual codes are arranged in a grid to store data by the plurality of visual codes, and wherein further comprising identifying the visual codes in the grid.

13. A method for decoding a visual code according to claim 1, wherein said decoding is performed in real time.

14. A method for decoding a visual code according to claim 1, further comprising selecting an entry from a form by changing the orientation of said mobile device relative to said at least one visual code.

15. A method for decoding a visual code according to claim 1, wherein the retrieved online content is at least one of a storyline, a video clip, an audio file, and a movie review.

16. A method for decoding a visual code according to claim 1, wherein said at least one visual code is printed using infrared ink, and wherein said digital camera photographs said at least one visual code as at least one infrared visual code.

17. A method for decoding a visual code according to claim 1, wherein the target point expressed in the visual code coordinate system of the image displayed on the screen of the camera is used as a parameter for selecting an entry from the database.

18. A method for decoding a visual code according to claim 13, further comprising overlaying a drawing on an image displayed on a screen of a camera, wherein said overlaid drawing is geometrically aligned with an object in said image when said mobile device is moved and the orientation of said at least one visual code is changed.

19. A method for decoding a visual code according to claim 1, wherein aiming said digital camera at said at least one visual code is aided by a cross-hair displayed on a screen of said mobile device.

20. A method for decoding a visual code according to claim 19, wherein said screen of said mobile device is a touch screen, and wherein the position of said cross-line is specified by specifying a target point using a stylus.

21. A method for decoding a visual code according to claim 3, wherein said error detection is performed using Reed-Solomon error detection.