INDICATING POSITIONAL INFORMATION IN A VIDEO SEQUENCE
The present invention relates to a method and a system in connection with television and video production for registering and storing the positions of one or more natural objects and subsequently employing these stored positions in order to generate synthetic or composite video sequences where such positions are represented alphanumerically or graphically.
When broadcasting arrangements where there is a desire to compare events which take place at different times if they were occurring simultaneously, such as, for example, in various sports arrangements, the traditional method has involved the need to compare aggregate times or to show several small pictures within one television picture. Attempts have also been made recently to generate a composite picture where a section from one sequence is inserted into another. This generally involves comparing two competitors who have started at different times, but whose passing of a reference point are to be shown as if the runners had started simultaneously. Other alternatives are sports such as Alpine skiing events, where it is required to show how one competitor is placed relative to a previous competitor, or events such as the high jump and pole vaulting or ski jumping, where it is desirable to show a visual comparison between two or more competitors.
EP-A-0 252 215 discloses a method for showing two subsequent events in display surfaces located beside each other. The method is particularly intended for displaying sports events such as, e.g., skiing and skating competitions and jumping events. The method states in particular that the events which are displayed simultaneously are synchronised in time or displayed in parallel in space.
Recently, times systems have been presented where this method is further developed by combining a section of a stored video sequence with a current video sequence instead of being shown in parallel with it. Methods and systems for composite pictures or video sequences are known from, amongst others, US patent 4,602,286 and US patent 5,099,331.
US patent 5,264,933 discloses a method for altering video images by inserting pictures, text or the like in a manner which makes it seem as if the inserted picture is a part of the original picture independently of panning and
zooming of the camera. The method makes it possible to alter advertising panels in a sports arena in order to adapt them to suit a given public.
These previously known techniques allow video pictures to be manipulated in various ways in order to increase or alter the amount of information contained in the picture. What is described with regard to parallel display of events taking place at different times, however, is limited by the fact that it is necessary to have video sequences which must be, or at least must be capable of being made almost identical with respect to camera angles, panning and zooming. The indicated method where additional information can be adapted to such conditions is limited to the ability to display additional information which is generated in advance, and not information representing events which are taking place, for example, while a sports arrangement is unfolding.
The present invention, however, is based on the realisation that on the basis of knowledge of the position of an object which is to be incorporated in a video image, one is no longer dependent on the object in question being filmed with corresponding camera angles, panning and zooming. Instead, a synthetic object can be generated which is placed inside the video image concerned. The positioning of the object in the picture can be calculated by means of the position of the natural object which the synthetic object is to represent, in relation to the camera's position and in relation to camera angle and zoom. Thus it is the object of the invention to provide a method and a system for inserting in a video sequence a representation of one or more objects based on stored information concerning these objects' positions.
For the record it should be pointed out that the terms video and video production in this context refer both to live broadcasts for television and other productions for television and video.
It has further been realised that by means of this knowledge of positions it is also possible to calculate distances in time or space, and these distances can also be displayed alphanumerically or graphically on a screen instead of or in addition to a representation of the respective objects.
From FR-B-2.726.370 a method is known for finding the positions of the players and the ball on a football pitch. Both players and ball are equipped with transmitters, and receivers are located aro ind the pitch for determining said positions. The registered positions are then used for registering errors
such as offside, and for analysing the game. Similar methods are described in WO 93/01867, WO 95/10337 and WO 95/08816. None of these publications indicates any registration and use of position data in such a manner that events which take place simultaneously can be compared or used for generating a composite or synthetic video sequence.
In the present invention the position is determined of all the objects which will subsequently be represented in a video sequence. These objects may, for example, be competitors in a sports arrangement, or other objects whose position it will be possible to use for subsequent generation of a display in a video sequence. In the following description the term natural object will be employed for such objects and the direct representation thereof in video sequences, while the term synthetic object will be employed for an object which is displayed in a video sequence and which represents in a suitable manner registered position data for the natural objects.
In a first embodiment all the objects are equipped with a transmitter. The transmitters may be active radio transmitters based on battery operation, or they may be active or passive transceivers. In a preferred embodiment transponders without batteries are employed, preferably implemented by means of acoustic surface wave technology (SAW technology).
The number of position detectors which are necessary for determining the position of the objects which are equipped with such transmitters or transponders is dependent on the nature of the area in which the objects are located. If the area in substantially flat and relatively limited, two detectors should suffice in order to achieve an unambiguous position determination. If, however, the area is three-dimensional, i.e. if there are three coordinates x,y,z which are to be found, or the extent of the area is relatively large in relation to the detectors' positions, it may be necessary to provide up to four position detectors which are not located in the same plane in order to find the position of a given natural object. A restriction in the number of position detectors to fewer than is strictly necessary for achieving an unambiguous mathematical solution when calculating the object's position assumes that alternative solutions can be ruled out since they are not meaningful, for example because they are located under the ground, outside the arena or the like. Additional position detectors will be necessary if the extent of the area is such that not every position in the area is within the range of all the
position detectors, or the area is divided into several subareas, such as, for example, several passing points along a ski run.
In an alternative embodiment of the invention the objects' positions are calculated on the basis of their position in a video image when this video image is filmed by a camera with a known position, known camera angle (tilt and panning) and known zoom angle (coverage angle). In this alternative embodiment the distance of the objects from the camera cannot be determined, but only the direction from the camera to the object, and the stored position, or the direction, can therefore only be used for generating display of a synthetic object representing the object, in video sequences filmed by the same camera. If, however, the objects are filmed with at least two cameras, thus making it possible to determine a direction from these cameras' positions to the objects' positions, it will also be possible by means of this embodiment to calculate the objects' actual positions. These embodiments, however, require the respective objects to be capable of being identified in a video image by means of image processing and, for example, pattern recognition, but they do not require the deployment of separate position detectors.
On the basis of the calculations mentioned above, according to the invention sequences of data are registered representing the positions of one or more natural objects over a given period of time. On the basis of the relations between these stored positions, the position of the camera which has filmed the video sequence in which the synthetic object is to be inserted, and this camera's direction and use of zoom, it is possible to calculate the synthetic object's position in the video image. In this description the necessary calculations will be explained by means of vectors, but it should be understood that the use of equivalent mathematical methods such as, for example, matrices is also covered by the invention.
On the basis of the stored positions, it will also be possible to calculate other values than positions in a display surface. These values may be represented alphanumerically or graphically in a video sequence.
Since the stored positions represent samples of positions, preferably at fixed time intervals, and are often connected to a timer system, a time reference will exist linked to each individual stored position. This means that it is
possible to find an absolute or relative point of time for each individual stored position, or conversely it is possible to find a stored position for any point of time within the period for which an object's position is stored, with the accuracy permitted by the sampling intervals. In the same way a point of time will naturally be defined for all natural objects which are represented in real time.
The invention will now be described in more detail in the form of embodiments, with reference to the attached drawings, in which
fig. 1 is a principle drawing illustrating the layout of a system according to the invention; figs. 2a-d illustrate vector representation of positions relative to a display surface; fig. 3 illustrates a video camera which can be employed when implementing the invention; fig. 4 illustrates a transponder for use in position finding in a possible embodiment of the invention;
fig. 5 illustrates an alternative method for position finding according to the present invention;
fig. 6 illustrates examples of video images generated by means of the present invention.
Figure 1 is a principle drawing of a system according to the invention. In this example three detectors 1 , 2, 3 are located around an arena 5. Use is also made of two cameras 7, 8 which are located along one longitudinal side of the arena. In the arena there is located a natural object 10 whose position is to be registered over a given period. These registered positions can subsequently be used to generate synthetic objects representing the position of said natural object in video sequences which are filmed by one of the two cameras 7, 8. The natural object may be an athlete, a football or any other object. The three detectors which are employed in this example are capable of detecting the distance to the object 10. The registered distance is transmitted via a data bus 1 1 , or by means of another known per se form of data transmission, to a device which by means of the registered distances and
the respective detectors' known positions, calculates the object's position in the form of coordinates x^y^z^n a defined coordinate system. It will be convenient to define an orthogonal coordinate system with the result that the arena is located substantially in the x,y plane, while the z-axis is perpendicular to this plane.
As with the detectors 1 , 2, 3, the cameras 7, 8 will also have positions which are unambiguously defined in said coordinate system. It will thus be possible to describe the position of an object in the arena relative to one of the cameras in the form of a vector OV, , OV2 with length corresponding to the distance between camera and object and direction from the camera to the object. These vectors will be referred to as object vectors. Furthermore, the cameras will be equipped with devices for registering camera angles and zoom angles. These too can be expressed as vectors. These vectors may have fixed or arbitrary length, and both will begin at the point which is defined as the camera's position. The camera vector KVl 5 KV2 represents the camera's direction and will have a direction perpendicularly inside the picture plane, while the zoom vector ZV,, ZV2 is defined so as to define the outer edge of the video picture. The camera vector's length may express the degree of zoom. Thus the zoom vector may, for example, be expressed as the sum of the camera vector and a vector which is perpendicular to the camera vector.
By means of camera vector, zoom vector and object vector it will be possible to establish whether a given position in the coordinate system lies inside or outside the video picture from a camera, and possibly where in the video picture said position will be.
Figure 2 illustrates these relationships in further detail. Since the video picture will normally be rectangular, a single zoom vector will not define the entire picture's outer edge. In order to be able to determine whether the object vector OV is located between the zoom vector ZV and the camera vector KV, it is therefore necessary to find a zoom vector ZV which is located in the same plane as object vector OV and camera vector KV. The plane in question will be known since both the object vector OV and the camera vector KV are known, and it will then be possible to determine the zoom vector ZV on the basis of the camera's characteristics. Figure 2a illustrates these vectors in relation to a picture field I, while figure 2b illustrates the same vectors projected down on to the xy plane. Figures 2c and
2d illustrate a corresponding representation of a situation where the object vector OV is not located between the camera vector KV and the zoom vector ZV.
If the object vector OV is located between the camera vector KV and the zoom vector ZV, as is the case in figures 2a and 2b, the position of the object concerned in the picture field I will be at a point on the straight line from the centre of the picture field to the picture's outer edge where it is intersected by the zoom vector, and the position on this line can be found by means of the angles between the respective vectors. If the zoom vector ZV is located between the camera vector KV and the object vector, as illustrated in figures 2c and 2d, however, the object's position will be located outside the picture field I. However, it will still be possible to calculate a position in the picture plane, and it will be possible to employ this position to find a direction out of the picture which can form the basis for an indication of the object's position.
A more detailed description of this technique can be found in the applicant's Norwegian patent NO 303.310.
Figure 3 illustrates a video camera 7, 8 for use in implementing the invention. A schematic illustration is given here of how camera vector KV and zoom vector ZV are produced. The incident light will naturally be refracted so that it is focused on the video detector CCD. The zoom vector, on the other hand, will begin at a point which will be located in the optical plane 16 and which may be located in front of or behind this detector, depending on the use of zoom and focusing. In order to simplify the subsequent calculations, however, it is an acceptable approximation to assume that the zoom vector always begins at the same point, and that this point is common for zoom vector, camera vector and the camera's position. The camera 7, 8 will comprise a camera head 17 with angle scales for determining the camera's direction (panning and tilt), together with means for registering values for zoom and focusing (not illustrated). These registered values will be intercepted by a unit 20 which is adapted to transmit them via a data bus 1 la or another suitable transmission medium, thus enabling them to be received and used by the production equipment in connection with the further steps in the invention. Similarly, the registered video signal will be transmitted from the video camera to the production equipment via a suitable transmission path 1 lb. These two transmission paths 1 1a, l ib can be realised
in a number of ways, and they can be physically separated or form the same physical connection. In a preferred embodiment it will be natural to separate video signals and data signals, at least on two logically separate channels, in which case it will be natural for all the registered data to be transmitted on a data network, while the video transmission is performed via a video network.
As already described, the object or objects whose positions are to be detected may be equipped with transmitters or with transponders. A transponder will preferably be used which utilises acoustic surface wave technology (SAW technology). A transponder of this kind is illustrated in figure 4. The transponder comprises a substrate 20 which, e.g., may be composed of a crystal such as lithium niobite which has a surface pattern of metal composed of transducers, reflectors, etc. A polling pulse from a position detector ( 1 , 2, 3, fig. 1 ) is received by the antenna (not illustrated) to a transducer 21 , which is illustrated here in the form of a so-called interdigital transducer. The received electromagnetic energy in the polling pulse is converted in the transducer 21 into an acoustic surface wave which moves along the substrate's surface. At a certain distance from the transducer 21 there are placed reflectors 22. When the acoustic surface wave hits the reflectors, reflections are created which move back towards the transducer 21. The transducer will convert the reflected waves to electromagnetic pulses which form the response signal which is transmitted via the transponder's antenna. At the ends of the transponder surface wave absorbers 23 may be provided to prevent undesirable reflections.
The number and location of the reflectors 22 will be able to ensure that each transponder transmits a unique return signal. Thus it will be possible, for example, to equip each of the participants in a sports arrangement with his/her unique transponder. When a detector receives a return signal after first having transmitted a polling pulse, it will be possible to determine the distance to the transponder which has returned the return signal from the time it takes from the polling pulse is transmitted until the reply signal is received, taking into account any delay in the transponder, while the transducer's identity can be established on the basis of the characteristics of the return signal.
In the event of approximately simultaneous detection of several response signals, in order to obtain unambiguous detection it may be expedient to use
special detection techniques, e.g. based on correlation between detected, coded response pulse seqi ences and prestored code sequences for each individual transponder. Another possibility is to use polling signals on different frequencies and corresponding frequency-tuned transponders. Finally, transponders may be employed with different delays, with the result that no transponders which are located in the area in question will produce return signals which collide in time.
Figure 5 illustrates an alternative method for determining the position of a natural object, where figure 5a illustrates a camera K, which is filming a natural object 10, while figure 5b illustrates a video image I filmed by this camera. By means of this alternative, a direction vector RV is determined from a camera K, to the natural object 10. In contrast to the object vector found in the embodiment of the invention described above, this direction vector will only have unit length (the vector's length therefore does not express the distance to the object). In the same way as described with reference to figure 3, the camera K, will be equipped with sensors in the camera head 17 which detect with very high accuracy the direction in which the camera is pointing. This direction will indicate the camera vector KV. In the camera the degree of zoom employed is also registered. By means of image processing techniques, such as pattern recognition, the object is identified whose position is to be determined, and a position is established for this in the display surface. Based on how powerful a zoom is being employed and the direction to the camera vector KV, it is possible to determine a direction for the zoom vector ZV which goes from the camera's position through the display surface's outer edge where it is intersected by a straight line from the picture surface's centre through the location of the natural object in the display surface I. The direction vector RV will thereby also be determined. This direction vector will define a straight line through the camera's position and the position of the natural object. The position of the natural object will thus not be unambiguously defined, but it will always be possible to calculate the position of this object in any picture filmed by the same camera at the same position, as long as camera vector and zoom use (coverage angle) are known for this camera. The position can be found by finding the zoom vector ZV for the camera in the same way, with the result that this vector remains located in the same plane as the direction vector RV and the camera vector KV. If the direction vector RV is located between the
zoom vector ZV and the camera vector KV, it will be possible to find the position in the picture by means of the angles between the respective vectors.
If two or more cameras are used to film the same object simultaneously, it should be possible correspondingly to find two or more direction vectors, as illustrated in figure 5c. Provided that measurements and calculations are performed with satisfactory accuracy, these vectors will define lines which intersect each other in the position of the natural object. Based on the positions of the respective cameras, it then becomes possible to register the natural object's position absolutely in a coordinate system. It will of course be possible to supplement these data with registered data for focusing of the respective cameras. A focusing registration of this kind, however, will be an extremely imprecise form of distance measurement.
Independently of which method is used for registering the positions of the objects concerned, if only one camera is employed and no distance detectors, according to the invention these positions or directions will be stored in a register together with a time reference. These stored data will represent the positions of the respective objects, either in the form of coordinates in a coordinate system, or in the form of directions from a defined position, preferably the position of a camera. Referring again to figure 1 , these positions will preferably be stored in a system 13 which comprises at least a memory for data representing these positions, and also a computing unit for performing the calculations described in this description together with video production equipment. By means of this equipment it will be possible to generate a synthetic picture 14 which can be mixed with or displayed alternately with a regular video picture, thus forming a composite video picture or video sequence 15. Various possible synthetic pictures and arrangements of this kind will be described below, with reference to figures 6a - 6e.
By means of the registered positions, it will be possible, for example, to settle special doubtful situations in connection with sport and athletics, such as offside decisions in football, clarification of controversial cases in connection with goal scoring and the like. A system designed according to the invention will thereby be capable of solving such tasks in a manner which per se is previously known. Furthermore, the s astern will comprise equipment
which enables a synthetic representation of natural objects to be generated, based on the detected position of this object. In this context the term natural object will be employed to indicate an actual physical object or the display of such an object in a video sequence, while the term synthetic object will indicate a generated symbol indicating the position of a natural object when the natural object itself is not shown in the video sequence or is only shown in a manner which is difficult for an observer to detect.
The detected position of an object can be used for generating a synthetic object in a video sequence in real time, i.e. for indicating the natural object's position when it cannot be easily seen on a video or television screen. It may, for example, involve a skier who is hidden behind a cluster of trees, an Alpine skier whose position requires to be shown in a general view of an entire downhill run, or an ice hockey puck or golf ball which on account of its size and speed is difficult to follow with the eyes. Such a utilisation of the system according to the present invention will correspond to that which is stated in the applicant's previous Norwegian patent no. 303.310.
In addition, the present invention offers a number of new possibilities which are achieved by means of the stored data for the positions of the natural objects. By means thereof, for example, the position of an object at a given time can be indicated in a video sequence which was filmed at another time, as illustrated in figure 6a. This may be employed, for example, by inserting in a television picture, which shows an athlete 30 as he/she is passing an intermediate station, a synthetic object 31 showing the position of a second athlete who passed the same intermediate station at an earlier time. The synthetic object's 31 position in the picture surface is calculated as indicated above. It will then be natural to show the position of this second athlete when he/she had been in action for the same amount of time as the athlete who is in the process of passing the intermediate station. In other words, this means that if the two athletes are skiers who have started at exactly five minute intervals, it will be the position of the skier who started first as it was five minutes ago, which is shown in the form of a synthetic object 31 in the television picture. The synthetic object's position in the display surface is calculated, for example, as indicated in the description of figures 2a and 2b.
In the same way it will be possible to show the position of a skater from a previous pair as this position was after exactly the same length of time in the
course of the race as for the pair in question who are in action. The public will thereby be able to obtain a direct and visual comparison of how the race is developing for one skater compared to the that of a skater who has already completed his race. If the position which is to be indicated by means of the synthetic object is located outside the display surface, it will be possible, as illustrated in figure 6b, to indicate the position by letting the synthetic object be an arrow 32 or other kind of cursor pointing out of the video picture towards the position concerned. The direction of this arrow is found by finding the straight line between the centre of the picture and the object's position in the picture plane, as indicated above under the description of figures 2c and 2d.
The actual synthetic object may assume any form whatever. The synthetic object will preferably not be a close copy of the natural object it represents, since this may make it difficult for viewers to distinguish between the representation of natural and synthetic objects on the television screen. On the contrary, in order to make the synthetic object easier to see, it will be desirable to employ known per se picture processing technology to analyse the colours in the part of the picture where the synthetic object is to be placed in order to ensure that the synthetic object has a colour which is in sharp contrast to the background.
We refer now to figure 6c. By means of the stored positions for the object represented by the synthetic object and the real-time registered position of the natural object 30 which is shown in the video sequence concerned and which is compared with the synthetic object 33, it will be possible to calculate the distance between these two positions. This distance may either be calculated as the straight line between the two points in the defined coordinate system representing the positions in question, or the distance can be calculated along a defined track between these points. This track will, for example, correspond to the course a skier must follow in order to cover the distance between the two positions. It will then be possible to display the calculated distance together with the synthetic object 33, either as part of the synthetic object, or in a separate area on the video or television screen where the video sequence concerned is shown.
Where there is a well-de ned and meaningful connection between position and time, as will be the c. se in various kinds of races such as ski races, Alpine events and speed skating, but not for events such as ski jumping, pole vaulting and ball games, it will also be possible to calculate a distance in time between the natural and the synthetic object. This distance can be defined in a number of ways. A result of having an absolute or relative time defined for each stored position is that an aggregate time can be calculated for a competitor for each such stored position. An alternative is therefore to calculate the difference between the individual aggregate time for the two objects at approximately the same position. In other words, the difference is calculated between the current value of the time for the competitor concerned (the natural object's aggregate time for the position it is in in real time) and the corresponding aggregate time for the competitor represented by the synthetic object when it was in approximately the same position. In the same way as for the distance in space, it will be possible to display the distance in time together with the synthetic object.
If the position which is to be displayed in the form of a synthetic object is located outside the display surface for the video sequence concerned, it is even more appropriate to illustrate the distance between the two positions in the form of an indication of time and/or distance. Furthermore, it is possible to generate a graphic illustration of this distance, for example in the form of a line or a column. By this means it is also possible to generate a comparison of more than two objects, e.g. a plurality of competitors in a race. A possible example of such a comparison is illustrated in figure 6d, where on the basis of the position of a competitor 30 a bar chart 34 is generated illustrating the distance to other competitors. The distance may be defined as distance in time when they were in the same position, or the distance in space when they had a corresponding time. The illustrated distance may also be the actual distance at the relevant time back to the other competitors if real time positions exist for them, or possibly the distance back in time to when the competitor concerned, if he/she is leading the race, was in the respective positions in which the other competitors are at the moment.
Figure 6e illustrates an example of a graphic representation 35 of a development over time, for example of a long distance race in track and field or a cross-country skiing race, generated by means of the present invention. The distance between two or more competitors can be calculated as described
above for a desired number of positions or times in the race, in principle for each individual stored position, and this may be illustrated, for example in the form of curves. These curves may be normalised with regard to one competitor or a pre-issued schedule represented by a straight line, and the competitors' distance to this competitor or this schedule will be illustrated as curves located above or below this straight line. The example in figure 6e illustrates time differences at various positions in the race, but it will of course also be possible to show distance differences at various times during the race. Other information which can be calculated by means of the stored relationships between position and time will include speed, average speed over a given interval, change in own speed at different periods, for example early and late in a race, etc. In principle, the possibilities of compiling statistics are limited only by the producer's wishes and imagination. It will be possible to calculate all of these statistics and values very rapidly since the software necessary for performing the calculations will already exist in the production equipment. It will therefore be possible to display the results of such calculations while the sports arrangement is in progress, in principle in real time.
In conclusion, it is of course also possible to generate completely synthetic video sequences which are only based on the positions which are stored for the respective natural objects. This could be suitable after the conclusion of a race for showing the development of the race as it would appear if all the competitors started at the same time and not at given time intervals, or it may be desirable to generate synthetic pictures of how a situation would have appeared from a position where there is no camera. This is something which is often desirable, for example, in connection with analysing the scoring of a goal in football.