RU2526049C2

RU2526049C2 - Method and apparatus for detecting game incidents in field sports in video sequences

Info

Publication number: RU2526049C2
Application number: RU2012109119/07A
Authority: RU
Inventors: Ксения Юрьевна Петрова; Сергей Михайлович Седунов; Михаил Николаевич Рычагов
Original assignee: Корпорация "САМСУНГ ЭЛЕКТРОНИКС Ко., Лтд."
Priority date: 2012-03-12
Filing date: 2012-03-12
Publication date: 2014-08-20
Also published as: KR20130105320A; RU2012109119A; KR102014443B1; KR20130105270A

Abstract

FIELD: physics, video.SUBSTANCE: invention relates to video processing techniques and specifically to automation of the process of classifying video content by genre and real-time content. The result is achieved by calculating brightness and saturation of each frame pixel described by values in red, green and blue colour channels; calculating the absolute value of brightness gradient for each frame pixel; performing colour classification of each frame pixel; calculating statistics based on results of colour classification for the whole frame; calculating statistics based on results of colour classification for green frame regions; determining if said frame is a game incident in field sports based exclusively on characteristics of the current frame; determining if the current frame belongs to the same scene as the previous frame; using the detection result obtained for the previous frame as a refined detection result if the current frame belongs to the same scene as the previous frame of the video sequence; or using the detection result obtained for the current frame based exclusively on characteristics of the current frame as the refined detection result if the current frame does not belong to the same scene as the previous frame of the video sequence.EFFECT: automatic adjustment of image settings.10 cl, 29 dwg

Description

Настоящее изобретение относится к технологиям обработки видеоизображений, а именно к автоматизации процесса классификации видеоконтента по жанру и содержанию в реальном режиме времени с целью оптимального подбора параметров устройства отображения.The present invention relates to video processing technologies, namely, to automate the process of classifying video content by genre and content in real time in order to optimally select display device parameters.

Из уровня техники известны различные подходы к автоматическому определению жанра видеопоследовательности. Теоретические аспекты разработок в этой области изложены в ряде публикаций, таких как N.Watcharapinchai, S.Aramvith, S.Siddhichai, S.Marukatat, S., A discriminant approach to sports video classification, proc. of the ISCIT '07. International Symposium on Communications and Information Technologies, pp.557-561, 2007 [1], J. Park, S.Han, Y. An, Heuristic Features for Color Correlogram for Image Retrieval, proc. of the ICCSA'08. International Conference on Computational Sciences and Its Applications, pp.9-13, 2008 [2], Automatic feature construction and a simple rule induction algorithm for skin detection, G. Gomez, E. Morales Proceedings of Workshop on Machine Learning in Computer Vision, 2002, pp.31-38 [3].The prior art various approaches to the automatic determination of the genre of a video sequence. Theoretical aspects of developments in this area are described in a number of publications, such as N. Watcharapinchai, S. Aramvith, S. Siddhichai, S. Marukatat, S., A discriminant approach to sports video classification, proc. of the ISCIT '07. International Symposium on Communications and Information Technologies, pp.557-561, 2007 [1], J. Park, S. Han, Y. An, Heuristic Features for Color Correlogram for Image Retrieval, proc. of the ICCSA'08. International Conference on Computational Sciences and Its Applications, pp. 9-13, 2008 [2], Automatic feature construction and a simple rule induction algorithm for skin detection, G. Gomez, E. Morales Proceedings of Workshop on Machine Learning in Computer Vision, 2002, pp.31-38 [3].

Практические разработки в области по автоматическому определению жанра видеопоследовательности отражены в ряде патентных документов.Practical developments in the field of automatic determination of the genre of video sequences are reflected in a number of patent documents.

Так, в патенте США 7831112 [4] предложен способ разделения видеопотока на временные сегменты в зависимости от возгласов публики, присутствующей при спортивном событии. Это изобретение относится к определению жанра мультимедийного контента и состоит из детектора признаков, вычисляющего предопределенные признаки мультимедийного контента, и блока классификации.So, in US patent 7831112 [4] a method for dividing a video stream into time segments is proposed depending on the cries of the audience present at a sporting event. This invention relates to the determination of a genre of multimedia content and consists of a feature detector calculating predetermined features of multimedia content and a classification unit.

В патентной заявке США 20070113248 [5] описаны способ и устройство для определения жанра мультимедийного контента. Устройство включает в себя детектор признаков и детектор жанра, при этом для решения задачи используются как признаки, выделяемые на основе только аудиопотока, так и признаки, выделяемые на основе видеопотока.US patent application 20070113248 [5] describes a method and apparatus for defining a genre of multimedia content. The device includes a feature detector and a genre detector, while for solving the problem, both features highlighted based on the audio stream only and features highlighted based on the video stream are used.

Наиболее близким к заявленному изобретению является решение, известное из патентной заявки США 20080186413 [6] и описывающее конвейер обработки видеоизображений в телевизионном приемнике, настраиваемый в зависимости от жанра просматриваемой передачи. Авторы предлагают использовать различные установки контрастности, уровня усиления деталей, пространственно-временного шумоподавления, гамма-коррекции, овердрайва и задней подсветки в зависимости от жанра демонстрируемого видеоизображения. В качестве примеров в решении-прототипе рассмотрены следующие жанры: спорт, музыка, студийные передачи, кино, телефильмы, мультфильмы. Блок классификации в прототипе [6] состоит из генератора гистограмм, детектора максимальной градации серого, детектора средней градации серого, детектора минимальной градации серого, а также детектора средней яркости.Closest to the claimed invention is a solution known from US patent application 20080186413 [6] and describing the video processing pipeline in a television receiver, customizable depending on the genre of the broadcast being watched. The authors propose using various settings of contrast, gain level of details, spatio-temporal noise reduction, gamma correction, overdrive and backlight depending on the genre of the displayed video image. The following genres are considered as examples in the prototype solution: sports, music, studio broadcasts, films, television films, cartoons. The classification block in the prototype [6] consists of a histogram generator, a maximum gray gradation detector, a medium gray gradation detector, a minimum gray gradation detector, and a medium brightness detector.

При выборе оптимального подхода к решению проблемы автоматического определения жанра видеоконтента необходимо принять во внимание следующие особенности данной проблемы.When choosing the optimal approach to solving the problem of automatically determining the genre of video content, it is necessary to take into account the following features of this problem.

В случае если определение жанра производится исключительно на основании признаков, извлекаемых из аудиопотока, решение задачи не может быть получено с точностью до одного кадра, что является важным требованием при использовании алгоритма детектирования жанра для управления установками устройства отображения.If the genre is determined solely on the basis of features extracted from the audio stream, the solution to the problem cannot be obtained with an accuracy of one frame, which is an important requirement when using the genre detection algorithm to control the settings of the display device.

В случае быстрого движения внутри видеокадра гистограмма может меняться настолько резко, что для соседних кадров одной и той же видеопоследовательности могут быть получены различные результаты классификации.In the case of fast movement within the video frame, the histogram can change so dramatically that for neighboring frames of the same video sequence, different classification results can be obtained.

Видеоизображения обычно подвергаются различным видам предварительной обработки, таким, как улучшение контрастности, которые могут изменить гистограмму, что может повлечь получение различных результатов классификации для одних и тех же кадров видеопоследовательности, подверженных различным видам предварительной обработки. Видеоизображения могут быть обработаны алгоритмами сжатия различного типа и с различными установками качества, что также может существенно повлиять на характеристики изображения.Video images are usually subjected to different types of preprocessing, such as improved contrast, which can change the histogram, which may result in different classification results for the same frames of the video sequence subjected to different types of preprocessing. Video images can be processed by compression algorithms of various types and with various quality settings, which can also significantly affect image characteristics.

Методы детектирования жанра, основанные на динамических признаках (таких, как межкадровая вариация яркости) или на статистическом анализе низкоуровневых признаков внутри сцены, позволяют получить результат классификации только после получения последнего кадра этой сцены. В случае адаптивного управления установками устройства изображения такой тип алгоритма вызвал бы большую (и зависящую от конкретной сцены изображения) задержку, что неприемлемо для приложения в телевизионном приемнике.Methods for detecting a genre based on dynamic features (such as interframe brightness variation) or on a statistical analysis of low-level features within a scene allow you to obtain a classification result only after receiving the last frame of this scene. In the case of adaptive control of the settings of the image device, this type of algorithm would cause a large (and depending on the particular scene of the image) delay, which is unacceptable for an application in a television receiver.

Таким образом, методы определения жанра, использующие признаки, извлекаемые из аудиопотока, не могут быть использованы в приложениях для управления коллекциями мультимедийного контента, в которых аудио- и видеоданные хранятся отдельно.Thus, methods for determining the genre using features extracted from the audio stream cannot be used in applications for managing collections of multimedia content in which audio and video data are stored separately.

Методы искусственного интеллекта требуют переобучения алгоритма при появлении новых ошибочно классифицированных образцов и добавлении их в обучающую выборку.Artificial intelligence methods require retraining of the algorithm when new erroneously classified samples appear and are added to the training set.

Задача, на решение которой направлено заявляемое изобретение, состоит в разработке усовершенствованного способа обнаружении эпизодов, связанных с демонстрацией игровых видов спорта в видеопоследовательности в реальном времени с целью автоматической подстройки настроек изображения в телевизионном приемнике. При этом такой способ должен быть свободен от основных недостатков, присущих известным из уровня техники решениям.The problem to which the claimed invention is directed is to develop an improved method for detecting episodes associated with the demonstration of game sports in real-time video sequences with the aim of automatically adjusting image settings in a television receiver. Moreover, such a method should be free from the main disadvantages inherent in solutions known from the prior art.

Технический результат достигается за счет разработки способа, обнаружения игровых эпизодов в полевых видах спорта в реальном времени, заключающегося в выполнении следующих операций:The technical result is achieved by developing a method for detecting game episodes in field sports in real time, which consists in performing the following operations:

- вычисляют яркость и насыщенность каждого пикселя кадра, описываемого значениями в красном, зеленом и синем цветовых каналах;- calculate the brightness and saturation of each pixel in the frame described by the values in the red, green and blue color channels;

- вычисляют абсолютную величину градиента яркости для каждого пикселя кадра;- calculate the absolute value of the brightness gradient for each pixel of the frame;

- выполняют классификацию по цвету каждого пикселя кадра;- perform classification by color of each pixel of the frame;

- вычисляют статистику по результатам классификации по цветам для всего кадра;- calculate statistics according to the results of the classification by color for the entire frame;

- вычисляют статистику по результатам классификации по цветам для зеленых областей кадра;- calculate statistics according to the results of the classification by color for the green areas of the frame;

- определяют, является ли данный кадр игровым эпизодом в полевых видах спорта на основании исключительно характеристик текущего кадра;- determine whether a given frame is a game episode in field sports based solely on the characteristics of the current frame;

- определяют, принадлежит ли текущий кадр той же сцене, что и предыдущий кадр;- determine whether the current frame belongs to the same scene as the previous frame;

- используют в качестве уточненного результата детектирования результат детектирования, полученный для предыдущего кадра, в случае, если текущий кадр принадлежит той же самой сцене, что и предыдущий кадр видеопоследовательности; или- use as a refined detection result the detection result obtained for the previous frame, if the current frame belongs to the same scene as the previous frame of the video sequence; or

- используют в качестве уточненного результата детектирования результат детектирования, полученный для текущего кадра на основании исключительно характеристик текущего кадра, в случае, если текущий кадр не принадлежит той же самой сцене, что и предыдущий кадр видеопоследовательности.- use as a refined detection result the detection result obtained for the current frame based solely on the characteristics of the current frame, if the current frame does not belong to the same scene as the previous frame of the video sequence.

Заявляемый способ позволяет получить результат классификации для каждого кадра видеопоследовательности, при этом результат постоянен на протяжении всей сцены. Способ работает в реальном времени (время задержки соответствует времени приема одного кадра и составляет от 40 до 67 мс). Способ устойчив как к быстрому движению в кадре, так и к предварительной обработке (например, гамма-коррекции или локальному улучшению контраста). Помимо этого, классификатор представляет собой набор простых правил, понимаемых человеком, поэтому при появлении ошибочно классифицированных образцов не требуется переобучения существующего классификатора, а достаточно добавления нового правила. Классификатор является суперпозицией одномерных и двумерных пороговых функций, линейного классификатора и логических функций, и использует очень простые статистические признаки, позволяющие аппаратную реализацию при помощи блоков сдвига, сумматоров и компараторов.The inventive method allows to obtain a classification result for each frame of a video sequence, while the result is constant throughout the scene. The method works in real time (the delay time corresponds to the reception time of one frame and ranges from 40 to 67 ms). The method is resistant both to fast movement in the frame and to preliminary processing (for example, gamma correction or local improvement in contrast). In addition, the classifier is a set of simple rules that are understood by man, so when erroneously classified samples appear, retraining of the existing classifier is not required, but adding a new rule is enough. The classifier is a superposition of one-dimensional and two-dimensional threshold functions, a linear classifier and logical functions, and uses very simple statistical features that allow hardware implementation using shift blocks, adders and comparators.

По сравнению с существующими методами заявляемое изобретение имеет шесть основных отличий:Compared with existing methods, the claimed invention has six main differences:

- детектирование производится для каждого кадра видеопоследовательности в отдельности, но при этом сохраняется временная гладкость результата детектирования в пределах одной сцены;- detection is performed for each frame of the video sequence separately, but at the same time the temporal smoothness of the detection result within the same scene is maintained;

- детектирование основано на признаках, эмпирически понимаемых человеком;- detection is based on signs empirically understood by a person;

- предложены четыре новых типа детекторов цвета, определяющих цвета сходным с человеческим восприятием образом: желтый, зеленый, белый, яркий и насыщенный цвет;- Four new types of color detectors have been proposed that detect colors in a manner similar to human perception: yellow, green, white, bright and saturated color;

- предложены четыре типа низкоуровневых статистических признака, используемых при классификации: среднее значение градиента яркости в зеленых областях, компактность гистограммы зеленого цветового канала в зеленых областях, средняя яркость в зеленых областях, среднее значение синего цветового канала в зеленых областях;- four types of low-level statistical features used in the classification are proposed: the average value of the brightness gradient in green areas, the compactness of the histogram of the green color channel in green areas, the average brightness in green areas, the average value of the blue color channel in green areas;

- классификатор имеет форму направленного ациклического графа, в узлах которого расположены одномерные и двумерные пороговые функции, линейные классификаторы и логические функции;- the classifier has the form of a directed acyclic graph, in the nodes of which are one-dimensional and two-dimensional threshold functions, linear classifiers and logical functions;

- предложен новый тип детектора смены сцены, основанный на алгоритме сегментации k-средних.- a new type of scene change detector is proposed, based on the k-means segmentation algorithm.

Для лучшего понимания заявленного изобретения далее приводится его подробное описание с соответствующими чертежами.For a better understanding of the claimed invention the following is a detailed description with the corresponding drawings.

Фиг.1. Граф-схема способа обнаружения игровых эпизодов.Figure 1. Graph diagram of a method for detecting game episodes.

Фиг.2. Примеры игровых эпизодов в полевых видах спорта с различной долей зеленых пикселей.Figure 2. Examples of game episodes in field sports with varying amounts of green pixels.

Фиг.3. Насыщенность зеленых пикселей как признак полевой игры.Figure 3. Saturation of green pixels as a sign of field play.

Фиг.4. Изображение с Фиг.3, вид 3.3, в котором средние значения красного, зеленого и синего цветовых каналов в зеленых областях установлены равными средним значениям этих величин в зеленых областях на Фиг.3, вид 3.4.Figure 4. The image from Figure 3, view 3.3, in which the average values of the red, green and blue color channels in the green areas are set equal to the average values of these values in the green areas in Figure 3, view 3.4.

Фиг.5. Изображение с Фиг.3, вид 3.4, в котором средние значение синего цветового канала в зеленых областях установлено равным среднему значению этой величины в зеленых областях на Фиг.3, вид 3.3.Figure 5. The image from Figure 3, view 3.4, in which the average value of the blue color channel in the green areas is set equal to the average value of this value in the green areas in Figure 3, view 3.3.

Фиг.6. Гистограмма зеленого цветового канала в зеленых областях изображения для изображений на Фиг.3, вид 3.3, и Фиг.3, вид 3.4; градиент яркости в зеленых областях для изображений на Фиг.3, вид 3.3 и Фиг.3, вид 3.4.6. The histogram of the green color channel in the green areas of the image for the images in Figure 3, view 3.3, and Figure 3, view 3.4; the brightness gradient in green areas for the images in Figure 3, view 3.3 and Figure 3, view 3.4.

Фиг.7. Эпизод полевой игры с нулевой пропорцией зеленых пикселей. 7. An episode of a field game with a zero proportion of green pixels.

Фиг.8. Дополнительная классификация по пропорции зеленых пикселей и насыщенности зеленых пикселей.Fig. 8. Additional classification according to the proportion of green pixels and the saturation of green pixels.

Фиг.9. Сцена в помещении, не являющаяся игровым эпизодом полевой игры: вид 9.1 - пример кадра, вид 9.2 - график признаков и результат классификации для сцены, состоящей из 300 последовательных кадров.Fig.9. A scene in a room that is not a game episode in a field game: view 9.1 is an example of a frame, view 9.2 is a feature graph and classification result for a scene consisting of 300 consecutive frames.

Фиг.10. Сцена на улице, не являющаяся игровым эпизодом полевой игры: вид 10.1 - пример кадра, вид 10.2 - график признаков и результат классификации для сцены, состоящей из 350 последовательных кадровFigure 10. A scene on the street that is not a game episode of a field game: view 10.1 - frame example, view 10.2 - feature chart and classification result for a scene consisting of 350 consecutive frames

Фиг.11. Сцена, не являющаяся игровым эпизодом полевой игры, содержащая большую пропорцию зеленых пикселей: вид 11.1- пример кадра, вид 11.2 - график признаков и результат классификации для сцены, состоящей из 300 последовательных кадров.11. A scene that is not a game episode of a field game containing a large proportion of green pixels: view 11.1 is an example of a frame, view 11.2 is a feature graph and classification result for a scene consisting of 300 consecutive frames.

Фиг.12. Игровой эпизод полевой игры, содержащий дальний план: вид 12.1 - пример кадра, вид 12.2 - график признаков и результат классификации для сцены, состоящей из 350 последовательных кадров.Fig. 12. A game episode of a field game containing a long-range plan: view 12.1 is an example of a frame, view 12.2 is a feature graph and classification result for a scene consisting of 350 consecutive frames.

Фиг.13. Игровой эпизод полевой игры, содержащий крупный план; вид 13.1 - пример кадра, вид 13.2 - график признаков и результат классификации для сцены, состоящей из 350 последовательных кадров.Fig.13. A close-up gaming episode of a field game; view 13.1 is an example of a frame, view 13.2 is a graph of features and a classification result for a scene consisting of 350 consecutive frames.

Фиг.14. Игровые эпизоды, чередующиеся с неигровыми эпизодами: вид 14.1 пример кадра, не являющегося игровым эпизодом, вид 14.2 - график признаков и результат классификации для видеопоследовательности, состоящей из 850 последовательных кадров (3 сцены).Fig.14. Game episodes alternating with non-game episodes: view 14.1 is an example of a frame that is not a game episode, view 14.2 is a feature graph and classification result for a video sequence consisting of 850 consecutive frames (3 scenes).

Фиг.15. Блок-схема детектора игровых эпизодов полевых игр.Fig.15. Block diagram of a game episode detector for field games.

Фиг.16. Блок-схема детектора признаков низкого уровня.Fig.16. Block diagram of a low level feature detector.

Фиг.17. Способ реализации классификатора пикселей по цвету.Fig.17. A way to implement the color classifier of pixels.

Фиг.18. Область, соответствующая белому цвету в цветовом пространстве RGB.Fig. 18. The area corresponding to white in the RGB color space.

Фиг.19. Пример результатов детектирования пикселей белого цвета.Fig.19. An example of white pixel detection results.

Фиг.20. Область, соответствующая оттенкам кожи в цветовом пространстве RGB. Fig.20. The area corresponding to skin tones in the RGB color space.

Фиг.21. Область, соответствующая желтому цвету в цветовом пространстве RGB.Fig.21. The area corresponding to yellow in the RGB color space.

Фиг.22. Пример результатов детектирования пикселей желтого цвета.Fig.22. An example of yellow pixel detection results.

Фиг.23. Область, соответствующая зеленому цвету в цветовом пространстве RGB.Fig.23. The area corresponding to green in the RGB color space.

Фиг.24. Пример результатов детектирования пикселей зеленого цвета.Fig.24. Example of green pixel detection results.

Фиг.25. Мера компактности гистограммы.Fig.25. A measure of the compactness of a histogram.

Фиг.26. Блок-схема анализатора признаков.Fig.26. Block diagram of the feature analyzer.

Фиг.27. Блок-схема детектора смены сцены.Fig.27. A block diagram of a scene change detector.

Фиг.28. Блок-схема динамического детектора игровых эпизодов.Fig.28. Block diagram of a dynamic game episode detector.

Фиг.29. Возможное применение в области телевидения.Fig.29. Possible application in the field of television.

Способ детектирования игровых эпизодов полевых видов спорта состоит из следующих шагов, показанных на Фиг.1:A method for detecting game episodes of field sports consists of the following steps shown in FIG. 1:

- для каждого пикселя, описываемого значениями в красном, зеленом и синем цветовых каналов, вычисляют яркость и насыщенность (этап 101);- for each pixel described by the values in the red, green and blue color channels, brightness and saturation are calculated (step 101);

- вычисляют абсолютную величину градиента яркости (этап 102)- calculate the absolute value of the brightness gradient (step 102)

- классифицируют каждый пиксель по цвету, а именно: (этап 103)- classify each pixel by color, namely: (step 103)

- является ли пиксель белым?- is the pixel white?

- является ли пиксель ярким и насыщенным?- is the pixel bright and saturated?

- является ли пиксель желтым?- is the pixel yellow?

- является ли пиксель зеленым?- is the pixel green?

- является ли цвет пикселя оттенком кожи?- is the pixel color a skin tone?

- вычисляют статистику по цветам для всего кадра (этап 104);- calculate color statistics for the entire frame (step 104);

- пропорция белых пикселей в кадре- the proportion of white pixels in the frame

- пропорция ярких и насыщенных пикселей в кадре- the proportion of bright and saturated pixels in the frame

- пропорция зеленых пикселей в кадре- the proportion of green pixels in the frame

- пропорция пикселей оттенка кожи в кадре- the proportion of pixels in the skin tone in the frame

- вычисляют статистику по цветам для зеленых областей (этап 105); а именно, для всех пикселей, которые на этапе 103 были классифицированы как зеленые, вычисляют следующие характеристики:- calculate color statistics for green areas (step 105); namely, for all pixels that were classified as green in step 103, the following characteristics are calculated:

- средняя яркость- average brightness

- средняя насыщенность- average saturation

- среднее значение синего цветового канала- average value of the blue color channel

- среднее значение абсолютной величины градиента яркости- average value of the absolute value of the brightness gradient

- для пикселей, классифицированных как зеленые, вычисляют гистограмму зеленого цветового канала (этап 106)- for pixels classified as green, a histogram of the green color channel is calculated (step 106)

- для определения того, является ли текущий кадр игровым эпизодом полевой игры, используют следующие правила (этап 107):- to determine whether the current frame is a game episode of a field game, the following rules are used (step 107):

- если пропорция зеленых пикселей равна нулю или очень низка, то результат детектирования является отрицательным;- if the proportion of green pixels is zero or very low, then the detection result is negative;

- если средняя яркость зеленых пикселей очень низка, то результат детектирования является отрицательным;- if the average brightness of green pixels is very low, then the detection result is negative;

- если средняя насыщенность зеленых пикселей очень низка, то результат детектирования является отрицательным;- if the average saturation of green pixels is very low, then the detection result is negative;

- если средняя насыщенность зеленых пикселей низка и средняя яркость зеленых пикселей низка, то результат детектирования является отрицательным;- if the average saturation of green pixels is low and the average brightness of green pixels is low, then the detection result is negative;

- если средняя насыщенность зеленых пикселей средняя и гистограмма зеленого канала в зеленых областях широка, то результат детектирования является отрицательным;- if the average saturation of green pixels is average and the histogram of the green channel in the green areas is wide, then the detection result is negative;

- если средняя насыщенность зеленых пикселей очень велика и гистограмма зеленого канала в зеленых областях очень узка, то результат детектирования является отрицательным;- if the average saturation of green pixels is very large and the histogram of the green channel in the green areas is very narrow, then the detection result is negative;

- если средняя насыщенность зеленых пикселей низка или велика и среднее значение абсолютной величины градиента яркости в зеленых областях очень низко, то результат детектирования является отрицательным;- if the average saturation of green pixels is low or high and the average value of the absolute value of the brightness gradient in green areas is very low, then the detection result is negative;

- если пропорция зеленых пикселей средняя или небольшая и пропорция ярких и насыщенных пикселей средняя или высокая, а количество пикселей тона человеческой кожи выше нуля и является малым или средним, то результат детектирования является положительным;- if the proportion of green pixels is medium or small and the proportion of bright and saturated pixels is medium or high, and the number of pixels of a human skin tone above zero is small or medium, then the detection result is positive;

- если пропорция зеленых пикселей велика и пропорция ярких и насыщенных пикселей мала, но больше нуля, то результат детектирования является положительным;- if the proportion of green pixels is large and the proportion of bright and saturated pixels is small, but greater than zero, then the detection result is positive;

- для определения того, принадлежит ли текущий кадр той же сцене, что и предыдущий, производят детектирование смены сцены (этап 108),- to determine whether the current frame belongs to the same scene as the previous one, a scene change is detected (step 108),

- в случае если смена сцены не произошла, то результат детектирования устанавливают равным результату детектирования для предыдущего кадра (этап 109).- if the scene change has not occurred, the detection result is set equal to the detection result for the previous frame (step 109).

Далее поясняются детали заявляемого способа.The following explains the details of the proposed method.

Принцип детектирования игровых эпизодов полевых игр основан на том наблюдении, что по сравнению с другими видами сцен пропорция зеленых пикселей в этих кадрах сравнительно велика.The principle of detecting game episodes of field games is based on the observation that, in comparison with other types of scenes, the proportion of green pixels in these frames is relatively large.

Предложенный на основе способа детектор позволяет отличать игровые эпизоды даже в случае, если эта пропорция сильно варьируется (Фиг.2: вид 2.1 - 15%, вид 2.2 - 24%, вид 2.3 - 46%, вид 2.4 - 67%, вид 2.5 - 82%, вид 2.6 - 97%).The detector proposed on the basis of the method makes it possible to distinguish game episodes even if this proportion varies greatly (Figure 2: view 2.1 - 15%, view 2.2 - 24%, view 2.3 - 46%, view 2.4 - 67%, view 2.5 - 82%, species 2.6 - 97%).

Предполагается, что обычно в игровых эпизодах полевых игр насыщенность зеленого цвета S_GR относительно высока (Фиг.3, вид 3.1): S_GR=93%), а в других типах сцен может быть ниже (Фиг.3, вид 3.2): S_GR=15%). Однако в некоторых случаях, в силу различных условий освещения сцены или особенностей использованного алгоритма сжатия, насыщенность зеленых пикселей может быть довольно низка (Фиг.3, вид 3.3);S_GR=31%). Причем иногда насыщенность зеленых пикселей может быть даже ниже, чем у сцен другого типа (Фиг.3, вид 3.4): S_GR=35%).It is assumed that usually in the game episodes of field games the green saturation S _{GR is} relatively high (Figure 3, view 3.1): S _GR = 93%), and in other types of scenes it may be lower (Figure 3, view 3.2): S _GR = 15%). However, in some cases, due to different lighting conditions of the scene or the features of the compression algorithm used, the saturation of green pixels can be quite low (Figure 3, view 3.3); S _GR = 31%). Moreover, sometimes the saturation of green pixels can be even lower than that of scenes of a different type (Figure 3, view 3.4): S _GR = 35%).

Человек легко заметит разницу между оттенками зеленого в этих двух изображениях, в чем легко можно убедиться, нормализовав эти два изображения по средним значениям синего, зеленого и красного каналов в зеленых областях (Фиг.4). При этом видно, что синий цветовой компонент играет ключевую роль и замена его средней величины на среднюю величину синего компонента в зеленых областях футбольного поля дает практически тот же цвет (Фиг.5). Таким образом, для того, чтобы отличить эти две сцены, имеет смысл использовать среднее значение синего канала в зеленых областях изображения. Среднее значение яркости имеет смысл использовать в настоящем изобретении в качестве классификационного признака с целью отличить очень темные сцены и классифицировать их как сцены, не являющиеся игровыми эпизодами, потому как спортивные соревнования обычно проводятся в хорошо освещенных местах.A person can easily notice the difference between shades of green in these two images, which can be easily verified by normalizing these two images by the average values of blue, green and red channels in green areas (Figure 4). It can be seen that the blue color component plays a key role and replacing its average value with the average value of the blue component in the green areas of the football field gives almost the same color (Figure 5). Thus, in order to distinguish between these two scenes, it makes sense to use the average value of the blue channel in the green areas of the image. It makes sense to use the average brightness value in the present invention as a classification feature in order to distinguish between very dark scenes and classify them as scenes that are not game scenes, because sports competitions are usually held in well-lit places.

Фиг.6 (виды 6.1 и 6.2) показывает гистограммы зеленого цветового канала для зеленых областей на изображениях с Фиг.3 (виды 3.3 и 3.4 соответственно). Эти рисунки показывают, что в случае, если гистограмма более компактна, то более вероятно, что кадр является игровым эпизодом, чем когда эта гистограмма размазана. В качестве меры компактности предлагается использовать пропорцию пикселей вблизи максимума гистограммы. В случае, если эта гистограмма размазана, можно считать, что кадр не является игровым эпизодом.6 (views 6.1 and 6.2) shows histograms of the green color channel for green areas in the images of Figure 3 (views 3.3 and 3.4, respectively). These figures show that if the histogram is more compact, it is more likely that the frame is a game episode than when this histogram is blurred. It is proposed to use the proportion of pixels near the maximum of the histogram as a measure of compactness. If this histogram is smeared, we can assume that the frame is not a game episode.

Фиг.6 (виды 6.3 и 6.4) показывает градиент канала яркости в зеленых областях для кадров на Фиг.3 (виды 3.3 и 3.4 соответственно). Видно, что очень низкие значения градиента могут соответствовать кадрам, не являющимся игровыми эпизодами.6 (views 6.3 and 6.4) shows the gradient of the luminance channel in the green areas for the frames in Figure 3 (views 3.3 and 3.4, respectively). It can be seen that very low gradient values may correspond to frames other than game scenes.

Если средняя яркость зеленых пикселей очень низка, то можно сделать вывод о том, что кадр не является игровым эпизодом.If the average brightness of green pixels is very low, then we can conclude that the frame is not a game episode.

Очевидно, что в некоторых игровых эпизодах пропорция зеленых пикселей может быть нулевой (Фиг.7), но даже человек едва ли классифицирует такие кадры, не читая сопровождающего текста (который может и отсутствовать). Для кадров с нулевой пропорцией зеленых пикселей можно считать, что кадр не является игровым эпизодом.Obviously, in some game episodes, the proportion of green pixels may be zero (Fig. 7), but even a person can hardly classify such frames without reading the accompanying text (which may be absent). For frames with a zero proportion of green pixels, we can assume that the frame is not a game episode.

Эффективный классификатор должен отдельно рассматривать кадры с различной пропорцией зеленых пикселей и различным их средним насыщением (Фиг.8). Кадры с очень низкой пропорцией зеленых пикселей и низкой насыщенностью зеленых пикселей классифицируются как не являющиеся игровым эпизодом.An effective classifier should separately consider frames with a different proportion of green pixels and their different average saturation (Fig. 8). Frames with a very low proportion of green pixels and low saturation of green pixels are classified as non-game scenes.

В каждой клетке в таблице на Фиг.8 применяются дополнительные правила, в зависимости от ожидаемого вида сцены. В случае, если число зеленых пикселей очень велико, то предполагается, что представлен дальний вид. В случае, если количество ярких и насыщенных пикселей (которые могут соответствовать форме игроков) низкое или среднее, и количество белых пикселей невелико, но больше нуля, то кадр классифицируется как игровой эпизод.In each cell in the table of FIG. 8, additional rules apply, depending on the expected appearance of the scene. If the number of green pixels is very large, then it is assumed that a distant view is presented. If the number of bright and saturated pixels (which may correspond to the shape of the players) is low or medium, and the number of white pixels is small, but greater than zero, then the frame is classified as a game episode.

В случае если количество зеленых пикселей малое или среднее, то предполагается наличие крупного плана. Если количество ярких и насыщенных пикселей среднее или большое, количество белых пикселей среднее, и количество пикселей тона человеческой кожи больше нуля и невелико или среднее, то кадр классифицируется как игровой эпизод.If the number of green pixels is small or medium, then a close-up is assumed. If the number of bright and saturated pixels is medium or large, the number of white pixels is average, and the number of pixels of a human skin tone is greater than zero and small or medium, then the frame is classified as a game episode.

В случае если не произошло смены сцены и предыдущий кадр был классифицирован как игровой эпизод, то текущий кадр также классифицируется как игровой эпизод.If there was no scene change and the previous frame was classified as a game episode, then the current frame is also classified as a game episode.

Примеры кадров и графики наиболее важных признаков наряду с результатами классификации для игровых эпизодов, не игровых эпизодов и смешанных последовательностей кадров показаны на Фиг.9-14. При этом Фиг.9 показывает результаты для последовательности кадров, снятых в помещении и не являющихся игровым эпизодом. Фиг.10 показывает график признаков и результаты детектирования для последовательности кадров, снятых на улице и не являющихся игровым эпизодом. Фиг.11 показывает график признаков и результаты детектирования для последовательности кадров, не являющихся игровым эпизодом и содержащих большую пропорцию зеленых пикселей. Фиг.12 показывает график признаков и результаты детектирования для игрового эпизода, содержащего дальний план. Фиг.13 показывает график признаков и результаты детектирования для игрового эпизода, содержащего крупный план. Фиг.14 показывает график признаков и результаты детектирования для видеопоследовательности, состоящей из двух игровых эпизодов (как на Фиг.12 (вид 12.1), 13 (вид 13.1)), между которыми расположен один неигровой эпизод (как на Фиг.14 (вид 14.3)).Examples of frames and graphs of the most important features along with the classification results for game episodes, non-game episodes, and mixed frame sequences are shown in FIGS. 9-14. In this case, Fig. 9 shows the results for a sequence of frames shot indoors and which are not a game episode. Figure 10 shows a graph of signs and detection results for a sequence of frames taken on the street and not a game episode. 11 shows a graph of signs and detection results for a sequence of frames that are not a game episode and contain a large proportion of green pixels. 12 shows a graph of signs and detection results for a game episode containing a long-range plan. Fig. 13 shows a graph of features and detection results for a game episode containing a close-up. Fig.14 shows a graph of signs and detection results for a video sequence consisting of two game episodes (as in Fig.12 (view 12.1), 13 (view 13.1)), between which there is one non-game episode (as in Fig.14 (view 14.3 )).

Блок-схема на Фиг.15 содержит структуру системы анализа видеоконтента. Входной видеопоток 1500 сохраняют в кадровом буфере 1501, из которого кадры, представленные в виде последовательностей пикселей в цветовом пространстве RGB, подаются на вычислитель 1503 низкоуровневых признаков и детектор 1505 смены сцены. Выход вычислителя 1503 низкоуровневых признаков соединен с входом анализатора 1504 признаков, который обрабатывает низкоуровневые признаки и делает предварительное заключение о том, относится ли текущий кадр к игровым эпизодам полевых видов спорта. Детектор 1505 смены сцены анализирует видеопоток и делает заключение о том, принадлежит ли текущий кадр к той же сцене, что и предыдущий. Выходы анализатора 1504 признаков и детектора 1505 смены сцены соединены со входами динамического детектора 1506 типа контента, который уточняет предварительное заключение анализатора признаков. В частности, динамический детектор типа контента распространяет решение, полученное для предыдущего кадра, в случае, если смены сцены не произошло.The block diagram of FIG. 15 contains the structure of a video content analysis system. The input video stream 1500 is stored in a frame buffer 1501, from which frames, represented as sequences of pixels in the RGB color space, are supplied to a low-level feature calculator 1503 and a scene change detector 1505. The output of the low-level feature calculator 1503 is connected to the input of the feature analyzer 1504, which processes the low-level features and makes a preliminary conclusion about whether the current frame refers to game episodes of field sports. The scene change detector 1505 analyzes the video stream and concludes whether the current frame belongs to the same scene as the previous one. The outputs of the feature analyzer 1504 and scene change detector 1505 are connected to the inputs of the content type dynamic detector 1506, which clarifies the preliminary conclusion of the feature analyzer. In particular, a dynamic content type detector distributes the solution obtained for the previous frame in the event that a scene change has not occurred.

Блок-схема вычислителя низкоуровневых признаков показана на Фиг.16. Последовательность 1600 пикселей в цветовом пространстве RGB подается на вход блока 1601, который вычисляет насыщенность 5 для каждого пикселя, и на вход блока 1602, который вычисляет яркость У для каждого пикселя, а также к первому входу классификатора 1603 цветов. Значение насыщенности пикселей из блока 1601 подается на второй вход классификатора 1603 цветов. Значение яркости из блока 1602 подается на третий вход классификатора 1603 цветов и на вход блока 1604 вычисления градиента. Выход классификатора 1603 цветов и выход блока 1604 вычисления градиента соединены с первым и вторым входами блока 1605 статистического анализа соответственно. Насыщенность пикселя S вычисляется в блоке 1601 для каждого пикселя в кадре на основании его величин в цветовых каналах R, G, и В по формуле $S = {\begin{cases} 1 - \frac{M 0}{M 1}, M 1 < 0 \\ 0, и н а ч е \end{cases}$

A block diagram of a low level feature calculator is shown in FIG. 16. A sequence of 1600 pixels in the RGB color space is supplied to the input of block 1601, which calculates the saturation 5 for each pixel, and to the input of block 1602, which calculates the brightness Y for each pixel, and also to the first input of the color classifier 1603. The pixel saturation value from block 1601 is supplied to the second input of the color classifier 1603. The brightness value from block 1602 is supplied to the third input of the color classifier 1603 and to the input of the gradient calculation block 1604. The output of the color classifier 1603 and the output of the gradient calculation unit 1604 are connected to the first and second inputs of the statistical analysis unit 1605, respectively. The saturation of the pixel S is calculated in block 1601 for each pixel in the frame based on its values in the color channels R, G, and B according to the formula

S = {\begin{cases} one - \frac{M 0}{M one}, M one < 0 \\ 0 and n but h e \end{cases}

где М0 - минимальное из величин в цветовых каналах M0=MIN(R, G, B) а М1 - максимальное из величин в цветовых каналах M1=MAX(R, G, B).where M0 is the minimum of the values in the color channels M0 = MIN (R, G, B) and M1 is the maximum of the values in the color channels M1 = MAX (R, G, B).

Яркость пикселя Y вычисляется в блоке 1602 для каждого пикселя в кадре на основании его величин в цветовых каналах R, G, и В по формуле The brightness of the pixel Y is calculated in block 1602 for each pixel in the frame based on its values in the color channels R, G, and B according to the formula

$Y = \frac{306 R}{1024} + \frac{601 G}{1024} + \frac{58 B}{512}$

Y = \frac{306 R}{1024} + \frac{601 G}{1024} + \frac{58 B}{512}

Пример реализации классификатора 1603 цветов показан на Фиг.3. На три входа этого блока подаются величины в цветовых каналах R, G, и В (1701), насыщенность (1702) и яркость (1703). Детектор 1704 белых пикселей анализирует величины R, G и В для каждого пикселя и вычисляет логическую величину, означающую, что данный пиксель будет выглядеть для человека белым. Детектор 1705 ярких и насыщенных пикселей анализирует величины R, G и В для каждого пикселя и вычисляет логическую величину, означающую, что данный пиксель будет выглядеть для человека ярким и насыщенным. Детектор 1706 пикселей тона человеческой кожи анализирует величины R, G, и В для каждого пикселя и вычисляет логическую величину, означающую, что данный пиксель будет выглядеть для человека похожим по цвету на человеческую кожу. Детектор 1707 желтых пикселей анализирует величины R,G, и В для каждого пикселя наряду с его насыщенностью (1702) и яркостью (1703) и вычисляет логическую величину, означающую, что данный пиксель будет выглядеть для человека желтым. Детектор 1708 зеленых пикселей анализирует величины R, G и В для каждого пикселя наряду с его яркостью (1703) и результатом детектирования желтого пикселя, и вычисляет логическую величину, означающую, что данный пиксель будет выглядеть для человека зеленым. Мультиплексор 1709 объединяет выходы детектора 1704 белых пикселей, детектора 1705 ярких и насыщенных пикселей, детектора 1706 пикселей тона человеческой кожи и детектора 1708 зеленых пикселей для всех пикселей и представляет в виде четырех каналов. An example implementation of the classifier 1603 colors shown in Fig.3. The three inputs of this block are supplied with values in the color channels R, G, and B (1701), saturation (1702) and brightness (1703). The white pixel detector 1704 analyzes the values of R, G and B for each pixel and calculates a logical value, meaning that this pixel will look white for a person. Detector 1705 bright and saturated pixels analyzes the values of R, G and B for each pixel and calculates a logical value, meaning that this pixel will look bright and saturated for a person. The human skin tone detector 1706 pixels analyzes R, G, and B values for each pixel and calculates a logical value, meaning that this pixel will look like a human skin in color. The yellow pixel detector 1707 analyzes the R, G, and B values for each pixel along with its saturation (1702) and brightness (1703) and calculates a logical value, which means that this pixel will look yellow for a person. The green pixel detector 1708 analyzes the R, G, and B values for each pixel along with its brightness (1703) and the result of detecting the yellow pixel, and calculates a logical value, which means that this pixel will look green for a person. Multiplexer 1709 combines the outputs of a white pixel detector 1704, a bright and saturated pixel detector 1705, a human skin tone pixel detector 1706 and a green pixel detector 1708 for all pixels and is represented in four channels.

Детектор белых пикселей вычисляет выражение W=S_RGB>384^M1-M0<30, где S_RGB - сумма величин в цветовых каналах S_RGB=R+G+B. B. Область, описываемая этой формулой в цветовом пространстве RGB, показана на Фиг.18. Примеры применения этого детектора показаны на Фиг.19. Детектор 1705 ярких и насыщенных пикселей вычисляет выражение B_S=М1>150&M1-М0≥М1/2. Детектор 1706 пикселей тона человеческой кожи вычисляет выражение S_k=(G≠0)⋀ $B \leq G + \frac{G}{2} \land S_{R G B} > \frac{267 R}{2^{7}} \land B \leq \frac{83 \cdot S_{R G B}}{2^{8}} \land G \leq \frac{83 \cdot S_{R G B}}{2^{8}}$

. The white pixel detector calculates the expression W = S _RGB > 384 ^ M1-M0 <30, where S _RGB is the sum of the values in the color channels S _RGB = R + G + BB The region described by this formula in the RGB color space is shown in Fig. 18 . Examples of the application of this detector are shown in Fig. 19. The detector 1705 bright and saturated pixels calculates the expression B _S = M1> 150 & M1-M0≥M1 / 2. A human skin tone detector 1706 pixels computes the expression S _k = (G ≠ 0) ⋀

B \leq G + \frac{G}{2} \land S_{R G B} > \frac{267 R}{2^{7}} \land B \leq \frac{83 \cdot S_{R G B}}{2^{8}} \land G \leq \frac{83 \cdot S_{R G B}}{2^{8}}

.

Область, описываемая этой формулой в цветовом пространстве RGB, показана на Фиг.20.The region described by this formula in the RGB color space is shown in FIG.

Детектор 1707 желтых пикселей вычисляет выражение Y_e=B<G⋀В<R⋀9·(М1_RG-М0_RG)<М0_RG-В⋀S>0.2⋀Y>110, где максимальное значение между величиной в зеленом канале G в красном канале R, а М0_RG - минимальное из этих двух чисел. Область, описываемая этой формулой в цветовом пространстве RGB, показана на Фиг.21. Примеры применения этого детектора показаны на Фиг.22.The yellow pixel detector 1707 calculates the expression Y _e = B <G⋀B <R⋀9 · (M1 _RG -M0 _RG ) <M0 _RG -B⋀S>0.2⋀Y> 110, where the maximum value between the value in the green channel G in the red channel R, and M0 _RG is the minimum of these two numbers. The region described by this formula in the RGB color space is shown in FIG. Examples of the application of this detector are shown in FIG.

Детектор 1708 зеленых пикселей вычисляет выражение $G_{r} = G > M 1_{R B} \land S_{R G B} > 80 \land (R + B < \frac{3}{2} G \lor R + B < 255 \lor R - B < 35) \land Y > 80 \land Y_{e}$

, где M1_RB - максимальное значение между величиной в красном канале R и синем канале В. Область, описываемая этой формулой в цветовом пространстве RGB, показана на Фиг.23. Сравнение результатов детектирования для простого детектора зеленых пикселей и предложенного детектора приведено на Фиг.24.Green pixel detector 1708 calculates an expression

G_{r} = G > M {one}_{R B} \land S_{R G B} > 80 \land (R + B < \frac{3}{2} G \lor R + B < 255 \lor R - B < 35) \land Y > 80 \land Y_{e}

where M1 _RB is the maximum value between the value in the red channel R and the blue channel B. The region described by this formula in the RGB color space is shown in FIG. 23. A comparison of the detection results for a simple detector of green pixels and the proposed detector is shown in Fig.24.

Блок 1604 вычисления градиента вычисляет производную по яркости D_Y, производя свертку канала яркости У с линейным ядром $K_{g r a d} = [\begin{matrix} 0 & 0 & 0 & 1 & 0 & 0 & - 1] \end{matrix}$

.Gradient calculator 1604 calculates the brightness derivative D _Y by convolution of the luminance channel Y with a linear core

K_{g r a d} = [\begin{matrix} 0 & 0 & 0 & one & 0 & 0 & - one] \end{matrix}

.

Блок 1605 статистического анализа вычисляет для каждого кадра следующие низкоуровневые признаки: нормализованное число зеленых пикселей F₁, нормализованное число пикселей цвета человеческой кожи F₂, среднюю яркость всех пикселей F₃, среднюю величину модуля градиента для зеленых пикселей F₄, нормализованное число ярких и насыщенных пикселей F₅, среднюю насыщенность зеленых пикселей F₆, нормализованное число белых пикселей F₇, среднюю яркость зеленых пикселей F₈, среднюю величину синего канала для зеленых пикселей F₉, компактность гистограммы зеленого цветового канала для зеленых пикселей F₁₀.The statistical analysis unit 1605 calculates for each frame the following low-level features: the normalized number of green pixels F ₁ , the normalized number of pixels of human skin color F ₂ , the average brightness of all pixels F ₃ , the average value of the gradient module for green pixels F ₄ , the normalized number of bright and saturated F pixels _5, the average saturation of green pixels F _6, the normalized number of white pixels F _7, the average brightness of green pixel F _8, the average value of the blue channel for the green pixels F _9, compactness histo Ranma green color channel for green pixels F _10.

Нормализованное число зеленых пикселей вычисляется согласно выражению $F_{1} = \frac{\sum_{i = 1 \dots w, j = 1 \dots h} δ (G_{r} (i, j))}{w \cdot h}$

, где w - ширина кадра в пикселях, h - высота кадра в пикселях, i, j - координаты пикселя, а δ - функция, преобразующая логический тип в вещественный по формуле

δ (x) = {\begin{cases} 0, \neg x \\ 1, x \end{cases}

.The normalized number of green pixels is calculated according to the expression

F_{one} = \frac{\sum_{i = one ... w, j = one ... h} δ (G_{r} (i, j))}{w \cdot h}

, where w is the width of the frame in pixels, h is the height of the frame in pixels, i, j are the coordinates of the pixel, and δ is a function that converts the logical type to real using the formula

δ (x) = {\begin{cases} 0 \neg x \\ one, x \end{cases}

.

Нормализованное число пикселей цвета человеческой кожи вычисляется согласно выражению $F_{2} = \frac{\sum_{i = 1 \dots w, j = 1 \dots h} δ (S_{k} (i, j))}{w \cdot h}$

.The normalized number of pixels of the color of human skin is calculated according to the expression

F_{2} = \frac{\sum_{i = one ... w, j = one ... h} δ (S_{k} (i, j))}{w \cdot h}

.

Средняя яркость всех пикселей вычисляется согласно выражению $F_{3} = \frac{\sum_{i = 1 \dots w, j = 1 \dots h} Y (i, j)}{w \cdot h}$

.The average brightness of all pixels is calculated according to the expression

F_{3} = \frac{\sum_{i = one ... w, j = one ... h} Y (i, j)}{w \cdot h}

.

Средняя величина модуля градиента для зеленых пикселей вычисляется согласно выражению $F_{4} = \frac{\sum_{i = 1 \dots w, j = 1 \dots h} | D_{y} (i, j) | \cdot δ (G_{r} (i, j))}{\sum_{i = 1 \dots w, j = 1 \dots h} δ (G_{r} (i, j))}$

.The average value of the gradient modulus for green pixels is calculated according to the expression

F_{four} = \frac{\sum_{i = one ... w, j = one ... h} | | | D_{y} (i, j) | | | \cdot δ (G_{r} (i, j))}{\sum_{i = one ... w, j = one ... h} δ (G_{r} (i, j))}

.

Нормализованное число ярких и насыщенных пикселей вычисляется согласно выражению $F_{5} = \frac{\sum_{i = 1 \dots w, j = 1 \dots h} δ (B_{s} (i, j))}{w \cdot h}$

.The normalized number of bright and saturated pixels is calculated according to the expression

F_{5} = \frac{\sum_{i = one ... w, j = one ... h} δ (B_{s} (i, j))}{w \cdot h}

.

Средняя насыщенность зеленых пикселей вычисляется согласно выражению $F_{6} = \frac{\sum_{i = 1 \dots w, j = 1 \dots h} S (i, j) \cdot δ (G_{r} (i, j))}{\sum_{i = 1 \dots w, j = 1 \dots h} δ (G_{r} (i, j))}$

.The average saturation of green pixels is calculated according to the expression

F_{6} = \frac{\sum_{i = one ... w, j = one ... h} S (i, j) \cdot δ (G_{r} (i, j))}{\sum_{i = one ... w, j = one ... h} δ (G_{r} (i, j))}

.

Нормализованное число белых пикселей вычисляется согласно выражению $F_{7} = \frac{\sum_{i = 1 \dots w, j = 1 \dots h} δ (W (i, j))}{w \cdot h}$

.The normalized number of white pixels is calculated according to the expression

F_{7} = \frac{\sum_{i = one ... w, j = one ... h} δ (W (i, j))}{w \cdot h}

.

Средняя яркость зеленых пикселей вычисляется согласно выражению $F_{8} = \frac{\sum_{i = 1 \dots w, j = 1 \dots h} Y (i, j) \cdot δ (G_{r} (i, j))}{\sum_{i = 1 \dots w, j = 1 \dots h} δ (G_{r} (i, j))}$

.The average brightness of green pixels is calculated according to the expression

F

_{8} = \frac{\sum_{i = one ... w, j = one ... h} Y (i, j) \cdot δ (G_{r} (i, j))}{\sum_{i = one ... w, j = one ... h} δ (G_{r} (i, j))}

.

Средняя величина синего канала для зеленых пикселей вычисляется согласно выражению $F_{9} = \frac{\sum_{i = 1 \dots w, j = 1 \dots h} B (i, j) \cdot δ (G_{r} (i, j))}{\sum_{i = 1 \dots w, j = 1 \dots h} δ (G_{r} (i, j))}$

.The average value of the blue channel for green pixels is calculated according to the expression

F_{9} = \frac{\sum_{i = one ... w, j = one ... h} B (i, j) \cdot δ (G_{r} (i, j))}{\sum_{i = one ... w, j = one ... h} δ (G_{r} (i, j))}

.

Компактность гистограммы зеленого цветового канала для зеленых пикселей Е₁₀ вычисляется за следующие три шага:The compactness of the histogram of the green color channel for green pixels E _{10 is} calculated in the following three steps:

1) Строится гистограмма Н_YGr для величин зеленого цветового канала G для пикселей, классифицированных по цвету как зеленые (Фиг.25);1) A histogram H _{YGr is plotted} for the values of the green color channel G for pixels classified by color as green (Fig. 25);

2) Ширина гистограммы Dвычисляется как разность между самым правым (MHY1) и самым левым (MHY0) ненулевыми элементами гистограммы;2) Histogram width DIt is calculated as the difference between the rightmost (MHY1) and the leftmost (MHY0) nonzero histogram elements;

3) Признак F₁₀ вычисляется как пропорция элементов гистограммы, лежащих на расстоянии не больше восьмой части ширины гистограммы от максимума гистограммы: 3) The sign F _{10 is} calculated as the proportion of the histogram elements lying at a distance of no more than an eighth of the width of the histogram from the maximum of the histogram:

$F_{10} = \frac{\sum_{i = P_{8} - D / 8}^{P_{8} + D / 8} H_{Y G r} (i)}{\sum_{i = 0}^{255} H_{Y G r} (i)}$

.

F_{10} = \frac{\sum_{i = P_{8} - D / 8}^{P_{8} + D / 8} H_{Y G r} (i)}{\sum_{i = 0}^{255} H_{Y G r} (i)}

.

Выходом блока 1605 статистического анализа является 10- компонентный вектор признаков (F₁, F₂, F₃, F₄, F₅, F₆, F₇, F₈, F₉, F₁₀).The output of statistical analysis block 1605 is a 10-component feature vector (F ₁ , F ₂ , F ₃ , F ₄ , F ₅ , F ₆ , F ₇ , F ₈ , F ₉ , F ₁₀ ).

Блок-схема анализатора 1504 признаков приведена на Фиг.26. Признак 2601 "средняя насыщенность зеленых пикселей" F₆ и признак 2602 "нормализованное число зеленых пикселей" F₁ подаются на первый и второй входы блока 2620 двумерного порогового преобразования, который вычисляет вектор логических величинA block diagram of a feature analyzer 1504 is shown in FIG. Sign 2601 "average green pixel saturation" F ₆ and sign 2602 "normalized number of green pixels" F ₁ are fed to the first and second inputs of the block 2620 two-dimensional threshold conversion, which calculates the vector of logical quantities

(y₁₁, y₁₂, y₁₃, y₁₄, y₂₁, y₂₂, y₂₃, y₂₄, y₃₁, y₃₂, y₃₃, y₃₄, y₄₁, y₄₂, y₄₃, y₄₄),(y ₁₁ , y ₁₂ , y ₁₃ , y ₁₄ , y ₂₁ , y ₂₂ , y ₂₃ , y ₂₄ , y ₃₁ , y ₃₂ , y ₃₃ , y ₃₄ , y ₄₁ , y ₄₂ , y ₄₃ , y ₄₄ ),

где каждая из этих величин y_ij вычисляется по формуле $y_{i j} = F_{1} \geq T_{i - 1}^{1} \land F_{1} \leq T_{i}^{1} \land F_{6} \geq T_{j - 1}^{2} \land F_{6} \leq T_{j}^{2}$

, где

T_{0}^{1}

,

T_{1}^{1}

,

T_{2}^{1}

,

T_{3}^{1}

,

T_{4}^{1}

,

T_{0}^{2}

,

T_{1}^{2}

,

T_{2}^{2}

,

T_{3}^{2}

,

T_{4}^{2}

- предопределенные пороговые величины, удовлетворяющие условию

T_{0}^{1} = 0

,

T_{4}^{1} = 1

,

T_{0}^{2} = 0

,

T_{4}^{2} = 0

,where each of these quantities y _{ij is} calculated by the formula

y_{i j} = F_{one} \geq T_{i - one}^{one} \land F_{one} \leq T_{i}^{one} \land F_{6} \geq T_{j - one}^{2} \land F_{6} \leq T_{j}^{2}

where

T_{0}^{one}

,

T_{one}^{one}

,

T_{2}^{one}

,

T_{3}^{one}

,

T_{four}^{one}

,

T_{0}^{2}

,

T_{one}^{2}

,

T_{2}^{2}

,

T_{3}^{2}

,

T_{four}^{2}

- predefined threshold values that satisfy the condition

T

_{0}^{one} = 0

,

T_{four}^{one} = one

,

T_{0}^{2} = 0

,

T_{four}^{2} = 0

,

$T_{0}^{1} < T_{1}^{1} < T_{2}^{1} < T_{3}^{1} < T_{4}^{1}$

,

T_{0}^{2} < T_{1}^{2} < T_{2}^{2} < T_{3}^{2} < T_{4}^{2}

.

T_{0}^{one} < T_{one}^{one} < T_{2}^{one} < T_{3}^{one} < T_{four}^{one}

,

T_{0}^{2} < T_{one}^{2} < T_{2}^{2} < T_{3}^{2} < T_{four}^{2}

.

Признак 2602 "нормализованное число зеленых пикселей" также подается на вход блока 2611 порогового преобразования, вычисляющего логическую величину N₁=F₁<T³, где T³- предопределенная пороговая величина, удовлетворяющая условию 0<T³<1.Sign 2602 "normalized number of green pixels" is also fed to the input of block 2611 threshold conversion, calculating the logical value N ₁ = F ₁ <T ³ , where T ³ is a predetermined threshold value that satisfies the condition 0 <T ³ <1.

Признак 2603 "нормализованное число пикселей цвета человеческой кожи" подается на вход блока 2612 порогового преобразования, вычисляющего логическую величину N₂=F₂<Т⁴, где Т⁴- предопределенная пороговая величина, удовлетворяющая условию 0<T⁴<1.Sign 2603 "normalized number of pixels of the color of human skin" is fed to the input of a threshold transform block 2612 that calculates a logical quantity N ₂ = F ₂ <T ⁴ , where T ⁴ is a predetermined threshold value that satisfies the condition 0 <T ⁴ <1.

Признак 2604 "средняя величина модуля градиента для зеленых пикселей " подается на вход блока 2613 порогового преобразования, вычисляющего логическую величину N₃=F₄<T⁵, где Т⁵ - предопределенная пороговая величина, удовлетворяющая условию 0<Т⁵<1.Sign 2604 "the average value of the modulus of the gradient for green pixels" is fed to the input of block 2613 of the threshold transformation, which calculates the logical value N ₃ = F ₄ <T ⁵ , where T ⁵ is a predetermined threshold value satisfying the condition 0 <T ⁵ <1.

Признак 2605 "нормализованное число белых пикселей " подается на вход блока 2614 порогового преобразования, вычисляющего логическую величину N₄=F₇>T⁶, где Т⁶- предопределенная пороговая величина, удовлетворяющая условию 0<T⁶<1.Sign 2605 "normalized number of white pixels" is fed to the input of the threshold transform block 2614, which calculates the logical value N ₄ = F ₇ > T ⁶ , where T ⁶ is a predefined threshold value that satisfies the condition 0 <T ⁶ <1.

Признак 2603 "нормализованное число пикселей цвета человеческой кожи" и признак 2607 "нормализованное число ярких и насыщенных пикселей" подаются на первый и второй входы блока 2621 двумерного порогового преобразования, который вычисляет вектор логических величинSign 2603 "normalized number of pixels of the color of human skin" and sign 2607 "normalized number of bright and saturated pixels" are fed to the first and second inputs of block 2621 two-dimensional threshold transformation, which calculates a vector of logical quantities

(z₁₁, z₁₂, z₂₁, z₂₂), где каждая из этих величин z_ij i вычисляется по формуле(z ₁₁ , z ₁₂ , z ₂₁ , z ₂₂ ), where each of these quantities z _ij i is calculated by the formula

$z_{i j} = F_{2} \geq T_{i - 1}^{7} \land F_{2} \leq T_{i}^{7} \land F_{5} \geq T_{j - 1}^{8} \land F_{5} \leq T_{j}^{8}$

, где

T_{0}^{7}

,

T_{1}^{7}

,

T_{2}^{7}

,

T_{0}^{8}

,

T_{1}^{8}

,

T_{2}^{8}

T_{0}^{7} = 0

,

T_{2}^{7} = 1

,

T_{0}^{8} = 0

,

T_{2}^{8} = 1

,

T_{0}^{7} < T_{1}^{7} < T_{2}^{7}

,

T_{0}^{8} < T_{1}^{8} < T_{2}^{8}

.

z_{i j} = F_{2} \geq T_{i - one}^{7} \land F_{2} \leq T_{i}^{7} \land F_{5} \geq T_{j - one}^{8} \land F_{5} \leq T_{j}^{}

where

T_{0}^{7}

,

T_{one}^{7}

,

T_{2}^{7}

,

T_{0}^{8}

,

T_{one}^{8}

,

T_{2}^{8}

- predefined threshold values that satisfy the condition

T

_{0}^{7} = 0

,

T_{2}^{7} = one

,

T_{0}^{8} = 0

,

T_{2}^{8} = one

,

T_{0}^{7} < T_{one}^{7} < T_{2}^{7}

,

T_{0}^{8} < T_{one}^{8} < T_{2}^{8}

.

Признак 2606 "средняя яркость" F₃и признак 2607 "нормализованное число ярких и насыщенных пикселей" F₅ подаются на первый и второй входы блока 2622 линейной классификации, вычисляющего логическую величину Q₁=К₁·F₃+К₂·F₅+В>0, где K₁, К₂ и В - предопределенные константы.Sign 2606 "average brightness" F ₃ and sign 2607 "normalized number of bright and saturated pixels" F ₅ are fed to the first and second inputs of linear classification block 2622, which calculates the logical value Q ₁ = K ₁ · F ₃ + K ₂ · F ₅ + B> 0, where K ₁ , K ₂ and B are predefined constants.

Признак 2607 "нормализованное число ярких и насыщенных пикселей" F₅подается на вход блока 2615 порогового преобразования, вычисляющего логическую величину Q₂=F₅>Т⁹ _, где Т⁹ - предопределенная пороговая величина, удовлетворяющая условию 0<T⁹<1.Sign 2607 "normalized number of bright and saturated pixels" F ₅ is fed to the input of a threshold transform block 2615 that calculates a logical quantity Q ₂ = F ₅ > T ⁹ _, where T ⁹ is a predetermined threshold value satisfying the condition 0 <T ⁹ <1.

Признак 2608 "средняя яркость зеленых пикселей" подается на вход блока 2616 порогового преобразования, вычисляющего логическую величину P₁=F₈>T¹⁰, где T¹⁰ - предопределенная пороговая величина, удовлетворяющая условию 0<T¹⁰<1.Sign 2608 "average brightness of green pixels" is fed to the input of block 2616 of the threshold transform that calculates the logical value P ₁ = F ₈ > T ¹⁰ , where T ¹⁰ is a predefined threshold value that satisfies the condition 0 <T ¹⁰ <1.

Признак 2608 "средняя яркость зеленых пикселей" подается на вход блока 2617 порогового преобразования, вычисляющего логическую величину P₂=F₈>T¹¹, где T¹¹ - предопределенная пороговая величина, удовлетворяющая условию 0<T¹¹<1, T¹¹≠T¹⁰.Sign 2608 "average brightness of green pixels" is fed to the input of block 2617 of the threshold transformation that calculates the logical value P ₂ = F ₈ > T ¹¹ , where T ¹¹ is a predefined threshold value satisfying the condition 0 <T ¹¹ <1, T ¹¹ ≠ T ¹⁰ .

Признак 2609 "средняя величина синего канала для зеленых пикселей " подается на вход блока 2618 порогового преобразования,Sign 2609 "the average value of the blue channel for green pixels" is supplied to the input of block 2618 threshold conversion,

вычисляющего логическую величину P₃=F₉<T¹², где Т¹²- предопределенная пороговая величина, удовлетворяющая условию 0<T¹²<1.calculating a logical quantity P ₃ = F ₉ <T ¹² , where T ¹² is a predetermined threshold value satisfying the condition 0 <T ¹² <1.

Признак 2610 "компактность гистограммы зеленого цветового канала для зеленых пикселей" подается на вход блока 2619 порогового преобразования, вычисляющего логическую величину P₄=F₁₀<T¹³, где T¹³ - предопределенная пороговая величина, удовлетворяющие условию 0<T¹³<1.The sign 2610 "compactness of the histogram of the green color channel for green pixels" is supplied to the input of the threshold transform unit 2619, which calculates the logical value P ₄ = F ₁₀ <T ¹³ , where T ¹³ is a predefined threshold value satisfying the condition 0 <T ¹³ <1.

Выходы блоков 2611, 2612, 2613, 2614, 2615, 2616, 2617, 2618, 2619 порогового преобразования наряду с выходами блоков 2620 и 2621 двумерного порогового преобразования и выходом блока 2622 линейной классификации соединены со входами блока 2623 логического анализа, вычисляющего результат детектирования игрового эпизода полевого вида спорта для текущего кадра исключительно на основании пикселей этого кадра.The outputs of the threshold transform blocks 2611, 2612, 2613, 2614, 2615, 2616, 2617, 2618, 2619, along with the outputs of the two-dimensional threshold transform blocks 2620 and 2621 and the output of the linear classification block 2622 are connected to the inputs of the logical analysis block 2623 that calculates the result of the detection of the game episode a field sport for the current frame solely based on the pixels of that frame.

Блок логического анализа реализует логическую формулу $R = (\neg V_{1}) \land P_{1} \land P_{4} \land V_{2} \land (\neg P_{2} \land P_{3}))$

,The logical analysis unit implements the logical formula

R = (\neg V_{one}) \land P_{one} \land P_{four} \land V_{2} \land (\neg P_{2} \land P_{3}))

,

где V₁=N₁⋁N₂⋁N₃⋁N₄⋁z₁₁ и where V ₁ = N ₁ ⋁ N ₂ ⋁ N ₃ ⋁ N ₄ ⋁z ₁₁ and

$V_{2} = (y_{22} \land Q_{2}) \lor y_{23} \lor (y_{24} \land Q_{1}) \lor y_{32} \lor y_{33} \lor y_{34} \lor y_{42} \lor y_{43} \lor y_{44}$

.

V_{2} = (y_{22} \land Q_{2}) \lor y_{23} \lor (y_{24} \land Q_{one}) \lor y_{32} \lor y_{33} \lor y_{34} \lor y_{42} \lor y_{43} \lor y_{44}

.

Блок-схема детектора 1505 смены сцены показана на Фиг.27. Значения 2701 в цветовых каналах R,G и В текущего кадра и центры кластеров сегментации из предыдущего кадра с выхода блока 2703 задержки подаются на вход блока 2702 кластеризации. Центры кластеров сегментации а:(описываются матрицей размером N_K на 3 элемента, где N_K - количество кластеров, используемых при сегментации:A block diagram of a scene change detector 1505 is shown in FIG. The values 2701 in the color channels R, G, and B of the current frame and the centers of the segmentation clusters from the previous frame from the output of the delay unit 2703 are supplied to the input of the clustering unit 2702. The centers of the segmentation clusters a: (are described by a matrix of size N _K for 3 elements, where N _K is the number of clusters used in segmentation:

$K_{C} = (\begin{matrix} R_{1}^{C} & R_{2}^{C} & R_{3}^{C} & \dots & R_{N_{K}}^{C} \\ G_{1}^{C} & G_{2}^{C} & G_{3}^{C} & \dots & G_{N_{K}}^{C} \\ B_{1}^{C} & B_{2}^{C} & B_{3}^{C} & \dots & B_{N_{K}}^{C} \end{matrix})$

K_{C} = (\begin{matrix} R_{one}^{C} & R_{2}^{C} & R_{3}^{C} & ... & R_{N_{K}}^{C} \\ G_{one}^{C} & G_{2}^{C} & G_{3}^{C} & ... & G_{N_{K}}^{C} \\ B_{one}^{C} & B_{2}^{C} & B_{3}^{C} & \dots & B_{N_{K}}^{C} \end{matrix})

Блок 2702 кластеризации выполняет одну итерацию алгоритма K - средних, т.е. каждый пиксель Р(i, j) с координатами i, j и описываемый тремя значениями в цветовых каналах Р(i,j)=(R(i,j) G(i, j) B(i, j)) причисляется к кластеру $K (i, j) = \arg \min_{k = 1 \dots N_{k}} ‖ P (i, j) - C_{k} ‖$

, где

C k = (R_{k}^{C} G_{k}^{C} B_{k}^{C})

- центр k-го кластера, а ∥·∥ - некоторая векторная норма. При этом суммарная ошибка кластеризации вычисляется как

E = \sum_{\begin{array}{l} i = 1 \dots w \\ j = 1 \dots h \end{array}} ‖ P (i, j) - C_{K (i, j)} ‖

. Обновленные центры кластеров вычисляются по формуле

{\tilde{K}}_{C} = (\begin{matrix} {\tilde{R}}_{1}^{C} & {\tilde{R}}_{2}^{C} & {\tilde{R}}_{3}^{C} & \dots & {\tilde{R}}_{N_{K}}^{C} \\ {\tilde{G}}_{1}^{C} & {\tilde{G}}_{2}^{C} & {\tilde{G}}_{3}^{C} & \dots & G_{N_{K}}^{C} \\ {\tilde{B}}_{1}^{C} & {\tilde{B}}_{2}^{C} & {\tilde{B}}_{3}^{C} & \dots & {\tilde{B}}_{N_{K}}^{C} \end{matrix})

, где

{\tilde{R}}_{k}^{C} = \frac{\sum_{\begin{array}{l} i = 1 \dots w \\ j = 1 \dots h \end{array}} δ (K (i, j) = k) \cdot R (i, j)}{\sum_{\begin{array}{l} i = 1 \dots w \\ j = 1 \dots h \end{array}} δ (K (i, j) = k)}

,

{\tilde{G}}_{k}^{C} = \frac{\sum_{\begin{array}{l} i = 1 \dots w \\ j = 1 \dots h \end{array}} δ (K (i, j) = k) \cdot G (i, j)}{\sum_{\begin{array}{l} i = 1 \dots w \\ j = 1 \dots h \end{array}} δ (K (i, j) = k)}

,

{\tilde{B}}_{k}^{C} = \frac{\sum_{\begin{array}{l} i = 1 \dots w \\ j = 1 \dots h \end{array}} δ (K (i, j) = k) \cdot B (i, j)}{\sum_{\begin{array}{l} i = 1 \dots w \\ j = 1 \dots h \end{array}} δ (K (i, j) = k)}

.Clustering block 2702 performs one iteration of the K-means algorithm, i.e. each pixel P (i, j) with coordinates i, j and described by three values in the color channels P (i, j) = (R (i, j) G (i, j) B (i, j)) is assigned to the cluster

K (i, j) = \arg \min_{k = one ... N_{k}} ‖ P (i, j) - C_{k} ‖

where

C k = (R_{k}^{C} G_{k}^{C} B_{k}^{C})

is the center of the kth cluster, and ∥ · ∥ is some vector norm. In this case, the total clustering error is calculated as

E = \sum_{\begin{array}{l} i = one ... w \\ j = one ... h \end{array}} ‖ P (i, j) - C_{K (i, j)} ‖

. Updated cluster centers are calculated by the formula

{\tilde{K}}_{C} = (\begin{matrix} {\tilde{R}}_{one}^{C} & {\tilde{R}}_{2}^{C} & {\tilde{R}}_{3}^{C} & ... & {\tilde{R}}_{N_{K}}^{C} \\ {\tilde{G}}_{one}^{C} & {\tilde{G}}_{2}^{C} & {\tilde{G}}_{3}^{C} & ... & G_{N_{K}}^{C} \\ {\tilde{B}}_{one}^{C} & {\tilde{B}}_{2}^{C} & {\tilde{B}}_{3}^{C} & \dots & {\tilde{B}}_{N_{K}}^{C} \end{matrix})

where

{\tilde{R}}_{k}^{C} = \frac{\sum_{\begin{array}{l} i = one ... w \\ j = one ... h \end{array}} δ (K (i, j) = k) \cdot R (i, j)}{\sum_{\begin{array}{l} i = one ... w \\ j = one ... h \end{array}} δ (K (i, j) = k)}

,

{\tilde{G}}_{k}^{C} = \frac{\sum_{\begin{array}{l} i = one ... w \\ j = one ... h \end{array}} δ (K (i, j) = k) \cdot G (i, j)}{\sum_{\begin{array}{l} i = one ... w \\ j = one ... h \end{array}} δ (K (i, j) = k)}

,

{\tilde{B}}_{k}^{C} = \frac{\sum_{\begin{array}{l} i = one ... w \\ j = one ... h \end{array}} δ (K (i, j) = k) \cdot B (i, j)}{\sum_{\begin{array}{l} i = one ... w \\ j = one ... h \end{array}} δ (K (i, j) = k)}

.

Обновленные центры кластеров с выхода блока 2702 кластеризации подаются на вход блока 2703 задержки. Суммарная ошибка кластеризации с выхода блока 2702 кластеризации подается на вход блока 2705 задержки. Суммарная ошибка кластеризации с выхода блока 2702 кластеризации и выход блока 2705 задержки подаются на вход блока 2706, вычисляющего минимум этих двух величин, и на вход блока 2707, вычисляющего максимум этих двух величин. Выход блока 2706 подается на вход блока 2708 усиления, который производит умножение входной величины на заданную константу, превышающую единицу. Выход блока 2708 и выход блока 2707 соединены с первым и вторым входами блока 2709 вычитания, выход которого соединен со входом 2710 порогового преобразования. Выходом 2710 порогового преобразования является логическая величина, принимающая истинное значение в случае, если текущий кадр не принадлежит той же сцене, что и предыдущий кадр. Updated cluster centers from the output of block 2702 clustering are fed to the input of block 2703 delay. The total clustering error from the output of block 2702 clustering is fed to the input of block 2705 delay. The total clustering error from the output of the clustering block 2702 and the output of the delay block 2705 are fed to the input of block 2706, which calculates the minimum of these two values, and to the input of block 2707, which calculates the maximum of these two values. The output of block 2706 is fed to the input of gain block 2708, which multiplies the input quantity by a predetermined constant greater than one. The output of block 2708 and the output of block 2707 are connected to the first and second inputs of the subtraction block 2709, the output of which is connected to the threshold conversion input 2710. The threshold transform output 2710 is a logic value that takes a true value if the current frame does not belong to the same scene as the previous frame.

Блок-схема динамического детектора 1506 типа контента показана на Фиг. 28. Выход блока 1605 статистического анализа соединен с первым входом данных переключателя 2802 и первым входом блока 2803 логической дизъюнкции, выход детектора 2801 смены сцены соединен с управляющим входом переключателя 2802. Выход блока 2803 логической дизъюнкции соединен со вторым входом данных переключателя 2802. На выход переключателя 2802 передается сигнал с первого входа данных, если управляющий вход установлен в логическую единицу, и сигнал со второго входа данных, если управляющий вход установлен в логический ноль. Выход переключателя 2802 соединен с входом линии 2804 задержки. Выход линии 2804 задержки соединен с вторым блоком 2803 логической дизъюнкции и представляет собой выход динамического детектора 1506 типа контента.A block diagram of a content type dynamic detector 1506 is shown in FIG. 28. The output of the statistical analysis block 1605 is connected to the first data input of the switch 2802 and the first input of the logical disjunction unit 2803, the output of the scene change detector 2801 is connected to the control input of the switch 2802. The output of the logical disjunction block 2803 is connected to the second data input of the switch 2802. To the output of the switch 2802 a signal is transmitted from the first data input if the control input is set to logic one, and a signal from the second data input if the control input is set to logic zero. The output of switch 2802 is connected to the input of delay line 2804. The output of the delay line 2804 is connected to the second logical disjunction unit 2803 and is the output of a content type dynamic detector 1506.

Одно из возможных приложений данного изобретения в телевизионном приемнике показано на Фиг.29. Видеосигнал 2900 принимается ресивером 2901 и сохраняется в кадровом буфере 2902. Выход ресивера соединяется с устройством 2905 для обнаружения игровых эпизодов в полевых видах спорта в видеопоследовательностях. Выход кадрового буфера 2902 соединен с блоком 2903 улучшения видео изображений, который производит обработку видеоизображения, в частности, подавление шумов, улучшение контраста, обострение и т.п. Выход устройства 2905 для обнаружения игровых эпизодов в полевых видах спорта в видеопоследовательностях соединен с входом блока 2906 адаптации, который передает требуемые для данного типа изображения настройки алгоритмов обработки видеоизображения в блок 2903 улучшения видео изображений. Выход блока 2903 улучшения видео изображений, состоящий из обработанного видеопотока, подается на устройство 2904 отображения.One possible application of the present invention in a television receiver is shown in FIG. The video signal 2900 is received by the receiver 2901 and stored in the frame buffer 2902. The output of the receiver is connected to the device 2905 for detecting game episodes in field sports in video sequences. The output of the frame buffer 2902 is connected to the video image enhancement block 2903, which performs video image processing, in particular, noise reduction, contrast enhancement, sharpening, and the like. The output of the device 2905 for detecting game episodes in field sports in video sequences is connected to the input of the adaptation block 2906, which transmits the settings of the video processing algorithms required for this type of image to the video image enhancement block 2903. The output of video image enhancement unit 2903, consisting of the processed video stream, is supplied to a display device 2904.

Изобретение также может быть использовано при автоматической индексации архивов видеопоследовательностей.The invention can also be used for automatic indexing of archives of video sequences.

Claims

1. A method for detecting game episodes in field sports in real time, which consists in performing the following operations:
- calculate the brightness and saturation of each pixel in the frame described by the values in the red, green and blue color channels;
- calculate the absolute value of the brightness gradient for each pixel of the frame;
- perform classification by color of each pixel of the frame;
- calculate statistics according to the results of the classification by color for the entire frame;
- calculate statistics according to the results of the classification by color for the green areas of the frame;
- determine whether a given frame is a game episode in field sports based solely on the characteristics of the current frame;
- determine whether the current frame belongs to the same scene as the previous frame;
- use the detection result obtained for the previous frame as an updated detection result if the current frame belongs to the same scene as the previous frame of the video sequence, or
- use the detection result obtained for the current frame based solely on the characteristics of the current frame, as an updated detection result if the current frame does not belong to the same scene as the previous frame of the video sequence.

2. The method according to claim 1, characterized in that the procedure for performing the classification of pixels by color consists of the following operations:
- determine whether the pixel is white;
- determine whether the pixel is bright and saturated;
- determine whether the pixel is yellow;
- determine whether the pixel is green;
- determine whether the pixel is similar in color to human skin.

3. The method according to claim 1, characterized in that the procedure for calculating statistics according to the results of color classification for the entire frame consists in performing the following operations:
- calculate the proportion of white pixels in the frame;
- calculate the proportion of bright and saturated pixels in the frame;
- calculate the proportion of green pixels in the frame;
- calculate the proportion of pixels similar in color to human skin in the frame.

4. The method according to claim 1, characterized in that the procedure for calculating statistics according to the results of color classification for green areas of the frame consists in performing the following operations:
- calculate the average brightness in the green areas of the frame;
- calculate the average saturation in the green areas of the frame;
- calculate the average value in the blue color channel in the green areas of the frame;
- calculate the average absolute value of the brightness gradient in the green areas of the frame;
- calculate the histogram of the green color channel in the green areas of the frame.

5. A device for detecting game episodes in field sports in real time, consisting of a frame buffer, a low level feature detector, a scene change detector, a feature analysis unit and a dynamic content type detector, where the frame buffer is connected to a low level feature detector, line input delays and with the first input of the scene change detector; the output of the low level feature detector is connected to the feature analysis unit, the delay line output is connected to the second input of the scene change detector; a feature analysis unit is connected to a first input of a content type dynamic detector; the output of the scene change detector is connected to a second input of a content type dynamic detector; the output of a dynamic content type detector is the output of the device.

6. The device according to claim 5, characterized in that the scene change detector is configured to determine whether the current frame belongs to the same scene as the previous frame.

7. The device according to claim 5, characterized in that the dynamic content type detector is configured to transmit, as an updated result, the detection result obtained by detecting the previous frame if the current frame belongs to the same scene as the previous frame, and the detection result obtained of the current frame based solely on the characteristics of the current frame, if the current frame is the first frame of the video sequence or if the current frame does not belong to the same scene as and the previous frame.

8. A low-level feature detector in a device for detecting game episodes in field sports in real time, consisting of a pixel conversion unit configured to calculate saturation and brightness for each pixel and coupled to a pixel color classifier, which is configured to determine pixel color in accordance with how it is perceived by man; and a gradient calculation unit, which is configured to calculate a gradient in the luminance channel and whose output is connected, along with the output of the pixel classifier, with the input of the statistical analysis unit.

9. The low-level feature detector of claim 8, characterized in that it is configured to classify the pixel color by computing a vector consisting of the following logical values:
- whether the pixel is white;
- whether the pixel is bright and saturated;
- whether the pixel is yellow;
- whether the pixel is green;
- Is the pixel similar in color to human skin.

10. The low level feature detector of claim 8, characterized in that the statistical analysis unit contained in the detector is configured to calculate a vector consisting of the following frame characteristics:
- the normalized number of green pixels in the frame;
- the normalized number of bright and saturated pixels in the frame;
- the normalized number of pixels similar in color to human skin in a frame;
- the average brightness of the green areas of the frame;
- average saturation of green areas of the frame;
the average value of the blue color channel in the green areas of the frame;
- the average absolute value of the brightness gradient in the green areas of the frame;
- compactness of the histogram of the green color channel in the green areas of the frame.