KR102780326B1

KR102780326B1 - Method for Generating Sports Highlight Videos

Info

Publication number: KR102780326B1
Application number: KR1020240092237A
Authority: KR
Inventors: 김현진
Original assignee: 주식회사 메타버즈
Priority date: 2024-07-12
Filing date: 2024-07-12
Publication date: 2025-03-12
Anticipated expiration: 2044-07-12

Abstract

본 발명에 따른 스포츠 하이라이트 영상 생성방법은, 영상전처리부가 하이라이트 영상을 생성하기 위한 원본 스포츠 영상 및 상기 원본 스포츠 영상의 해설 텍스트를 제공받는 (a)단계, 상기 영상전처리부가 상기 원본 스포츠 영상의 전체 길이를 임의로 설정된 기준에 따라 복수의 단위 청크로 분할 설정하는 (b)단계, 상기 영상전처리부가 상기 복수의 단위 청크에 포함된 해설 음성의 음량정보 및 해설 텍스트의 의미정보를 도출하는 (c)단계, 영상생성부가 상기 복수의 단위 청크의 포함된 해설 음성의 음량정보 및 의미정보를 기반으로 경기 이벤트 발생 지점을 추출하는 (d)단계, 상기 영상생성부가 상기 경기 이벤트 발생 지점을 중간점으로 하여 전후에 시작점 및 종료점을 설정함에 따라 메인 이벤트 영상의 길이를 결정하는 (e)단계 및 상기 영상생성부가 상기 메인 이벤트 영상을 포함하는 스포츠 하이라이트 영상을 생성하는 (f)단계를 포함한다.A method for generating a sports highlight video according to the present invention comprises: (a) a step in which a video preprocessing unit receives an original sports video and a commentary text of the original sports video for generating a highlight video; (b) a step in which the video preprocessing unit divides and sets the entire length of the original sports video into a plurality of unit chunks according to an arbitrarily set criterion; (c) a step in which the video preprocessing unit derives volume information of commentary voices and semantic information of commentary texts included in the plurality of unit chunks; (d) a step in which the video generation unit extracts a point of occurrence of a game event based on the volume information and semantic information of commentary voices included in the plurality of unit chunks; (e) a step in which the video generation unit determines the length of a main event video by setting a start point and an end point before and after the point of occurrence of the game event as a midpoint; and (f) a step in which the video generation unit generates a sports highlight video including the main event video.

Description

{Method for Generating Sports Highlight Videos}

본 발명은 스포츠 하이라이트 영상을 생성하는 방법에 관한 것으로서, 보다 상세하게는 사람의 개입 없이 자동으로 정확하고 빠르게 스포츠 하이라이트 영상을 생성할 수 있는 방법에 관한 것이다.The present invention relates to a method for generating a sports highlight video, and more particularly, to a method for automatically and accurately generating a sports highlight video without human intervention.

최근 스포츠에 대한 사람들의 관심이 높아지면서, 경기의 중요한 순간들을 빠르고 간편하게 확인할 수 있는 스포츠 하이라이트 영상의 수요가 증가하고 있다. 또한 이는 단순히 팬들의 경기 관람 편의를 넘어, 스포츠 분석가, 코치, 선수 등 다양한 관계자들에게도 중요한 부분이다.As people's interest in sports has increased recently, the demand for sports highlight videos that allow for quick and easy viewing of important moments in the game is also increasing. In addition, this is important not only for the convenience of fans watching the game, but also for various related parties such as sports analysts, coaches, and players.

과거에는 경기 전체를 녹화하여 하이라이트 부분을 수작업으로 편집하는 방식이 일반적이었다. 그러나 이러한 방식은 많은 시간과 편집자의 노력이 소요되며, 중요한 순간을 놓칠 가능성도 높다.In the past, it was common to record the entire game and manually edit the highlights. However, this method took a lot of time and effort from the editor, and there was a high possibility of missing important moments.

또한 수작업 편집 과정에서 주관적인 판단이 개입될 수 있어, 시청자들이 원하는 하이라이트와 실제 제공되는 하이라이트가 불일치하는 문제점도 존재했다.Additionally, since subjective judgment can be involved in the manual editing process, there was also a problem of discrepancies between the highlights that viewers wanted and the highlights that were actually provided.

더불어, 수작업 편집 과정에서 편집자의 주관이 개입되면 특정 팀이나 선수에게 유리한 하이라이트가 제작되는 경우도 있으며, 이는 공정성에 대한 논란이 발생할 수 있는 소지가 된다.In addition, when the editor's subjectivity is involved in the manual editing process, there are cases where highlights are produced that are favorable to a specific team or player, which can lead to controversy over fairness.

그리고 편집자의 피로도나 경험에 따라 스포츠 하이라이트 영상의 질이 천차만별로 달라질 수 있으며, 이는 하이라이트 영상의 일관성을 저하시킬 수 있다는 문제를 가진다.And the quality of sports highlight videos can vary greatly depending on the editor's fatigue and experience, which can lead to inconsistencies in highlight videos.

따라서 이와 같은 문제점들을 해결하기 위한 방법이 요구된다.Therefore, a method to solve these problems is required.

한국공개특허 제10-2018-0118936호Korean Patent Publication No. 10-2018-0118936

본 발명은 상술한 종래 기술의 문제점을 해결하기 위하여 안출된 발명으로서, 원본 스포츠 영상과 해설 텍스트를 기반으로 일관적이고 정확한 스포츠 하이라이트 영상을 자동으로 생성하는 방법을 제공하기 위한 목적을 가진다.The present invention is an invention devised to solve the problems of the above-described prior art, and has the purpose of providing a method for automatically generating consistent and accurate sports highlight videos based on original sports videos and commentary text.

본 발명의 과제들은 이상에서 언급한 과제들로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The tasks of the present invention are not limited to the tasks mentioned above, and other tasks not mentioned will be clearly understood by those skilled in the art from the description below.

상기한 목적을 달성하기 위한 본 발명의 스포츠 하이라이트 영상 생성방법은, 영상전처리부가 하이라이트 영상을 생성하기 위한 원본 스포츠 영상 및 상기 원본 스포츠 영상의 해설 텍스트를 제공받는 (a)단계, 상기 영상전처리부가 상기 원본 스포츠 영상의 전체 길이를 임의로 설정된 기준에 따라 복수의 단위 청크로 분할 설정하는 (b)단계, 상기 영상전처리부가 상기 복수의 단위 청크에 포함된 해설 음성의 음량정보 및 해설 텍스트의 의미정보를 도출하는 (c)단계, 영상생성부가 상기 복수의 단위 청크의 포함된 해설 음성의 음량정보 및 의미정보를 기반으로 경기 이벤트 발생 지점을 추출하는 (d)단계, 상기 영상생성부가 상기 경기 이벤트 발생 지점을 중간점으로 하여 전후에 시작점 및 종료점을 설정함에 따라 메인 이벤트 영상의 길이를 결정하는 (e)단계 및 상기 영상생성부가 상기 메인 이벤트 영상을 포함하는 스포츠 하이라이트 영상을 생성하는 (f)단계를 포함한다.In order to achieve the above object, the method for generating a sports highlight video of the present invention comprises: (a) a step in which a video preprocessing unit receives an original sports video for generating a highlight video and a commentary text of the original sports video; (b) a step in which the video preprocessing unit divides and sets the entire length of the original sports video into a plurality of unit chunks according to an arbitrarily set criterion; (c) a step in which the video preprocessing unit derives volume information of commentary voices and semantic information of commentary texts included in the plurality of unit chunks; (d) a step in which the video generation unit extracts a point of occurrence of a game event based on the volume information and semantic information of commentary voices included in the plurality of unit chunks; (e) a step in which the video generation unit determines the length of a main event video by setting a start point and an end point before and after the point of occurrence of the game event as a midpoint; and (f) a step in which the video generation unit generates a sports highlight video including the main event video.

이때 상기 (b)단계는, 상기 영상전처리부가 상기 원본 스포츠 영상에 포함된 해설 음성 중 임의의 발화가 시작되는 지점으로부터 발화가 종료되는 지점을 임시 청크로 설정하는 (b-1)단계, 상기 영상전처리부가 상기 (b-1)단계에서 설정된 임시 청크에 포함된 해설 음성의 목소리가 바뀌는 지점이 존재하는지를 분석하는 (b-2)단계, 상기 영상전처리부가 상기 (b-2)단계에 의해 임시 청크에 포함된 해설 음성의 목소리가 바뀌는 지점이 존재하는 것으로 판단된 경우, 해당 지점을 기준으로 상기 임시 청크를 복수 개의 단위 청크로 분할하는 (b-3)단계 및 상기 영상전처리부가 상기 (b-1)단계에서 설정된 각 임시 청크 사이에 존재하는 공백 부분들을 각각 단위 청크로 설정하는 (b-4)단계를 포함할 수 있다.At this time, the step (b) may include a step (b-1) in which the video preprocessing unit sets a temporary chunk from a point where any utterance of the commentary voice included in the original sports video starts to a point where the utterance ends, a step (b-2) in which the video preprocessing unit analyzes whether there is a point where the voice of the commentary voice included in the temporary chunk set in the step (b-1) changes, a step (b-3) in which the video preprocessing unit divides the temporary chunk into a plurality of unit chunks based on the point where the voice of the commentary voice included in the temporary chunk changes if it is determined by the step (b-2) that there is a point where the voice of the commentary voice included in the temporary chunk changes, and a step (b-4) in which the video preprocessing unit sets each blank portion between the temporary chunks set in the step (b-1) as a unit chunk.

그리고 상기 (c)단계는, 상기 영상전처리부가 각각의 단위 청크에 포함된 해설 음성에 대한 음량도를 설정하는 (c-1)단계 및 상기 영상전처리부가 각각의 단위 청크에 포함된 해설 텍스트의 의미를 해석하는 (c-2)단계를 포함할 수 있다.And the step (c) above may include a step (c-1) in which the image preprocessing unit sets the volume level for the commentary voice included in each unit chunk, and a step (c-2) in which the image preprocessing unit interprets the meaning of the commentary text included in each unit chunk.

또한 상기 (d)단계는, 상기 영상생성부가 해당 스포츠에 관련된 이벤트를 의미하고 있는 해설 텍스트를 포함하고 있는 단위 청크를 추출하는 (d-1)단계, 상기 영상생성부가 상기 (d-1)단계에서 추출된 단위 청크에 포함된 해설 음성의 음량도가 기 설정된 기준값 이상인지를 판단하는 (d-2)단계 및 상기 영상생성부가 상기 (d-2)단계에서 기 설정된 기준값 이상의 음량도를 가지는 단위 청크를 상기 경기 이벤트 발생 시점으로 판단하는 (d-3)단계를 포함할 수 있다.In addition, the step (d) may include a step (d-1) in which the video generation unit extracts a unit chunk including a commentary text indicating an event related to the corresponding sport, a step (d-2) in which the video generation unit determines whether the volume of the commentary voice included in the unit chunk extracted in the step (d-1) is equal to or higher than a preset reference value, and a step (d-3) in which the video generation unit determines a unit chunk having a volume equal to or higher than the preset reference value in the step (d-2) as the time of occurrence of the game event.

더불어 상기 (e)단계는, 상기 영상생성부가 상기 경기 이벤트 발생 지점으로 설정된 단위 청크를 기준으로 기 설정된 시간 간격을 가지는 제1선행 시점 및 제1후행 시점을 설정하는 (e-1)단계, 상기 영상생성부가 상기 제1선행 시점 및 상기 제1후행 시점에 해설 음성이 발생하고 있는지를 판단하는 (e-2)단계, 상기 영상생성부가 상기 (e-2)단계에서 상기 제1선행 시점에 해설 음성이 발생하지 않는 것으로 판단된 경우, 상기 제1선행 시점으로부터 상기 경기 이벤트 발생 지점으로 설정된 단위 청크의 시작점까지를 메인 이벤트 영상에 포함시키는 (e-3a)단계 및 상기 영상생성부가 상기 (e-2)단계에서 상기 제1후행 시점에 해설 음성이 발생하지 않는 것으로 판단된 경우, 상기 경기 이벤트 발생 지점으로 설정된 단위 청크의 종료점으로부터 상기 제1후행 시점까지를 메인 이벤트 영상으로 포함시키는 (e-4a)단계를 포함할 수 있다.In addition, the step (e) may include a step (e-1) in which the video generation unit sets a first preceding point in time and a first succeeding point in time having a preset time interval based on the unit chunk set as the point of occurrence of the match event, a step (e-2) in which the video generation unit determines whether a commentary voice is generated at the first preceding point in time and the first succeeding point in time, a step (e-3a) in which the video generation unit includes a portion from the first preceding point in time to the starting point of the unit chunk set as the point of occurrence of the match event in the main event video if the video generation unit determines in step (e-2) that a commentary voice is not generated at the first succeeding point in time, and a step (e-4a) in which the video generation unit includes a portion from the ending point of the unit chunk set as the point of occurrence of the match event to the first succeeding point in the main event video if the video generation unit determines in step (e-2) that a commentary voice is not generated at the first succeeding point in time.

그리고 상기 (e)단계는, 상기 영상생성부가 상기 (e-2)단계에서 상기 제1선행 시점에 해설 음성이 발생하고 있는 것으로 판단된 경우, 상기 제1선행 시점을 기 설정된 기준에 따라 보다 이전 시점으로 설정하는 (e-3b)단계 및 상기 (e-3a)단계에 도달할 때까지 상기 (e-2)단계 및 상기 (e-3b)단계를 n회 재수행하는 (e-3c)단계를 더 포함할 수 있다.And the step (e) may further include a step (e-3b) of setting the first preceding time point to an earlier time point according to a preset criterion when the image generation unit determines in the step (e-2) that the narration voice is generated at the first preceding time point, and a step (e-3c) of re-performing the steps (e-2) and (e-3b) n times until the step (e-3a) is reached.

더불어 상기 (e)단계는, 상기 영상생성부가 상기 (e-2)단계에서 상기 제1후행 시점에 해설 음성이 발생하고 있는 것으로 판단된 경우, 상기 제1후행 시점을 기 설정된 기준에 따라 보다 이후 시점으로 설정하는 (e-4b)단계 및 상기 (e-4a)단계에 도달할 때까지 상기 (e-2)단계 및 상기 (e-4b)단계를 n회 재수행하는 (e-4c)단계를 더 포함할 수 있다.In addition, the step (e) may further include a step (e-4b) of setting the first subsequent time point to a later time point according to a preset criterion when the video generation unit determines in the step (e-2) that the narration voice is generated at the first subsequent time point, and a step (e-4c) of re-performing the steps (e-2) and (e-4b) n times until the step (e-4a) is reached.

한편 상기 (e)단계 및 상기 (f)단계 사이에는, 상기 영상생성부가 상기 메인 이벤트 영상보다 이전에 재생되는 사전 이벤트 영상 및 상기 메인 이벤트 영상보다 이후에 재생되는 사후 이벤트 영상 중 적어도 어느 하나의 길이를 결정하는 (add)단계가 더 수행될 수 있다.Meanwhile, between the above steps (e) and (f), an (add) step may be further performed in which the video generation unit determines the length of at least one of a pre-event video played before the main event video and a post-event video played after the main event video.

이때 상기 (add)단계는, 상기 영상생성부가 상기 메인 이벤트 영상을 기준으로 기 설정된 시간 간격을 가지는 제2선행 시점 및 제2후행 시점을 설정하는 (add-1)단계, 상기 영상생성부가 상기 제2선행 시점 및 상기 제2후행 시점에 해설 음성이 발생하고 있는지를 판단하는 (add-2)단계, 상기 영상생성부가 상기 (add-2)단계에서 상기 제2선행 시점에 해설 음성이 발생하지 않는 것으로 판단된 경우, 상기 제2선행 시점으로부터 상기 메인 이벤트 영상의 시작점까지를 사전 이벤트 영상으로 결정하는 (add-3a)단계 및 상기 영상생성부가 상기 (add-2)단계에서 상기 제2후행 시점에 해설 음성이 발생하지 않는 것으로 판단된 경우, 상기 메인 이벤트 영상의 종료점으로부터 상기 제2후행 시점까지를 사후 이벤트 영상으로 결정하는 (add-4a)단계를 포함할 수 있다.At this time, the (add) step may include an (add-1) step in which the video generation unit sets a second preceding time point and a second succeeding time point having a preset time interval based on the main event video, an (add-2) step in which the video generation unit determines whether a commentary voice is generated at the second preceding time point and the second succeeding time point, a (add-3a) step in which the video generation unit determines that a commentary voice is not generated at the second preceding time point in the (add-2) step, determines a region from the second preceding time point to the starting point of the main event video as a pre-event video, and a (add-4a) step in which the video generation unit determines that a commentary voice is not generated at the second succeeding time point in the (add-2) step, determines a region from the ending point of the main event video to the second succeeding time point as a post-event video.

그리고 상기 (add)단계는, 상기 영상생성부가 상기 (add-2)단계에서 상기 제2선행 시점에 해설 음성이 발생하고 있는 것으로 판단된 경우, 상기 제2선행 시점을 기 설정된 기준에 따라 보다 이전 시점으로 설정하는 (add-3b)단계 및 상기 (add-3a)단계에 도달할 때까지 상기 (add-2)단계 및 상기 (add-3b)단계를 n회 재수행하는 (add-3c)단계를 더 포함할 수 있다.And the (add) step may further include an (add-3b) step of setting the second preceding time point to an earlier time point according to a preset standard when the image generation unit determines in the (add-2) step that the narration voice is generated at the second preceding time point, and an (add-3c) step of re-performing the (add-2) step and the (add-3b) step n times until the (add-3a) step is reached.

또한 상기 (add)단계는, 상기 영상생성부가 상기 (add-2)단계에서 상기 제2후행 시점에 해설 음성이 발생하고 있는 것으로 판단된 경우, 상기 제2후행 시점을 기 설정된 기준에 따라 보다 이후 시점으로 설정하는 (add-4b)단계 및 상기 (add-4a)단계에 도달할 때까지 상기 (add-2)단계 및 상기 (add-4b)단계를 n회 재수행하는 (add-4c)단계를 더 포함할 수 있다.In addition, the (add) step may further include an (add-4b) step of setting the second subsequent time point to a later time point according to a preset standard when the image generation unit determines in the (add-2) step that the narration voice is generated at the second subsequent time point, and an (add-4c) step of re-performing the (add-2) step and the (add-4b) step n times until the (add-4a) step is reached.

또한 상기 (f)단계는, 상기 메인 이벤트 영상과 함께 상기 사전 이벤트 영상 및 상기 사후 이벤트 영상을 포함하여 상기 스포츠 하이라이트 영상을 생성할 수 있다.Additionally, the step (f) above can generate the sports highlight video including the pre-event video and the post-event video along with the main event video.

더불어 본 발명은 메모리부가 상기 (a)단계 내지 (f)단계에서 발생한 복수의 데이터를 데이터베이스에 저장하는 (ex1)단계를 더 포함할 수 있다.In addition, the present invention may further include a step (ex1) in which the memory unit stores a plurality of data generated in steps (a) to (f) in a database.

그리고 상기 (f)단계 이후에는, 데이터검증부가 상기 스포츠 하이라이트 영상이 해당 스포츠의 경기 이벤트 발생 지점에 부합하는지를 검증하는 (ex2)단계가 더 수행될 수 있다.And after the above step (f), a step (ex2) may be further performed in which the data verification unit verifies whether the sports highlight video matches the point where the game event of the corresponding sport occurred.

상기한 과제를 해결하기 위한 본 발명의 스포츠 하이라이트 영상 생성방법은, 원본 스포츠 영상과 해설 텍스트를 기반으로 하여 중요한 경기 이벤트를 빠르게 자동으로 추출할 수 있다는 장점을 가진다.The sports highlight video generation method of the present invention for solving the above-mentioned problem has the advantage of being able to quickly and automatically extract important game events based on original sports videos and commentary text.

이는 수작업 편집 과정에서 발생할 수 있는 주관적 판단의 개입을 최소화할 수 있고, 더 객관적이고 일관된 하이라이트 영상을 제공할 수 있으며, 경기 이벤트 발생 지점을 정확하게 탐지함에 따라 시청자들이 놓칠 수 있는 중요한 순간들을 효과적으로 강조할 수 있다.This can minimize the intervention of subjective judgment that may occur during the manual editing process, provide more objective and consistent highlight footage, and effectively highlight important moments that viewers may miss by accurately detecting the point where game events occur.

또한 본 발명은 해설 음성의 음량 정보와 의미 정보를 활용하여 경기 이벤트의 중요도를 판단하는 방식을 가지므로, 다른 방식들에 비해 더 정확하게 중요한 순간을 식별할 수 있으며, 다양한 스포츠 종목에 범용적으로 적용 가능하다는 장점이 있다.In addition, since the present invention has a method of judging the importance of a game event by utilizing the volume information and semantic information of the commentary voice, it can identify important moments more accurately than other methods and has the advantage of being universally applicable to various sports.

더불어 본 발명은 스포츠 하이라이트 영상의 길이나 구성 요소를 유연하게 설정할 수 있어 시청자의 다양한 요구를 충족시키고, 더 개인화된 스포츠 하이라이트 영상을 제공할 수 있도록 하는 장점을 가진다.In addition, the present invention has the advantage of being able to flexibly set the length and components of a sports highlight video, thereby satisfying various needs of viewers and providing a more personalized sports highlight video.

본 발명의 효과들은 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 청구범위의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The effects of the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those skilled in the art from the description of the claims.

도 1은 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법을 수행하기 위한 스포츠 하이라이트 영상 생성장치의 구조를 개념적으로 나타낸 도면이다.
도 2는 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법의 전체 과정을 나타낸 도면이다.
도 3은 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (b)단계의 세부 과정을 나타낸 도면이다.
도 4는 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (c)단계의 세부 과정을 나타낸 도면이다.
도 5는 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (d)단계의 세부 과정을 나타낸 도면이다.
도 6은 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (e)단계의 세부 과정을 나타낸 도면이다.
도 7은 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, 스포츠 하이라이트 영상의 구조를 개략적으로 나타낸 도면이다.
도 8은 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (add)단계의 세부 과정을 나타낸 도면이다.
도 9는 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (ex2)단계의 세부 과정을 나타낸 도면이다.FIG. 1 is a diagram conceptually illustrating the structure of a sports highlight video generation device for performing a sports highlight video generation method according to one embodiment of the present invention.
FIG. 2 is a diagram showing the entire process of a method for generating a sports highlight video according to one embodiment of the present invention.
FIG. 3 is a drawing showing a detailed process of step (b) in a method for generating a sports highlight video according to one embodiment of the present invention.
FIG. 4 is a drawing showing a detailed process of step (c) in a method for generating a sports highlight video according to one embodiment of the present invention.
FIG. 5 is a drawing showing a detailed process of step (d) in a method for generating a sports highlight video according to one embodiment of the present invention.
FIG. 6 is a drawing showing a detailed process of step (e) in a method for generating a sports highlight video according to one embodiment of the present invention.
FIG. 7 is a drawing schematically showing the structure of a sports highlight video in a method for generating a sports highlight video according to one embodiment of the present invention.
FIG. 8 is a drawing showing a detailed process of the (add) step in a method for generating a sports highlight video according to one embodiment of the present invention.
FIG. 9 is a drawing showing a detailed process of step (ex2) in a method for generating a sports highlight video according to one embodiment of the present invention.

본 명세서에서, 어떤 구성요소(또는 영역, 층, 부분 등)가 다른 구성요소 "상에 있다", "연결된다", 또는 "결합된다"고 언급되는 경우에 그것은 다른 구성요소 상에 직접 배치/연결/결합될 수 있거나 또는 그들 사이에 제3의 구성요소가 배치될 수도 있다는 것을 의미한다.In this specification, when a component (or region, layer, portion, etc.) is referred to as being “on,” “connected to,” or “coupled to” another component, it means that it can be directly disposed/connected/coupled to the other component, or that a third component may be disposed between them.

동일한 도면부호는 동일한 구성요소를 지칭한다. 또한, 도면들에 있어서, 구성요소들의 두께, 비율, 및 치수는 기술적 내용의 효과적인 설명을 위해 과장된 것이다.Identical drawing symbols refer to identical components. Also, in the drawings, the thicknesses, proportions, and dimensions of the components are exaggerated for the purpose of effectively explaining the technical contents.

"및/또는"은 연관된 구성들이 정의할 수 있는 하나 이상의 조합을 모두 포함한다.“And/or” includes any combination of one or more of the associated constructs that can be defined.

제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.The terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are only used to distinguish one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component. The singular expression includes the plural expression unless the context clearly indicates otherwise.

또한, "아래에", "하측에", "위에", "상측에" 등의 용어는 도면에 도시된 구성들의 연관관계를 설명하기 위해 사용된다. 상기 용어들은 상대적인 개념으로, 도면에 표시된 방향을 기준으로 설명된다.Additionally, terms such as "below," "lower," "above," and "upper," are used to describe the relationships between components depicted in the drawings. These terms are relative concepts and are described based on the directions indicated in the drawings.

다르게 정의되지 않는 한, 본 명세서에서 사용된 모든 용어 (기술 용어 및 과학 용어 포함)는 본 발명이 속하는 기술 분야의 당업자에 의해 일반적으로 이해되는 것과 동일한 의미를 갖는다. 또한, 일반적으로 사용되는 사전에서 정의된 용어와 같은 용어는 관련 기술의 맥락에서 의미와 일치하는 의미를 갖는 것으로 해석되어야 하고, 이상적인 또는 지나치게 형식적인 의미로 해석되지 않는 한, 명시적으로 여기에서 정의된다.Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. In addition, terms that are defined in commonly used dictionaries, such as terms, should be interpreted as having a meaning consistent with the meaning in the context of the relevant art, and are explicitly defined herein, unless interpreted in an idealized or overly formal sense.

"포함하다" 또는 "가지다" 등의 용어는 명세서 상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.It should be understood that the terms "include" or "have" are intended to specify the presence of a feature, number, step, operation, component, part, or combination thereof described in the specification, but do not exclude in advance the possibility of the presence or addition of one or more other features, numbers, steps, operations, components, parts, or combinations thereof.

더불어 본 명세서에서 제1구성요소가 제2구성요소 상(ON)에서 동작 또는 실행된다고 언급될 때, 제1구성요소는 제2구성요소가 동작 또는 실행되는 환경에서 동작 또는 실행되거나 또는 제2구성요소와 직접 또는 간접적으로 상호 작용을 통해서 동작 또는 실행되는 것으로 이해되어야 할 것이다.Additionally, when it is stated in this specification that a first component operates or is executed on a second component, it should be understood that the first component operates or is executed in an environment in which the second component operates or is executed, or that the first component operates or is executed through direct or indirect interaction with the second component.

어떤 구성요소, 장치, 또는 시스템이 프로그램 또는 소프트웨어로 이루어진 구성요소를 포함한다고 언급되는 경우, 명시적인 언급이 없더라도, 그 구성요소, 장치, 또는 시스템은 그 프로그램 또는 소프트웨어가 실행 또는 동작하는데 필요한 하드웨어(예를 들면, 메모리, CPU 등)나 다른 프로그램 또는 소프트웨어(예를 들면 운영체제나 하드웨어를 구동하는데 필요한 드라이버 등)를 포함하는 것으로 이해되어야 할 것이다.When a component, device, or system is referred to as including a component consisting of a program or software, even if there is no explicit mention, it should be understood that the component, device, or system includes hardware (e.g., memory, CPU, etc.) or other programs or software (e.g., an operating system or drivers necessary to operate hardware) necessary for the program or software to execute or operate.

또한, 어떤 구성요소가 구현됨에 있어서 특별한 언급이 없다면, 그 구성요소는 소프트웨어, 하드웨어, 또는 소프트웨어 및 하드웨어 어떤 형태로도 구현될 수 있는 것으로 이해되어야 할 것이다.Additionally, unless otherwise specifically stated, a component may be implemented in software, hardware, or both software and hardware.

또한, 본 명세서에서 사용된 용어는 실시 예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 '포함한다(comprises)' 및/또는 '포함하는(comprising)'은 언급된 구성요소는 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다.Also, the terminology used herein is for the purpose of describing embodiments only and is not intended to limit the present invention. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. The words 'comprises' and/or 'comprising' as used in the specification do not exclude the presence or addition of one or more other components mentioned.

또한, 본 명세서에서 '부', '장치' 등의 용어는 하드웨어 및 해당 하드웨어에 의해 구동되거나 하드웨어를 구동하기 위한 소프트웨어의 기능적, 구조적 결합을 지칭하는 것으로 의도될 수 있다. 예를 들어, 여기서 하드웨어는 CPU 또는 다른 프로세서(processor)를 포함하는 데이터 처리 기기일 수 있다. 또한, 하드웨어에 의해 구동되는 소프트웨어는 실행중인 프로세스, 객체(object), 실행파일(executable), 실행 스레드(thread of execution), 프로그램(program) 등을 지칭할 수 있다.In addition, terms such as 'part', 'device', etc. in this specification may be intended to refer to a functional and structural combination of hardware and software driven by or for driving the hardware. For example, the hardware herein may be a data processing device including a CPU or other processor. In addition, software driven by the hardware may refer to a running process, an object, an executable, a thread of execution, a program, etc.

또한, 상기 용어들은 소정의 코드와 상기 소정의 코드가 수행되기 위한 하드웨어 리소스의 논리적인 단위를 의미할 수 있으며, 반드시 물리적으로 연결된 코드를 의미하거나, 한 종류의 하드웨어를 의미하는 것이 아님은 본 발명의 기술분야의 평균적 전문가에게는 용이하게 추론될 수 있다.In addition, it can be easily inferred by an average expert in the technical field of the present invention that the above terms may mean a logical unit of a given code and hardware resources for executing the given code, and do not necessarily mean physically connected code or a type of hardware.

이하, 본 발명에서 실시하고자 하는 구체적인 기술내용에 대해 첨부도면을 참조하여 상세하게 설명하기로 한다.Hereinafter, the specific technical contents to be implemented in the present invention will be described in detail with reference to the attached drawings.

본 발명에 따른 스포츠 하이라이트 영상 생성방법은 다양한 서버, 단말기의 프로세서에 의해 구동될 수 있으며, 이는 디스플레이 모듈 등 영상 출력장치를 통해 출력되어 시각화된 그래픽 유저 인터페이스를 통해 가시적인 정보를 제공할 수 있다.The method for generating a sports highlight video according to the present invention can be driven by processors of various servers and terminals, and can be output through a video output device such as a display module to provide visible information through a visualized graphical user interface.

특히 본 발명에 따른 스포츠 하이라이트 영상 생성방법은 프로그래밍되어 이동식 디스크나 통신망을 이용하여 서버, 단말기 등에 설치될 수 있으며, 스포츠 하이라이트 영상 생성방법은 서버, 단말기 등이 다양한 기능적 수단으로 운용되도록 할 수 있다.In particular, the method for generating a sports highlight video according to the present invention can be programmed and installed in a server, terminal, etc. using a removable disk or a communication network, and the method for generating a sports highlight video can enable the server, terminal, etc. to be operated by various functional means.

즉 본 발명은 소프트웨어에 의한 정보 처리가 하드웨어를 통해 구체적으로 실현된다.That is, in the present invention, information processing by software is specifically realized through hardware.

이하, 첨부된 도면들을 참조하여 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 대해서 설명하도록 한다.Hereinafter, a method for generating a sports highlight video according to one embodiment of the present invention will be described with reference to the attached drawings.

도 1은 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법을 수행하기 위한 스포츠 하이라이트 영상 생성장치의 구조를 개념적으로 나타낸 도면이다.FIG. 1 is a diagram conceptually illustrating the structure of a sports highlight video generation device for performing a sports highlight video generation method according to one embodiment of the present invention.

도 1에 도시된 바와 같이, 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성장치는 영상전처리부(10), 영상생성부(20), 메모리부(30), 데이터검증부(40)를 포함할 수 있다.As illustrated in FIG. 1, a sports highlight video generation device according to one embodiment of the present invention may include an image preprocessing unit (10), an image generation unit (20), a memory unit (30), and a data verification unit (40).

영상전처리부(10)는 스포츠 하이라이트 영상을 생성할 수 있도록 원본 스포츠 영상을 전처리하는 구성요소이며, 영상생성부(20)는 영상전처리부(10)에 의해 전처리된 원본 스포츠 영상을 기반으로 스포츠 하이라이트 영상을 생성하게 된다.The image preprocessing unit (10) is a component that preprocesses the original sports video so that a sports highlight video can be generated, and the image generation unit (20) generates a sports highlight video based on the original sports video preprocessed by the image preprocessing unit (10).

메모리부(30)는 본 발명에 따라 스포츠 하이라이트 영상을 생성하는 과정에서 발생하는 모든 데이터를 저장하도록 마련되며, 이를 위해 데이터를 저장하기 위한 데이터베이스를 포함할 수 있다.The memory unit (30) is provided to store all data generated in the process of generating a sports highlight video according to the present invention, and may include a database for storing data for this purpose.

데이터검증부(40)는 영상전처리부(10) 및 영상생성부(20)에 의해 생성된 스포츠 하이라이트 영상이 해당 스포츠의 경기 이벤트 발생 지점에 부합하는지를 검증하는 구성요소이다.The data verification unit (40) is a component that verifies whether the sports highlight video generated by the video preprocessing unit (10) and the video generation unit (20) matches the point where the game event of the corresponding sport occurred.

도 1에 도시된 각각의 구성은 기능 및 논리적으로 분리될 수도 있음을 나타내는 것이며, 반드시 각각의 구성이 별도의 물리적 장치로 구분되거나 별도의 코드로 작성됨을 의미하는 것은 아님을 본 발명의 기술분야의 평균적 전문가는 용이하게 추론할 수 있을 것이다.Each of the components illustrated in FIG. 1 may be functionally and logically separated, and it will be readily apparent to an average expert in the technical field of the present invention that this does not necessarily mean that each component is separated into a separate physical device or written in separate code.

도 2는 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법의 전체 과정을 나타낸 도면이다.FIG. 2 is a diagram showing the entire process of a method for generating a sports highlight video according to one embodiment of the present invention.

도 2에 도시된 바와 같이, 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법은 (a)단계 내지 (f)단계를 포함하며, 또한 (add)단계, (ex1)단계 및 (ex2)단계를 더 포함할 수 있다.As illustrated in FIG. 2, a method for generating a sports highlight video according to one embodiment of the present invention includes steps (a) to (f), and may further include steps (add), (ex1), and (ex2).

(a)단계는 영상전처리부(10)가 하이라이트 영상을 생성하기 위해 원본 스포츠 영상 및 상기 원본 스포츠 영상의 해설 텍스트를 제공받는 단계이다.Step (a) is a step in which the video preprocessing unit (10) receives an original sports video and annotation text of the original sports video to generate a highlight video.

이때 원본 스포츠 영상은 경기 전체를 녹화한 영상이며, 원본 스포츠 영상에 나타난 스포츠의 종류는 특정 스포츠만으로 제한되지 않으며 그 어떠한 스포츠도 대상으로 적용될 수 있음은 물론이다.At this time, the original sports video is a video recording of the entire game, and the type of sport shown in the original sports video is not limited to a specific sport and can of course be applied to any sport.

또한 이와 같은 원본 스포츠 영상에는 해당 스포츠의 해설자가 발화한 해설 음성이 포함된다.Additionally, original sports footage like this includes commentary voices uttered by the sport's commentators.

해설 텍스트는 경기 중 해설자가 말한 해설 음성을 텍스트로 변환한 것으로, 예컨대 음성 인식 기술(Speech to Text, STT) 등의 방식을 통해 자동으로 생성될 수 있으나, 이와 달리 수동으로 제작될 수도 있음은 물론이다.Commentary text is the commentary voice spoken by the commentator during the game converted into text. It can be automatically generated, for example, through speech recognition technology (Speech to Text, STT), but it can also be created manually.

이와 같은 원본 스포츠 영상 및 해설 텍스트는 스포츠 하이라이트 영상을 생성하기 위한 기초 자료로 사용되며, 이하의 각 단계를 통해 중요한 순간을 식별하고 이를 바탕으로 스포츠 하이라이트 영상을 생성하는 데 활용된다.Such original sports footage and commentary text are used as the basis for creating sports highlight videos, and through each step below, important moments are identified and used to create sports highlight videos based on these.

(b)단계는 영상전처리부(10)가 원본 스포츠 영상의 전체 길이를 임의로 설정된 기준에 따라 복수의 단위 청크로 분할 설정하는 단계이다.Step (b) is a step in which the video preprocessing unit (10) divides the entire length of the original sports video into multiple unit chunks according to arbitrarily set criteria.

단위 청크의 분할 기준은 예컨대 사전 설정에 의해 설정되거나, 해설자의 발화 패턴 등을 고려하여 설정될 수 있다. 예를 들어, 일정 시간 간격으로 영상을 분할하거나, 해설자의 발화 단위를 기준으로 청크를 설정할 수 있을 것이다.The criteria for dividing a unit chunk can be set by, for example, a preset or by considering the narrator's speech pattern, etc. For example, a video can be divided into chunks at regular time intervals or chunks can be set based on the narrator's speech unit.

또한 각 단위 청크는 독립적으로 분석되고 처리될 수 있으며, 이를 통해 복수의 단위 청크로 분할된 영상은 병렬 처리 기법을 통해 효율적으로 처리될 수 있다.Additionally, each unit chunk can be analyzed and processed independently, allowing an image divided into multiple unit chunks to be efficiently processed using parallel processing techniques.

(c)단계는 영상전처리부(10)가 복수의 단위 청크에 포함된 해설 음성의 음량정보 및 해설 텍스트의 의미정보를 도출하는 단계이다.Step (c) is a step in which the image preprocessing unit (10) derives volume information of the commentary voice and semantic information of the commentary text included in multiple unit chunks.

해설 음성의 음량정보는 각 단위 청크에서 해설자의 목소리 크기를 나타내는 것으로, 이는 해당 스포츠의 이벤트 발생 지점을 식별하는 데 중요한 지표가 된다.The volume information of the commentary voice indicates the volume of the commentator's voice in each unit chunk, which is an important indicator for identifying the point where an event in the corresponding sport occurs.

또한 해설 텍스트의 의미정보는 해설자의 발언 내용을 분석하여, 특정 이벤트나 중요한 순간을 파악하는 데 중요한 정보를 제공한다. 이와 같은 의미정보는 자연어 처리 기술을 이용하여 추출할 수 있으며, 해설 텍스트의 키워드, 문장 구조 등의 정보를 포함할 수 있다.In addition, the semantic information of the commentary text provides important information for understanding specific events or important moments by analyzing the commentator's speech content. Such semantic information can be extracted using natural language processing technology and can include information such as keywords and sentence structures of the commentary text.

(d)단계는 영상생성부(20)가 복수의 단위 청크에 포함된 해설 음성의 음량정보 및 의미정보를 기반으로 경기 이벤트 발생 지점을 추출하는 단계이다.Step (d) is a step in which the video generation unit (20) extracts the point of occurrence of a game event based on the volume information and semantic information of the commentary voice included in multiple unit chunks.

상술한 바와 같이, 해설 음성의 음량이 높거나 해설 텍스트에서 특정 이벤트와 관련된 내용이 포함된 경우, 해당 단위 청크는 경기 이벤트 발생 지점으로 판단될 수 있다.As described above, if the volume of the commentary voice is high or the commentary text contains content related to a specific event, the unit chunk may be judged as the point where a match event occurred.

예컨대, 해설자의 음량이 높은 부분은 경기의 중요한 순간일 가능성이 크기 때문에, 본 과정에서는 이를 기준으로 중요한 이벤트 발생 지점을 식별할 수 있다.For example, a loud part of a commentator's voice is likely to be an important moment in the game, so this process can identify the point where important events occur based on this.

더불어 경기 이벤트 발생 지점은 특정 이벤트와 관련된 키워드를 기반으로 판단하거나, 또는 키워드와 음량정보를 종합 고려하여 판단할 수도 있다.In addition, the location of a game event can be determined based on keywords related to a specific event, or by comprehensively considering keywords and volume information.

이때 키워드는 관리자가 임의로 설정이 가능하며(예: 골, 도움, 안타, 홈런, 특정선수 이름 등), 혹은 때에 따라 관리자가 수동 검색을 통해 새로 설정할 수도 있다.At this time, keywords can be set arbitrarily by the administrator (e.g. goals, assists, hits, home runs, specific player names, etc.), or sometimes the administrator can set new keywords through manual search.

이와 같이 경기 이벤트 발생 지점을 식별하는 방식은 다양하게 수행될 수 있으며, 해당 과정에 대한 보다 자세한 사항은 후술하도록 한다.There are various ways to identify the point where a game event occurs, and more detailed information about the process will be described later.

(e)단계는 영상생성부(20)가 경기 이벤트 발생 지점을 중간점으로 하여 전후에 시작점 및 종료점을 설정함에 따라 메인 이벤트 영상의 길이를 결정하는 단계이다.Step (e) is a step in which the video generation unit (20) determines the length of the main event video by setting the start and end points before and after the point where the game event occurs as the midpoint.

메인 이벤트 영상은 스포츠 하이라이트 영상의 가장 핵심적인 부분으로서, 메인 이벤트 영상의 시작점과 종료점은 경기의 맥락을 충분히 전달할 수 있도록 유연하게 설정될 수 있다. 이에 대해서는 후술하도록 한다.The main event video is the most essential part of a sports highlight video, and the start and end points of the main event video can be set flexibly to sufficiently convey the context of the game. This will be described later.

(f)단계는 영상생성부(20)가 메인 이벤트 영상을 포함하는 스포츠 하이라이트 영상을 생성하는 과정이다.Step (f) is a process in which the video generation unit (20) generates a sports highlight video including a main event video.

본 과정에 의해, 수작업 편집의 주관적 판단을 배제하고 객관적이고 일관된 하이라이트 영상이 제작된다.Through this process, an objective and consistent highlight video is produced, eliminating the subjective judgment of manual editing.

이하에서는, 상술한 각 단계들에 대해 보다 자세히 설명하도록 한다.Below, each of the steps described above will be explained in more detail.

도 3은 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (b)단계의 세부 과정을 나타낸 도면이다.FIG. 3 is a drawing showing a detailed process of step (b) in a method for generating a sports highlight video according to one embodiment of the present invention.

도 3에 도시된 바와 같이, (b)단계는 (b-1)단계 내지 (b-4)단계를 포함할 수 있다.As illustrated in FIG. 3, step (b) may include steps (b-1) to (b-4).

(b-1)단계는 영상전처리부(10)가 원본 스포츠 영상에 포함된 해설 음성 중 임의의 발화가 시작되는 지점으로부터 발화가 종료되는 지점을 임시 청크로 설정하는 단계이다.Step (b-1) is a step in which the video preprocessing unit (10) sets the point from which any speech among the commentary voices included in the original sports video starts to the point where the speech ends as a temporary chunk.

이와 같이 하는 이유는, 해설 음성이 중간에 잘려 분할되는 것을 방지하기 위한 것이다. 이는 하나의 발화가 중간에 분할될 경우, 해설 음성의 의도와 전체적인 분위기, 음량을 정확하게 파악하기가 어렵기 때문이다.The reason for doing this is to prevent the narration voice from being cut off and split in the middle. This is because if one utterance is split in the middle, it is difficult to accurately grasp the intention, overall atmosphere, and volume of the narration voice.

(b-2)단계는 영상전처리부(10)가 (b-1)단계에서 설정된 임시 청크에 포함된 해설 음성의 목소리가 바뀌는 지점이 존재하는지를 분석하는 단계이다.Step (b-2) is a step in which the image preprocessing unit (10) analyzes whether there is a point where the voice of the narration voice included in the temporary chunk set in step (b-1) changes.

이는 해설자가 복수 명이 존재하는 경우, 특정 해설자의 발화 이후 공백 없이 연속적으로 다른 해설자가 발화를 할 수 있기 때문으로, 해당 부분을 구분하여 분할하기 위한 것이다.This is because, when there are multiple narrators, other narrators may speak consecutively without a gap after a specific narrator speaks, so the purpose is to divide and separate that part.

예컨대, 본 과정에서는 음성 주파수 분석 기법, 발화자 구분(Voice Activity Detection, VAD) 기법, 피치 분석 기법 등이 활용될 수 있을 것이다.For example, this course may utilize voice frequency analysis techniques, voice activity detection (VAD) techniques, and pitch analysis techniques.

(b-3)단계는 영상전처리부(10)가 (b-2)단계에 의해 임시 청크에 포함된 해설 음성의 목소리가 바뀌는 지점이 존재하는 것으로 판단된 경우, 해당 지점을 기준으로 임시 청크를 복수 개의 단위 청크로 분할하는 단계이다.Step (b-3) is a step in which, if the image preprocessing unit (10) determines that there is a point where the voice of the narration voice included in the temporary chunk changes in step (b-2), the temporary chunk is divided into multiple unit chunks based on that point.

그리고 (b-4)단계는 영상전처리부(10)가 (b-1)단계에서 설정된 각 임시 청크 사이에 존재하는 공백 부분들을 각각 단위 청크로 설정하는 단계이다.And step (b-4) is a step in which the image preprocessing unit (10) sets each blank portion existing between each temporary chunk set in step (b-1) as a unit chunk.

이상의 과정에 따라, 전체 원본 스포츠 영상은 해설자의 발화 별로 구분된 복수의 단위 청크들과, 해설자의 발화를 포함하고 있는 단위 청크 사이에 위치되는 복수의 단위 청크들로 구획될 수 있다.According to the above process, the entire original sports video can be segmented into multiple unit chunks separated by the commentator's utterance and multiple unit chunks located between the unit chunks containing the commentator's utterance.

도 4는 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (c)단계의 세부 과정을 나타낸 도면이다.FIG. 4 is a drawing showing a detailed process of step (c) in a method for generating a sports highlight video according to one embodiment of the present invention.

도 4에 도시된 바와 같이, (c)단계는 (c-1)단계 및 (c-2)단계를 포함할 수 있다.As illustrated in FIG. 4, step (c) may include steps (c-1) and (c-2).

전술한 바와 같이, (c)단계는 영상전처리부(10)가 복수의 단위 청크에 포함된 해설 음성의 음량정보 및 해설 텍스트의 의미정보를 도출하는 단계로서, 먼저 (c-1)단계에서는 영상전처리부(10)가 각각의 단위 청크에 포함된 해설 음성에 대한 음량도를 설정하게 된다. 이때 음량도는 다양한 방식으로 설정될 수 있다.As described above, step (c) is a step in which the image preprocessing unit (10) derives volume information of the narration voice included in multiple unit chunks and semantic information of the narration text. First, in step (c-1), the image preprocessing unit (10) sets the volume level for the narration voice included in each unit chunk. At this time, the volume level can be set in various ways.

예컨대, 영상전처리부(10)는 각각의 단위 청크 내에서 음량도의 평균치를 설정할 수 있다. 이는 해설자의 해설 음성은 하나의 단위 청크 내에서도 다양하게 변화할 수 있기 때문이다.For example, the image preprocessing unit (10) can set the average value of the loudness within each unit chunk. This is because the narrator's narration voice can vary widely even within one unit chunk.

다음으로, (c-2)단계는 영상전처리부(10)가 각각의 단위 청크에 포함된 해설 텍스트의 의미를 해석하는 단계이다.Next, step (c-2) is a step in which the image preprocessing unit (10) interprets the meaning of the explanatory text included in each unit chunk.

본 과정은 해설자의 발언 내용을 분석하여 특정 이벤트나 중요한 순간을 파악하는 데 중요한 정보를 제공하기 위한 것으로, 전술한 바와 같이 의미정보는 자연어 처리 기술을 이용하여 해석될 수 있다.This process is designed to provide important information for understanding specific events or important moments by analyzing the content of the commentator's speech, and as mentioned above, semantic information can be interpreted using natural language processing technology.

도 5는 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (d)단계의 세부 과정을 나타낸 도면이다.FIG. 5 is a drawing showing a detailed process of step (d) in a method for generating a sports highlight video according to one embodiment of the present invention.

도 5에 도시된 바와 같이, (d)단계는 (d-1)단계 내지 (d-3)단계를 포함할 수 있다.As illustrated in FIG. 5, step (d) may include steps (d-1) to (d-3).

(d-1)단계는 영상생성부(20)가 해당 스포츠에 관련된 이벤트를 의미하고 있는 해설 텍스트를 포함하고 있는 단위 청크를 추출하는 단계이다.Step (d-1) is a step in which the video generation unit (20) extracts a unit chunk containing a commentary text meaning an event related to the sport.

상술한 (c-2)단계에서 각 단위 청크에 대한 의미정보가 분석되었으므로, 본 과정에서는 각 단위 청크 중 해당 스포츠에 관련된 이벤트를 의미하고 있는 해설 텍스트를 포함하고 있는 단위 청크를 추출하게 된다.Since the semantic information for each unit chunk was analyzed in the above-described step (c-2), in this process, the unit chunk that includes the explanatory text meaning the event related to the corresponding sport is extracted from each unit chunk.

예컨대, 해당 스포츠가 축구인 경우 '골', '득점', '들어가다' 등의 의미가 담긴 단위 청크가 추출될 수 있으며, 해당 스포츠가 야구인 경우에는 '홈런', '안타', '삼진' 등의 의미가 담긴 단위 청크가 추출될 수 있다. 이는 스포츠에 따라 다양하게 사전 설정될 수 있음은 물론이다.For example, if the sport is soccer, unit chunks containing meanings such as 'goal', 'score', and 'go in' can be extracted, and if the sport is baseball, unit chunks containing meanings such as 'home run', 'hit', and 'strikeout' can be extracted. Of course, this can be preset in various ways depending on the sport.

(d-2)단계는 영상생성부(20)가 (d-1)단계에서 추출된 단위 청크에 포함된 해설 음성의 음량도가 기 설정된 기준값 이상인지를 판단하는 단계이다.Step (d-2) is a step in which the image generation unit (20) determines whether the volume of the commentary voice included in the unit chunk extracted in step (d-1) is equal to or higher than a preset reference value.

본 과정에서는 (d-1)단계에서 선별된 단위 청크들을 대상으로, 단위 청크에 포함된 해설 음성의 음량도가 기준값 이상인지를 판단하게 된다.In this process, the unit chunks selected in step (d-1) are judged to determine whether the volume of the commentary voice included in the unit chunks is above the standard value.

이때 음량도의 기준값은 다양한 방식으로 설정될 수 있다.At this time, the reference value for the volume level can be set in various ways.

예컨대, 해설자의 발화 중에는 통상적인 음량이 기본적으로 스포츠 경기 중 가장 많은 빈도로 등장하는 경향이 있으므로, 기준값은 각 단위 청크를 종합하여 가장 많은 빈도로 나타나는 음량도로 설정될 수도 있다.For example, since the normal volume level tends to appear most frequently during a sports game, the reference value can be set to the volume level that appears most frequently when synthesizing each unit chunk.

또는, 기준값은 (c-1)단계에서 설정된 모든 단위 청크의 음량도를 취합하여 도출한 평균 음량도일 수도 있다.Alternatively, the reference value may be an average loudness derived by compiling the loudness levels of all unit chunks set in step (c-1).

(d-3)단계는 영상생성부(20)가 (d-2)단계에서 기 설정된 기준값 이상의 음량도를 가지는 단위 청크를 경기 이벤트 발생 시점으로 판단하는 단계이다.Step (d-3) is the step in which the video generation unit (20) determines a unit chunk having a volume level higher than the standard value set in step (d-2) as the time of occurrence of a game event.

즉 본 과정은, 해설자의 음량이 기준값에 비해 높은 부분을 경기의 중요한 순간으로 판단하여, 이를 경기 이벤트 발생 시점으로 간주하는 것이다. 이는 경기 이벤트가 발생한 경우에는 해설자의 음량이 높아지는 경향이 있기 때문이다.In other words, this process determines that the part where the commentator's volume is higher than the reference value is an important moment in the game, and considers it as the point at which a game event occurs. This is because the commentator's volume tends to increase when a game event occurs.

도 6은 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (e)단계의 세부 과정을 나타낸 도면이다.FIG. 6 is a drawing showing a detailed process of step (e) in a method for generating a sports highlight video according to one embodiment of the present invention.

도 6에 도시된 바와 같이, (e)단계는 (e-1)단계 내지 (e-4)단계를 포함할 수 있으며, (e-3)단계 및 (e-4)단계는 또한 세부적인 단계로 구분될 수 있다.As illustrated in FIG. 6, step (e) may include steps (e-1) to (e-4), and steps (e-3) and (e-4) may also be divided into detailed steps.

이와 같은 (e)단계는 스포츠 하이라이트 영상의 가장 핵심적인 부분인 메인 이벤트 영상을 결정하는 과정으로서, 경기의 맥락을 충분히 전달할 수 있도록 메인 이벤트 영상의 시작점과 종료점을 설정하게 된다.Step (e) is the process of determining the main event video, which is the most essential part of a sports highlight video, and sets the start and end points of the main event video so that it can sufficiently convey the context of the game.

(e-1)단계는 영상생성부(20)가 경기 이벤트 발생 지점으로 설정된 단위 청크를 기준으로 기 설정된 시간 간격을 가지는 제1선행 시점 및 제1후행 시점을 설정하는 단계이다.Step (e-1) is a step in which the video generation unit (20) sets a first preceding point in time and a first succeeding point in time with a preset time interval based on a unit chunk set as a game event occurrence point.

이는 경기 이벤트가 발생한 순간을 기준으로 이전에 해당 경기 이벤트가 발생하게 된 원인이 되는 장면과, 해당 경기 이벤트가 발생한 결과가 스포츠 하이라이트 영상에 포함되도록 하기 위한 것이다.This is to ensure that the scene that led to the match event and the result of the match event are included in the sports highlight video based on the moment the match event occurred.

(e-2)단계는 영상생성부(20)가 (e-1)단계에서 설정된 제1선행 시점 및 제1후행 시점에 해설 음성이 발생하고 있는지를 판단하는 단계이다.Step (e-2) is a step in which the image generation unit (20) determines whether the commentary voice is generated at the first preceding time point and the first succeeding time point set in step (e-1).

본 과정은 스포츠 하이라이트 영상에서 해설 음성이 잘리는 것을 방지하기 위한 것으로, 영상을 매끄럽게 처리하기 위한 것이다. 그리고 본 판단 결과에 따라 이하의 (e-3a)단계 내지 (e-3c)단계 또는 (e-4a)단계 내지 (e-4c)단계로 진행될 수 있다.This process is intended to prevent commentary voices from being cut off in sports highlight videos and to process the videos smoothly. And depending on the result of this judgment, the process can proceed to steps (e-3a) to (e-3c) or steps (e-4a) to (e-4c).

(e-3a)단계는 영상생성부(20)가 (e-2)단계에서 제1선행 시점에 해설 음성이 발생하지 않는 것으로 판단된 경우, 제1선행 시점으로부터 경기 이벤트 발생 지점으로 설정된 단위 청크의 시작점까지를 메인 이벤트 영상에 포함시키는 단계이다.Step (e-3a) is a step in which, if the video generation unit (20) determines that no commentary voice is generated at the first preceding time point in step (e-2), it includes the starting point of the unit chunk set as the point of occurrence of the game event from the first preceding time point in the main event video.

즉 제1선행 시점에 해설 음성이 발생하지 않을 경우에는, 해당 지점을 메인 이벤트 영상의 시작 지점으로 설정하게 된다.That is, if the commentary voice does not occur at the first preceding point, that point is set as the starting point of the main event video.

(e-3b)단계는 영상생성부(20)가 (e-2)단계에서 제1선행 시점에 해설 음성이 발생하고 있는 것으로 판단된 경우, 제1선행 시점을 기 설정된 기준에 따라 보다 이전 시점으로 설정하는 단계이다.Step (e-3b) is a step in which, if the video generation unit (20) determines that the commentary voice is generated at the first preceding point in time in step (e-2), the first preceding point in time is set to an earlier point in time according to the preset criteria.

즉 제1선행 시점에 해설 음성이 발생하고 있는 경우에는, 제1선행 시점을 현재보다 이전으로 다시 설정할 수 있다. 이때 제1선행 시점을 다시 설정할 경우에는 직전에 설정된 최초 제1선행 시점의 시간 간격보다 짧은 시간 간격을 가지도록 할 수 있다.That is, if the commentary voice is generated at the first preceding point in time, the first preceding point in time can be reset to an earlier time than the present. In this case, when the first preceding point in time is reset, the time interval can be set to be shorter than the time interval of the first preceding point in time that was set immediately before.

(e-3c)단계는 (e-3b)단계가 수행된 경우, (e-3a)단계에 도달할 때까지 (e-2)단계 및 (e-3b)단계를 n회 재수행하는 단계이다.Step (e-3c) is a step in which steps (e-2) and (e-3b) are re-performed n times until step (e-3a) is reached if step (e-3b) has been performed.

결과적으로 최종적으로는 해설 음성이 발생하지 않는 시점을 메인 이벤트 영상의 시작 지점으로 설정할 수 있게 된다.As a result, we can ultimately set the starting point of the main event video to the point where the commentary voice does not occur.

그리고 (e-4a)단계는 영상생성부(20)가 (e-2)단계에서 제1후행 시점에 해설 음성이 발생하지 않는 것으로 판단된 경우, 경기 이벤트 발생 지점으로 설정된 단위 청크의 종료점으로부터 제1후행 시점까지를 메인 이벤트 영상에 포함시키는 단계이다.And step (e-4a) is a step for including the main event video from the end point of the unit chunk set as the game event occurrence point to the first subsequent time point when the video generation unit (20) determines that the commentary voice does not occur at the first subsequent time point in step (e-2).

즉 제1후행 시점에 해설 음성이 발생하지 않을 경우에는, 해당 지점을 메인 이벤트 영상의 종료 지점으로 설정하게 된다.That is, if the commentary voice does not occur at the first trailing point, that point is set as the end point of the main event video.

(e-4b)단계는 영상생성부(20)가 (e-2)단계에서 제1후행 시점에 해설 음성이 발생하고 있는 것으로 판단된 경우, 제1후행 시점을 기 설정된 기준에 따라 보다 이후 시점으로 설정하는 단계이다.Step (e-4b) is a step in which, if the video generation unit (20) determines that the commentary voice is generated at the first subsequent time point in step (e-2), the first subsequent time point is set to a later time point according to the preset criteria.

즉 제1후행 시점에 해설 음성이 발생하고 있는 경우에는, 제1후행 시점을 현재보다 이후로 다시 설정할 수 있다. 이때 제1후행 시점을 다시 설정할 경우에는 직전에 설정된 최초 제1후행 시점의 시간 간격보다 짧은 시간 간격을 가지도록 할 수 있다.That is, if the commentary voice is generated at the first lag time point, the first lag time point can be re-set to a time later than the present. In this case, when re-setting the first lag time point, it can be set to have a shorter time interval than the time interval of the first lag time point that was set immediately before.

(e-4c)단계는 (e-4a)단계에 도달할 때까지 (e-2)단계 및 (e-4b)단계를 n회 재수행하는 단계이다. 이를 통해 최적의 후행 시점을 설정할 수 있다.Step (e-4c) is a step in which steps (e-2) and (e-4b) are re-performed n times until step (e-4a) is reached. This allows the optimal follow-up point to be set.

결과적으로, 최종적으로는 해설 음성이 발생하지 않는 시점을 메인 이벤트 영상의 종료 지점으로 설정할 수 있게 된다.As a result, we can ultimately set the point in time when the commentary voice stops being heard as the end point of the main event video.

다만, 이와 같은 (e-3a)단계 내지 (e-3c)단계 또는 (e-4a)단계 내지 (e-4c)단계에 있어서, (e-3c)단계 및 (e-4c)단계를 복수 회 반복하여도 계속하여 제1선행 시점으로 선택되었던 모든 부분에 음성이 발생되는 경우가 존재할 수 있다.However, in steps (e-3a) to (e-3c) or steps (e-4a) to (e-4c), there may be cases where voice is continuously generated in all parts selected as the first preceding time point even if steps (e-3c) and (e-4c) are repeated multiple times.

이와 같은 경우, n을 기 설정된 수치로 미리 설정하고, 반복 횟수가 n값에 도달한 상태에서, n값에 도달하기까지 선택된 모든 제1선행 시점 중 해설 음성의 음량이 가장 작은 제1선행 시점을 메인 이벤트 영상의 시작 지점 또는 종료 지점으로 결정할 수 있다.In such a case, n is preset to a preset value, and when the number of repetitions reaches the n value, the first preceding point in time with the lowest volume of the commentary voice among all the first preceding points in time selected until the n value is reached can be determined as the start point or end point of the main event video.

이상의 (e)단계를 통해 스포츠 하이라이트 영상의 메인 이벤트 영상이 결정될 수 있다, 다만, 본 실시예에서 스포츠 하이라이트 영상은 메인 이벤트 영상 외에도 사전 이벤트 영상 및 사후 이벤트 영상을 더 포함할 수 있다.Through the above step (e), the main event video of the sports highlight video can be determined. However, in the present embodiment, the sports highlight video may further include a pre-event video and a post-event video in addition to the main event video.

도 8은 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, 스포츠 하이라이트 영상의 구조를 개략적으로 나타낸 도면이다.FIG. 8 is a drawing schematically showing the structure of a sports highlight video in a method for generating a sports highlight video according to one embodiment of the present invention.

도 8에 도시된 바와 같이, 사전 이벤트 영상은 메인 이벤트 영상보다 이전에 재생되도록 앞 부분에 배치되는 것으로, 메인 이벤트 영상이 발생하게 된 이전 시점의 과정을 스포츠 하이라이트 영상에 함께 포함시켜 메인 이벤트 영상의 발생 원인을 보다 용이하게 파악할 수 있도록 하기 위한 것이다.As illustrated in Figure 8, the pre-event video is placed in front so that it is played before the main event video, so that the process prior to the occurrence of the main event video can be included in the sports highlight video to make it easier to identify the cause of the occurrence of the main event video.

또한 사후 이벤트 영상은 메인 이벤트 영상보다 이후에 재생되도록 앞 부분에 배치되는 것으로, 메인 이벤트 영상이 발생하게 된 이후 시점의 상황들을 스포츠 하이라이트 영상에 함께 포함시켜 메인 이벤트 영상의 발생 이후 분위기를 용이하게 파악할 수 있도록 하기 위한 것이다.In addition, the post-event video is placed in front so that it is played after the main event video, so that the situations after the main event video occurs can be included in the sports highlight video to make it easier to understand the atmosphere after the main event video occurs.

그리고 이를 위해, 상술한 (e)단계 및 후술할 (f)단계 사이에는, 영상생성부(20)가 메인 이벤트 영상보다 이전에 재생되는 사전 이벤트 영상 및 메인 이벤트 영상보다 이후에 재생되는 사후 이벤트 영상 중 적어도 어느 하나의 길이를 결정하는 (add)단계가 더 수행될 수 있다.And for this purpose, between the step (e) described above and the step (f) described below, an (add) step may be further performed in which the video generation unit (20) determines the length of at least one of the pre-event video played before the main event video and the post-event video played after the main event video.

도 8은 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (add)단계의 세부 과정을 나타낸 도면이다.FIG. 8 is a drawing showing a detailed process of the (add) step in a method for generating a sports highlight video according to one embodiment of the present invention.

도 8에 도시된 바와 같이, (add)단계는 (add-1)단계 내지 (add-4)단계를 포함할 수 있으며, (add-3)단계 및 (add-4)단계는 또한 세부적인 단계로 구분될 수 있다.As illustrated in FIG. 8, the (add) step may include steps (add-1) to (add-4), and the (add-3) step and the (add-4) step may also be divided into detailed steps.

(add-1)단계는 영상생성부(20)가 메인 이벤트 영상을 기준으로 기 설정된 시간 간격을 가지는 제2선행 시점 및 제2후행 시점을 설정하는 단계이다.The (add-1) step is a step in which the video generation unit (20) sets a second preceding time point and a second succeeding time point with a preset time interval based on the main event video.

(add-2)단계는 영상생성부(20)가 (add-1)단계에서 설정된 제2선행 시점 및 제2후행 시점에 해설 음성이 발생하고 있는지를 판단하는 단계이다.The (add-2) step is a step in which the image generation unit (20) determines whether the commentary voice is generated at the second preceding time point and the second succeeding time point set in the (add-1) step.

본 과정 역시 전술한 (e-2)단계와 같이 스포츠 하이라이트 영상에서 해설 음성이 잘리는 것을 방지하기 위한 것으로, 영상을 매끄럽게 처리하기 위한 것이다. 그리고 본 판단 결과에 따라 이하의 (add-3a)단계 내지 (add-3c)단계 또는 (add-4a)단계 내지 (add-4c)단계로 진행될 수 있다.This process is also intended to prevent commentary voices from being cut off in sports highlight videos, like the aforementioned step (e-2), and to process the videos smoothly. And depending on the judgment result, the process can proceed to the following steps (add-3a) to (add-3c) or (add-4a) to (add-4c).

(add-3a)단계는 영상생성부(20)가 (add-2)단계에서 제2선행 시점에 해설 음성이 발생하지 않는 것으로 판단된 경우, 제2선행 시점으로부터 메인 이벤트 영상의 시작점까지를 사전 이벤트 영상으로 결정하는 단계이다.The (add-3a) step is a step in which, if the video generation unit (20) determines that no commentary voice is generated at the second preceding time point in the (add-2) step, the time from the second preceding time point to the start point of the main event video is determined to be the pre-event video.

즉 제2선행 시점에 해설 음성이 발생하지 않을 경우에는, 해당 지점을 사전 이벤트 영상의 시작 지점으로 설정하게 된다.That is, if the commentary voice does not occur at the second preceding point, that point is set as the starting point of the pre-event video.

(add-3b)단계는 영상생성부(20)가 (add-2)단계에서 제2선행 시점에 해설 음성이 발생하고 있는 것으로 판단된 경우, 제2선행 시점을 기 설정된 기준에 따라 보다 이전 시점으로 설정하는 단계이다.The (add-3b) step is a step in which, if the video generation unit (20) determines that the commentary voice is generated at the second preceding time point in the (add-2) step, the second preceding time point is set to an earlier time point according to the preset criteria.

즉 제2선행 시점에 해설 음성이 발생하고 있는 경우에는, 제2선행 시점을 현재보다 이전으로 다시 설정할 수 있다. 이때 제2선행 시점을 다시 설정할 경우에는 직전에 설정된 최초 제2선행 시점의 시간 간격보다 짧은 시간 간격을 가지도록 할 수 있다.In other words, if the commentary voice is generated at the second preceding point in time, the second preceding point in time can be reset to an earlier time than the present. In this case, when the second preceding point in time is reset, the time interval can be set to be shorter than the time interval of the first second preceding point in time that was set immediately before.

(add-3c)단계는 (add-3b)단계가 수행된 경우, (add-3a)단계에 도달할 때까지 (add-2)단계 및 (add-3b)단계를 n회 재수행하는 단계이다.The (add-3c) step is a step in which, if the (add-3b) step has been performed, the (add-2) and (add-3b) steps are re-performed n times until the (add-3a) step is reached.

결과적으로, 최종적으로는 해설 음성이 발생하지 않는 시점을 사전 이벤트 영상의 시작 지점으로 설정할 수 있게 된다.As a result, we can ultimately set the starting point of the pre-event video to the point where no commentary voice occurs.

그리고 (add-4a)단계는 영상생성부(20)가 (add-2)단계에서 제2후행 시점에 해설 음성이 발생하지 않는 것으로 판단된 경우, 메인 이벤트 영상의 종료점으로부터 제2후행 시점까지를 사후 이벤트 영상으로 결정하는 단계이다.And the (add-4a) step is a step in which, if the video generation unit (20) determines that the commentary voice does not occur at the second subsequent time point in the (add-2) step, it determines the time from the end point of the main event video to the second subsequent time point as the post-event video.

즉 제2후행 시점에 해설 음성이 발생하지 않을 경우에는, 해당 지점을 사후 이벤트 영상의 종료 지점으로 설정하게 된다.That is, if the commentary voice does not occur at the second follow-up point, that point is set as the end point of the post-event video.

(add-4b)단계는 영상생성부(20)가 (add-2)단계에서 제2후행 시점에 해설 음성이 발생하고 있는 것으로 판단된 경우, 제2후행 시점을 기 설정된 기준에 따라 보다 이후 시점으로 설정하는 단계이다.The (add-4b) step is a step in which, if the video generation unit (20) determines that the commentary voice is generated at the second subsequent time point in the (add-2) step, the second subsequent time point is set to a later time point according to the preset criteria.

즉 제2후행 시점에 해설 음성이 발생하고 있는 경우에는, 제2후행 시점을 현재보다 이후로 다시 설정할 수 있다. 이때 제2후행 시점을 다시 설정할 경우에는 직전에 설정된 최초 제2후행 시점의 시간 간격보다 짧은 시간 간격을 가지도록 할 수 있다.In other words, if the commentary voice is generated at the second lag time, the second lag time can be set again later than the present. In this case, when the second lag time is set again, it can be set to have a shorter time interval than the time interval of the first second lag time set immediately before.

(add-4c)단계는 (add-4a)단계에 도달할 때까지 (add-2)단계 및 (add-4b)단계를 n회 재수행하는 단계이다.Step (add-4c) is a step that re-performs steps (add-2) and (add-4b) n times until step (add-4a) is reached.

결과적으로 최종적으로는 해설 음성이 발생하지 않는 시점을 사후 이벤트 영상의 종료 지점으로 설정할 수 있게 된다.As a result, the point in time when the commentary voice stops being generated can be set as the end point of the post-event video.

이상과 같이 (e)단계 및 (add)단계의 과정에 따라, 스포츠 하이라이트 영상의 전체 길이가 결정될 수 있다.As described above, the total length of the sports highlight video can be determined according to the process of step (e) and step (add).

다음으로, (f)단계는 영상생성부(20)가 메인 이벤트 영상을 포함하는 스포츠 하이라이트 영상을 생성하는 단계이다.Next, step (f) is a step in which the video generation unit (20) generates a sports highlight video including a main event video.

이때 상술한 바와 같이 본 과정에서는, 메인 이벤트 영상과 함께 사전 이벤트 영상 및 사후 이벤트 영상을 포함하여 스포츠 하이라이트 영상을 생성하게 된다.As described above, this process creates a sports highlight video, including pre-event video and post-event video along with the main event video.

한편 본 발명은, (ex1)단계와, (f)단계 이후 수행되는 (ex2)단계를 더 포함할 수 있다.Meanwhile, the present invention may further include step (ex2) performed after step (ex1) and step (f).

(ex1)단계는 메모리부(30)가 상기 (a)단계 내지 (f)단계에서 발생한 복수의 데이터를 데이터베이스에 저장하는 과정으로서, 본 발명의 모든 과정에서 발생할 수 있는 모든 데이터를 보유하고 관리할 수 있도록 한다.Step (ex1) is a process in which the memory unit (30) stores multiple data generated in steps (a) to (f) in a database, enabling it to retain and manage all data that may be generated in all processes of the present invention.

그리고 (ex2)단계는 데이터검증부(40)가 스포츠 하이라이트 영상이 해당 스포츠의 경기 이벤트 발생 지점에 부합하는지를 검증하는 과정이다.And step (ex2) is the process in which the data verification unit (40) verifies whether the sports highlight video matches the point where the game event of the corresponding sport occurred.

본 과정은 상술한 모든 과정에 의해 생성된 스포츠 하이라이트 영상이 실제로 해당 경기에서 이벤트가 발생한 지점인지를 검증하여 확인하기 위해 수행될 수 있다.This process can be performed to verify and confirm that the sports highlight video generated by all of the processes described above is actually the point where the event occurred in the game.

도 9는 본 발명의 일 실시예에 따른 스포츠 하이라이트 영상 생성방법에 있어서, (ex2)단계의 세부 과정을 나타낸 도면이다.FIG. 9 is a drawing showing a detailed process of step (ex2) in a method for generating a sports highlight video according to one embodiment of the present invention.

도 9에 도시된 바와 같이, (ex2)단계는 이하의 세부적인 과정들을 포함할 수 있다.As illustrated in Fig. 9, step (ex2) may include the following detailed processes.

(ex2-1)단계는, 데이터검증부(40)가 원본 스포츠 영상으로부터 스포트 하이라이트 영상에 해당하는 시점을 검출하는 과정이다.Step (ex2-1) is a process in which the data verification unit (40) detects a point in time corresponding to a sport highlight video from the original sports video.

본 과정에서는 생성된 스포트 하이라이트 영상을 원본 스포츠 영상과 대조하여, 어느 시점에서 해당 이벤트가 발생하였는지를 판단하게 된다.In this process, the generated sport highlight video is compared with the original sports video to determine at what point in time the event occurred.

(ex2-2)단계는, 데이터검증부(40)가 원본 스포츠 영상 중에서, 검증 대상이 되는 스포츠 하이라이트 영상에 해당하는 시점의 이전 및 이후 시점에 화각 내에 나타나는 스코어보드를 검출하는 과정이다. 본 과정은 다양한 영상 판독 기법을 통해 이루어질 수 있다.(ex2-2) Step is a process in which the data verification unit (40) detects a scoreboard that appears within the field of view at a time point before and after the time point corresponding to the sports highlight video to be verified among the original sports video. This process can be performed through various video interpretation techniques.

이때 스코어보드라 함은 영상의 화각 내에서 보여지는 현장 스코어보드뿐 아니라, 영상 화면에 입혀진 편집 스코어보드 등과 같이 현재 경기의 스코어를 표시할 수 있는 모든 대상이 포함될 수 있다.At this time, the scoreboard may include not only the on-site scoreboard shown within the video field of view, but also any object that can display the score of the current game, such as an edited scoreboard overlaid on the video screen.

(ex2-3)단계는, 데이터검증부(40)가 스포츠 하이라이트 영상에 해당하는 시점의 이전 및 이후 시점에서 검출한 스코어보드에서 각각 나타나는 스코어의 변화를 판단하는 과정이다.Step (ex2-3) is a process in which the data verification unit (40) determines the change in the score that appears on the scoreboard detected at points before and after the point corresponding to the sports highlight video.

본 과정에 있어서, 스코어의 변화를 판단 시에는 스코어보드의 정보만을 고려하는 방식이 적용될 수도 있으나, 스코어 변화 판단 방식은 이에 제한되지는 않으며 해설 음성의 음량정보, 톤, 해설 텍스트의 의미정보 등을 함께 고려하여 종합 판단하는 방식이 적용될 수도 있다.In this process, when judging changes in scores, a method that only considers information on the scoreboard may be applied, but the method for judging changes in scores is not limited to this, and a method for comprehensively judging by considering volume information of the commentary voice, tone, and semantic information of the commentary text may also be applied.

(ex2-4)단계는, 데이터검증부(40)가 상기 (ex2-3)단계의 판단 결과 스코어의 변화가 검출된 경우, 해당 스포츠 하이라이트 영상을 정상 영상으로 판단하는 과정이다.Step (ex2-4) is a process in which the data verification unit (40) determines the corresponding sports highlight video as a normal video if a change in the score is detected as a result of the judgment in step (ex2-3).

그리고 (ex2-5)단계는, 데이터검증부(40)가 상기 (ex2-3)단계의 판단 결과 스코어의 변화가 검출되지 않은 경우, 해당 스포츠 하이라이트 영상을 비정상 영상으로 판단하는 과정이다.And step (ex2-5) is a process in which the data verification unit (40) determines the corresponding sports highlight video as an abnormal video if no change in the score is detected as a result of the judgment in step (ex2-3).

예컨대, 해당 스포츠가 축구라고 가정할 경우, 골이 발생한 시점을 스포츠 하이라이트 영상으로 생성하였으나, 해당 시점 이후 오프사이드나 반칙 등으로 판정 정정이 일어난 경우가 존재할 수 있다.For example, if the sport is soccer, a sports highlight video may be created showing the point in time when a goal was scored, but there may be cases where a decision is corrected after that point due to an offside or foul.

따라서 (ex2-5)단계에서 데이터검증부(40)는 스코어의 변화가 일어나지 않은 경우에는 해당 이벤트에 대해 판단 정정이 이루어진 것으로 판단하여, 해당 스포츠 하이라이트 영상이 실제 이벤트가 아닌 비정상 영상인 것으로 판단하게 된다.Therefore, in step (ex2-5), if there is no change in the score, the data verification unit (40) determines that a judgment correction has been made for the event in question, and thus determines that the sports highlight video in question is an abnormal video and not an actual event.

다만, 이와 같이 비정상 영상으로 판정된 스포츠 하이라이트 영상은 완전히 폐기되지는 않고, 임시 보관 영상으로 취급될 수 있으며, 정상 영상에 비해 낮은 중요도를 가지는 것으로 마킹되어 보관될 수 있다.However, sports highlight videos judged as abnormal videos in this way are not completely discarded, but may be treated as temporarily stored videos and may be marked as having lower importance than normal videos and stored.

그리고 이후 임시 보관 영상은 더욱 많은 수의 스포츠 하이라이트 영상을 필요로 하는 컨텐츠에서 정상 영상과 함께 활용될 수도 있다.And later, the temporary archive footage can be used alongside regular footage in content that requires a larger number of sports highlight footage.

이상과 같이 본 발명에 따른 바람직한 실시예를 살펴보았으며, 앞서 설명된 실시예 이외에도 본 발명이 그 취지나 범주에서 벗어남이 없이 다른 특정 형태로 구체화될 수 있다는 사실은 해당 기술에 통상의 지식을 가진 이들에게는 자명한 것이다. 그러므로, 상술된 실시예는 제한적인 것이 아니라 예시적인 것으로 여겨져야 하고, 이에 따라 본 발명은 상술한 설명에 한정되지 않고 첨부된 청구항의 범주 및 그 동등 범위 내에서 변경될 수도 있다.As described above, preferred embodiments according to the present invention have been examined, and it is obvious to those skilled in the art that the present invention can be embodied in other specific forms in addition to the embodiments described above without departing from the spirit or scope thereof. Therefore, the above-described embodiments should be considered as illustrative rather than restrictive, and accordingly, the present invention is not limited to the above description, but may be modified within the scope of the appended claims and their equivalents.

10: 영상전처리부
20: 영상생성부
30: 메모리부
40: 데이터검증부10: Image preprocessing section
20: Video generation section
30: Memory section
40: Data Verification Department

Claims

(a) step in which a video preprocessing unit receives original sports video and commentary text of the original sports video for generating a highlight video;
(b) step in which the above video preprocessing unit divides the entire length of the original sports video into a plurality of unit chunks according to arbitrarily set criteria;
(c) step in which the above image preprocessing unit derives volume information of the commentary voice and semantic information of the commentary text included in the plurality of unit chunks;
(d) step in which the video generation unit extracts the occurrence point of a game event based on the volume information and semantic information of the commentary voice included in the plurality of unit chunks;
(e) step of determining the length of the main event video by setting the start point and end point before and after the point where the game event occurs as the midpoint by the above video generation unit; and
(f) step of generating a sports highlight video including the main event video by the above video generation unit;
Including,
Step (e) above,
(e-1) step in which the above video generation unit sets a first preceding point in time and a first succeeding point in time having a preset time interval based on the unit chunk set as the point of occurrence of the game event;
(e-2) step in which the above video generation unit determines whether an explanation voice is generated at the first preceding time point and the first succeeding time point;
If the video generation unit determines that the commentary voice does not occur at the first preceding time point in the step (e-2), the step (e-3a) of including the main event video from the first preceding time point to the starting point of the unit chunk set as the game event occurrence point; and
If the above video generation unit determines that the commentary voice does not occur at the first subsequent time point in the above step (e-2), the step (e-4a) of including the main event video from the end point of the unit chunk set as the game event occurrence point to the first subsequent time point;
As including,
In order to include in the main event video the cause of occurrence of the game event that appears from the first preceding point in time to the starting point of the unit chunk set as the game event occurrence point, and the result of occurrence of the game event that appears from the ending point of the unit chunk set as the game event occurrence point to the first succeeding point in time, together with the unit chunk set as the game event occurrence point.
How to create a sports highlight video.

In the first paragraph,
Step (b) above,
(b-1) step in which the above video preprocessing unit sets a temporary chunk from the point where any utterance among the commentary voices included in the original sports video starts to the point where the utterance ends;
Step (b-2) in which the above video preprocessing unit analyzes whether there is a point where the voice of the commentary voice included in the temporary chunk set in step (b-1) changes;
If the above video preprocessing unit determines that there is a point where the voice of the narration voice included in the temporary chunk changes through the above step (b-2), the step (b-3) of dividing the temporary chunk into multiple unit chunks based on the point; and
Step (b-4) in which the image preprocessing unit sets each blank portion existing between each temporary chunk set in step (b-1) as a unit chunk;
Including,
How to create a sports highlight video.

In the first paragraph,
Step (c) above,
(c-1) step in which the above image preprocessing unit sets the volume level for the commentary voice included in each unit chunk; and
(c-2) step in which the above image preprocessing unit interprets the meaning of the explanatory text included in each unit chunk;
Including,
How to create a sports highlight video.

In the third paragraph,
Step (d) above,
Step (d-1) in which the above video generation unit extracts a unit chunk containing a commentary text meaning an event related to the corresponding sport;
(d-2) step in which the above video generation unit determines whether the volume of the commentary voice included in the unit chunk extracted in (d-1) is greater than a preset reference value; and
Step (d-3) in which the video generation unit determines a unit chunk having a volume level higher than the preset reference value in step (d-2) as the point in time when the game event occurs;
Including,
How to create a sports highlight video.

delete

In the first paragraph,
Step (e) above,
If the above video generation unit determines that the commentary voice is generated at the first preceding time point in the above step (e-2), the step (e-3b) sets the first preceding time point to an earlier time point according to the preset criteria; and
Step (e-3c) of re-performing steps (e-2) and (e-3b) n times until step (e-3a) is reached;
Including more,
How to create a sports highlight video.

In the first paragraph,
Step (e) above,
If the above video generation unit determines that the commentary voice is generated at the first subsequent time point in the above step (e-2), the step (e-4b) sets the first subsequent time point to a later time point according to the preset criteria; and
Step (e-4c) of re-performing steps (e-2) and (e-4b) n times until step (e-4a) is reached;
Including more,
How to create a sports highlight video.

In the first paragraph,
Between steps (e) and (f) above,
The above video generation unit further performs an (add) step of determining the length of at least one of a pre-event video played before the main event video and a post-event video played after the main event video.
How to create a sports highlight video.

In Article 8,
The above (add) step is,
(add-1) step in which the above video generation unit sets a second preceding time point and a second succeeding time point having a preset time interval based on the above main event video;
(add-2) step in which the above video generation unit determines whether an explanation voice is generated at the second preceding time point and the second succeeding time point;
If the above video generation unit determines that the commentary voice does not occur at the second preceding time point in the above (add-2) step, the (add-3a) step of determining the time from the second preceding time point to the starting point of the main event video as the pre-event video; and
If the above video generation unit determines that the commentary voice does not occur at the second subsequent time point in the above (add-2) step, the (add-4a) step determines the time from the end point of the main event video to the second subsequent time point as the post-event video;
Including,
How to create a sports highlight video.

In Article 9,
The above (add) step is,
If the above video generation unit determines that the commentary voice is generated at the second preceding time point in the above (add-2) step, the (add-3b) step sets the second preceding time point to an earlier time point according to the preset criteria; and
Step (add-3c) of re-performing steps (add-2) and (add-3b) n times until step (add-3a) is reached;
Including more,
How to create a sports highlight video.

In Article 9,
The above (add) step is,
If the above video generation unit determines that the commentary voice is generated at the second subsequent time point in the above (add-2) step, the (add-4b) step sets the second subsequent time point to a later time point according to the preset criteria; and
Step (add-4c) of re-performing steps (add-2) and (add-4b) n times until step (add-4a) is reached;
Including more,
How to create a sports highlight video.

In Article 8,
The above step (f) is,
Generating said sports highlight video, including said pre-event video and said post-event video along with said main event video;
How to create a sports highlight video.

In the first paragraph,
The memory unit further includes a step (ex1) of storing a plurality of data generated in steps (a) to (f) in a database.
How to create a sports highlight video.

In the first paragraph,
After step (f) above,
(ex2) The data verification department further performs a step of verifying whether the sports highlight video matches the point where the game event of the sport occurred.
How to create a sports highlight video.