[go: up one dir, main page]

TWI894591B - Sound ray configuration method, system and computer program product - Google Patents

Sound ray configuration method, system and computer program product

Info

Publication number
TWI894591B
TWI894591B TW112128777A TW112128777A TWI894591B TW I894591 B TWI894591 B TW I894591B TW 112128777 A TW112128777 A TW 112128777A TW 112128777 A TW112128777 A TW 112128777A TW I894591 B TWI894591 B TW I894591B
Authority
TW
Taiwan
Prior art keywords
voice
data
character data
character
sound ray
Prior art date
Application number
TW112128777A
Other languages
Chinese (zh)
Other versions
TW202507709A (en
Inventor
梁家瑞
Original Assignee
聯經數位股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 聯經數位股份有限公司 filed Critical 聯經數位股份有限公司
Priority to TW112128777A priority Critical patent/TWI894591B/en
Publication of TW202507709A publication Critical patent/TW202507709A/en
Application granted granted Critical
Publication of TWI894591B publication Critical patent/TWI894591B/en

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

一種由聲線配置系統實施的聲線配置方法包含:(A)獲得多筆角色資料及多個聲線群組,每一聲線群組包括多筆分別對應於多個特定聲線的聲線資料,每一聲線群組的該等聲線資料包含一初始聲線資料及一或多筆至少藉由對該初始聲線資料進行音頻範圍調整而被產生的衍生聲線資料。(B)對於每一角色資料,從該等聲線群組中選出其中一個與該角色資料對應同一個角色類型的匹配聲線群組,再將該匹配聲線群組之該等聲線資料中的其中一者設定為一對應於該角色資料的配對聲線資料。A voice configuration method implemented by a voice configuration system includes: (A) obtaining a plurality of character data and a plurality of voice groups, each voice group including a plurality of voice data corresponding to a plurality of specific voices, wherein the voice data in each voice group includes an initial voice data and one or more derived voice data generated by at least performing an audio range adjustment on the initial voice data. (B) for each character data, selecting a matching voice group from the voice groups that corresponds to the same character type as the character data, and then setting one of the voice data in the matching voice group as a matching voice data corresponding to the character data.

Description

聲線配置方法及系統與電腦程式產品Sound ray configuration method, system and computer program product

本發明是有關於一種聲線配置方法,特別是指一種適合替多個角色分配合適之聲音的聲線配置方法。本發明還有關於適合替多個角色分配合適之聲音的一種聲線配置系統,以及一種電腦程式產品。The present invention relates to a voice configuration method, and more particularly to a voice configuration method suitable for assigning appropriate voices to multiple characters. The present invention also relates to a voice configuration system suitable for assigning appropriate voices to multiple characters, and a computer program product.

在配音工作中,不同角色的台詞通常會使用不同的聲音來詮釋,並且,除了以真人配音之外,利用電腦合成語音(即虛擬人聲)來進行配音也已成為另一種可行做法。In dubbing work, different characters' lines are usually interpreted using different voices. In addition to real-life dubbing, using computer-synthesized voices (i.e. virtual human voices) for dubbing has become another feasible approach.

現有的電腦語音合成軟體中往往會提供多種不同的虛擬人聲以供選擇。然而,若要讓配音結果聽起來更加自然,則應避免採用口音或音色太過突兀的聲線來替角色配音,在此考量之下,電腦語音合成軟體所能提供的合適聲線選擇便可能會變得相當有限。Existing computer-generated speech synthesis (CSTS) software often offers a variety of virtual human voices to choose from. However, to achieve a more natural dubbing experience, it's important to avoid using voices with accents or overly unnatural timbre. This can limit the selection of suitable voices CSTS software can offer.

因此,在需配音的角色數量較多但合適聲線選擇有限的情況下,要如何更好地利用電腦合成語音來完成配音,便成為一個值得探討的議題。Therefore, given the large number of characters requiring dubbing but limited choice of suitable voices, how to better utilize computer-generated speech to complete dubbing has become a topic worth exploring.

因此,本發明的其中一目的,便在於提供一種有助於改善多角色配音效果的聲線配置方法。Therefore, one of the objects of the present invention is to provide a voice configuration method that helps to improve the dubbing effect of multiple characters.

本發明聲線配置方法由一聲線配置系統實施,該聲線配置方法包含:(A)獲得一聲線需求資訊及一聲線資料組合。該聲線需求資訊包含多筆分別對應於多個角色的角色資料,且每一角色資料對應於多個角色類型的其中一者。該聲線資料組合包含多個聲線群組,每一聲線群組對應於該等角色類型的其中一者且包括多筆聲線資料,每一聲線群組的該等聲線資料分別對應於多個特定聲線,且包含一初始聲線資料及一或多筆至少藉由對該初始聲線資料進行音頻範圍調整而被產生的衍生聲線資料。(B)對於每一角色資料,從該等聲線群組中選出其中一個與該角色資料對應同一個角色類型的匹配聲線群組,再將該匹配聲線群組之該等聲線資料中的其中一者設定為一對應於該角色資料的配對聲線資料,藉此將該配對聲線資料所對應的該特定聲線配置給該角色資料所對應的該角色。The present invention provides a voice ray configuration method implemented by a voice ray configuration system. The method comprises: (A) obtaining voice ray requirement information and a voice ray data set. The voice ray requirement information includes a plurality of character data corresponding to a plurality of characters, each character data corresponding to one of a plurality of character types. The voice ray data set includes a plurality of voice ray groups, each voice ray group corresponding to one of the character types and including a plurality of voice ray data. The voice ray data in each voice ray group corresponds to a plurality of specific voices and includes an initial voice ray data and one or more derived voice ray data generated by at least adjusting the audio range of the initial voice ray data. (B) For each character data, a matching voice group corresponding to the same character type as the character data is selected from the voice groups, and one of the voice data in the matching voice group is set as a matching voice data corresponding to the character data, thereby allocating the specific voice corresponding to the matching voice data to the character corresponding to the character data.

在本發明聲線配置方法的一些實施態樣中,在步驟(A)中,該等角色資料的其中一或多筆角色資料各包含一與音頻範圍相關的聲線特徵標籤,而各被作為一重要角色資料。在步驟(B)中,若該角色資料屬於該重要角色資料,該聲線配置系統在從該等聲線群組中選出該匹配聲線群組之後,是將該匹配聲線群組的該等聲線資料中與該重要角色資料之該聲線特徵標籤匹配的其中一者設定為該重要角色資料所對應的該配對聲線資料。In some embodiments of the voice configuration method of the present invention, in step (A), one or more of the character data each includes a voice feature tag associated with an audio frequency range and is each designated as a key character data item. In step (B), if the character data item is key character data, the voice configuration system selects the matching voice ray group from the voice ray groups and then sets one of the voice ray data items in the matching voice ray group that matches the voice feature tag of the key character data item as the paired voice ray data item corresponding to the key character data item.

在本發明聲線配置方法的一些實施態樣中,在步驟(B)中,對於該等角色資料中對應於同一個角色類型的其中多筆重要角色資料,若該等聲線群組中的其中多個聲線群組是與其中該等重要角色資料對應於同一個角色類型,則該聲線配置系統是輪流從其中該等聲線群組中選出該等重要角色資料所分別對應的該等配對聲線資料。In some embodiments of the voice configuration method of the present invention, in step (B), for multiple important character data corresponding to the same character type in the character data, if multiple voice groupings in the voice groupings correspond to the same character type as the important character data, the voice configuration system selects the matching voice data corresponding to the important character data from the voice groupings in turn.

在本發明聲線配置方法的一些實施態樣中,在步驟(A)中,該等角色資料的其中至少一者被作為一主要角色資料,且該等聲線群組中被用於決定該主要角色資料所對應之配對聲線資料的該聲線群組被作為一主角聲線群組。在步驟(B)中,對於該等角色資料中對應於同一個角色類型的其中多筆重要角色資料,若該等聲線群組中的其中多個聲線群組是與其中該等重要角色資料對應於同一個角色類型,但其中該等聲線群組中存在該主角聲線群組,則該聲線配置系統不從該主角聲線群組中選出該等重要角色資料所要對應的配對聲線資料。In some embodiments of the voice configuration method of the present invention, in step (A), at least one of the character data is selected as a main character data, and the voice group used in the voice groups to determine the matching voice data corresponding to the main character data is selected as a protagonist voice group. In step (B), for multiple important character data corresponding to the same character type in the character data, if multiple voice groups in the voice groups correspond to the same character type as the important character data, but the protagonist voice group exists in the voice groups, the voice configuration system does not select the matching voice data corresponding to the important character data from the protagonist voice group.

在本發明聲線配置方法的一些實施態樣中,在步驟(A)中,該等角色資料的其中一或多筆不包含聲線特徵標籤的角色資料各被作為一次要角色資料。在步驟(B)中,對於該等角色資料中對應於同一個角色類型的其中多筆次要角色資料,若該等聲線群組中的其中多個聲線群組是與其中該等次要角色資料對應於同一個角色類型,則該聲線配置系統是輪流從其中該等聲線群組中選出該等次要角色資料所分別對應的該等配對聲線資料。In some embodiments of the voice ray configuration method of the present invention, in step (A), one or more pieces of character data that do not include a voice ray feature label are each treated as a secondary character data. In step (B), for multiple pieces of secondary character data corresponding to the same character type in the character data, if multiple voice ray groups in the voice ray groups correspond to the same character type as the secondary character data, the voice ray configuration system selects the matching voice ray data corresponding to each of the secondary character data in turn from the voice ray groups.

在本發明聲線配置方法的一些實施態樣中,該聲線配置方法還包含在步驟(B)之後的:(C)對於該等角色資料中的其中一筆受檢角色資料以及該受檢角色資料當前所對應的該配對聲線資料,根據一對應於該聲線需求資訊的文本資料判斷一聲線衝突條件是否符合,其中,該文本資料包含多個台詞部分,每一台詞部分對應於該等角色資料的其中一角色資料以及該其中一角色資料當前所對應的該配對聲線資料,且該受檢角色資料所對應的每一台詞部分被作為一受檢台詞部分,並且,該聲線衝突條件包含:該等台詞部分中存在一個符合一鄰近條件且所對應之角色資料非為該受檢角色資料的鄰近台詞部分,且該鄰近台詞部分所對應的該配對聲線資料與該受檢角色資料所對應的該配對聲線資料相同,其中,對於該(等)受檢台詞部分以外的其他每一台詞部分,該鄰近條件代表該台詞部分在該文本資料中與任一受檢台詞部分之間相隔的字元數量小於等於一預設字元數量門檻值。(D)在判斷該聲線衝突條件符合的情況下輸出一指示出該受檢角色資料的聲線衝突提示。In some embodiments of the voice configuration method of the present invention, the voice configuration method further includes, after step (B): (C) for one of the examined character data in the character data and the paired voice data currently corresponding to the examined character data, judging whether a voice conflict condition is met based on text data corresponding to the voice requirement information, wherein the text data includes a plurality of lines, each line part corresponds to one of the character data in the character data and the paired voice data currently corresponding to the one of the character data, and the examined character data corresponds to Each dialogue portion is considered a checked dialogue portion, and the voice conflict condition includes: there is a neighboring dialogue portion among the dialogue portions that meets a proximity condition and corresponds to a character data other than the checked character data, and the matching voice data corresponding to the neighboring dialogue portion is the same as the matching voice data corresponding to the checked character data. For each dialogue portion other than the checked dialogue portion(s), the proximity condition indicates that the number of characters between the dialogue portion and any of the checked dialogue portions in the text data is less than or equal to a preset character count threshold. (D) If the voice conflict condition is determined to be met, a voice conflict prompt indicating the checked character data is output.

在本發明聲線配置方法的一些實施態樣中,在步驟(D)中,在判斷該聲線衝突條件符合的情況下,該聲線配置系統是先更新該受檢角色資料所對應的聲線資料,並於更新該受檢角色資料所對應的聲線資料且再次判斷出該聲線衝突條件符合之後,才輸出該聲線衝突提示。In some embodiments of the voice ray configuration method of the present invention, in step (D), when it is determined that the voice ray conflict condition is met, the voice ray configuration system first updates the voice ray data corresponding to the inspected character data. Only after the voice ray data corresponding to the inspected character data is updated and it is again determined that the voice ray conflict condition is met, does the voice ray conflict prompt be output.

本發明的另一目的,在於提供一種有助於改善多角色配音效果的聲線配置系統。Another object of the present invention is to provide a voice configuration system that helps improve the dubbing effect of multiple characters.

本發明聲線配置系統包含一處理單元及一電連接該處理單元的儲存單元,其中,該處理單元用於:獲得一聲線需求資訊及一聲線資料組合,該聲線需求資訊包含多筆分別對應於多個角色的角色資料,且每一角色資料對應於多個角色類型的其中一者,該聲線資料組合包含多個聲線群組,每一聲線群組對應於該等角色類型的其中一者且包括多筆聲線資料,每一聲線群組的該等聲線資料分別對應於多個特定聲線,且包含一初始聲線資料及一或多筆至少藉由對該初始聲線資料進行音頻範圍調整而被產生的衍生聲線資料;對於每一角色資料,從該等聲線群組中選出其中一個與該角色資料對應同一個角色類型的匹配聲線群組,再將該匹配聲線群組之該等聲線資料中的其中一者設定為一對應於該角色資料的配對聲線資料,藉此將該配對聲線資料所對應的該特定聲線配置給該角色資料所對應的該角色。The voice configuration system of the present invention includes a processing unit and a storage unit electrically connected to the processing unit, wherein the processing unit is used to: obtain voice demand information and a voice data combination, the voice demand information includes a plurality of character data corresponding to a plurality of characters, and each character data corresponds to one of a plurality of character types; the voice data combination includes a plurality of voice groups, each voice group corresponds to one of the character types and includes a plurality of voice data, and the voice data of each voice group corresponds to a plurality of A specific voice line is provided, and includes an initial voice line data and one or more derived voice line data generated by at least adjusting the audio range of the initial voice line data; for each character data, a matching voice line group corresponding to the same character type as the character data is selected from the voice line groups, and one of the voice line data in the matching voice line group is set as a matching voice line data corresponding to the character data, thereby allocating the specific voice line corresponding to the matching voice line data to the character corresponding to the character data.

在本發明聲線配置系統的一些實施態樣中,該等角色資料的其中一或多筆角色資料各包含一與音頻範圍相關的聲線特徵標籤,而各被作為一重要角色資料。若該角色資料屬於該重要角色資料,該處理單元在從該等聲線群組中選出該匹配聲線群組之後,是將該匹配聲線群組的該等聲線資料中與該重要角色資料之該聲線特徵標籤匹配的其中一者設定為該重要角色資料所對應的該配對聲線資料。In some embodiments of the voice configuration system of the present invention, one or more of the character data each includes a voice feature tag associated with an audio frequency range and is each considered as a key character data item. If the character data item is a key character data item, the processing unit selects the matching voice ray group from the voice ray groups and then sets one of the voice ray data items in the matching voice ray group that matches the voice feature tag of the key character data item as the paired voice ray data item corresponding to the key character data item.

在本發明聲線配置系統的一些實施態樣中,對於該等角色資料中對應於同一個角色類型的其中多筆重要角色資料,若該等聲線群組中的其中多個聲線群組是與其中該等重要角色資料對應於同一個角色類型,則該處理單元是輪流從其中該等聲線群組中選出該等重要角色資料所分別對應的該等配對聲線資料。In some embodiments of the voice configuration system of the present invention, for multiple important character data corresponding to the same character type in the character data, if multiple voice groups in the voice groups correspond to the same character type as the important character data, the processing unit selects the matching voice data corresponding to the important character data from the voice groups in turn.

在本發明聲線配置系統的一些實施態樣中,該等角色資料的其中至少一者被作為一主要角色資料,且該等聲線群組中被用於決定該主要角色資料所對應之配對聲線資料的該聲線群組被作為一主角聲線群組。對於該等角色資料中對應於同一個角色類型的其中多筆重要角色資料,若該等聲線群組中的其中多個聲線群組是與其中該等重要角色資料對應於同一個角色類型,但其中該等聲線群組中存在該主角聲線群組,則該處理單元不從該主角聲線群組中選出該等重要角色資料所要對應的配對聲線資料。In some embodiments of the voice configuration system of the present invention, at least one of the character data is designated as a main character data, and the voice group used in the voice groups to determine the matching voice data corresponding to the main character data is designated as a protagonist voice group. For multiple pieces of important character data corresponding to the same character type in the character data, if multiple voice groups in the voice groups correspond to the same character type as the important character data, but the protagonist voice group is present in the voice groups, the processing unit does not select the matching voice data corresponding to the important character data from the protagonist voice group.

在本發明聲線配置系統的一些實施態樣中,該等角色資料的其中一或多筆不包含聲線特徵標籤的角色資料各被作為一次要角色資料。對於該等角色資料中對應於同一個角色類型的其中多筆次要角色資料,若該等聲線群組中的其中多個聲線群組是與其中該等次要角色資料對應於同一個角色類型,則該處理單元是輪流從其中該等聲線群組中選出該等次要角色資料所分別對應的該等配對聲線資料。In some embodiments of the voice configuration system of the present invention, one or more pieces of character data that do not include a voice feature tag are each treated as secondary character data. For multiple pieces of secondary character data corresponding to the same character type in the character data, if multiple voice groups in the voice groups correspond to the same character type as the secondary character data, the processing unit sequentially selects the matching voice data corresponding to the secondary character data from the voice groups.

在本發明聲線配置系統的一些實施態樣中,該處理單元還用於:在對每一角色資料設定該角色資料所對應的該配對聲線資料之後,對於該等角色資料中的其中一筆受檢角色資料以及該受檢角色資料當前所對應的該配對聲線資料,根據一對應於該聲線需求資訊的文本資料判斷一聲線衝突條件是否符合,其中,該文本資料包含多個台詞部分,每一台詞部分對應於該等角色資料的其中一角色資料以及該其中一角色資料當前所對應的該配對聲線資料,且該受檢角色資料所對應的每一台詞部分被作為一受檢台詞部分,並且,該聲線衝突條件包含:該等台詞部分中存在一個符合一鄰近條件且所對應之角色資料非為該受檢角色資料的鄰近台詞部分,且該鄰近台詞部分所對應的該配對聲線資料與該受檢角色資料所對應的該配對聲線資料相同,其中,對於該(等)受檢台詞部分以外的其他每一台詞部分,該鄰近條件代表該台詞部分在該文本資料中與任一受檢台詞部分之間相隔的字元數量小於等於一預設字元數量門檻值;在判斷該聲線衝突條件符合的情況下輸出一指示出該受檢角色資料的聲線衝突提示。In some embodiments of the voice configuration system of the present invention, the processing unit is further used to: after setting the matching voice data corresponding to each character data, for one of the inspected character data in the character data and the matching voice data currently corresponding to the inspected character data, determine whether a voice conflict condition is met based on text data corresponding to the voice requirement information, wherein the text data includes multiple dialogue parts, each dialogue part corresponds to one of the character data in the character data and the matching voice data currently corresponding to one of the character data, and each dialogue part corresponding to the inspected character data is used as a received Check the dialogue parts, and the voice conflict condition includes: there is a neighboring dialogue part among the dialogue parts that meets a neighboring condition and the corresponding character data is not the neighboring dialogue part of the checked character data, and the matching voice data corresponding to the neighboring dialogue part is the same as the matching voice data corresponding to the checked character data, wherein, for each dialogue part other than the (etc.) checked dialogue part(s), the neighboring condition represents that the number of characters between the dialogue part and any checked dialogue part in the text data is less than or equal to a preset character number threshold; when it is determined that the voice conflict condition is met, a voice conflict prompt indicating the checked character data is output.

在本發明聲線配置系統的一些實施態樣中,在判斷該聲線衝突條件符合的情況下,該處理單元是先更新該受檢角色資料所對應的聲線資料,並於更新該受檢角色資料所對應的聲線資料且再次判斷出該聲線衝突條件符合之後,才輸出該聲線衝突提示。In some embodiments of the sound ray configuration system of the present invention, when it is determined that the sound ray conflict condition is met, the processing unit first updates the sound ray data corresponding to the inspected character data, and only outputs the sound ray conflict prompt after the sound ray data corresponding to the inspected character data is updated and it is again determined that the sound ray conflict condition is met.

本發明的再一目的,在於提供一種有助於改善多角色配音效果的電腦程式產品。Another object of the present invention is to provide a computer program product that helps improve the dubbing effect of multiple characters.

本發明電腦程式產品包含一應用程式,其中,當該應用程式被一電子裝置載入並執行時,能使該電子裝置實施如前述任一實施態樣中所述的聲線配置方法。The computer program product of the present invention includes an application program, wherein when the application program is loaded and executed by an electronic device, the electronic device can implement the sound ray configuration method as described in any of the aforementioned embodiments.

本發明之功效在於:該聲線配置系統能在口音及音色合適的聲線數量有限的情況下,利用合適的特定聲線(即初始聲線資料所對應的特定聲線)作為基礎,而擴展出更多口音及音色合適的特定聲線(即衍生聲線資料所對應的特定聲線),藉此提升可選的合適聲線數量,故有助於在合適聲線數量有限的情況下達成較佳的配音效果。The effectiveness of the present invention lies in that, when the number of voices with suitable accents and timbres is limited, the voice configuration system can use a suitable specific voice (i.e., the specific voice corresponding to the initial voice data) as a basis to expand more specific voices with suitable accents and timbres (i.e., the specific voice corresponding to the derived voice data), thereby increasing the number of suitable voices to choose from. This helps achieve better dubbing effects when the number of suitable voices is limited.

在本發明被詳細描述之前應當注意:在未特別定義的情況下,本專利說明書中所述的「電連接(electrically connected)」是用來描述電腦硬體(例如電子系統、設備、裝置、單元、元件)之間的「耦接(coupled)」關係,且泛指複數電腦硬體之間透過導體/半導體材料彼此實體相連而實現的「有線電連接」,以及利用無線通訊技術(例如但不限於無線網路、藍芽及電磁感應等)而實現無線資料傳輸的「無線電連接」。另一方面,在未特別定義的情況下,本專利說明書中所述的「電連接」也泛指複數電腦硬體之間彼此直接耦接而實現的「直接電連接」,以及複數電腦硬體之間是透過其他電腦硬體間接耦接而實現的「間接電連接」。再一方面,本專利說明書中所述的「聲線」是指語音在聽覺感受方面所呈現出的整體特質,而包含語音的音頻(也稱音高)範圍、音色、口音(也稱腔調)、語速等語音聲學特徵,其中,「音色」是一種聲音的感官屬性,使聽者可以據此判斷出兩個具有相同響度和音高的聲音之間的區別。Before the present invention is described in detail, it should be noted that, unless otherwise specified, the term "electrically connected" as used in this patent specification is used to describe the "coupled" relationship between computer hardware (e.g., electronic systems, devices, apparatuses, units, components), and generally refers to "wired electrical connections" achieved by physically connecting multiple computer hardware components to each other through conductive/semiconductor materials, as well as "radio connections" achieved by wireless data transmission using wireless communication technologies (such as, but not limited to, wireless networks, Bluetooth, and electromagnetic induction). On the other hand, unless otherwise specified, the term "electrical connection" as used in this patent specification broadly refers to "direct electrical connection" achieved by directly coupling multiple computer hardware components to each other, as well as "indirect electrical connection" achieved by indirectly coupling multiple computer hardware components through other computer hardware components. Furthermore, the term "voice" as used in this patent specification refers to the overall auditory quality of speech, including acoustic characteristics of speech such as the frequency range (also known as pitch), timbre, accent (also known as intonation), and speech rate. "Timbre" is a sensory property of sound that allows listeners to distinguish between two sounds of the same loudness and pitch.

參閱圖1,本發明聲線配置系統1的一實施例例如被實施為一台電腦設備(可例如為桌上型電腦、筆記型電腦或伺服器),而且,該聲線配置系統1包含一處理單元11,以及一電連接該處理單元11的儲存單元12。更具體地說,在本實施例中,該處理單元11為一以積體電路實現且具有資料運算及指令收發功能的處理器,該儲存單元12則為一用於儲存數位資料的資料儲存裝置(例如硬碟,或者是其他種類的電腦可讀取記錄媒體)。然而,在類似的實施態樣中,該處理單元11也可以是多個處理器的集合,或者是一包括有處理器的處理電路,而該儲存單元12也可以是多個相同或相異種類之儲存裝置的集合。此外,在不同的實施例中,該聲線配置系統1亦可被實施為例如平板電腦、智慧型手機等不同類型的電子裝置。基於上述,該聲線配置系統1在電腦硬體方面的實際實施態樣並不以本實施例為限。Referring to FIG. 1 , one embodiment of the sound ray configuration system 1 of the present invention is implemented as a computer device (e.g., a desktop computer, laptop computer, or server). Furthermore, the sound ray configuration system 1 includes a processing unit 11 and a storage unit 12 electrically connected to the processing unit 11. More specifically, in this embodiment, the processing unit 11 is a processor implemented as an integrated circuit and having data processing and instruction transmission and reception functions, while the storage unit 12 is a data storage device for storing digital data (e.g., a hard drive or other type of computer-readable recording medium). However, in similar implementations, the processing unit 11 may also be a collection of multiple processors, or a processing circuit including a processor, and the storage unit 12 may also be a collection of multiple storage devices of the same or different types. Furthermore, in different implementations, the sound ray configuration system 1 may also be implemented as a different type of electronic device, such as a tablet computer or smartphone. Based on the above, the actual implementation of the sound ray configuration system 1 in terms of computer hardware is not limited to this embodiment.

在本實施例中,該儲存單元12例如儲存有一虛擬人聲資料庫DB,其中,該虛擬人聲資料庫DB例如是藉由該處理單元11運行一電腦語音合成軟體所建立,且該虛擬人聲資料庫DB例如包含多個分別對應多種虛擬人聲的語音設定檔。具體而言,每一語音設定檔能用於供該處理單元11控制一揚聲器(圖未示)發出該語音設定檔所對應的該種虛擬人聲,因此,若將一串文字與該語音設定檔配合,該處理單元11便能控制揚聲器以該語音設定檔所對應的該種虛擬人聲播放出該串文字,從而實現電腦合成語音的輸出。In this embodiment, the storage unit 12 stores a virtual voice database DB, for example. The virtual voice database DB is created by the processing unit 11 running computer speech synthesis software, and includes multiple voice profiles corresponding to various virtual voices. Specifically, each voice profile can be used by the processing unit 11 to control a speaker (not shown) to produce the virtual voice corresponding to the voice profile. Therefore, if a text string is paired with the voice profile, the processing unit 11 can control the speaker to play the text string using the virtual voice corresponding to the voice profile, thereby achieving computer-synthesized speech output.

補充說明的是,該電腦語音合成軟體以及該等語音設定檔皆可利用現有技術實現,故在此不過度詳述其細節。It should be noted that the computer speech synthesis software and the voice profiles can be implemented using existing technologies, so their details are not described in detail here.

配合參閱圖2,以下示例性地詳細說明本實施例的該聲線配置系統1如何實施一聲線配置方法。With reference to FIG. 2 , the following exemplarily describes in detail how the sound ray configuration system 1 of this embodiment implements a sound ray configuration method.

首先,在步驟S1中,該處理單元11獲得一文本資料,以及一對應於該文本資料的聲線需求資訊D1(示於圖3)。First, in step S1, the processing unit 11 obtains a text data and a sound ray demand information D1 corresponding to the text data (shown in FIG3 ).

該文本資料在本實施例中例如是一篇小說的文字檔,且該文本資料相當於是本實施例中的一個待配音目標。該文本資料指示出多個角色,並且包含多個台詞部分,其中,每一個台詞部分是其中一個角色的對白或獨白,而包括一或多個以自然語言形式呈現的語句。In this embodiment, the text data is, for example, a text file of a novel, and is equivalent to a target to be dubbed in this embodiment. The text data indicates multiple characters and includes multiple dialogue sections, where each dialogue section is a dialogue or monologue of one of the characters and includes one or more sentences presented in natural language.

如圖3所示,該聲線需求資訊D1包含多筆角色資料10,而且,該等角色資料10是分別對應於該文本資料所指示出的該等角色。As shown in FIG3 , the voice requirement information D1 includes a plurality of character data 10 , and the character data 10 respectively correspond to the characters indicated by the text data.

每一角色資料10包含一角色標記101,且該角色標記101指示出該角色資料10所對應的一個角色類型,例如圖3中所示出的「成年男」、「成年女」、「幼年男」及「幼年女」等,但並不以此為限。補充說明的是,根據該文本資料的內容,該等角色資料10中的其中多筆角色資料10可能會對應於同一種角色類型,舉例來說,圖3中所示的角色資料10A、角色資料10D及角色資料10E便皆是對應於「成年男」的角色類型。Each character data 10 includes a character tag 101, and the character tag 101 indicates a character type corresponding to the character data 10, such as "adult male", "adult female", "child male", and "child female" shown in FIG3, but is not limited thereto. It should be noted that, depending on the content of the text data, multiple pieces of character data 10 may correspond to the same character type. For example, the character data 10A, character data 10D, and character data 10E shown in FIG3 all correspond to the character type of "adult male".

在本實施例中,該等角色資料10中的其中多筆角色資料10(例如圖3中所示的角色資料10A至角色資料10E)還各包含一聲線特徵標籤102。對於每一筆包含該聲線特徵標籤102的角色資料10,該聲線特徵標籤102用於指示出該角色資料10所對應的角色適合以對應之角色類型中何種音調的聲音來配音。以圖3中的該角色資料10B為例,該角色資料10B對應於「成年女」的角色類型,且其聲線特徵標籤102指示出「高音」,表示該角色資料10B所對應的角色適合以「成年女」角色類型中「高音」音調的聲音來配音。In this embodiment, multiple pieces of character data 10 (e.g., character data 10A through 10E shown in FIG3 ) further include a voice characteristic tag 102. For each piece of character data 10 including a voice characteristic tag 102, the voice characteristic tag 102 indicates which pitch of voice within the corresponding character type is suitable for dubbing the character corresponding to the character data 10. For example, character data 10B in FIG3 corresponds to the "adult female" character type, and its voice characteristic tag 102 indicates "high pitch," indicating that the character corresponding to the character data 10B is suitable for dubbing with a voice of the "high pitch" pitch within the "adult female" character type.

在本實施例中,該等角色資料10的其中一筆角色資料10(例如圖3中的該角色資料10A)被作為一主要角色資料10*,換言之,該主要角色資料10*所對應的角色在該文本資料中相當於一主角(例如男主角,但不以此為限)。另一方面,對於該主要角色資料10*以外的其他該等角色資料10,每一筆包含該聲線特徵標籤102的角色資料10(例如圖3中的角色資料10B至角色資料10E)被作為一重要角色資料10’,換言之,該等重要角色資料10’所對應的該等角色在該文本資料中相當於多個重要配角。再一方面,每一筆不包含聲線特徵標籤102的角色資料10(例如圖3中的角色資料10F至角色資料10J)被作為一次要角色資料10”,換言之,該等次要角色資料10”所對應的該等角色在該文本資料中相當於多個次要配角。In this embodiment, one of the character data 10 (e.g., character data 10A in FIG. 3 ) is designated as a main character data 10*. In other words, the character corresponding to the main character data 10* corresponds to a protagonist (e.g., a male protagonist, but not limited thereto) in the text data. On the other hand, for each of the character data 10 other than the main character data 10*, each character data 10 including the voice feature tag 102 (e.g., character data 10B through character data 10E in FIG. 3 ) is designated as an important character data 10′. In other words, the characters corresponding to the important character data 10′ correspond to multiple important supporting roles in the text data. On the other hand, each character data 10 that does not include the voice feature label 102 (such as character data 10F to character data 10J in Figure 3) is regarded as a secondary character data 10". In other words, the characters corresponding to the secondary character data 10" are equivalent to multiple minor supporting roles in the text data.

特別說明的是,根據該文本資料的實際內容,在本實施例的不同實施態樣中,該等角色資料10中也可能只有其中單一筆角色資料10包含聲線特徵標籤102。另一方面,該等角色資料10中也可以有一筆以上的角色資料10被作為主要角色資料10*。應當理解,圖3僅是用於示出該聲線需求資訊D1的一種示例性態樣,因此,該聲線需求資訊D1的實際態樣當然不以本實施例為限。It should be noted that, depending on the actual content of the text data, in different implementations of this embodiment, only a single piece of character data 10 may include a voice feature label 102. Alternatively, more than one piece of character data 10 may be designated as primary character data 10*. It should be understood that FIG3 merely illustrates an exemplary embodiment of the voice requirement information D1. Therefore, the actual embodiment of the voice requirement information D1 is certainly not limited to this embodiment.

另外,在本實施例中,該處理單元11例如是先獲得該文本資料,接著再利用一語言模型對該文本資料進行分段、拆句以及自然語言理解,以歸納出該文本資料所指示出的該等角色以及每一角色適合的聲線特質,從而產生該聲線需求資訊D1。換句話說,在本實施例中,該等角色資料10及其所包括的角色標記101及聲線特徵標籤102是由該處理單元11藉由對該文本資料進行語意分析所決定的。然而,可選的是,該聲線需求資訊D1亦可是由該處理單元11藉由對該文本資料進行語意分析並配合使用者的手動編輯調整而被產生。此外,在不同的實施例中,該處理單元11也可以是預先從其他的電子裝置接收該文本資料以及該聲線需求資訊D1並將其儲存於該儲存單元12,並且在開始執行該聲線配置方法時對該儲存單元12進行讀取,以從該儲存單元12獲得該文本資料以及該聲線需求資訊D1,因此,該聲線需求資訊D1並不限於是由該處理單元11所產生,且該處理單元11獲得該聲線需求資訊D1的方式亦不以本實施例為限。In addition, in this embodiment, the processing unit 11 first obtains the text data, and then uses a language model to segment, break down sentences, and perform natural language understanding on the text data to summarize the roles indicated by the text data and the voice characteristics suitable for each role, thereby generating the voice requirement information D1. In other words, in this embodiment, the role data 10 and the role tags 101 and voice feature labels 102 included therein are determined by the processing unit 11 through semantic analysis of the text data. However, optionally, the voice requirement information D1 can also be generated by the processing unit 11 through semantic analysis of the text data and manual editing and adjustment by the user. In addition, in different embodiments, the processing unit 11 may also receive the text data and the sound ray demand information D1 from other electronic devices in advance and store them in the storage unit 12, and read the storage unit 12 when starting to execute the sound ray configuration method to obtain the text data and the sound ray demand information D1 from the storage unit 12. Therefore, the sound ray demand information D1 is not limited to being generated by the processing unit 11, and the way in which the processing unit 11 obtains the sound ray demand information D1 is not limited to this embodiment.

在該處理單元11獲得該文本資料及該聲線需求資訊D1之後,流程進行至步驟S2。After the processing unit 11 obtains the text data and the sound ray demand information D1, the process proceeds to step S2.

在步驟S2中,該處理單元11根據該虛擬人聲資料庫DB獲得一聲線資料組合D2(示於圖4)。In step S2, the processing unit 11 obtains a sound ray data set D2 (shown in FIG4 ) based on the virtual human voice database DB.

如圖4所示,該聲線資料組合D2包含多個聲線群組20(在本實施例中以六個為例,但並不以此為限),其中,每一聲線群組20對應於該等角色類型的其中一者,例如但不限於前述之「成年男」、「成年女」、「幼年男」及「幼年女」的其中一者。在本實施例中,該聲線資料組合D2中的該等聲線群組20要對應哪些角色類型可例如是預先設定好的,但在其他實施例中也可以是由該處理單元11根據該等角色資料10的角色標記101所即時決定出的。As shown in FIG4 , the voice ray data set D2 includes a plurality of voice ray groups 20 (six in this embodiment, but not limited thereto). Each voice ray group 20 corresponds to one of the character types, such as, but not limited to, the aforementioned "adult male," "adult female," "child male," and "child female." In this embodiment, the character types to which the voice ray groups 20 in the voice ray data set D2 correspond can be pre-determined. However, in other embodiments, the processing unit 11 can also determine these character types in real time based on the character tags 101 in the character data 10.

對於每一聲線群組20,該聲線群組20包括多筆分別對應於多個特定聲線的聲線資料201,且該等特定聲線適合被用來詮釋該聲線群組20所對應之角色類型的角色聲音。以圖4中的聲線群組20A為例,該聲線群組20A所包括的該五筆聲線資料201便是分別對應於五個適合用來詮釋「成年男」之角色但音調彼此不同的特定聲線。其中,每一聲線資料201例如包括多個語音設定參數,而且,該等語音設定參數是用於共同定義出該聲線資料201所對應之該特定聲線的音頻範圍、音色、口音及語速等語音聲學特徵,藉此,若將一串文字與該聲線資料201配合,該處理單元11便能控制揚聲器以該聲線資料201所對應的該特定聲線播放出該串文字。Each voice ray group 20 includes a plurality of voice ray data 201 corresponding to a plurality of specific voice lines suitable for interpreting the voices of the character type to which the voice ray group 20 corresponds. For example, in voice ray group 20A in FIG4 , the five voice ray data 201 included in voice ray group 20A correspond to five specific voice lines with different pitches suitable for interpreting the character "adult male." Each sound ray data 201 includes, for example, a plurality of voice setting parameters. Furthermore, these voice setting parameters are used to collectively define the voice acoustic characteristics, such as the frequency range, timbre, accent, and speaking speed, of the specific sound ray corresponding to the sound ray data 201. Thus, if a string of text is combined with the sound ray data 201, the processing unit 11 can control the speaker to play the string of text using the specific sound ray corresponding to the sound ray data 201.

在本實施例中,對於同一個聲線群組20之該等聲線資料201所分別對應的該等特定聲線,該等特定聲線所呈現出的口音彼此相同(例如皆為某特定地區的腔調),但至少在音調的高低上彼此不同。換句話說,對於同一個聲線群組20之該等聲線資料201所分別對應的該等特定聲線,若以該等特定聲線對同一個句子產生語音,則該等特定聲線對於該句子所呈現出的詞語發音模式彼此相同,但呈現出的整體音調高低則會彼此不同。In this embodiment, the specific voice lines corresponding to the voice line data 201 of the same voice line group 20 exhibit the same accent (e.g., the accent of a specific region), but differ at least in pitch. In other words, if the specific voice lines corresponding to the voice line data 201 of the same voice line group 20 are used to produce speech for the same sentence, the specific voice lines will exhibit the same pronunciation pattern for the sentence, but will exhibit different overall pitch.

更詳細地說,對於同一個聲線群組20中的該等聲線資料201,該等聲線資料201中的其中一筆聲線資料201為一筆由該處理單元11從該虛擬人聲資料庫DB中所選出的初始聲線資料,且該初始聲線資料可例如為該虛擬人聲資料庫DB所包含的其中一個語音設定檔。進一步地,在同一個聲線群組20的該等聲線資料201中,除了該初始聲線資料以外的其他每一筆聲線資料201為一筆衍生聲線資料,而且,每一衍生聲線資料例如是由該處理單元11至少藉由對該初始聲線資料進行基音的音頻範圍調整所產生的。所以,每一衍生聲線資料所對應的特定聲線相當於是該處理單元11以該初始聲線資料所對應的該特定聲線作為基礎,並至少藉由提高或降低其基音的音頻範圍所衍生出的另一個音調較為高亢或低沉的聲線。More specifically, for the sound ray data 201 in the same sound ray group 20, one piece of sound ray data 201 is an initial sound ray data selected by the processing unit 11 from the virtual voice database DB. This initial sound ray data may, for example, be one of the voice profiles contained in the virtual voice database DB. Furthermore, within the sound ray data 201 in the same sound ray group 20, each piece of sound ray data 201 other than the initial sound ray data is a piece of derived sound ray data. Each piece of derived sound ray data is generated by, for example, the processing unit 11 by adjusting the audio frequency range of the fundamental pitch of the initial sound ray data. Therefore, the specific sound ray corresponding to each derived sound ray data is equivalent to another sound ray with a higher or lower pitch derived by the processing unit 11 based on the specific sound ray corresponding to the initial sound ray data and at least by increasing or decreasing the audio frequency range of its fundamental tone.

以圖4具體舉例來說,在每一聲線群組20中,標記有「中音」的該聲線資料201例如是該聲線群組20中由該處理單元11從該虛擬人聲資料庫DB中所選出的初始聲線資料。另一方面,標記有「低音」或「極低音」的聲線資料201例如是由該處理單元11藉由降低該初始聲線資料之基音音頻範圍而被產生的衍生聲線資料。再一方面,標記有「高音」或「極高音」的聲線資料201則例如是由該處理單元11藉由提高該初始聲線資料之基音音頻範圍而被產生的衍生聲線資料。藉此,本實施例能針對單一種特定聲線進行基音音頻範圍的調整,從而擴展出音調高低不同的其他特定聲線,如此,即便該虛擬人聲資料庫DB中之口音及音色合適的特定聲線數量有限,本實施例也能以合適的特定聲線作為基礎而擴展出更多口音及音色合適的特定聲線,從而提升合適的特定聲線數量。Taking Figure 4 as an example, within each sound ray group 20, the sound ray data 201 labeled "Mid-range" is, for example, the initial sound ray data selected by the processing unit 11 from the virtual human voice database DB within that sound ray group 20. Meanwhile, the sound ray data 201 labeled "Bass" or "Extreme Bass" is, for example, derived sound ray data generated by the processing unit 11 by lowering the fundamental frequency range of the initial sound ray data. Furthermore, the sound ray data 201 labeled "High-pitched" or "Extreme-high-pitched" is, for example, derived sound ray data generated by the processing unit 11 by raising the fundamental frequency range of the initial sound ray data. In this way, the present embodiment can adjust the fundamental frequency range of a single specific voice line, thereby expanding other specific voice lines with different pitches. In this way, even if the number of specific voice lines with suitable accents and timbres in the virtual human voice database DB is limited, the present embodiment can expand more specific voice lines with suitable accents and timbres based on the suitable specific voice line, thereby increasing the number of suitable specific voice lines.

進一步地,在本實施例的一種進階實施態樣中,該處理單元11在對其中一筆初始聲線資料進行基音音頻範圍調整而產生一筆對應的衍生聲線資料之後,該處理單元11可例如將該筆衍生聲線資料作為另外一個對應不同角色類型之聲線群組20的初始聲線資料。舉一例來說,假設其中一筆初始聲線資料是對應於「成年女」的角色類型,則該處理單元11例如在將該筆初始聲線資料的基音音頻範圍提高而產生一筆對應之聲線更為尖銳的衍生聲線資料之後,例如根據該筆衍生聲線資料產生另一個對應於「幼年男」之角色類型的聲線群組20,並將該筆衍生聲線資料作為對應「幼年男」之該聲線群組20所包含的初始聲線資料。因此,藉由對該初始聲線資料進行基音的音頻範圍調整而產生衍生聲線資料,本實施例能夠利用同一種口音/音色合適的特定聲線擴展出其他適用於不同年齡/性別之角色的特定聲線。Furthermore, in an advanced implementation of this embodiment, after the processing unit 11 adjusts the fundamental audio frequency range of one of the initial sound ray data to generate a corresponding derivative sound ray data, the processing unit 11 can, for example, use the derivative sound ray data as the initial sound ray data of another sound ray group 20 corresponding to a different character type. For example, assuming one of the initial voice ray data corresponds to the "adult female" character type, the processing unit 11, after increasing the fundamental pitch frequency range of the initial voice ray data to generate corresponding derivative voice ray data with a sharper voice, generates another voice ray group 20 corresponding to the "young male" character type based on the derived voice ray data, and uses the derived voice ray data as the initial voice ray data included in the voice ray group 20 corresponding to the "young male" character type. Therefore, by adjusting the fundamental pitch frequency range of the initial voice ray data to generate the derived voice ray data, this embodiment can utilize the same specific voice ray with a suitable accent/timbre to expand other specific voices suitable for characters of different ages/genders.

補充說明的是,在不同的實施態樣中,該等聲線群組20中的其中一或多個聲線群組20也可以只包含單一筆衍生聲線資料。此外,可選地,在同一個聲線群組20中,任一衍生聲線資料可以是由該處理單元11藉由對該初始聲線資料同時進行基音的音頻範圍調整以及語速調整而被產生的。It should be noted that, in various implementations, one or more of the sound ray groups 20 may include only a single piece of derived sound ray data. Furthermore, optionally, within the same sound ray group 20, any piece of derived sound ray data may be generated by the processing unit 11 by simultaneously adjusting the audio frequency range of the fundamental pitch and the speech rate of the initial sound ray data.

在本實施例中,基於某些角色類型之聲線的需求量相對較高,該等聲線群組20的其中多個聲線群組20是對應於同一個角色類型。舉例來說,圖4中的聲線群組20A及聲線群組20B都是對應於「成年男」的角色類型,而聲線群組20C及聲線群組20D則都是對應於「成年女」的角色類型。特別說明的是,每一個聲線群組20在音色特徵及口音特徵的其中至少一方面具有唯一性,因此,即便其中兩個聲線群組20是對應於同一個角色類型,該兩聲線群組20所分別對應的兩群特定聲線在音色及口音呈現的其中至少一方面也會彼此不同。換個方式說,對於任一聲線群組20之任一聲線資料201所對應的該特定聲線,該特定聲線所呈現出的語音在音調、音色及口音的組合上具有唯一性。In this embodiment, because certain character types have a relatively high demand for voices, multiple voice groups 20 within the voice groupings 20 correspond to the same character type. For example, voice groupings 20A and 20B in FIG4 both correspond to the "adult male" character type, while voice groupings 20C and 20D both correspond to the "adult female" character type. It should be noted that each voice grouping 20 is unique in at least one of its timbre and accent characteristics. Therefore, even if two voice groups 20 correspond to the same character type, the two specific voices corresponding to those two voice groups 20 will differ in at least one of their timbre and accent characteristics. In other words, for a specific voice ray corresponding to any voice ray data 201 of any voice ray group 20, the speech presented by the specific voice ray is unique in the combination of pitch, timbre and accent.

在該處理單元11獲得該聲線資料組合D2之後,流程進行至步驟S3。After the processing unit 11 obtains the sound ray data combination D2, the process proceeds to step S3.

在步驟S3中,該處理單元11將該聲線需求資訊D1的該等角色資料10與該聲線資料組合D2的該等聲線資料201進行配對,以建立每一角色資料10與該等聲線資料201之其中一者之間的對應關係。在本實施例中,該處理單元11是優先決定該主要角色資料10*所要對應的聲線資料201,接著決定該等重要角色資料10’所要對應的聲線資料201,最後再決定該等次要角色資料10”所要對應的聲線資料201,但並不以此為限。In step S3, the processing unit 11 matches the character data 10 in the voice requirement information D1 with the voice data 201 in the voice data combination D2 to establish a correspondence between each character data 10 and one of the voice data 201. In this embodiment, the processing unit 11 prioritizes the voice data 201 corresponding to the primary character data 10*, then determines the voice data 201 corresponding to the important character data 10', and finally determines the voice data 201 corresponding to the secondary character data 10", but this is not a limitation.

更具體地說,對於每一角色資料10,該處理單元11是先從該等聲線群組20中選出其中一個與該角色資料10對應於同一個角色類型的匹配聲線群組,再將該匹配聲線群組之該等聲線資料201中的其中一者設定為一對應於該角色資料10的配對聲線資料,藉此將該配對聲線資料所對應的該特定聲線配置給該角色資料10所對應的該角色。舉例來說,圖3中的該角色資料10B是對應於「成年女」的角色類型,則該處理單元11便會選出圖4中同樣對應於「成年女」之角色類型的其中一個聲線群組20(例如聲線群組20C或聲線群組20D)來作為匹配聲線群組,再從其中選出該角色資料10B所對應的配對聲線資料。More specifically, for each character data 10, the processing unit 11 first selects a matching voice group from the voice groups 20 that corresponds to the same character type as the character data 10, and then sets one of the voice data 201 in the matching voice group as a matching voice data corresponding to the character data 10, thereby allocating the specific voice corresponding to the matching voice data to the character corresponding to the character data 10. For example, if the character data 10B in FIG3 corresponds to the character type of "adult female", the processing unit 11 will select one of the voice groups 20 (such as voice group 20C or voice group 20D) in FIG4 that also corresponds to the character type of "adult female" as a matching voice group, and then select the matching voice data corresponding to the character data 10B from them.

對於每一角色資料10,若該角色資料10是包含聲線特徵標籤102的主要角色資料10*或重要角色資料10’,則該處理單元11在從該等聲線群組20中選出對應的匹配聲線群組之後,是將該匹配聲線群組的該等聲線資料201中與該角色資料10之該聲線特徵標籤102匹配的其中一者設定為該角色資料10所對應的該配對聲線資料。For each character data 10, if the character data 10 is main character data 10* or important character data 10' including a voice feature tag 102, the processing unit 11 selects a corresponding matching voice group from the voice groups 20 and then sets one of the voice data 201 of the matching voice group that matches the voice feature tag 102 of the character data 10 as the matching voice data corresponding to the character data 10.

以圖3中的角色資料10A舉一例來說,該角色資料10A對應於「成年男」的角色類型,並且包含指示出「低音」的聲線特徵標籤102,因此,假設該處理單元11是選出圖4中的聲線群組20A來作為匹配聲線群組,則該處理單元11會進一步從聲線群組20A中選出匹配於「低音」之聲線特徵標籤102的該聲線資料201來作為該角色資料10A所對應的配對聲線資料。並且,由於該角色資料10A為該主要角色資料10*,因此,在此例中,該聲線群組20A會被作為該等聲線群組20中被用於決定該主要角色資料10*所對應之配對聲線資料的一個主角聲線群組20*。For example, taking character data 10A in FIG3 , character data 10A corresponds to the character type "adult male" and includes a voice feature tag 102 indicating "bass." Therefore, assuming that processing unit 11 selects voice group 20A in FIG4 as a matching voice group, processing unit 11 will further select voice data 201 matching voice feature tag 102 of "bass" from voice group 20A as the matching voice data corresponding to character data 10A. Furthermore, since character data 10A is the main character data 10*, in this example, voice group 20A will be used as a main voice group 20* among the voice groups 20 to determine the matching voice data corresponding to the main character data 10*.

以圖3中的角色資料10B舉另一例來說,該角色資料10B對應於「成年女」的角色類型,並且包含指示出「高音」的聲線特徵標籤102,因此,假設該處理單元11是選出圖4中的聲線群組20C來作為匹配聲線群組,則該處理單元11會進一步從聲線群組20C中選出匹配於「高音」之聲線特徵標籤102的該聲線資料201來作為該角色資料10B所對應的配對聲線資料。Taking the character data 10B in FIG. 3 as another example, the character data 10B corresponds to the character type of "adult female" and includes a voice feature label 102 indicating a "high pitch". Therefore, assuming that the processing unit 11 selects the voice group 20C in FIG. 4 as the matching voice group, the processing unit 11 will further select the voice data 201 that matches the voice feature label 102 of "high pitch" from the voice group 20C as the matching voice data corresponding to the character data 10B.

在本實施例中,對於該等角色資料10中對應於同一個角色類型的其中多筆重要角色資料10’,若該等聲線群組20中的其中多個聲線群組20是與其中該等重要角色資料10’對應於同一個角色類型,且其中該等聲線群組20中不存在主角聲線群組20*,則該處理單元11是輪流從其中該等聲線群組20中選出該等重要角色資料10’所分別對應的該等配對聲線資料。舉例來說,圖3中的角色資料10B、角色資料10C以及圖4中的聲線群組20C、聲線群組20D都是對應於「成年女」的角色類型,且聲線群組20C及聲線群組20D皆非屬於主角聲線群組20*,在此情況下,該處理單元11會按照該角色資料10B及該角色資料10C所對應之該兩角色於該文本資料中首次出現的順序,而例如先從該聲線群組20C中選出該角色資料10B所對應的配對聲線資料,再從該聲線群組20D中選出該角色資料10C所對應的配對聲線資料,而使得該角色資料10B及該角色資料10C所分別對應的配對聲線資料是來自於不同的聲線群組20。如此一來,即便該角色資料10B及該角色資料10C都是對應於「成年女」的角色類型,本實施例仍能替該角色資料10B及該角色資料10C配置音色或口音彼此不同的兩個特定聲線,而有助於盡量避免多個重要配角之間的聲音太過相似。In this embodiment, for multiple important character data 10' corresponding to the same character type in the character data 10, if multiple voice groups 20 in the voice groups 20 correspond to the same character type as the important character data 10', and there is no protagonist voice group 20* in the voice groups 20, then the processing unit 11 selects the matching voice data corresponding to the important character data 10' respectively from the voice groups 20 in turn. For example, the character data 10B and character data 10C in Figure 3 and the voice group 20C and voice group 20D in Figure 4 all correspond to the character type of "adult female", and the voice group 20C and voice group 20D do not belong to the protagonist voice group 20*. In this case, the processing unit 11 will follow the order in which the two characters corresponding to the character data 10B and the character data 10C first appear in the text data, and for example, first select the matching voice data corresponding to the character data 10B from the voice group 20C, and then select the matching voice data corresponding to the character data 10C from the voice group 20D, so that the matching voice data corresponding to the character data 10B and the character data 10C respectively come from different voice groups 20. In this way, even if the character data 10B and the character data 10C both correspond to the "adult female" character type, this embodiment can still configure two specific voices with different timbres or accents for the character data 10B and the character data 10C, which helps to avoid the voices of multiple important supporting characters being too similar.

在另一種情形中,對於該等角色資料10中對應於同一個角色類型的其中多筆重要角色資料10’,若該等聲線群組20中的其中多個聲線群組20是與其中該等重要角色資料10’對應於同一個角色類型,但其中該等聲線群組20中存在已被用於決定該主要角色資料10*所對應之配對聲線資料的主角聲線群組20*,則該處理單元11不從該主角聲線群組20*中選出該等重要角色資料10’所要對應的配對聲線資料。舉例來說,圖3中的角色資料10D、角色資料10E以及圖4中的聲線群組20A、聲線群組20B都是對應於「成年男」的角色類型,但該聲線群組20A為主角聲線群組20*,在此情況下,該處理單元11便只會從該聲線群組20B中選出該角色資料10D及該角色資料10E所對應的配對聲線資料,而不會從該主角聲線群組20*(在此例中為該聲線群組20A)中選擇該角色資料10D及該角色資料10E所對應的配對聲線資料,如此便能確保該角色資料10D及該角色資料10E所分別對應的配對聲線資料與該主要角色資料10*所對應的配對聲線資料是來自於不同的聲線群組20,以避免重要配角與主角的聲音太過相似。In another case, for multiple important character data 10' corresponding to the same character type in the character data 10, if multiple voice groups 20 in the voice groups 20 correspond to the same character type as the important character data 10', but there is a protagonist voice group 20* in the voice groups 20 that has been used to determine the matching voice data corresponding to the main character data 10*, then the processing unit 11 does not select the matching voice data to be corresponded to the important character data 10' from the protagonist voice group 20*. For example, the character data 10D and the character data 10E in FIG3 and the voice group 20A and the voice group 20B in FIG4 all correspond to the character type of "adult male", but the voice group 20A is the protagonist voice group 20*. In this case, the processing unit 11 will only select the matching voice data corresponding to the character data 10D and the character data 10E from the voice group 20B, and will not select the matching voice data corresponding to the character data 10D and the character data 10E from the voice group 20B. The matching voice data corresponding to the character data 10D and the character data 10E are selected from the main character voice group 20* (in this case, the voice group 20A). This ensures that the matching voice data corresponding to the character data 10D and the character data 10E, respectively, and the matching voice data corresponding to the main character data 10* are from different voice groups 20, thereby preventing the voices of important supporting characters from being too similar to those of the main character.

對於該等角色資料10中對應於同一個角色類型的其中多筆次要角色資料10”,若該等聲線群組20中的其中多個聲線群組20是與其中該等次要角色資料10”對應於同一個角色類型,則該聲線配置系統1是輪流從其中該等聲線群組20中選出該等次要角色資料10”所分別對應的該等配對聲線資料。舉例來說,圖3中的角色資料10F、角色資料10G及角色資料10H以及圖4中的聲線群組20C、聲線群組20D都是對應於「成年女」的角色類型,且角色資料10F、角色資料10G及角色資料10H皆屬於次要角色資料10”,在此情況下,該處理單元11會按照該角色資料10F、角色資料10G及角色資料10H所對應的該三個次要配角於該文本資料中首次出現的順序,而例如先從該聲線群組20C中選出該角色資料10F所對應的配對聲線資料,接著從該聲線群組20D中選出該角色資料10G所對應的配對聲線資料,然後再從該聲線群組20C中選出該角色資料10H所對應的配對聲線資料,藉此確保該角色資料10F至該角色資料10H所分別對應的該等配對聲線資料不會全部來自於同一個聲線群組20。For the multiple secondary character data 10" corresponding to the same character type in the character data 10, if the multiple voice line groups 20 in the voice line groups 20 correspond to the same character type as the secondary character data 10", the voice line configuration system 1 selects the matching voice line data corresponding to the secondary character data 10" from the voice line groups 20 in turn. For example, the character data 10F, the character data 10G and the character data 10H in FIG3 and the voice line group 20C and the voice line group 20D in FIG4 all correspond to the character type of "adult female", and the character data 10F, the character data 10G and the character data 10H are all In the case of the secondary character data 10", the processing unit 11 will select the matching voice data corresponding to the character data 10F, the character data 10G and the character data 10H in the order in which the three secondary supporting characters first appear in the text data. For example, the processing unit 11 will first select the matching voice data corresponding to the character data 10F from the voice group 20C, then select the matching voice data corresponding to the character data 10G from the voice group 20D, and then select the matching voice data corresponding to the character data 10H from the voice group 20C, thereby ensuring that the matching voice data corresponding to the character data 10F to the character data 10H respectively do not all come from the same voice group 20.

補充說明的是,對於每一次要角色資料10”,該處理單元11在選出與該次要角色資料10”對應於同一個角色類型的匹配聲線群組之後,可例如是將該匹配聲線群組的該等聲線資料201中尚未與其他任何角色資料10存在對應關係,或者是所對應之其他角色資料10之數量最少的該聲線資料201設定為與該次要角色資料10”對應的配對聲線資料,但並不以此為限。It should be noted that for each secondary character data 10", after the processing unit 11 selects a matching voice line group corresponding to the same character type as the secondary character data 10", it may, for example, set the voice line data 201 in the matching voice line group that has not yet been matched with any other character data 10, or the voice line data 201 with the least number of corresponding other character data 10, as the matching voice line data corresponding to the secondary character data 10", but the present invention is not limited to this.

在該處理單元11設定每一角色資料10所對應的配對聲線資料之後,流程進行至步驟S4。After the processing unit 11 sets the matching voice ray data corresponding to each character data 10, the process proceeds to step S4.

在步驟S4中,對於該等次要角色資料10”,該處理單元11將每一筆次要角色資料10”作為一受檢角色資料,並對於該受檢角色資料以及該受檢角色資料當前所對應的該配對聲線資料,根據該文本資料判斷一聲線衝突條件是否符合。In step S4, for the secondary character data 10", the processing unit 11 treats each secondary character data 10" as a checked character data, and determines whether a voice conflict condition is met based on the text data for the checked character data and the paired voice data currently corresponding to the checked character data.

首先說明的是,在步驟S3執行完畢之後,該處理單元11已對每一角色資料10設定其所對應的配對聲線資料,在此情況下,該文本資料所包含的每一個台詞部分不但對應於該等角色資料10的其中一筆角色資料10,還對應於該其中一角色資料10當前所對應的該配對聲線資料。First, it should be noted that after step S3 is completed, the processing unit 11 has set the corresponding matching voice data for each character data 10. In this case, each line portion contained in the text data not only corresponds to one of the character data 10, but also corresponds to the matching voice data currently corresponding to one of the character data 10.

並且,為了便於描述,在此將該等台詞部分中與該受檢角色資料對應的每一台詞部分作為一受檢台詞部分。Furthermore, for the convenience of description, each of the dialogue parts corresponding to the inspected character data is referred to as an inspected dialogue part.

對於該受檢角色資料以及該受檢角色資料當前所對應的該配對聲線資料,該聲線衝突條件代表:該等台詞部分中存在一個符合一鄰近條件且所對應之角色資料10非為該受檢角色資料的鄰近台詞部分,且該鄰近台詞部分所對應的該配對聲線資料與該受檢角色資料所對應的該配對聲線資料相同。其中,對於該(等)受檢台詞部分以外的其他每一台詞部分,該鄰近條件代表該台詞部分在該文本資料中與任一受檢台詞部分之間相隔的字元數量小於等於一預設字元數量門檻值(例如600個字元,但並不以此為限)。For the inspected character data and the currently corresponding paired voice data of the inspected character data, the voice conflict condition represents: there exists a character data 10 among the dialogue portions that meets a proximity condition and is not a neighboring dialogue portion of the inspected character data, and the paired voice data corresponding to the neighboring dialogue portion is the same as the paired voice data corresponding to the inspected character data. For each dialogue portion other than the inspected dialogue portion(s), the proximity condition represents that the number of characters between the dialogue portion and any of the inspected dialogue portions in the text data is less than or equal to a preset character count threshold (e.g., 600 characters, but not limited thereto).

舉一例來說,假設圖3中的該角色資料10F被作為受檢角色資料,若該處理單元11對於該角色資料10F及其所對應的配對聲線資料判定該聲線衝突條件符合,表示該角色資料10F所對應的至少一個台詞部分在該文本資料中與另一角色資料10(例如圖3中的角色資料10B)對應之台詞部分之間相隔的字元數量小於等於該預設字元數量門檻值,且該角色資料10F與角色資料10B是對應於相同的配對聲線資料,亦即被配置到相同的特定聲線。也就是說,若以該角色資料10F及角色資料10B當前所對應的配對聲線資料進行實際配音,將會發生該角色資料10F及角色資料10B所對應之該兩角色以完全相同的聲線在短時間內先後發言的情況,而容易造成不佳的聆聽感受。For example, assuming that the character data 10F in Figure 3 is taken as the inspected character data, if the processing unit 11 determines that the voice conflict condition is met for the character data 10F and its corresponding paired voice data, it means that the number of characters between at least one dialogue portion corresponding to the character data 10F and the dialogue portion corresponding to another character data 10 (for example, the character data 10B in Figure 3) in the text data is less than or equal to the preset character number threshold, and the character data 10F and the character data 10B correspond to the same paired voice data, that is, they are configured to the same specific voice. That is, if the actual dubbing is performed using the paired voice data currently corresponding to the character data 10F and the character data 10B, the two characters corresponding to the character data 10F and the character data 10B will speak in exactly the same voice one after another in a short period of time, which may easily lead to a poor listening experience.

若該處理單元11對於任一受檢角色資料及其所對應之配對聲線資料判斷出聲線衝突條件符合,流程進行至步驟S5。另一方面,若該處理單元11對於每一受檢角色資料及其所對應之配對聲線資料皆未判斷出聲線衝突條件符合,流程則進行至步驟S9。If the processing unit 11 determines that the sound ray conflict condition is met for any of the examined character data and its corresponding paired sound ray data, the process proceeds to step S5. On the other hand, if the processing unit 11 does not determine that the sound ray conflict condition is met for any of the examined character data and its corresponding paired sound ray data, the process proceeds to step S9.

在步驟S5中,一旦該處理單元11對於任一受檢角色資料及其當前對應之配對聲線資料判斷出聲線衝突條件符合,該處理單元11自動地更新造成該聲線衝突條件符合之該受檢角色資料所對應的聲線資料201,亦即將該受檢角色資料所對應的配對聲線資料從原本的聲線資料201更換為不同的另一個聲線資料201,以嘗試排除兩個不同角色以相同的聲線在短時間內先後發言的情況。舉例來說,假設該受檢角色資料所對應的配對聲線資料原本是圖4的該聲線群組20C中標記「低音」的該聲線資料201,則該處理單元11可例如是將該受檢角色資料所對應的配對聲線資料更換成該聲線群組20C中的下一筆聲線資料201(例如該聲線群組20C中標記「中音」的該聲線資料201),或者是更換成對應於相同角色類型之另一聲線群組20(例如該聲線群組20D)中的任一聲線資料201。具體而言,該處理單元11可例如是根據該等聲線資料201之間的順序來更換該受檢角色資料所對應的配對聲線資料,也可例如是以隨機的方式來更換該受檢角色資料所對應的配對聲線資料。In step S5, once the processing unit 11 determines that a voice ray conflict condition is met for any of the examined character data and its currently corresponding paired voice ray data, the processing unit 11 automatically updates the voice ray data 201 corresponding to the examined character data that causes the voice ray conflict condition to be met, that is, the paired voice ray data corresponding to the examined character data is replaced from the original voice ray data 201 to another different voice ray data 201, in an attempt to eliminate the situation where two different characters speak with the same voice in a short period of time. For example, assuming that the paired voice ray data corresponding to the examined character data is originally the voice ray data 201 marked "bass" in the voice ray group 20C in Figure 4, the processing unit 11 may, for example, replace the paired voice ray data corresponding to the examined character data with the next voice ray data 201 in the voice ray group 20C (for example, the voice ray data 201 marked "middle" in the voice ray group 20C), or replace it with any voice ray data 201 in another voice ray group 20 corresponding to the same character type (for example, the voice ray group 20D). Specifically, the processing unit 11 may, for example, replace the paired sound ray data corresponding to the examined character data according to the order of the sound ray data 201, or may, for example, replace the paired sound ray data corresponding to the examined character data in a random manner.

在該處理單元11更新該受檢角色資料所對應的聲線資料201之後,流程進行至步驟S6。After the processing unit 11 updates the sound ray data 201 corresponding to the examined character data, the process proceeds to step S6.

在步驟S6中,該處理單元11再次對於每一受檢角色資料以及該受檢角色資料所對應之配對聲線資料判斷該聲線衝突條件是否符合。若該處理單元11再次判斷出該聲線衝突條件符合,流程進行至步驟S7,另一方面,若該處理單元11判斷出該聲線衝突條件並未再次符合,流程則進行至步驟S9。In step S6, the processing unit 11 again determines whether the sound ray conflict condition is met for each piece of examined character data and the paired sound ray data corresponding to the examined character data. If the processing unit 11 again determines that the sound ray conflict condition is met, the process proceeds to step S7. On the other hand, if the processing unit 11 again determines that the sound ray conflict condition is not met, the process proceeds to step S9.

在步驟S7中,該處理單元11判斷其本身判定該聲線衝突條件符合的累積次數是否已達到一預設衝突次數門檻值(例如三次,但並不以此為限)。若判斷結果為是,流程進行至步驟S8,另一方面,若判斷結果為否,流程則例如從步驟S5再次開始進行。In step S7, the processing unit 11 determines whether the cumulative number of times it has determined that the sound ray conflict condition has been met has reached a preset conflict threshold (e.g., three times, but not limited thereto). If the determination is yes, the process proceeds to step S8. On the other hand, if the determination is no, the process restarts from step S5, for example.

在步驟S8中,一旦該處理單元11判定該聲線衝突條件符合的累積次數已達到該預設衝突次數門檻值,該處理單元11產生一聲線衝突提示,並將該聲線衝突提示輸出。其中,該線衝突提示例如指示出造成該聲線衝突條件符合的該受檢角色資料,並且,該處理單元11輸出該聲線衝突提示的方式,可例如是控制一顯示裝置將該聲線衝突提示以顯示的方式輸出,但並不以此為限。補充說明的是,該聲線衝突提示例如是用來提示使用者以手動設定的方式來決定該受檢角色資料所要對應的配對聲線資料,但並不以此為限。In step S8, once the processing unit 11 determines that the cumulative number of times the sound ray conflict condition is met has reached the preset conflict threshold, the processing unit 11 generates a sound ray conflict prompt and outputs the sound ray conflict prompt. The sound ray conflict prompt, for example, indicates the detected character data that caused the sound ray conflict condition to be met, and the processing unit 11 outputs the sound ray conflict prompt by, for example, controlling a display device to display the sound ray conflict prompt, but the present invention is not limited thereto. It should be noted that the sound ray conflict prompt is used, for example, to prompt the user to manually determine the matching sound ray data corresponding to the detected character data, but the present invention is not limited thereto.

在步驟S9中,該處理單元11產生一聲線配置結果,並將該聲線配置結果輸出。其中,該聲線配置結果例如指示出所有該等角色資料10以及該等角色資料10所分別對應的該等配對聲線資料,並且,該處理單元11輸出該聲線配置結果的方式,可例如是控制該顯示裝置將該聲線配置結果以顯示的方式輸出,及/或將該聲線配置結果輸出至該儲存單元12儲存,但並不以此為限。補充說明的是,該聲線配置結果能用於供該處理單元11據以對該文本資料執行實際的配音程序,從而產生一對應於該文本資料且能被揚聲器播放的配音結果。In step S9, the processing unit 11 generates a voice ray configuration result and outputs the voice ray configuration result. The voice ray configuration result, for example, indicates all of the character data 10 and the corresponding paired voice ray data. The processing unit 11 may output the voice ray configuration result by, for example, controlling the display device to display the voice ray configuration result and/or outputting the voice ray configuration result to the storage unit 12 for storage, but the present invention is not limited thereto. It should be noted that the voice ray configuration result can be used by the processing unit 11 to perform an actual dubbing process on the text data, thereby generating a dubbing result corresponding to the text data and capable of being played by a speaker.

以上即為本實施例之聲線配置系統1如何實施該聲線配置方法的示例說明。The above is an example of how the sound ray configuration system 1 of this embodiment implements the sound ray configuration method.

特別說明的是,本實施例的步驟S1至步驟S8及圖2的流程圖僅是用於示例說明本發明聲線配置方法的其中一種可實施方式。應當理解,即便將步驟S1至步驟S8進行合併、拆分或順序調整,若合併、拆分或順序調整之後的流程與本實施例相比是以實質相同的方式達成實質相同的功效,便仍屬於本發明聲線配置方法的可實施態樣,因此,本實施例的步驟S1至步驟S8及圖2的流程圖並非用於限制本發明的可實施範圍。It should be noted that steps S1 through S8 of this embodiment and the flowchart in FIG2 are merely exemplary of one possible implementation of the sound ray configuration method of the present invention. It should be understood that even if steps S1 through S8 are combined, separated, or their order is adjusted, as long as the resulting process achieves substantially the same effect as the present embodiment in a substantially identical manner, it will still fall within the scope of the sound ray configuration method of the present invention. Therefore, steps S1 through S8 of this embodiment and the flowchart in FIG2 are not intended to limit the scope of implementation of the present invention.

本發明還提供了一種電腦程式產品的一實施例,其中,該電腦程式產品包含一能被儲存於電腦可讀取紀錄媒體且能被一電子裝置(例如但不限於桌上型電腦、筆記型電腦、平板電腦、智慧型手機或伺服器)所載入並運行的應用程式,並且,當該電子裝置載入並運行該電腦程式產品的該應用程式時,該應用程式能使該電子裝置被作為該聲線配置系統1,並且實施前述的該聲線配置方法。The present invention also provides an embodiment of a computer program product, wherein the computer program product includes an application that can be stored in a computer-readable recording medium and can be loaded and run by an electronic device (for example, but not limited to a desktop computer, laptop computer, tablet computer, smart phone or server), and when the electronic device loads and runs the application of the computer program product, the application enables the electronic device to be used as the sound ray configuration system 1 and implement the aforementioned sound ray configuration method.

綜上所述,藉由實施該聲線配置方法,該聲線配置系統1能在口音及音色合適的聲線數量有限的情況下,利用合適的特定聲線(即初始聲線資料所對應的特定聲線)作為基礎,而擴展出更多口音及音色合適的特定聲線(即衍生聲線資料所對應的特定聲線),藉此提升可選的合適聲線數量。並且,該聲線配置系統1有助於避免重要配角與主角的聲音太過相似,以及避免多個重要配角之間的聲音彼此太過相似。進一步地,該聲線配置系統1還能在設定每一角色資料所對應的配對聲線資料之後,主動偵測是否存在不同角色在短時間內以相同聲線先後發言的情況,並嘗試予以排除。因此,該聲線配置系統1有助於在合適聲線數量有限的情況下達成較佳的配音效果,而確實能達成本發明之目的。In summary, by implementing this voice configuration method, the voice configuration system 1 can, when the number of voices with suitable accents and timbre is limited, use a suitable specific voice (i.e., the specific voice corresponding to the initial voice data) as a basis to expand the number of specific voices with suitable accents and timbre (i.e., the specific voice corresponding to the derived voice data), thereby increasing the number of suitable voices to choose from. Furthermore, the voice configuration system 1 helps prevent the voices of important supporting characters from being too similar to the main character, and prevents the voices of multiple important supporting characters from being too similar to each other. Furthermore, after setting the matching voice data corresponding to each character's data, the voice configuration system 1 can proactively detect whether different characters speak with the same voice in a short period of time, and attempt to eliminate these situations. Therefore, the voice ray configuration system 1 helps to achieve better dubbing effects when the number of suitable voice rays is limited, and can indeed achieve the purpose of this invention.

惟以上所述者,僅為本發明之實施例而已,當不能以此限定本發明實施之範圍,凡是依本發明申請專利範圍及專利說明書內容所作之簡單的等效變化與修飾,皆仍屬本發明專利涵蓋之範圍內。However, the above description is merely an example of the present invention and should not be used to limit the scope of the present invention. All simple equivalent changes and modifications made within the scope of the patent application and the contents of the patent specification of the present invention are still within the scope of the present patent.

1:聲線配置系統 11:處理單元 12:儲存單元 DB:虛擬人聲資料庫 D1:聲線需求資訊 10、10A~10J:角色資料 101:角色標記 102:聲線特徵標籤 10*:主要角色資料 10’:重要角色資料 10”:次要角色資料 D2:聲線資料組合 20、20A~20D:聲線群組 20*:主角聲線群組 201:聲線資料 S1~S9:步驟 1: Voice Configuration System 11: Processing Unit 12: Storage Unit DB: Virtual Voice Database D1: Voice Requirement Information 10, 10A-10J: Character Data 101: Character Tag 102: Voice Feature Label 10*: Primary Character Data 10': Important Character Data 10”: Secondary Character Data D2: Voice Data Group 20, 20A-20D: Voice Groups 20*: Main Character Voice Group 201: Voice Data S1-S9: Steps

本發明之其他的特徵及功效,將於參照圖式的實施方式中清楚地呈現,其中: 圖1是一方塊示意圖,示例性地表示本發明聲線配置系統的一實施例; 圖2是一流程圖,用於示例性地說明該實施例如何實施一聲線配置方法; 圖3是一示意圖,示例性地繪示該實施例在執行該聲線配置方法的過程中所利用的一聲線需求資訊;及 圖4是一示意圖,示例性地繪示該實施例在執行該聲線配置方法的過程中所利用的一聲線資料組合。 Other features and functions of the present invention are clearly illustrated in the accompanying drawings, wherein: Figure 1 is a block diagram illustrating an exemplary embodiment of the sound ray configuration system of the present invention; Figure 2 is a flow chart illustrating how the embodiment implements a sound ray configuration method; Figure 3 is a diagram illustrating exemplary sound ray requirement information utilized in executing the sound ray configuration method; and Figure 4 is a diagram illustrating exemplary sound ray data combinations utilized in executing the sound ray configuration method.

S1~S9:步驟 S1~S9: Steps

Claims (7)

一種聲線配置方法,由一聲線配置系統實施,該聲線配置系統包含一處理單元及一電連接該處理單元的儲存單元,該聲線配置方法包含: (A)獲得一聲線需求資訊及一聲線資料組合,其中: 該聲線需求資訊包含多筆分別對應於多個角色的角色資料,且每一角色資料對應於多個角色類型的其中一者,該等角色資料的其中一或多筆角色資料各包含一與音頻範圍相關的聲線特徵標籤,而各被作為一重要角色資料; 該聲線資料組合包含多個聲線群組,每一聲線群組對應於該等角色類型的其中一者且包括多筆聲線資料,每一聲線群組的該等聲線資料分別對應於多個特定聲線,且包含一初始聲線資料及一或多筆至少藉由對該初始聲線資料進行音頻範圍調整而被產生的衍生聲線資料; (B)對於每一角色資料,從該等聲線群組中選出其中一個與該角色資料對應同一個角色類型的匹配聲線群組,再將該匹配聲線群組之該等聲線資料中的其中一者設定為一對應於該角色資料的配對聲線資料,藉此將該配對聲線資料所對應的該特定聲線配置給該角色資料所對應的該角色,其中,若該角色資料屬於該重要角色資料,該聲線配置系統在從該等聲線群組中選出該匹配聲線群組之後,是將該匹配聲線群組的該等聲線資料中與該重要角色資料之該聲線特徵標籤匹配的其中一者設定為該重要角色資料所對應的該配對聲線資料; (C)對於該等角色資料中的其中一筆受檢角色資料以及該受檢角色資料當前所對應的該配對聲線資料,根據一對應於該聲線需求資訊的文本資料判斷一聲線衝突條件是否符合,其中,該文本資料包含多個台詞部分,每一台詞部分對應於該等角色資料的其中一角色資料以及該其中一角色資料當前所對應的該配對聲線資料,且該受檢角色資料所對應的每一台詞部分被作為一受檢台詞部分,並且,該聲線衝突條件包含:該等台詞部分中存在一個符合一鄰近條件且所對應之角色資料非為該受檢角色資料的鄰近台詞部分,且該鄰近台詞部分所對應的該配對聲線資料與該受檢角色資料所對應的該配對聲線資料相同,其中,對於該(等)受檢台詞部分以外的其他每一台詞部分,該鄰近條件代表該台詞部分在該文本資料中與任一受檢台詞部分之間相隔的字元數量小於等於一預設字元數量門檻值;及 (D)在判斷該聲線衝突條件符合的情況下輸出一指示出該受檢角色資料的聲線衝突提示。 A voice configuration method is implemented by a voice configuration system comprising a processing unit and a storage unit electrically connected to the processing unit. The method comprises: (A) obtaining voice requirement information and a voice data set, wherein: The voice requirement information comprises a plurality of character data corresponding to a plurality of characters, each character data set corresponding to one of a plurality of character types; one or more of the character data sets each comprises a voice feature label associated with an audio frequency range and each is considered as an important character data set; The voice data set includes a plurality of voice groups, each voice group corresponding to one of the character types and including a plurality of voice data items. The voice data items in each voice group correspond to a plurality of specific voices, and include an initial voice data item and one or more derived voice data items generated by at least adjusting the audio range of the initial voice data item. (B) For each character data, selecting a matching voice group from the voice groups that corresponds to the same character type as the character data, and then setting one of the voice data in the matching voice group as a matching voice data corresponding to the character data, thereby assigning the specific voice corresponding to the matching voice data to the character corresponding to the character data. If the character data belongs to the important character data, after selecting the matching voice group from the voice groups, the voice configuration system sets one of the voice data in the matching voice group that matches the voice feature label of the important character data as the matching voice data corresponding to the important character data; (C) for one of the examined character data and the paired voice data currently corresponding to the examined character data, determining whether a voice conflict condition is met based on text data corresponding to the voice requirement information, wherein the text data includes a plurality of dialogue parts, each dialogue part corresponds to one of the character data and the paired voice data currently corresponding to one of the character data, and each dialogue part corresponding to the examined character data is used as an examined dialogue part, and the The voice conflict condition includes: there is a neighboring voice portion among the dialogue portions that meets a proximity condition and corresponds to character data other than the examined character data, and the matching voice data corresponding to the neighboring voice portion is identical to the matching voice data corresponding to the examined character data; wherein, for each dialogue portion other than the examined dialogue portion(s), the proximity condition indicates that the number of characters between the dialogue portion and any examined dialogue portion in the text data is less than or equal to a preset character count threshold; and (D) if the voice conflict condition is determined to be met, outputting a voice conflict prompt indicating the examined character data. 如請求項1所述的聲線配置方法,其中,在步驟(B)中,對於該等角色資料中對應於同一個角色類型的其中多筆重要角色資料,若該等聲線群組中的其中多個聲線群組是與其中該等重要角色資料對應於同一個角色類型,則該聲線配置系統是輪流從其中該等聲線群組中選出該等重要角色資料所分別對應的該等配對聲線資料。As described in claim 1, the voice configuration method, wherein in step (B), for multiple important character data corresponding to the same character type in the character data, if multiple voice groupings in the voice groupings correspond to the same character type as the important character data, then the voice configuration system selects the matching voice data corresponding to the important character data respectively from the voice groupings in turn. 如請求項1所述的聲線配置方法,其中: 在步驟(A)中,該等角色資料的其中至少一者被作為一主要角色資料,且該等聲線群組中被用於決定該主要角色資料所對應之配對聲線資料的該聲線群組被作為一主角聲線群組;及 在步驟(B)中,對於該等角色資料中對應於同一個角色類型的其中多筆重要角色資料,若該等聲線群組中的其中多個聲線群組是與其中該等重要角色資料對應於同一個角色類型,但其中該等聲線群組中存在該主角聲線群組,則該聲線配置系統不從該主角聲線群組中選出該等重要角色資料所要對應的配對聲線資料。 The voice configuration method of claim 1, wherein: In step (A), at least one of the character data is designated as a main character data, and the voice group in the voice groups used to determine the matching voice data corresponding to the main character data is designated as a protagonist voice group; and In step (B), for multiple important character data in the character data corresponding to the same character type, if multiple voice groups in the voice groups correspond to the same character type as the important character data, but the protagonist voice group exists in the voice groups, the voice configuration system does not select the matching voice data from the protagonist voice group to correspond to the important character data. 如請求項1所述的聲線配置方法,其中, 在步驟(A)中,該等角色資料的其中一或多筆不包含聲線特徵標籤的角色資料各被作為一次要角色資料; 在步驟(B)中,對於該等角色資料中對應於同一個角色類型的其中多筆次要角色資料,若該等聲線群組中的其中多個聲線群組是與其中該等次要角色資料對應於同一個角色類型,則該聲線配置系統是輪流從其中該等聲線群組中選出該等次要角色資料所分別對應的該等配對聲線資料。 The voice configuration method of claim 1, wherein: In step (A), one or more pieces of character data in the character data that do not include a voice feature label are each treated as a secondary character data; In step (B), for multiple pieces of secondary character data in the character data corresponding to the same character type, if multiple voice groups in the voice groups correspond to the same character type as the secondary character data, the voice configuration system selects the paired voice data corresponding to the secondary character data from the voice groups in turn. 如請求項1所述的聲線配置方法,其中,在步驟(D)中,在判斷該聲線衝突條件符合的情況下,該聲線配置系統是先更新該受檢角色資料所對應的聲線資料,並於更新該受檢角色資料所對應的聲線資料且再次判斷出該聲線衝突條件符合之後,才輸出該聲線衝突提示。The sound ray configuration method as described in claim 1, wherein, in step (D), when it is determined that the sound ray conflict condition is met, the sound ray configuration system first updates the sound ray data corresponding to the inspected character data, and only outputs the sound ray conflict prompt after the sound ray data corresponding to the inspected character data is updated and it is again determined that the sound ray conflict condition is met. 一種聲線配置系統,包含一處理單元及一電連接該處理單元的儲存單元,該聲線配置系統被配置為實施如請求項1至5其中任一項所述的聲線配置方法。A sound ray configuration system includes a processing unit and a storage unit electrically connected to the processing unit, wherein the sound ray configuration system is configured to implement the sound ray configuration method as described in any one of claims 1 to 5. 一種電腦程式產品,包含一應用程式,其中,當該應用程式被一電子裝置載入並執行時,能使該電子裝置實施如請求項1至5其中任一項所述的聲線配置方法。A computer program product includes an application program, wherein when the application program is loaded and executed by an electronic device, the electronic device can implement the sound ray configuration method as described in any one of claims 1 to 5.
TW112128777A 2023-08-01 2023-08-01 Sound ray configuration method, system and computer program product TWI894591B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW112128777A TWI894591B (en) 2023-08-01 2023-08-01 Sound ray configuration method, system and computer program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW112128777A TWI894591B (en) 2023-08-01 2023-08-01 Sound ray configuration method, system and computer program product

Publications (2)

Publication Number Publication Date
TW202507709A TW202507709A (en) 2025-02-16
TWI894591B true TWI894591B (en) 2025-08-21

Family

ID=95555206

Family Applications (1)

Application Number Title Priority Date Filing Date
TW112128777A TWI894591B (en) 2023-08-01 2023-08-01 Sound ray configuration method, system and computer program product

Country Status (1)

Country Link
TW (1) TWI894591B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070220575A1 (en) * 2006-03-03 2007-09-20 Verimatrix, Inc. Movie studio-based network distribution system and method
TW201322743A (en) * 2011-11-18 2013-06-01 Onlive Inc Graphical user interface, system and method for controlling a video stream
CN107683449A (en) * 2015-04-10 2018-02-09 索尼互动娱乐股份有限公司 Control personal space content presented via a head-mounted display
CN114783403A (en) * 2022-02-18 2022-07-22 腾讯科技(深圳)有限公司 Method, device, equipment, storage medium and program product for generating audio reading material

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070220575A1 (en) * 2006-03-03 2007-09-20 Verimatrix, Inc. Movie studio-based network distribution system and method
TW201322743A (en) * 2011-11-18 2013-06-01 Onlive Inc Graphical user interface, system and method for controlling a video stream
CN107683449A (en) * 2015-04-10 2018-02-09 索尼互动娱乐股份有限公司 Control personal space content presented via a head-mounted display
CN114783403A (en) * 2022-02-18 2022-07-22 腾讯科技(深圳)有限公司 Method, device, equipment, storage medium and program product for generating audio reading material

Also Published As

Publication number Publication date
TW202507709A (en) 2025-02-16

Similar Documents

Publication Publication Date Title
CN112309365B (en) Training method and device of speech synthesis model, storage medium and electronic equipment
CN106373580B (en) Method and device for synthesizing singing voice based on artificial intelligence
KR102615154B1 (en) Electronic apparatus and method for controlling thereof
CN106898340B (en) Song synthesis method and terminal
CN113724686B (en) Method, apparatus, electronic device and storage medium for editing audio
US20200313782A1 (en) Personalized real-time audio generation based on user physiological response
US10971125B2 (en) Music synthesis method, system, terminal and computer-readable storage medium
CN109982231B (en) Information processing method, device and storage medium
US20210193108A1 (en) Voice synthesis method, device and apparatus, as well as non-volatile storage medium
JP7728978B2 (en) Accompaniment generation method, device, and storage medium
US8103505B1 (en) Method and apparatus for speech synthesis using paralinguistic variation
CN111105776A (en) Audio playback device and playback method thereof
US10855241B2 (en) Adjusting an equalizer based on audio characteristics
EP3920049A1 (en) Techniques for audio track analysis to support audio personalization
CN116229996A (en) Audio production method, device, terminal, storage medium and program product
TWI894591B (en) Sound ray configuration method, system and computer program product
US20190377540A1 (en) Calibrating audio output device with playback of adjusted audio
US12315490B2 (en) Text-to-speech and speech recognition for noisy environments
CN116034423A (en) Audio processing method, device, equipment, storage medium and program product
US20230230611A1 (en) Method and device for managing audio based on spectrogram
CN109841224B (en) Multimedia playing method, system and electronic equipment
CN112685000B (en) Audio processing method, device, computer equipment and storage medium
CN118430553A (en) Audio processing method and device and electronic equipment
Wang et al. Spectral motion contrast as a speech context effect
CN115691468A (en) Singing voice synthesis method, computer equipment and storage medium