CN115910069A

CN115910069A - Automatic English bilingual subtitle generating system in traditional Chinese medicine video

Info

Publication number: CN115910069A
Application number: CN202211579294.9A
Authority: CN
Inventors: 刘秀峰; 吴雨璐; 王坤; 杨兴钊; 陈平平; 李荣耀
Original assignee: Guangzhou University of Traditional Chinese Medicine
Current assignee: Guangzhou University of Traditional Chinese Medicine
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2023-04-04

Abstract

The invention discloses a system for automatically generating Chinese-English bilingual subtitles for videos of traditional Chinese medicine, which includes a front-end information acquisition module, an analysis processing module, and a back-end writing module based on the system architecture. Call the data of Chinese medicine terminology database. Compared with the existing technology, the present invention has the advantages of: reasonable system architecture design and process steps, importing the self-built corpus into the speech recognition engine for model training, and dynamically updating through the knowledge map technology, so as to continuously build and improve The model improves the accuracy of speech recognition in the field of Chinese medicine, effectively improves the efficiency and accuracy of bilingual translation of Chinese medicine videos, reduces the workload of translators, has good applicability, and is easy to promote.

Description

An automatic generation system of Chinese and English bilingual subtitles for Chinese medicine videos

技术领域technical field

本发明涉及中医领域，具体是指一种中医视频中英双语字幕自动生成系统。The invention relates to the field of traditional Chinese medicine, in particular to a system for automatically generating Chinese-English bilingual subtitles for videos of traditional Chinese medicine.

背景技术Background technique

中医理论来源于对医疗经验的总结，其内容包括精气学说、气血津液、藏象、经络、体质、病因、发病、病机、治则、养生等。早在两千多年前，中医专著《黄帝内经》问世，奠定了中医学的基础。时至今日，中国传统医学相关的理论、诊断法、治疗方法等，均可在此书中找到根源。TCM theory comes from the summary of medical experience, and its content includes the theory of essence, qi, blood, body fluid, viscera, meridian, constitution, etiology, disease, pathogenesis, treatment principles, health preservation, etc. As early as more than 2,000 years ago, the Chinese medicine monograph "Huangdi Neijing" came out, which laid the foundation of Chinese medicine. Today, the theories, diagnostic methods, and treatment methods related to traditional Chinese medicine can be found in this book.

发明内容Contents of the invention

本发明要解决的技术问题是目前中医视频均以汉语为解说音源，在翻译过程中多依靠人工翻译，导致翻译效率低且存在翻译准确性不高的问题，也提高了翻译人员的工作负荷。The technical problem to be solved by the present invention is that the current Chinese medicine videos all use Chinese as the source of commentary, and the translation process mostly relies on manual translation, resulting in low translation efficiency and low translation accuracy, and also increases the workload of translators.

为解决上述技术问题，本发明提供的技术方案为：一种中医视频中英双语字幕自动生成系统，包括基于系统架构的前端信息获取模块、解析处理模块和后端编写模块，前端信息获取模块获取视频数据，并根据获取的视频数据类型调用中医药术语库的数据；解析处理模块通过科大讯飞接口对语音内容进行识别，在解析结果后得到文本并生成相应的字幕，解析处理模块内置有判断程序，判断程序检测生成的字幕结果是否需要翻译，具体包括：1)需要翻译，系统调用百度机器翻译机口和中英双语中医药术语翻译库对字幕结果进行翻译并生成双语字幕；2)不需要翻译，直接将字幕结果进行导出；判断程序对字幕进行关键词提取，提取的带有关键词的数据通过信道传输至后端编写模块，后端编写模块通过网页页面制作而成，通过信道获取前端信息获取模块、解析处理模块发送来的指令数据进行系统制作。In order to solve the above technical problems, the technical solution provided by the present invention is: a Chinese-English bilingual subtitle automatic generation system for Chinese medicine video, including a front-end information acquisition module, an analysis processing module and a back-end writing module based on the system architecture, and the front-end information acquisition module acquires Video data, and call the data of the traditional Chinese medicine terminology database according to the acquired video data type; the analysis processing module recognizes the voice content through the HKUST Xunfei interface, obtains the text after analyzing the result and generates corresponding subtitles, and the analysis processing module has a built-in judgment program to judge whether the subtitle results generated by the program detection need to be translated, specifically including: 1) translation is required, the system calls the Baidu machine translation machine port and the Chinese-English bilingual Chinese medicine term translation database to translate the subtitle results and generate bilingual subtitles; 2) not If translation is required, the subtitle results are directly exported; the judgment program extracts keywords from the subtitles, and the extracted data with keywords is transmitted to the back-end writing module through the channel. The back-end writing module is made through the web page and obtained through the channel The front-end information acquisition module and the instruction data sent by the analysis processing module are used for system production.

本发明与现有技术相比的优点在于：系统架构设计和流程步骤合理，通过自建语料库去导入到语音识别引擎进行模型训练，并通过知识图谱技术进行动态更新，以此不断构建和完善的模型去提升对中医领域的语音识别准确率，有效提高中医视频的双语翻译效率和准确度，减轻了翻译人员的工作负荷，适用性好，便于推广。Compared with the prior art, the present invention has the advantages of reasonable system architecture design and process steps, importing the self-built corpus into the speech recognition engine for model training, and dynamically updating through the knowledge map technology, so as to continuously build and improve the The model improves the accuracy of speech recognition in the field of TCM, effectively improves the efficiency and accuracy of bilingual translation of TCM videos, reduces the workload of translators, has good applicability, and is easy to promote.

进一步的，前端信息获取模块的音频获取方法包括语音信号提取、声音特征参数提取。Further, the audio acquisition method of the front-end information acquisition module includes voice signal extraction and sound feature parameter extraction.

进一步的，解析处理模块对识别的语音数据解析功能基于阿里的开源库FastJson。Furthermore, the analysis function of the analysis processing module for the recognized speech data is based on Ali's open source library FastJson.

进一步的，解析处理模块基于TextView生成字幕结果。Further, the parsing and processing module generates subtitle results based on the TextView.

进一步的，后端编写模块功能包括：1)利用知识图谱技术进行检索；2)修改字幕；3)人声提取；4) 音频转写；5)语音播报。Further, the functions of the back-end writing module include: 1) search using knowledge graph technology; 2) modify subtitles; 3) vocal extraction; 4) audio transcription; 5) voice broadcast.

附图说明Description of drawings

图1是一种中医视频中英双语字幕自动生成系统的工作流程示意图。Figure 1 is a schematic diagram of the workflow of a system for automatically generating Chinese-English bilingual subtitles for Chinese medicine videos.

图2是实施例一的示意图。Fig. 2 is a schematic diagram of Embodiment 1.

图3是实施例二的示意图。Fig. 3 is a schematic diagram of the second embodiment.

图4是实施例三的示意图。Fig. 4 is a schematic diagram of the third embodiment.

具体实施方式Detailed ways

下面结合附图对本发明做进一步的详细说明。The present invention will be described in further detail below in conjunction with the accompanying drawings.

本发明在具体实施时，如图1所示的实施例中，前端信息获取模块获取视频数据，并根据获取的视频数据类型调用中医药术语库的数据；解析处理模块通过科大讯飞接口对语音内容进行识别，在解析结果后得到文本并生成相应的字幕，解析处理模块内置有判断程序，判断程序检测生成的字幕结果是否需要翻译，具体包括：1)需要翻译，系统调用百度机器翻译机口和中英双语中医药术语翻译库对字幕结果进行翻译并生成双语字幕；2)不需要翻译，直接将字幕结果进行导出；判断程序对字幕进行关键词提取，提取的带有关键词的数据通过信道传输至后端编写模块，后端编写模块通过网页页面制作而成，通过信道获取前端信息获取模块、解析处理模块发送来的指令数据进行系统制作。每个导入的视频都可以生成视频对应的小型知识图谱。When the present invention is implemented, in the embodiment shown in Figure 1, the front-end information acquisition module acquires video data, and transfers the data of the traditional Chinese medicine terminology database according to the video data type acquired; Identify the content, get the text after parsing the result and generate corresponding subtitles. The parsing processing module has a built-in judgment program to judge whether the subtitle results generated by the program detection need to be translated, specifically including: 1) Translation is required, and the system calls the Baidu machine translation machine port Translate the subtitle results with the Chinese-English bilingual TCM terminology translation database and generate bilingual subtitles; 2) Directly export the subtitle results without translation; the judgment program extracts keywords from the subtitles, and the extracted data with keywords passes through The channel is transmitted to the back-end writing module, and the back-end writing module is made through the web page, and the command data sent by the front-end information acquisition module and the analysis processing module are obtained through the channel to make the system. Each imported video can generate a small knowledge graph corresponding to the video.

进一步的，前端信息获取模块的音频获取方法包括语音信号提取、声音特征参数提取。解析处理模块对识别的语音数据解析功能基于阿里的开源库FastJson。解析处理模块基于TextView生成字幕结果。后端编写模块功能包括：1)利用知识图谱技术进行检索；2)修改字幕；3)人声提取；4)音频转写；5)语音播报。Further, the audio acquisition method of the front-end information acquisition module includes voice signal extraction and sound feature parameter extraction. The parsing and processing module analyzes the recognized voice data based on Ali's open source library FastJson. The parsing and processing module generates subtitle results based on TextView. The functions of the back-end writing module include: 1) search by knowledge map technology; 2) modify subtitles; 3) vocal extraction; 4) audio transcription; 5) voice broadcast.

在本发明的一个实施例中，如图2所示，本实施例给出的是系统所搭载的各项技术，具体包括：1) 语音识别技术，语音识别技术包含声学特征参数提取(梅尔频率倒谱系数、声学模型训练、语音模型训练、线性预测分析)、实时语音转写技术(握手阶段、实时通信阶段)、语音信号的获取(A/D转换、采样、量化、编码)；In one embodiment of the present invention, as shown in Figure 2, what this embodiment provides is the various technologies carried by the system, specifically including: 1) speech recognition technology, speech recognition technology includes acoustic feature parameter extraction (Mel Frequency cepstral coefficient, acoustic model training, speech model training, linear predictive analysis), real-time speech transcription technology (handshake stage, real-time communication stage), speech signal acquisition (A/D conversion, sampling, quantization, encoding);

2)系统开发技术(网页制作、后台开发)；2) System development technology (webpage production, background development);

3)字幕生成(TextView、Observab监听语音识别结果)；3) Caption generation (TextView, Observab monitor speech recognition results);

4)中医药术语数据库(对照国家中医术语标准、中国药典中医中药网)；4) TCM terminology database (compared with the national TCM terminology standard, Chinese Pharmacopoeia TCM Chinese Medicine Network);

5)知识图谱(解析中医关键词、构建知识梳理图)；5) Knowledge map (analyze TCM keywords, construct knowledge combing map);

6)双语种翻译技术(阿里开源库FastJSon、百度机翻译引擎、-多语种翻译术语库(中医术语翻译对照标准、中医术语翻译对照辞典)、ViewModel)。6) Bilingual translation technology (Alibaba open source library FastJSon, Baidu machine translation engine, -multilingual translation term base (Chinese medicine term translation comparison standard, traditional Chinese medicine term translation comparison dictionary), ViewModel).

7)采用neo4j技术，对数据库中的专业术语关键词和之间的关系检索。7) Use neo4j technology to retrieve the relationship between keywords and terms in the database.

在本发明的一个实施例中，如图3所示，本实施例给出的是系统的工作原理，中医药术语库通过语音识别获得并解析文本得到关键词，利用知识图谱技术进行检索，检索后的结果完善知识图谱并动态扩充中医药术语库，而中医药术语库通过术语识别准确率来检验知识图谱合理性。In one embodiment of the present invention, as shown in Figure 3, this embodiment provides the working principle of the system. The Chinese medicine terminology database is obtained through speech recognition and analyzed to obtain keywords, and the knowledge map technology is used for retrieval. The final result improves the knowledge graph and dynamically expands the TCM terminology database, and the TCM terminology database tests the rationality of the knowledge graph through the accuracy of term recognition.

在本发明的一个实施例中，如图4所示，本实施例给出的是系统的循环式流程，系统提取音频并调用中医药术语库，通过科大讯飞接口和阿里开源库FastJSon进行语音识别，识别后的结果通过“是否转成英语”的逻辑算法进行处理，是则调用中英双语中医药术语库和翻译接口并最后生成字幕，否则直接生成字幕(中文)，系统对字幕进行解析并提取出关键词，利用提取的关键词进行知识图谱检索，在完善知识图谱过程中会产生许多中医术语，将其用来完善中医术语语料库。在细节设计上，通过自训练平台结合中医药术语库训练模型对其进行语音识别。In one embodiment of the present invention, as shown in Figure 4, this embodiment presents the system's cyclic process, the system extracts the audio and invokes the TCM terminology database, and uses the HKUST Xunfei interface and the Ali open source library FastJSon for speech Recognition, the result after recognition is processed through the logic algorithm of "whether to convert into English", if yes, call the Chinese-English bilingual TCM terminology database and translation interface and finally generate subtitles, otherwise directly generate subtitles (Chinese), and the system will analyze the subtitles The keywords are extracted, and the extracted keywords are used for knowledge map retrieval. In the process of improving the knowledge map, many TCM terms will be generated, which will be used to improve the TCM terminology corpus. In terms of detailed design, the self-training platform combined with the TCM terminology database training model is used for speech recognition.

以上显示和描述了本发明的基本原理和主要特征以及发明的优点，本行业的技术人员应该了解，本发明不受上述实施例的限制，上述实施例和说明书中描述的只是说明本发明的原理，在不脱离本发明精神和范围的前提下，本发明还会有各种变化和改进，这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等效物界定。The above shows and describes the basic principles and main features of the present invention and the advantages of the invention. Those skilled in the art should understand that the present invention is not limited by the above-mentioned embodiments. What are described in the above-mentioned embodiments and description are only to illustrate the principles of the present invention , On the premise of not departing from the spirit and scope of the present invention, there will be various changes and improvements in the present invention, and these changes and improvements all fall within the scope of the claimed invention. The protection scope of the present invention is defined by the appended claims and their equivalents.

Claims

1. The utility model provides an automatic generation system of english bilingual subtitle in traditional chinese medical science video, includes that the front end information based on system architecture obtains module, analysis processing module and back end and compiles the module, its characterized in that: the front-end information acquisition module acquires video data and calls data of a traditional Chinese medicine term library according to the type of the acquired video data; the analysis processing module identifies voice content through a science news flying interface, obtains a text and generates a corresponding subtitle after an analysis result, a judgment program is arranged in the analysis processing module, and the judgment program detects whether the generated subtitle result needs to be translated or not, and the method specifically comprises the following steps: 1) The system calls a Baidu machine translation machine port and a Chinese-English bilingual Chinese medicine term translation library to translate the subtitle result and generate a bilingual subtitle; 2) The caption result is directly exported without translation; the judgment program extracts keywords from the subtitles, the extracted data with the keywords are transmitted to the back-end compiling module through a channel, the back-end compiling module is manufactured through a webpage, and instruction data sent by the front-end information acquisition module and the analysis processing module are acquired through the channel to carry out system manufacturing.

2. The system for automatically generating English bilingual subtitles in traditional Chinese medicine videos according to claim 1, wherein: the audio acquisition method of the front-end information acquisition module comprises voice signal extraction and sound characteristic parameter extraction.

3. The system for automatically generating English bilingual subtitles in a traditional Chinese medicine video according to claim 1, wherein the system comprises: the analysis processing module analyzes the recognized voice data based on an open source library FastJson in Ali.

4. The system for automatically generating English bilingual subtitles in traditional Chinese medicine videos according to claim 1, wherein: and the analysis processing module generates a caption result based on the TextView.

5. The system for automatically generating English bilingual subtitles in a traditional Chinese medicine video according to claim 1, wherein the back-end compiling module has the functions of: 1) Searching by using a knowledge graph technology; 2) Modifying the subtitles; 3) Extracting human voice; 4) Audio transfer; 5) And (5) voice broadcasting.