WO2018227761A1 - Dispositif de correction pour données enregistrées et diffusées pour l'enseignement - Google Patents
Dispositif de correction pour données enregistrées et diffusées pour l'enseignement Download PDFInfo
- Publication number
- WO2018227761A1 WO2018227761A1 PCT/CN2017/099055 CN2017099055W WO2018227761A1 WO 2018227761 A1 WO2018227761 A1 WO 2018227761A1 CN 2017099055 W CN2017099055 W CN 2017099055W WO 2018227761 A1 WO2018227761 A1 WO 2018227761A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- text
- voice data
- voice
- unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Definitions
- the invention relates to a network teaching recording and broadcasting technology, which can be used for recording and playing a teaching activity or a conference process based on network teaching or online conference, and particularly relates to a device capable of correcting recorded teaching voice data.
- the recorder mainly includes a camera and a wireless digital microphone to record video information and voice data of the courseware.
- the first network transmits the courseware information to the server.
- the server is used on the one hand to further process the courseware information, to generate courseware data, and on the other hand to search and call the courseware data in the database, and then convert the courseware data back to the courseware information.
- the database is used to store the courseware data.
- the second network is used to connect the client to the server.
- the client is used to facilitate the user to query courseware information and invoke courseware information.
- Said The patent application discloses a relatively typical streaming media format recording course technology. Now, its main disadvantage is that the files formed after recording are relatively large, the uploading and downloading speed is slow, and the required storage space is large.
- Recent technologies in teaching and recording such as CN105306861A (publication date February 3, 2016), disclose an effective classroom teaching and recording method and system.
- multimedia whiteboard can be realized for users.
- the functional voice, speech/speech voice, communication with other users, and/or coaching, etc. are recorded to form different data streams, and a unified time stamp for various data streams is generated by the network teaching recording system.
- the end obtains the data stream according to the time stamp reproduction, and the organic combination plays out to display to the user, thereby completing the on-demand browsing.
- the patent application discloses a classroom recording and recording method for separately storing and recording classroom teaching data in three data stream formats according to time stamps.
- CN101354748A (Publication Date January 28, 2009) discloses a character recognition device including an image pickup device, a character recognition device, a voice conversion device, and a voice output device for taking in text information and taking a photo
- the entered text information is sent as a picture to the character
- the character recognition device is configured to identify the text information in the picture and send the text information to the voice conversion device;
- the voice conversion device is configured to convert the text information into voice data, and send the message information to the a voice output device;
- the voice output device configured to play the voice data.
- the patent application discloses a technique for collecting and recognizing text symbols in image information and then converting the text symbols into speech.
- CN102956231A discloses a semi-automatic correction based speech key information recording apparatus and method in the field of speech recognition technology, the apparatus comprising: a key information extraction unit and an information correction unit connected thereto, The key information extracting unit obtains the uncorrected text data and extracts the key information, and outputs the key information to the information correcting unit, and the information correcting unit outputs the text data confirmed by the user feedback.
- the invention reduces the workload of manual correction by using a semi-automatic information correction unit; uses a database to correct special nouns such as place names and professional tool names, thereby reducing the influence caused by the operator's knowledge limit in manual correction; extracting voice data Key information in the message, thereby increasing the amount of information available for the recorded information.
- the patent application aims to solve the problem of semi-automatic correction of text data after speech conversion into text.
- CN105159870A (Publication Date December 16, 2015) discloses a processing system for accurately completing continuous natural speech textualization, the processing system comprising a cloud speech recognition engine and a speech recognition post-correction platform, the speech recognition post-correction platform and The cloud speech recognition engine is connected, and the speech recognition post-correction platform comprises a display unit, a correction operation unit, a control unit and a three-dimensional integrated generation unit, and the correction operation unit comprises a speech correction, a keyboard correction, a mouse correction and a keyboard plus The correct operation mode of the mouse, which discloses that the voice file to be recognized can be finely segmented to achieve accurate recognition.
- CN105808197A discloses an information processing method applied to an electronic device having a speech recognition module, the method comprising: receiving input speech data; After the voice data is input and the recognition result is obtained, when the first information in the recognition result is the content that needs to be corrected, the first information is at least one character in the recognition result, and the manner of inputting through the operation body is adopted. Correcting the first information in the recognition result, wherein the first information in the recognition result is corrected by the manner of inputting the operation body, and only the part of the target correction is corrected, without the user inputting the voice data again.
- the objective result can be obtained, the operation process is simple, and the overall speed of information input is improved.
- the patent application discloses that it is only necessary to correct the content that needs to be corrected at the first position after speech recognition, thereby improving the speed of the correction, but such correction is only for the recognized text data, wherein the process of speech recognition In the middle, the method of comparing the information to be identified with the standard voice data is used, thereby improving the recognition accuracy.
- CN106328145A (Publication Date, January 11, 2017) discloses a voice correction method and apparatus, comprising: acquiring voice data input by a user; and identifying the voice data to obtain text content corresponding to the voice data; When the text content includes the first preset keyword, the text content is divided into original text and edit text according to the first preset keyword, wherein the edit text is used to perform the original text. Correction; according to the edited text from the Extracting the text to be corrected in the original text; correcting the original text according to the edited text and the to-be-corrected text to obtain the corrected text.
- the patent application discloses that the text to be edited in the original text, that is, the edited text, can be obtained by means of keyword recognition, and the correction is made in a targeted manner.
- CN102215233A (Publication Date, October 12, 2011) discloses an information system client installed in a user's terminal device, which can be applied to a microblog, a blog, a forum or a personal space, etc., including: a user interaction module and a connection office.
- the voice module of the user interaction module preferably, further includes a feedback module, a conversion module, the voice module includes a voice collection unit, a voice recognition unit, and a voice synthesis unit, and the voice collection unit is configured to collect voices of the user; the voice recognition unit The voice recognition unit collects the voice recognition as text output to the user interaction module; the voice synthesis unit converts the text obtained by the user interaction module from the information system server into voice output to the user; the feedback module, the connection center a voice recognition unit, configured to confirm whether the voice recognition is correct, if the correct, the feedback module outputs the text to the user interaction module, and if not, the feedback module enables the voice collection unit Re-acquiring the user's voice or the voice recognition unit corrects the text straight To confirm that it is correct.
- the patent application discloses a technology for converting between voice and text respectively, and aims to convert information of one format into information of another format, and if the outputted text information is incorrect, the feedback module re- Collect user voices or directly correct the text information of the output
- CN106486113A discloses a method of recording a meeting, comprising: Obtaining a voice signal; converting the voice signal into corresponding text information by a voice conversion software, and displaying the text information, wherein the text information includes correct text information and incorrect text information; and performing error text information in the document Marking, and linking the erroneous text information of the mark to the voice signal corresponding to the erroneous text information; when clicking the erroneous text information, using the voice conversion software to associate the voice linked with the erroneous text information
- the signal is secondarily recognized, and the second recognized text information is editably displayed in the document; the error text information is corrected and edited in the editable display to obtain corrected text information, and the corrected text information is used Replace the error text message.
- the present invention aims to provide a teaching recording and reproducing data correcting apparatus, which replaces the text of the specific corrected text with the standard voice data on the basis of correcting the text converted by the voice.
- the corresponding voice segment corresponding to the corrected text content in the original recorded voice data forms standard voice data and corresponding text, so that when the recorded data is recorded on the spot after the event, the voice data different from the original recorded voice data can be played. Correct voice, and display the correct subtitle information.
- the invention aims to provide a teaching recording and broadcasting data correction device with a voice correction function, which comprises converting a voice signal in a network teaching or an online conference into original voice data with time stamp using a recording device, and using a voice recognition model. Converting the original voice data into original text data, correcting the original text data, replacing the old text content to be corrected with the new text content, realizing correction of the original text data to form corrected text data, using time stamping Positioning, replacing the standard voice data of the new text content with the corresponding voice data segment of the old text content to form the modified voice data.
- the description mainly describes an embodiment of the present invention in the name of a network teaching recording system or a network conference system, it can be understood that the apparatus of the present invention can also be used for other network online communication processes.
- Record and play That is, the invention relates to the provision of online teaching, online training, emergency command (map annotation and voice recording), financial systems or online
- the method of teaching and recording of the conference system or the method, system and computer program product of the recording and playing process of the conference in the process of network teaching, online training, emergency command (map annotation and voice recording), financial system (marketing explanation) or online conference
- the recording of the voice data is involved, the correction of the text data by the conversion of the voice data is recognized, and the standard voice data of the corrected text content is replaced with the corresponding voice data of the original recording, so that the correction of the recorded voice data can be realized.
- the invention provides a teaching recording and broadcasting data correction device, in the process of recording and on-demand review of a multimedia classroom (or network classroom) or the like, especially when recording a multimedia classroom, including voice data and a multimedia whiteboard.
- the action data electronic whiteboard book
- the operation data on the screen of the user terminal, the video data recorded by the recording device, etc. are added in time stamps in the data stream format and then saved, forming the recorded data.
- the cable is used.
- the wireless local area or the wide area network obtains the recorded data, realizes the reproduction process on the user terminal by using the time stamp or simulates the teaching process of the reproduction classroom, thereby realizing the review playback or the on-demand playback of the recorded classroom.
- the teaching recording data correction device of the present invention comprises a file identification generating unit, a voice data collecting unit, a voice data correcting unit, another data collecting unit, a recording data playing unit and an error information feedback unit, wherein
- a file identification generating unit configured to generate a file identification ID when starting the recording teaching process
- a voice data collecting unit configured to convert a voice signal into original voice data by using an audio collecting device, and save the voice data stream format
- a voice data correction unit configured to correct voice data that needs to be corrected in the original voice data, to form corrected voice data
- the other data collection unit is configured to collect at least one of the following data: action data on the multimedia whiteboard, operation data on the screen of the user terminal, video data of the video recording device, and adding the timestamp to each data collected, And separately saved in a data stream format, and together with the modified voice data stream and the modified text data, form recordable data that can be played;
- Recording a data playing unit the user uses the terminal to acquire the recorded data through a network, combines different data streams according to the time stamp, thereby playing the recorded data on the terminal, reproduces and/or simulates a recurring teaching process, and realizes Learning and/or reviewing the teaching process;
- the error information feedback unit may: when the user plays the recorded data by using the terminal, may select and submit the error text content in the found modified text data, and the feedback content is updated by the administrator, and the correction is updated. Text data, and repeating the voice data replacement unit, updating the modified voice data.
- the voice data modification unit further includes a voice data recognition unit, a text data correction unit, and a voice data replacement unit, wherein:
- a voice data identification unit configured to convert the original voice data identification into original text data
- a text data correction unit configured to correct the original text data, and correct the old text content that needs to be corrected into an accurate new text content to form corrected text data
- a voice data replacing unit configured to replace the voice data stream segment of the old text content in the original voice data with standard voice data of the new text content to form a modified voice data stream.
- the voice data collecting unit is configured to collect at least one voice data from at least one voice source, add a time stamp, and save the voice data stream format;
- the voice data identification unit is configured to convert the voice data stream identification into text data, the text data includes the time stamp, and the time of each text content in the text data may be determined according to the time stamp coordinate.
- the voice data replacing unit is configured to retrieve standard voice data of the new text content from a standard voice database, and replace the old one of the original voice data with the standard voice data according to the time stamp A segment of the speech data stream corresponding to the textual content, thereby forming a stream of modified speech data.
- the modified text data is displayed on the screen of the terminal in a subtitle manner according to the time stamp, preferably displayed on a screen area in which video data is played, and more preferably, the text data is editable
- the way is displayed in a specific area of the terminal, in a selectable manner.
- a correction history record is formed, which may include correction time, correction content, correction operator, problem finder, and the like.
- the voice data replacing unit is configured to calculate smoothing according to the pronunciation time of the replaced old text content in the original voice data and the pronunciation time of the standard voice data of the new text content.
- the coefficient further adjusts the pronunciation time of the new text content according to the smoothing coefficient, thereby causing smoothing and synchronization of the voice data before and after replacement.
- the old text content may be empty content, that is, the new text content replacing the empty content is missing, and the text content needs to be added now.
- the new text content may be empty content, that is, the old text content that is replaced is redundant, and the deleted text content is now required.
- the level of classroom recording is improved, various data are separately saved by means of the identifier of the time stamp, and the voice data is corrected by the recognition and conversion of the voice data and the correction of the text data, and the voice data is corrected according to the corrected text content.
- the content that needs to be corrected in the original recorded voice data overcomes the problems caused by "less talk, wrong talk and miss talk" in the classroom, and can obtain double corrected speech data and text data (subtitle information).
- FIG. 1 is a block diagram of a recording and broadcasting system according to the present invention.
- FIG. 2 is a flow chart showing the recording and recording steps in accordance with the present invention.
- Figure 3 is a flow chart of speech correction in accordance with the present invention.
- the network teaching in the invention is not limited to the classroom teaching form of students and teachers, and may include online network teaching, remote network teaching, local network teaching, and employees of enterprises and institutions, with teachers and students, or trainers as participants. Participate in online web conferencing, remote web conferencing, local web conferencing, and other forms of communication/interaction that use the web for online communication and/or presentation of file content, such as remote collaborative work.
- the teacher 1 and the student 2 respectively connect to the teaching server 3 via the Internet using a terminal device installed with a client of the network teaching recording and broadcasting system, thereby realizing network lecture/listening/recording/on-demand/review of the multimedia classroom. .
- the terminal device includes: a processor, a network module, a control module, a display module, and a smart operating system, and can be a smart phone, a PAD, a notebook computer, a desktop computer, or the like.
- the terminal may be provided with a plurality of data interfaces for connecting various extension devices and accessories through a data bus.
- the intelligent operating system includes Windows, Android and its improvements, iOS, on which application software can be installed and run, and functions of various application software, services, and application stores/platforms under the intelligent operating system are realized.
- Terminal devices can be connected to the Internet via RJ45/Wi-Fi/Bluetooth/2G/3G/4G/G.hn/Zigbee/Z-ware/RFID connections and connected to other terminals or other computers and devices via the Internet.
- 1394/USB/Serial/SATA/SCSI/PCI-E/Thunderbolt/data card interface and other data interfaces or bus methods through HDMI/YpbPr/SPDIF/AV/DVI/VGA/TRS/SCART/Displayport, etc. Audio and video interface, etc.
- the connection method is used to connect various expansion equipment and accessories to form a conference/teaching equipment interactive system.
- the reading device realizes image access, sound access, use control and screen recording of the electronic whiteboard, RFID reading function, and can access and control mobile storage devices, digital devices and other devices through corresponding interfaces; through DLNA/ IGRS technology and internet technology are used to implement functions such as manipulation, interaction and screen switching between multi-screen devices.
- a processor is defined to include, but is not limited to, an instruction execution system such as a computer/processor based system, an application specific integrated circuit (ASIC), a computing device, or a non-transitory or non-transitory computer.
- a hardware and/or software system that reads a storage medium to acquire or acquire logic and execute instructions contained in a non-transitory storage medium or a non-transitory computer readable storage medium.
- the processor may also include any controller, state machine, microprocessor, internetwork-based entity, service or feature, or any other analog, digital, and/or mechanical implementation thereof.
- the Internet may include a local area network and a wide area Internet, and may be a wired Internet or a wireless Internet, or any combination of these networks.
- the main steps of the network teaching recording according to the present invention are as follows:
- the user uses the terminal to log in, the intelligent electronic whiteboard, the teacher terminal screen operation motion capturing program, the camera, the microphone and other multimedia teaching equipment enter the working state, the camera may have more than one, the microphone includes at least one, respectively Used to capture the teacher's voice and to capture the student's voice, the recording server's teaching server can be used to generate digital timestamps.
- S200 Start online teaching: the teacher starts classroom teaching, and the recording and broadcasting system generates a teaching document ID.
- the teacher uses the intelligent electronic whiteboard to display (as a teaching board or explain the problem board), and uses real-time voice to explain and use.
- Real-time interactive voice communication and can also be displayed and explained on the teacher terminal using electronic documents such as PPT documents, so as to carry out multimedia teaching and interactive question and answer communication with students.
- S300 Recording data saving: During the recording process, the action on the intelligent whiteboard is transmitted and saved in the form of “action data stream + time stamp”.
- the voice in the teaching and interaction process is “voice data stream + time stamp”.
- Transmission and storage, the operation actions of electronic documents such as PPT documents involved in the teacher terminal are transmitted and saved in the manner of "electronic document operation data stream + time stamp", and the collected video data is transmitted in the form of "video data stream + time stamp”. And save. All of these data streams throughout the course of the course are tied to the teaching document ID to achieve the identity of the recorded course. These data can be added or deleted as needed.
- the recorded data includes voice data, video data, and PPT document presentation data
- the PPT document presentation data can usually be displayed in the form of video data. You must use an action action to reproduce it.
- classified recording Split screen display is a relatively mature technology.
- the various data recorded can be saved to a local database or a terminal database, and then uploaded to the remote teaching server through the network, or directly saved to the remote teaching server.
- a voice acquisition device such as various available microphones, can be used to acquire the voice signal, and the voice signal can be converted to voice data for storage in a data stream format.
- the gender of the speech source can be marked so that the standard speech of the corresponding gender can be selected for subsequent speech correction (replacement) operations.
- the gender of the voice source can be separately identified, and the multiple voice sources can be identified, and the time stamps can be separately saved and the multiple voice sources can be separately identified. I will not repeat them here.
- S400 Voice data conversion: For the recorded original voice data, the original text data is first formed by the voice model, and then the original text data is corrected. When the original text data is formed, the time stamp of the original voice data is added to the text data so that the text content in the text data can be time-located.
- the text content may be at least one word, word, sentence or paragraph in the text data.
- the clock data of the time dimension of the audio data can be obtained by the time positioning, that is, the clock parameter of the time point at which a certain data segment in one audio data can be relatively located.
- the original speech data identification can be converted into the original text data by using various available speech models, and when the speech data recognition conversion is performed, the gender of the speech source is first recognized. And adding gender information to the text data.
- Proofreading corrections for text data include manual proofreading, semi-automatic proofreading, and voice proofreading.
- Voice data correction The original text data is corrected using a voice correction command, that is, using a voice proofing method (CN106406807A), but the present invention is not limited thereto.
- the voice proofreading unit includes: receiving a voice correction instruction, identifying, in the text data to be corrected, all the characters that are the same as the voice correction instruction sound, and a time stamp of the text content, determining the to-be-corrected text in all the recognized texts, and displaying
- the alternative text list corresponding to the to-be-corrected text accepts an alternative text selection instruction, performs a replacement operation, and forms corrected text data, thereby completing the text correction.
- the standard pronunciation information of the corrected text is retrieved from the standard speech database, and the corresponding speech data segment is replaced with the standard pronunciation information according to the time stamp of the corrected text to form the corrected speech data.
- the standard speech database may include a girls standard speech database, a boys standard speech database, and/or a personalized standard speech database.
- the personalized standard voice database is a voice model of a specific speaker formed by a standard voice database formed by recording a specific speaker, or by corpus training, and can be used for voice recognition, and can also be used to generate personalized standard voice. database.
- the corresponding standard voice is selected according to the voice source gender information of the original text data, or other personalized information.
- the old text content may be empty content, that is, the new text content replacing the empty content is missing, and the added text content is now required.
- the new text content can be empty content, that is, The old text content that was replaced is superfluous and now needs to delete the text content.
- the specific steps of the speech correction are as follows:
- the voice correction instruction is received, for example, the user can issue a “selected Hu Jian” voice instruction through the unit to initiate the correction of the problem text “Hu Jian”. instruction.
- the user can clarify which character needs to be corrected by using a further voice instruction.
- the words that are recognized as “hujian” from the time of going to the following are: “Hu Jian”, “mutual see”, “shoulder shoulder”, etc., the user currently wants to recognize the first If a text is corrected, the "first" voice can be issued to determine the first recognized text as the current text to be corrected.
- an alternative text of the homophone is displayed in the vicinity of the text.
- a list of words that allows the user to subsequently select alternate text For example, if the first word “Hu Jian” in the text data is “hujian” is determined as the text to be corrected, then the first word in the text data in this step is “hujian”. "A list of alternative texts is displayed nearby: 1, Fujian; 2, accessories; 3, shoulder pads; 4, mutual see,...
- the user can speak the position of the alternative text in the alternative text list by voice, and complete the work of selecting the alternative text. For example, use Fujian to replace Hu Jian.
- the time position information of the text to be corrected is marked with a time stamp, thereby accurately positioning the time position information of the voice data corresponding to the corrected text.
- a correction history record is formed, the correction history record including correction time, correction content, correction operator, and the like.
- the standard speech data is searched according to the alternative text, and if a plurality of words or sentences are combined, a new piece of speech data is combined.
- the text data includes gender information of the voice source, and when the search is performed, the girl's pronunciation or the boy's pronunciation, or various voice data such as various trebles and basses may be obtained according to the gender information.
- the new voice data segment is replaced with the corresponding voice data segment in the original voice data according to the previously described time position information to form new voice data.
- the pronunciation time is not necessarily the same.
- the pronunciation time of the two speech segments may be calculated first. And a smoothing coefficient, according to the smoothing coefficient, speeding up or slowing down the standard pronunciation time, so that the pronunciation duration of the same text content after the replacement and before the replacement is consistent.
- the user uses the terminal to log in to the recording and broadcasting system through the Internet, and can realize the review playback or on-demand playback of the recorded classroom.
- these recording classrooms may be process record files of online online conferences, and the recording and playback system will send the teaching file IDs requested by the user for review or on-demand to the teaching server through the Socket encrypted channel, through teaching.
- the file ID obtains the time-stamped action data stream, the voice data stream, the electronic document operation data stream, the video data stream, and the text data of the course to be sent to the user terminal requesting the corresponding teaching file ID, and the user terminal locally according to the timestamp.
- Restore Reproduce or simulate reappearance
- These data streams can be displayed or switched display in each functional area of the user terminal. For video, it can generally be reproduced on the user terminal, but for the operation of the electronic whiteboard, simulation reproduction can be realized by the simulation program of the electronic whiteboard.
- the user can choose to play only at least one of these data streams, for example, can only listen to the voice.
- text data it can be displayed in a specific area of the user terminal in the form of subtitles, such as a video exhibition. In the exhibition area.
- the text data functioning as a caption can be displayed in a specific editable area, so that the user can perform a selected operation or the like, so that only the corresponding text needs to be selected for the found non-standard voice data or text information.
- Information can be fed back.
- the administrator of the recording and broadcasting system verifies the feedback after receiving the feedback from the user. If it finds that there is an error, repeats the correction steps of the previous text data and the voice data stream, so that the text data and the voice data can be continuously improved and improved. .
- the terminal and the server are configured to be connected to a communication network including the Internet. Therefore, the medium may be a program that carries the program code in a streaming manner via the communication network.
- the program code is downloaded from the communication network as described above, the program for downloading may be stored in the main device or may be installed from another recording medium.
- the present invention can be realized by the above-described program code in the form of a computer data signal embodied in an electronic transmission embodied in a carrier wave.
- the teaching and recording data correction device improves the level of classroom recording, and saves various data by means of time stamp identification, through recognition and conversion of voice data and correction of text data, and according to the correction Corrected voice data with text content, corrected original recorded language
- the content that needs to be corrected in the audio data overcomes the problems caused by "less talk, wrong talk and miss talk" in the classroom, and can obtain double corrected speech data and text data (subtitle information).
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
La présente invention concerne un dispositif de correction pour des données enregistrées et diffusées pour l'enseignement. Un équipement d'enregistrement vocal est utilisé pour convertir des signaux vocaux générés dans un processus d'enseignement en réseau ou de réunion en ligne en données vocales originales ayant des horodatages ; un modèle de reconnaissance vocale est utilisé pour reconnaître et convertir les données vocales originales en données de texte originales ; les données de texte originales sont révisées pour remplacer un ancien contenu de texte devant être corrigé par un nouveau contenu de texte, de façon à mettre en œuvre une correction des données de texte originales pour former des données de texte corrigées ; les horodatages sont utilisés pour le positionnement ; des données vocales standard du nouveau contenu de texte sont utilisées pour remplacer les intervalles de données vocales correspondantes de l'ancien contenu de texte pour former des données vocales corrigées. L'utilisation du dispositif de la présente invention peut corriger des données enregistrées et diffusées pour l'enseignement, de façon à résoudre le problème de la confusion et du fait qu'un utilisateur est induit en erreur dans des systèmes d'enregistrement et de diffusion pour l'enseignement en raison d'erreurs de locution et d'omissions et d'expressions non standard existant dans l'enseignement en salle de classe.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710444172.1A CN107220228B (zh) | 2017-06-13 | 2017-06-13 | 一种教学录播数据修正装置 |
| CN201710444172.1 | 2017-06-13 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2018227761A1 true WO2018227761A1 (fr) | 2018-12-20 |
Family
ID=59948760
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2017/099055 Ceased WO2018227761A1 (fr) | 2017-06-13 | 2017-08-25 | Dispositif de correction pour données enregistrées et diffusées pour l'enseignement |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN107220228B (fr) |
| WO (1) | WO2018227761A1 (fr) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110459233A (zh) * | 2019-03-19 | 2019-11-15 | 深圳壹秘科技有限公司 | 语音的处理方法、装置和计算机可读存储介质 |
| CN110534100A (zh) * | 2019-08-27 | 2019-12-03 | 北京海天瑞声科技股份有限公司 | 一种基于语音识别的中文语音校对方法和装置 |
Families Citing this family (19)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109324811B (zh) * | 2017-07-28 | 2021-10-15 | 深圳市鹰硕技术有限公司 | 一种用于更新教学录播数据的装置 |
| CN107767871B (zh) * | 2017-10-12 | 2021-02-02 | 安徽听见科技有限公司 | 文本显示方法、终端及服务器 |
| JP7069631B2 (ja) * | 2017-10-16 | 2022-05-18 | 富士フイルムビジネスイノベーション株式会社 | 情報処理装置及び情報処理プログラム |
| CN107820112A (zh) * | 2017-11-15 | 2018-03-20 | 安徽声讯信息技术有限公司 | 一种音频文字直播系统 |
| CN108320318B (zh) * | 2018-01-15 | 2023-07-28 | 腾讯科技(深圳)有限公司 | 图像处理方法、装置、计算机设备及存储介质 |
| CN110390930A (zh) * | 2018-04-15 | 2019-10-29 | 高翔 | 一种音频文字校对的方法和系统 |
| CN108962293B (zh) * | 2018-07-10 | 2021-11-05 | 武汉轻工大学 | 录像修正方法、系统、终端设备及存储介质 |
| CN110858492A (zh) * | 2018-08-23 | 2020-03-03 | 阿里巴巴集团控股有限公司 | 音频剪辑方法、装置、设备和系统及数据处理方法 |
| CN109300468B (zh) * | 2018-09-12 | 2022-09-06 | 科大讯飞股份有限公司 | 一种语音标注方法及装置 |
| CN109243484A (zh) * | 2018-10-16 | 2019-01-18 | 上海庆科信息技术有限公司 | 一种会议发言记录的生成方法及相关装置 |
| CN109782986A (zh) * | 2018-12-14 | 2019-05-21 | 浙江学海教育科技有限公司 | 一种教学课件的制作方法、存储介质、及应用系统 |
| CN109858005B (zh) * | 2019-03-07 | 2024-01-12 | 百度在线网络技术(北京)有限公司 | 基于语音识别的文档更新方法、装置、设备及存储介质 |
| CN110880316A (zh) * | 2019-10-16 | 2020-03-13 | 苏宁云计算有限公司 | 一种音频的输出方法和系统 |
| CN110930997B (zh) * | 2019-12-10 | 2022-08-16 | 四川长虹电器股份有限公司 | 一种利用深度学习模型对音频进行标注的方法 |
| CN111399800A (zh) * | 2020-03-13 | 2020-07-10 | 胡勇军 | 一种语音输入法系统 |
| CN113571061B (zh) * | 2020-04-28 | 2024-12-13 | 阿里巴巴集团控股有限公司 | 语音转写文本编辑系统、方法、装置及设备 |
| CN112562638B (zh) * | 2020-11-26 | 2025-01-07 | 北京达佳互联信息技术有限公司 | 语音预览的方法、装置及电子设备 |
| CN113590871B (zh) * | 2021-02-05 | 2025-08-29 | 腾讯科技(深圳)有限公司 | 一种音频分类方法、装置及计算机可读存储介质 |
| CN116524910B (zh) * | 2023-06-25 | 2023-09-08 | 安徽声讯信息技术有限公司 | 一种基于麦克风的文稿预制方法及系统 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103366731A (zh) * | 2012-03-31 | 2013-10-23 | 盛乐信息技术(上海)有限公司 | 语音合成方法及系统 |
| CN103366741A (zh) * | 2012-03-31 | 2013-10-23 | 盛乐信息技术(上海)有限公司 | 语音输入纠错方法及系统 |
| CN105306861A (zh) * | 2015-10-15 | 2016-02-03 | 深圳市时尚德源文化传播有限公司 | 一种网络教学录播方法及系统 |
| CN106710597A (zh) * | 2017-01-04 | 2017-05-24 | 广东小天才科技有限公司 | 语音数据的录音方法及装置 |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103207769B (zh) * | 2012-01-16 | 2016-10-05 | 联想(北京)有限公司 | 语音修正的方法及用户设备 |
| CN105244022B (zh) * | 2015-09-28 | 2019-10-18 | 科大讯飞股份有限公司 | 音视频字幕生成方法及装置 |
-
2017
- 2017-06-13 CN CN201710444172.1A patent/CN107220228B/zh active Active
- 2017-08-25 WO PCT/CN2017/099055 patent/WO2018227761A1/fr not_active Ceased
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103366731A (zh) * | 2012-03-31 | 2013-10-23 | 盛乐信息技术(上海)有限公司 | 语音合成方法及系统 |
| CN103366741A (zh) * | 2012-03-31 | 2013-10-23 | 盛乐信息技术(上海)有限公司 | 语音输入纠错方法及系统 |
| CN105306861A (zh) * | 2015-10-15 | 2016-02-03 | 深圳市时尚德源文化传播有限公司 | 一种网络教学录播方法及系统 |
| CN106710597A (zh) * | 2017-01-04 | 2017-05-24 | 广东小天才科技有限公司 | 语音数据的录音方法及装置 |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110459233A (zh) * | 2019-03-19 | 2019-11-15 | 深圳壹秘科技有限公司 | 语音的处理方法、装置和计算机可读存储介质 |
| CN110534100A (zh) * | 2019-08-27 | 2019-12-03 | 北京海天瑞声科技股份有限公司 | 一种基于语音识别的中文语音校对方法和装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN107220228A (zh) | 2017-09-29 |
| CN107220228B (zh) | 2019-08-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2018227761A1 (fr) | Dispositif de correction pour données enregistrées et diffusées pour l'enseignement | |
| CN109324811B (zh) | 一种用于更新教学录播数据的装置 | |
| US12462808B2 (en) | Systems and methods for team cooperation with real-time recording and transcription of conversations and/or speeches | |
| CN111538851B (zh) | 自动生成演示视频的方法、系统、设备及存储介质 | |
| CN109698920B (zh) | 一种基于互联网教学平台的跟随教学系统 | |
| JP6472898B2 (ja) | ネット教育における記録・再生方法およびシステム | |
| US7458013B2 (en) | Concurrent voice to text and sketch processing with synchronized replay | |
| CN104408983B (zh) | 基于录播设备的智能教学信息处理系统 | |
| WO2019095446A1 (fr) | Système d'enseignement de succession avec fonction d'évaluation de la parole | |
| JP2002202941A (ja) | マルチメディア電子学習システムおよび学習方法 | |
| Valor Miró et al. | Evaluating intelligent interfaces for post-editing automatic transcriptions of online video lectures | |
| KR20130115484A (ko) | 강의 교재와 동기되는 데이터를 이용하는 강의 컨텐츠 제공 시스템 및 강의 컨텐츠 제공 방법 | |
| KR101858204B1 (ko) | 양방향 멀티미디어 컨텐츠 생성 방법 및 장치 | |
| KR101198091B1 (ko) | 학습 콘텐츠 서비스 제공 방법 및 시스템 | |
| KR100395883B1 (ko) | 실시간 강의 기록 장치 및 그에 따른 파일 기록방법 | |
| CN116312083A (zh) | 课程文件生成方法、装置、电子设备及存储介质 | |
| JP2004266578A (ja) | 動画像編集方法および装置 | |
| JP4085015B2 (ja) | ストリームデータ生成装置、ストリームデータ生成システム、ストリームデータ生成方法及びプログラム | |
| CN118200299A (zh) | 元宇宙会议托管方法、装置、设备、存储介质及程序产品 | |
| Paul | Building a specialised audio-visual corpus | |
| KR20030025771A (ko) | 인터넷상의 교육용 컨텐츠 제공 시스템 및 그 방법 | |
| KR20200039907A (ko) | 스크립트를 이용한 스마트 어학학습서비스 및 서비스 제공 방법 | |
| KR20240113179A (ko) | 버추얼 휴먼을 활용한 강의 콘텐츠 제공 방법 | |
| US20210397783A1 (en) | Rich media annotation of collaborative documents | |
| JP3816901B2 (ja) | ストリームデータの編集方法と編集システム及びプログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17913820 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 17913820 Country of ref document: EP Kind code of ref document: A1 |