[go: up one dir, main page]

CN106782509A - A kind of corpus labeling method and device and terminal - Google Patents

A kind of corpus labeling method and device and terminal Download PDF

Info

Publication number
CN106782509A
CN106782509A CN201611097247.5A CN201611097247A CN106782509A CN 106782509 A CN106782509 A CN 106782509A CN 201611097247 A CN201611097247 A CN 201611097247A CN 106782509 A CN106782509 A CN 106782509A
Authority
CN
China
Prior art keywords
list
text
audio
cell
corpus
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611097247.5A
Other languages
Chinese (zh)
Inventor
焦玉娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Leshi Zhixin Electronic Technology Tianjin Co Ltd
LeTV Holding Beijing Co Ltd
Original Assignee
Leshi Zhixin Electronic Technology Tianjin Co Ltd
LeTV Holding Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Leshi Zhixin Electronic Technology Tianjin Co Ltd, LeTV Holding Beijing Co Ltd filed Critical Leshi Zhixin Electronic Technology Tianjin Co Ltd
Priority to CN201611097247.5A priority Critical patent/CN106782509A/en
Publication of CN106782509A publication Critical patent/CN106782509A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a kind of corpus labeling method, device and terminal, wherein, corpus labeling method includes:Batch language material list is obtained and shows, wherein, the list of batch language material includes audio list and text list, wherein, the cell in each cell correspondence text list in audio list;In the audio play command that target audio in receiving for indicating audio list is played out, target audio is played;The corresponding content of text of target audio that reception is chosen from multiple pre-set text contents, as the content of text marked on the corresponding cell of target audio in text list.The present invention, user only needs to the triggering audio in batch language material list and plays and mark corresponding content of text, terminal carries out the content of text that audio is played and receives user annotation, the mark of language material can be completed, the time need not be spent carries out matching for audio language material and audio title, the time cost of corpus labeling is reduced, operating efficiency is improve.

Description

Corpus labeling method and apparatus, and terminal
Technical Field
The invention relates to the technical field of voice recognition, in particular to a corpus labeling method, a corpus labeling device and a terminal.
Background
The voice recognition technology has two key resources, namely a voice model and an acoustic model, wherein the acoustic model needs to manually mark the corpus content, and the marked corpus content is used as a training set for training. In order to improve the recognition rate, the corpus content can be classified, such as: gender, age, noise type and the like, and targeted training is carried out after arrangement, so that the aim of improving the recognition rate is fulfilled.
The existing corpus labeling mode is mostly local labeling, the corpus is issued to workers, and the workers newly build files for recording audio names and audio content texts corresponding to the audio names. In the process of labeling, an audio player is required to play the corpus audio files one by one, then a worker finds out the audio name corresponding to the played audio, and labels the text content corresponding to the audio name. The corpus labeling mode is high in time cost, and workers need to spend most of time to match the audio corpus with the audio names. Local storage is not convenient to manage, and multi-file operation reduces the working efficiency.
Disclosure of Invention
In view of this, embodiments of the present invention provide a corpus tagging method, a corpus tagging device, and a terminal, so as to solve the problems of high time cost and low working efficiency of corpus tagging in the prior art.
According to a first aspect, an embodiment of the present invention provides a corpus tagging method, which is applicable to a terminal with a display screen, and the corpus tagging method includes: acquiring and displaying a batch corpus list, wherein the batch corpus list comprises an audio list and a text list, and each cell in the audio list corresponds to one cell in the text list; when an audio playing command for indicating target audio in the audio list to be played is received, playing the target audio; and receiving the text content corresponding to the target audio selected from a plurality of preset text contents as the text content marked on the cell corresponding to the target audio in the text list.
Optionally, the method further comprises: when a file export command is received, acquiring a file corresponding to a batch corpus list marked with text content; and exporting the file corresponding to the batch corpus list marked with the text content, wherein the file export command is used for indicating that the file corresponding to the batch corpus list marked with the text content is exported in batch.
Optionally, when an audio play command for instructing a target audio in the audio list to be played is received, playing the target audio includes: judging whether the operation of selecting the cell in the audio list or the text list exists or not; when the operation of selecting the cells in the audio list or the text list exists, searching the selected cells in the audio list or the text list; and playing the audio corresponding to the selected cell.
Optionally, receiving a text content corresponding to the target audio selected from a plurality of preset text contents includes: receiving a command for opening a drop-down list of cells corresponding to the target audio in the text list, and acquiring the plurality of preset text contents in the drop-down list, wherein each cell in the text list corresponds to one drop-down list; receiving a selection command which is input through a mouse and used for selecting a text from the preset text contents, and marking the text contents indicated by the selection command on the cell corresponding to the target audio; or receiving a command for opening a drop-down list of cells corresponding to the target audio in the text list, and acquiring the plurality of preset text contents in the drop-down list, wherein each cell in the text list corresponds to one drop-down list; receiving a selected command which is input through a keyboard and used for selecting a text from the preset text contents, determining the text content corresponding to the selected command, and marking the text content on the cell corresponding to the target audio.
Optionally, after receiving a text content corresponding to the target audio selected from a plurality of preset text contents, the method further includes: judging whether an operation of selecting another cell in the text list exists or not; and when the operation of selecting another cell in the text list exists, saving the text content marked on the previous cell in the text list.
According to a second aspect, an embodiment of the present invention provides a corpus tagging device, which is suitable for a terminal with a display screen, where the corpus tagging device includes: the system comprises a first acquisition unit, a second acquisition unit and a display unit, wherein the first acquisition unit is used for acquiring and displaying a batch corpus list, the batch corpus list comprises an audio list and a text list, and each cell in the audio list corresponds to one cell in the text list; the playing unit is used for playing the target audio when receiving an audio playing command for indicating the target audio in the audio list to be played; and the receiving unit is used for receiving the text content corresponding to the target audio selected from the plurality of preset text contents as the text content marked on the cell corresponding to the target audio in the text list.
Optionally, the method further comprises: the second acquisition unit is used for acquiring a file corresponding to the batch corpus list marked with the text content when a file export command is received; and the exporting unit is used for exporting the files corresponding to the batch corpus list marked with the text contents, wherein the file exporting command is used for indicating the batch export of the files corresponding to the batch corpus list marked with the text contents.
Optionally, the playback unit includes: the judging module is used for judging whether the operation of selecting the cells in the audio list or the text list exists or not; the searching module is used for searching the selected cell in the audio list or the text list when the operation of selecting the cell in the audio list or the text list exists; and the playing module is used for playing the audio corresponding to the selected cell.
Optionally, the receiving unit includes: a first receiving module, configured to receive a command for opening a drop-down list of cells corresponding to the target audio in the text list, and obtain the multiple preset text contents in the drop-down list, where each cell in the text list corresponds to one drop-down list; the second receiving module is used for receiving a selection command which is input by a mouse and used for selecting a text from the preset text contents, and marking the text content indicated by the selection command on a cell corresponding to the target audio; or, a third receiving module, configured to receive a command for opening a drop-down list of cells corresponding to the target audio in the text list, and obtain the multiple preset text contents in the drop-down list, where each cell in the text list corresponds to one drop-down list; and the fourth receiving module is used for receiving a selected command which is input through a keyboard and is used for selecting a text from the preset text contents, determining the text content corresponding to the selected command, and marking the text content on the cell corresponding to the target audio.
Optionally, the method further comprises: the judging unit is used for judging whether an operation of selecting another cell in the text list exists after receiving the text content corresponding to the target audio selected from a plurality of preset text contents; and the storage unit is used for storing the text content marked on the previous cell in the text list when the operation of selecting another cell in the text list exists.
According to a third aspect, an embodiment of the present invention provides a terminal, including: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to cause the at least one processor to execute the corpus tagging method according to the first aspect or any one of the optional manners of the first aspect.
According to a fourth aspect, an embodiment of the present invention provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the corpus tagging method according to the first aspect or any one of the alternatives of the first aspect.
According to a fifth aspect, an embodiment of the present invention provides a computer program product, the computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform the corpus annotation method of the first aspect or any one of the alternatives of the first aspect.
According to the embodiment of the invention, the audio file is corresponding to the text list of the corresponding content to be input in advance by adopting the batch corpus list, so that the user only needs to trigger audio playing on the batch corpus list and label the corresponding text content, the terminal carries out audio playing and receives the text content labeled by the user, and the labeling of the corpus can be completed without spending time to match the audio corpus with the audio name, thereby reducing the time cost of corpus labeling and improving the working efficiency.
Drawings
One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the figures in which like reference numerals refer to similar elements and which are not to scale unless otherwise specified. In the drawings, wherein:
FIG. 1 is a flow chart of a corpus tagging method according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a batch corpus list according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating another batch corpus list according to an embodiment of the present invention;
FIG. 4 is a flow chart of a corpus tagging method according to another embodiment of the present invention;
FIG. 5 is a flow chart of a corpus tagging method according to yet another embodiment of the present invention;
FIG. 6 is a diagram illustrating another batch corpus list according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating a corpus tagging device according to an embodiment of the present invention;
FIG. 8 is a diagram illustrating a corpus tagging device according to another embodiment of the present invention;
FIG. 9 is a diagram illustrating a corpus tagging apparatus according to a further embodiment of the present invention;
fig. 10 is a schematic hardware structure diagram of a terminal for executing a corpus tagging method according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In a first aspect of the embodiments of the present invention, a corpus tagging method is provided, where the method is applied to a terminal with a display screen, and is executed by the terminal, as shown in fig. 1, where the method includes:
step S101, a batch corpus list is obtained and displayed, wherein the batch corpus list comprises an audio list and a text list, and each cell in the audio list corresponds to one cell in the text list.
The batch corpus list in this embodiment is preset, and when a worker executes a corpus tagging task, the worker directly extracts a relevant task from the system to obtain a batch corpus list corresponding to the task. That is to say, in the embodiment of the present invention, the corpus tagging task is allocated downward in a task issuing manner. The corpus tagging task can be executed on a webpage, and the terminal displays the corpus in the form of the webpage after acquiring the batch corpus list.
Because the batch corpus list comprises the audio list and the text list, and the cells on the audio list and the text list are in one-to-one correspondence, the audio on the audio list can be directly triggered to be played, so that a user (or a worker) can conveniently label the audio.
Step S102, when an audio playing command for indicating the target audio in the audio list to play is received, the target audio is played.
After the batch corpus list is displayed on the terminal, the user may input an audio playing command to the terminal through an input device, such as a mouse or a keyboard, and the terminal responds to play corresponding audio.
Step S103, receiving a text content corresponding to a target audio selected from a plurality of preset text contents as a text content labeled on a cell corresponding to the target audio in the text list.
After the user hears the audio file played by the terminal, the content to be heard is marked on the cell of the text list corresponding to the audio. And the terminal receives the text content selected by the user from the plurality of preset text contents, stores the text content in the corresponding cell and finishes the marking of the audio corpus.
According to the embodiment of the invention, the audio file is corresponding to the text list of the corresponding content to be input in advance by adopting the batch corpus list, so that the user only needs to trigger audio playing on the batch corpus list and label the corresponding text content, the terminal carries out audio playing and receives the text content labeled by the user, and the labeling of the corpus can be completed without spending time to match the audio corpus with the audio name, thereby reducing the time cost of corpus labeling and improving the working efficiency.
FIG. 2 is a diagram illustrating a batch corpus list according to an embodiment of the present invention. As shown in fig. 2, one column of "audio" represents the above-mentioned audio list, and one column of "correct" represents the above-mentioned text list.
FIG. 3 is a diagram illustrating another batch corpus list according to an embodiment of the present invention. The batch corpus list is labeled from Text To Speech (TTS), wherein a column of "audio" still represents the audio list, a column of "original Text" represents the Text content corresponding To the corresponding audio, and a column of "correct" still represents the Text list, but the content labeled in the list is the content that cannot be heard in the corresponding audio playing, for example, the Text content corresponding To the audio is "2012/13 season english-plus 12 th round", but the "12 rounds" cannot be heard, the corresponding labeling is performed, and the labeled content can be preset and then is provided for the user To select.
Fig. 4 is a flowchart illustrating a corpus tagging method according to another embodiment of the present invention. The method is suitable for a terminal with a display screen, and is executed by the terminal, as shown in fig. 4, and the method comprises the following steps:
step S401, acquiring and displaying a batch corpus list, where the batch corpus list includes an audio list and a text list, and each cell in the audio list corresponds to one cell in the text list.
In step S402, when an audio playing command for instructing the target audio in the audio list to be played is received, the target audio is played.
Step S403, receiving a text content corresponding to the target audio selected from a plurality of preset text contents as a text content marked on a cell corresponding to the target audio in the text list.
In this embodiment, steps S401 to S403 are similar to steps S101 to S103 shown in fig. 1, and refer to the above description specifically.
In the embodiment of the present invention, a user needs to label each audio file during the corpus labeling process, so that the steps S402 and S403 are repeatedly executed by the terminal before the corpus labeling task is completed.
Step S404, when a file export command is received, acquiring a file corresponding to the batch corpus list marked with the text content.
Step S405, exporting files corresponding to the batch corpus list marked with the text content, wherein the file export command is used for indicating the batch export of the files corresponding to the batch corpus list marked with the text content.
In this embodiment, the file export command is input by the user, after the task is completed, the file export command on the terminal may be clicked, and after the terminal receives the command, the file of the leather volume list marked with the text content is acquired, and then the file is exported.
According to the embodiment of the invention, after the corpus tagging task is completed, the user can derive the files of the batch corpus list through the file derivation command, so that the batch tagging of the corpus and the derivation of the files are realized, and the working efficiency is improved.
Optionally, in the process of exporting the file, the user may also select a filtering condition (such as corpus time, whether to label, a person in charge, etc.), export the required file, and perform analysis statistics.
FIG. 5 is a flowchart illustrating a corpus tagging method according to another embodiment of the present invention. The method is suitable for a terminal with a display screen, and is executed by the terminal, as shown in fig. 5, and the method comprises the following steps:
step S501, a batch corpus list is obtained and displayed, where the batch corpus list includes an audio list and a text list, and each cell in the audio list corresponds to one cell in the text list.
In this embodiment, step S501 is similar to step S101 shown in fig. 1, and refer to the above description specifically.
Step S502, judging whether the operation of selecting the cell in the audio list or the text list exists.
In step S503, when there is an operation of selecting a cell in the audio list or the text list, the selected cell in the audio list or the text list is searched.
And step S504, playing the audio corresponding to the selected cell.
In this embodiment, the selected cell is used as a condition for triggering audio playing, where the audio list corresponds to the cells in the text list one to one, and therefore, the cell may be a cell in the audio list or a cell in the text list. That is, the audio playing is triggered according to the active cell, and when the active cell moves, that is, when the target cell moves to the next cell, the audio file corresponding to the next cell is triggered to be played
As shown in fig. 2, when the batch corpus list is opened, the active cell is defaulted in the first cell of the text list, and at this time, the corresponding audio file is played, and when the active cell moves to the second cell, the audio content of the second cell is automatically played.
Optionally, in the embodiment of the present invention, the user may also control the audio to pause playing by clicking a mouse or inputting a shortcut key, and after receiving a corresponding pause command, the terminal pauses the currently played audio.
Step S505, receiving a text content corresponding to the target audio selected from a plurality of preset text contents as a text content labeled on a cell corresponding to the target audio in the text list.
In this embodiment, step S505 is similar to step S103 shown in fig. 1, and refer to the above description specifically.
According to the embodiment of the invention, the audio playing is automatically carried out according to the selected cell, and the user does not need to click, so that the working efficiency is improved.
As an optional implementation manner, receiving a text content corresponding to the target audio selected from a plurality of preset text contents includes: receiving a command for opening a drop-down list of cells corresponding to a target audio in a text list, and acquiring a plurality of preset text contents in the drop-down list, wherein each cell in the text list corresponds to one drop-down list; receiving a selection command which is input through a mouse and used for selecting a text from a plurality of preset text contents, and marking the text contents indicated by the selection command on the cells corresponding to the target audio.
In the embodiment of the invention, a corresponding drop-down list is configured for each cell in the text list in advance, the drop-down list comprises optional text contents, when a user carries out corpus annotation, the corresponding text contents can be directly selected from the drop-down list through a mouse, then a selected command is clicked for confirmation, and the terminal receives the text contents selected by the user and marks the text contents at the corresponding position, so that mouse selection quick marking is realized, and the corpus annotation efficiency is further improved. As shown in fig. 6, when the first column is marked, the user clicks the cell with a mouse, a drop-down list is called, and the terminal receives the command and outputs the drop-down list for the user to select.
As another optional implementation manner, receiving text content corresponding to a target audio selected from a plurality of preset text contents includes: receiving a command for opening a drop-down list of cells corresponding to a target audio in a text list, and acquiring a plurality of preset text contents in the drop-down list, wherein each cell in the text list corresponds to one drop-down list; receiving a selected command which is input through a keyboard and used for selecting a text from a plurality of preset text contents, determining the text content corresponding to the selected command, and marking the text content on a cell corresponding to the target audio.
The difference from the above scheme is that in this embodiment, the user may select corresponding text content from the pull-down list by using a preset shortcut key, for example, moving the active cell up and down by using the "↓", "↓" keys, and clicking the enter key to confirm the selection; or inputting corresponding sequence values according to the number of cells of the drop-down list, for example, if 5 sentences exist in the drop-down menu and the characters are input to be 1, selecting the 1 st sentence. According to the embodiment of the invention, the text content is selected for marking in a preset shortcut key mode, so that the text marking speed is further increased.
As an optional implementation manner, the corpus tagging method according to the embodiment of the present invention, after receiving a text content corresponding to the target audio selected from a plurality of preset text contents, further includes: judging whether an operation of selecting another cell in the text list exists or not; and when the operation of selecting another cell in the text list exists, saving the text content marked on the previous cell in the text list.
In this embodiment, after the target audio is labeled with the corresponding text content, if the active cell jumps to another cell, that is, selects another cell, the text content corresponding to the target audio is automatically saved. As shown in FIG. 2 and FIG. 3, the column of "state" represents the storage state of the annotation content corresponding to the audio, and when another cell is selected after the annotation is completed, the "not saved" state is automatically changed into the "saved" state.
It should be noted that, in the embodiment of the present invention, when performing corpus tagging, a user may input a command through a keyboard, may input a command through a mouse, or input a command in a manner of combining a keyboard and a mouse. Specifically, the description is made with reference to fig. 2 and 3, respectively.
The batch corpus list shown in fig. 2 may be labeled by using a full keyboard, or by using a keyboard and a mouse in combination, where the full keyboard labeling means that the corpus can be labeled only by using the keyboard during the labeling process.
Most of the labels using the batch corpus list are determined characters, and the labels are selected through shortcut keys, and the labeling process is as follows:
1) after the task is picked up, the page is marked, the linguistic data is automatically played, the playing of the audio can be controlled to be paused through a shortcut key (such as an A key), and the focus, namely the movable cell, is automatically focused in the first cell;
2) moving a focus to the next cell by a tab key or a direction right key, selecting the content in a pull-down list by a shortcut key (if the first one is selected, pressing the corresponding shortcut key, and returning), automatically matching the characters, filling the characters to the corresponding position, or inputting the characters for marking, wherein the shortcut key of the pull-down list is configured for a database and can be changed;
3) the focus moves to the right, the last record is automatically saved, and the corresponding state is changed from 'not saved' to 'saved';
4) marking one by one, and continuing page turning.
The batch corpus list shown in fig. 3 may be labeled by using a full mouse, specifically, the label is a TTS label, where the text content is known and cannot be clearly heard according to the recording label. Specifically, the method comprises the following steps:
1) entering a marked page, clicking to play, and automatically playing corresponding audio;
2) selecting characters by using a mouse, automatically displaying the characters in a correction column, clicking a reset button if the characters are marked and selected wrongly, and filling the remark content of the corpus;
3) marking a page, clicking a save button before turning the page, converting the state from 'not saved' to 'saved', and marking successfully.
In a second aspect of the embodiments of the present invention, there is provided a corpus tagging device, which is applicable to a terminal with a display screen, and the terminal can implement its function, as shown in fig. 7, the device includes: a first acquisition unit 701, a playback unit 702, and a reception unit 703.
The first obtaining unit 701 is configured to obtain and display a batch corpus list, where the batch corpus list includes an audio list and a text list, and each cell in the audio list corresponds to one cell in the text list.
The batch corpus list in this embodiment is preset, when a worker executes a corpus tagging task, the worker directly extracts a relevant task from the system, and the first obtaining unit 701 obtains the batch corpus list corresponding to the task. That is to say, in the embodiment of the present invention, the corpus tagging task is allocated downward in a task issuing manner. The corpus tagging task may be executed on a web page, and the first obtaining unit 701 obtains the batch corpus list and then displays the batch corpus list in the form of a web page.
Because the batch corpus list comprises the audio list and the text list, and the cells on the audio list and the text list are in one-to-one correspondence, the audio on the audio list can be directly triggered to be played, so that a user (or a worker) can conveniently label the audio.
The playing unit 702 is configured to play the target audio when receiving an audio playing command for instructing the target audio in the audio list to be played.
After the corpus tagging device displays the batch corpus list, the user may input an audio playing command to the terminal through an input device, such as a mouse or a keyboard, and the corpus tagging device responds, and the playing unit 702 plays a corresponding audio.
The receiving unit 703 is configured to receive a text content corresponding to the target audio selected from a plurality of preset text contents, as the text content marked on the cell corresponding to the target audio in the text list.
After the user hears the audio file played by the playing unit 702, the content to be heard is marked on the cell of the text list corresponding to the audio. The receiving unit 703 receives a text content selected by a user from a plurality of preset text contents, stores the text content in a corresponding cell, and completes the labeling of the audio corpus.
According to the embodiment of the invention, the audio file is corresponding to the text list of the corresponding content to be input in advance by adopting the batch corpus list, so that a user only needs to trigger audio playing on the batch corpus list and label the corresponding text content, the corpus labeling device carries out audio playing and receives the text content labeled by the user, and the labeling of the corpus can be completed without spending time to match the audio corpus with the audio name, the time cost of corpus labeling is reduced, and the working efficiency is improved.
FIG. 8 is a diagram illustrating a corpus tagging apparatus according to another embodiment of the present invention. The device is suitable for a terminal with a display screen, and the function of the terminal is realized by the terminal, as shown in fig. 8, the device comprises: the first acquiring unit 701, the playing unit 702, and the receiving unit 703 further include: a second acquisition unit 704 and a derivation unit 705.
The second obtaining unit 704 is configured to obtain a file corresponding to the batch corpus list marked with the text content when receiving a file export command.
The exporting unit 705 is configured to export a file corresponding to the batch corpus list marked with the text content, where the file exporting command is used to instruct to export the file corresponding to the batch corpus list marked with the text content in batches.
In this embodiment, the file export command is input by the user, after the task is completed, the file export command on the terminal may be clicked, and after the terminal receives the command, the file of the leather volume list marked with the text content is acquired, and then the file is exported.
According to the embodiment of the invention, after the corpus tagging task is completed, the user can derive the files of the batch corpus list through the file derivation command, so that the batch tagging of the corpus and the derivation of the files are realized, and the working efficiency is improved.
FIG. 9 is a diagram illustrating a corpus tagging apparatus according to another embodiment of the present invention. The device is suitable for a terminal with a display screen, and the function of the terminal is realized by the terminal, as shown in fig. 9, the device comprises: a first acquiring unit 701, a playing unit 702, and a receiving unit 703, wherein the playing unit 702 includes: a judging module 7021, a searching module 7022 and a playing module 7023.
The judging module 7021 is configured to judge whether there is an operation of selecting a cell in the audio list or the text list.
The searching module 7022 is configured to search for a selected cell in the audio list or the text list when there is an operation of selecting a cell in the audio list or the text list.
The playing module 7023 is configured to play the audio corresponding to the selected cell.
In this embodiment, the selected cell is used as a condition for triggering audio playing, where the audio list corresponds to the cells in the text list one to one, and therefore, the cell may be a cell in the audio list or a cell in the text list. That is, the audio playing is triggered according to the active cell, and when the active cell moves, that is, when the target cell moves to the next cell, the audio file corresponding to the next cell is triggered to be played
As shown in fig. 2, when the batch corpus list is opened, the active cell is defaulted in the first cell of the text list, and at this time, the corresponding audio file is played, and when the active cell moves to the second cell, the audio content of the second cell is automatically played.
As an optional implementation, the receiving unit includes: the first receiving module is used for receiving a command for opening a drop-down list of cells corresponding to a target audio in a text list, and acquiring a plurality of preset text contents in the drop-down list, wherein each cell in the text list corresponds to one drop-down list; and the second receiving module is used for receiving a selection command which is input by a mouse and used for selecting a text from a plurality of preset text contents, and marking the text content indicated by the selection command on a cell corresponding to the target audio.
In the embodiment of the invention, a corresponding drop-down list is configured for each cell in the text list in advance, the drop-down list comprises optional text contents, when a user carries out corpus annotation, the corresponding text contents can be directly selected from the drop-down list through a mouse, then a selected command is clicked for confirmation, and the terminal receives the text contents selected by the user and marks the text contents at the corresponding position, so that mouse selection quick marking is realized, and the corpus annotation efficiency is further improved. As shown in fig. 6, when the first column is marked, the user clicks the cell with a mouse, a drop-down list is called, and the terminal receives the command and outputs the drop-down list for the user to select.
As another optional implementation, the receiving unit includes: the third receiving module is used for receiving a command for opening a drop-down list of cells corresponding to the target audio in the text list, and acquiring a plurality of preset text contents in the drop-down list, wherein each cell in the text list corresponds to one drop-down list; and the fourth receiving module is used for receiving a selected command which is input through a keyboard and is used for selecting a text from a plurality of preset text contents, determining the text content corresponding to the selected command, and marking the text content on the cell corresponding to the target audio.
The difference from the above scheme is that in this embodiment, the user may select corresponding text content from the pull-down list by using a preset shortcut key, for example, moving the active cell up and down by using the "↓", "↓" keys, and clicking the enter key to confirm the selection; or inputting corresponding sequence values according to the number of cells of the drop-down list, for example, if 5 sentences exist in the drop-down menu and the characters are input to be 1, selecting the 1 st sentence. According to the embodiment of the invention, the text content is selected for marking in a preset shortcut key mode, so that the text marking speed is further increased.
As an optional implementation manner, the corpus tagging device according to the embodiment of the present invention further includes: the judging unit is used for judging whether the operation of selecting another cell in the text list exists after receiving the text content corresponding to the target audio selected from the plurality of preset text contents; and the storage unit is used for storing the text content marked on the previous cell in the text list when the operation of selecting another cell in the text list exists.
In this embodiment, after the target audio is labeled with the corresponding text content, if the active cell jumps to another cell, that is, selects another cell, the text content corresponding to the target audio is automatically saved. As shown in FIG. 2 and FIG. 3, the column of "state" represents the storage state of the annotation content corresponding to the audio, and when another cell is selected after the annotation is completed, the "not saved" state is automatically changed into the "saved" state.
Fig. 10 is a schematic diagram of a hardware structure of a terminal for performing a corpus tagging method according to an embodiment of the present invention, and as shown in fig. 10, the device includes one or more processors 100 and a memory 200, where one processor 100 is taken as an example in fig. 10.
The memory 200 stores instructions executable by the at least one processor 100, and the instructions are executed by the at least one processor 100 to enable the at least one processor 100 to execute the corpus tagging method according to the embodiment of the present invention.
The apparatus for performing the corpus tagging method may further include: an input device 300 and an output device 400.
The processor 100, the memory 200, the input device 300, and the output device 400 may be connected by a bus or other means, and the bus connection is exemplified in fig. 100.
Processor 100 may be a Central Processing Unit (CPU). The Processor 100 may also be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, or any combination thereof. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 200, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the corpus tagging method in the embodiment of the present application (for example, the first obtaining unit 701, the playing unit 702, and the receiving unit 703 shown in fig. 7). The processor 100 executes various functional applications of the server and data processing by running the non-transitory software programs, instructions and modules stored in the memory 200, so as to implement the corpus tagging method of the above method embodiment.
The memory 200 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created from use of the corpus tagging mechanism device, and the like. Further, the memory 200 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 200 may optionally include memory located remotely from the processor 100, and these remote memories may be connected to the corpus annotation device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 300 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the corpus tagging device. The output device 400 may include a display device such as a display screen.
The one or more modules are stored in the memory 200 and, when executed by the one or more processors 100, perform the method shown in fig. 1-3.
The product can execute the method provided by the embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method. Details of the technique not described in detail in the present embodiment may be specifically referred to the related description in the embodiments shown in fig. 1 to 3.
The embodiment of the invention also provides a non-transitory computer storage medium, wherein the computer storage medium stores computer executable instructions, and the computer executable instructions can execute the corpus tagging method in any method embodiment. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard disk (Hard disk Drive, abbreviated as HDD), a Solid State Drive (SSD), or the like; the storage medium may also comprise a combination of memories of the kind described above.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Although the embodiments of the present invention have been described in conjunction with the accompanying drawings, those skilled in the art may make various modifications and variations without departing from the spirit and scope of the invention, and such modifications and variations fall within the scope defined by the appended claims.

Claims (11)

1. A corpus tagging method is characterized by comprising the following steps:
acquiring and displaying a batch corpus list, wherein the batch corpus list comprises an audio list and a text list, and each cell in the audio list corresponds to one cell in the text list;
when an audio playing command for indicating target audio in the audio list to be played is received, playing the target audio;
and receiving the text content corresponding to the target audio selected from a plurality of preset text contents as the text content marked on the cell corresponding to the target audio in the text list.
2. The corpus tagging method according to claim 1, further comprising:
when a file export command is received, acquiring a file corresponding to a batch corpus list marked with text content;
and exporting the file corresponding to the batch corpus list marked with the text content, wherein the file export command is used for indicating that the file corresponding to the batch corpus list marked with the text content is exported in batch.
3. The corpus tagging method according to claim 1, wherein said playing the target audio upon receiving an audio playing command for instructing playing of the target audio in the audio list comprises:
judging whether the operation of selecting the cell in the audio list or the text list exists or not;
when the operation of selecting the cells in the audio list or the text list exists, searching the selected cells in the audio list or the text list;
and playing the audio corresponding to the selected cell.
4. The corpus tagging method according to claim 1, wherein receiving a text content corresponding to the target audio selected from a plurality of preset text contents comprises:
receiving a command for opening a drop-down list of cells corresponding to the target audio in the text list, and acquiring the plurality of preset text contents in the drop-down list, wherein each cell in the text list corresponds to one drop-down list; receiving a selection command which is input through a mouse and used for selecting a text from the preset text contents, and marking the text contents indicated by the selection command on the cell corresponding to the target audio; or,
receiving a command for opening a drop-down list of cells corresponding to the target audio in the text list, and acquiring the plurality of preset text contents in the drop-down list, wherein each cell in the text list corresponds to one drop-down list; receiving a selected command which is input through a keyboard and used for selecting a text from the preset text contents, determining the text content corresponding to the selected command, and marking the text content on the cell corresponding to the target audio.
5. The corpus tagging method according to claim 1, after receiving a text content corresponding to the target audio selected from a plurality of preset text contents, further comprising:
judging whether an operation of selecting another cell in the text list exists or not;
and when the operation of selecting another cell in the text list exists, saving the text content marked on the previous cell in the text list.
6. A corpus tagging device, comprising:
the system comprises a first acquisition unit, a second acquisition unit and a display unit, wherein the first acquisition unit is used for acquiring and displaying a batch corpus list, the batch corpus list comprises an audio list and a text list, and each cell in the audio list corresponds to one cell in the text list;
the playing unit is used for playing the target audio when receiving an audio playing command for indicating the target audio in the audio list to be played;
and the receiving unit is used for receiving the text content corresponding to the target audio selected from the plurality of preset text contents as the text content marked on the cell corresponding to the target audio in the text list.
7. The corpus tagging device of claim 6, further comprising:
the second acquisition unit is used for acquiring a file corresponding to the batch corpus list marked with the text content when a file export command is received;
and the exporting unit is used for exporting the files corresponding to the batch corpus list marked with the text contents, wherein the file exporting command is used for indicating the batch export of the files corresponding to the batch corpus list marked with the text contents.
8. The corpus tagging device of claim 6, wherein said playing unit comprises:
the judging module is used for judging whether the operation of selecting the cells in the audio list or the text list exists or not;
the searching module is used for searching the selected cell in the audio list or the text list when the operation of selecting the cell in the audio list or the text list exists;
and the playing module is used for playing the audio corresponding to the selected cell.
9. The corpus tagging device of claim 6, wherein the receiving unit comprises:
a first receiving module, configured to receive a command for opening a drop-down list of cells corresponding to the target audio in the text list, and obtain the multiple preset text contents in the drop-down list, where each cell in the text list corresponds to one drop-down list; the second receiving module is used for receiving a selection command which is input by a mouse and used for selecting a text from the preset text contents, and marking the text content indicated by the selection command on a cell corresponding to the target audio; or,
a third receiving module, configured to receive a command for opening a drop-down list of cells corresponding to the target audio in the text list, and obtain the multiple preset text contents in the drop-down list, where each cell in the text list corresponds to one drop-down list; and the fourth receiving module is used for receiving a selected command which is input through a keyboard and is used for selecting a text from the preset text contents, determining the text content corresponding to the selected command, and marking the text content on the cell corresponding to the target audio.
10. The corpus tagging device of claim 6, further comprising:
the judging unit is used for judging whether an operation of selecting another cell in the text list exists after receiving the text content corresponding to the target audio selected from a plurality of preset text contents;
and the storage unit is used for storing the text content marked on the previous cell in the text list when the operation of selecting another cell in the text list exists.
11. A terminal, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the corpus tagging method of any one of claims 1-5.
CN201611097247.5A 2016-12-02 2016-12-02 A kind of corpus labeling method and device and terminal Pending CN106782509A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611097247.5A CN106782509A (en) 2016-12-02 2016-12-02 A kind of corpus labeling method and device and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611097247.5A CN106782509A (en) 2016-12-02 2016-12-02 A kind of corpus labeling method and device and terminal

Publications (1)

Publication Number Publication Date
CN106782509A true CN106782509A (en) 2017-05-31

Family

ID=58883160

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611097247.5A Pending CN106782509A (en) 2016-12-02 2016-12-02 A kind of corpus labeling method and device and terminal

Country Status (1)

Country Link
CN (1) CN106782509A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108897869A (en) * 2018-06-29 2018-11-27 北京百度网讯科技有限公司 Corpus labeling method, device, equipment and storage medium
CN109582925A (en) * 2018-11-08 2019-04-05 厦门快商通信息技术有限公司 A kind of corpus labeling method and system of man-computer cooperation
CN112185351A (en) * 2019-07-05 2021-01-05 北京猎户星空科技有限公司 Voice signal processing method and device, electronic equipment and storage medium
CN113407745A (en) * 2021-06-30 2021-09-17 北京百度网讯科技有限公司 Data annotation method and device, electronic equipment and computer readable storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108897869A (en) * 2018-06-29 2018-11-27 北京百度网讯科技有限公司 Corpus labeling method, device, equipment and storage medium
CN108897869B (en) * 2018-06-29 2020-10-27 北京百度网讯科技有限公司 Corpus labeling method, apparatus, device and storage medium
CN109582925A (en) * 2018-11-08 2019-04-05 厦门快商通信息技术有限公司 A kind of corpus labeling method and system of man-computer cooperation
CN109582925B (en) * 2018-11-08 2023-02-14 厦门快商通信息技术有限公司 Man-machine combined corpus labeling method and system
CN112185351A (en) * 2019-07-05 2021-01-05 北京猎户星空科技有限公司 Voice signal processing method and device, electronic equipment and storage medium
CN112185351B (en) * 2019-07-05 2024-05-24 北京猎户星空科技有限公司 Voice signal processing method and device, electronic equipment and storage medium
CN113407745A (en) * 2021-06-30 2021-09-17 北京百度网讯科技有限公司 Data annotation method and device, electronic equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN110164435B (en) Speech recognition method, device, equipment and computer readable storage medium
US10733197B2 (en) Method and apparatus for providing information based on artificial intelligence
US8635059B2 (en) Providing alternative translations
US8745051B2 (en) Resource locator suggestions from input character sequence
CN111898388B (en) Video subtitle translation editing method and device, electronic equipment and storage medium
CN113127708B (en) Information interaction method, device, equipment and storage medium
TWI510965B (en) Input method editor integration
CN110955428A (en) Page display method and device, electronic equipment and medium
CN106484131B (en) Input error correction method and input method device
KR20150017156A (en) Method and apparatus for providing recommendations on portable terminal
CN111128254B (en) Audio playing method, electronic equipment and storage medium
CN106782509A (en) A kind of corpus labeling method and device and terminal
CN109165336B (en) Information output control method and family education equipment
US12067055B2 (en) Information display method and electronic apparatus
CN116188250A (en) Image processing method, device, electronic equipment and storage medium
EP3029567A1 (en) Method and device for updating input method system, computer storage medium, and device
CN104182381A (en) character input method and system
CN111723235A (en) Music content identification method, device and equipment
CN103632668A (en) Method and apparatus for training English voice model based on Chinese voice information
CN117313675A (en) Text modification method, apparatus, computer device and computer readable storage medium
CN107861706A (en) The response method and device of a kind of phonetic order
CN116431843A (en) Multimedia resource processing method and device, electronic equipment and storage medium
CN116152827A (en) Question answering method, device, equipment and medium
CN113722467B (en) Processing method, system, device and storage medium for user search intention
KR20060118253A (en) A computer-readable recording medium having recorded thereon an autocomplete query providing system, method and program for executing the method.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170531