[go: up one dir, main page]

CN113569040A - A question generation method, device and computing device based on judicial trial - Google Patents

A question generation method, device and computing device based on judicial trial Download PDF

Info

Publication number
CN113569040A
CN113569040A CN202010357575.4A CN202010357575A CN113569040A CN 113569040 A CN113569040 A CN 113569040A CN 202010357575 A CN202010357575 A CN 202010357575A CN 113569040 A CN113569040 A CN 113569040A
Authority
CN
China
Prior art keywords
vector
trial
text
splicing
dialogue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010357575.4A
Other languages
Chinese (zh)
Inventor
姬长阵
周鑫
张雅婷
孙常龙
张琼
司罗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010357575.4A priority Critical patent/CN113569040A/en
Publication of CN113569040A publication Critical patent/CN113569040A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/18Legal services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Tourism & Hospitality (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Economics (AREA)
  • Technology Law (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Machine Translation (AREA)

Abstract

本发明公开了一种基于司法庭审的问题生成方法、装置及计算设备,方法包括:利用编码器对诉状文本进行编码处理,生成针对所述诉状文本的第一语义向量;利用分类器对所述第一语义向量进行处理,得到表示所述诉状文本的案件类别的类别标签;将所述第一语义向量与所述类别标签进行拼接,得到第一拼接向量;利用解码器对所述第一拼接向量进行解码处理,生成庭审法官的首个提问语句。

Figure 202010357575

The invention discloses a question generation method, device and computing device based on judicial trial. The method includes: using an encoder to encode a pleading text to generate a first semantic vector for the pleading text; The first semantic vector is processed to obtain a category label representing the case category of the complaint text; the first semantic vector is spliced with the category label to obtain a first splicing vector; the first splicing vector is obtained by using a decoder The vector is decoded to generate the first question sentence of the trial judge.

Figure 202010357575

Description

Problem generation method and device based on judicial court trial and computing equipment
Technical Field
The invention relates to the field of natural language processing, in particular to a problem generation method and device based on judicial court trial and computing equipment.
Background
With the development of social economy and the improvement of cultural education degree, people have stronger and stronger legal consciousness, so that more and more cases are treated in courts. In the traditional court trial, the court hearing officer issues questions on original reports and posters, and the original reports and the posters answer related questions to form a court trial record. The court trial mode needs longer time, costs higher manpower and material resources and has low efficiency.
In view of this, how to provide an intelligent questioning scheme based on judicial court trial, thereby improving court trial efficiency, becomes a technical problem to be solved urgently.
Disclosure of Invention
In view of the above, the present invention has been developed to provide a judicial court trial problem generation method, apparatus and computing device that overcome, or at least partially address, the above-discussed problems.
According to one aspect of the present invention, there is provided a judicial court trial-based problem generation method, including:
encoding the complaint text by using an encoder to generate a first semantic vector aiming at the complaint text;
processing the first semantic vector by using a classifier to obtain a category label representing the case category of the appeal shape text;
splicing the first semantic vector and the category label to obtain a first spliced vector;
and decoding the first splicing vector by using a decoder to generate a first question sentence of the court hearing officer.
Optionally, the judicial court trial-based problem generation method according to the present invention further includes: encoding historical dialogue of the court trial by using the encoder to generate a second semantic vector aiming at the historical dialogue, wherein the historical dialogue comprises question sentences of the court trial officer and original told answer sentences; splicing the second semantic vector with the category label to obtain a second spliced vector; and decoding the second spliced vector by using the decoder to generate a subsequent question sentence of the court trial judge.
Optionally, in the judicial court trial-based problem generation method according to the present invention, the encoding, with the encoder, the historical dialogue of the court trial includes: obtaining a preset number of latest dialogue sentences from the historical dialogue; and encoding the acquired dialogue statement by using the encoder.
Optionally, in the judicial court trial-based problem generation method according to the present invention, the obtaining a predetermined number of latest dialogue sentences from the historical dialogue includes: and when the number of the dialogue sentences in the historical dialogue is less than the preset number, acquiring all the dialogue sentences in the historical dialogue.
Optionally, in the judicial court trial based problem generation method according to the present invention, the encoder and the decoder employ an RNN network, an LSTM network, or a GRU network.
Optionally, in the judicial trial-based problem generation method according to the invention, the classifier adopts a SoftMax classifier
According to an aspect of the present invention, there is provided a judicial court trial-based problem generation apparatus including:
the system comprises an encoder, a processing unit and a processing unit, wherein the encoder is suitable for encoding the complaint text and generating a first semantic vector aiming at the complaint text;
the classifier is suitable for processing the first semantic vector to obtain a category label representing the case category of the appeal shape text;
the splicing unit is suitable for splicing the first semantic vector and the category label to obtain a first splicing vector;
and the decoder is suitable for decoding the first splicing vector to generate a first question sentence of the court trial judge.
Optionally, in the judicial court trial-based question generation apparatus according to the present invention, the encoder is further adapted to perform an encoding process on a historical dialogue of the court trial, which includes a question sentence of the court trial officer and an original posted answer sentence, to generate a second semantic vector for the historical dialogue; the splicing unit is further adapted to splice the second semantic vector with the category label to obtain a second spliced vector; the decoder is further adapted to decode the second stitched vector to generate a subsequent question sentence of the court trial judge.
According to yet another aspect of the invention, there is provided a computing device comprising: at least one processor; and a memory storing program instructions, wherein the program instructions are configured to be executed by the at least one processor, the program instructions comprising instructions for performing the above-described method.
According to yet another aspect of the present invention, there is provided a readable storage medium storing program instructions which, when read and executed by a computing device, cause the computing device to perform the above-described method.
The neural network language model is applied to the judicial field, case category information is automatically determined based on the complaint text, the first question sentence of the court hearing judge is determined based on the complaint text and the case category information, the next question sentence is automatically generated based on the historical court hearing conversation and the case category information, and the court hearing efficiency can be obviously improved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 illustrates a schematic diagram of a judicial court trial-based questioning system 100, according to one embodiment of the present invention;
FIG. 2 shows a schematic diagram of a computing device 200, according to one embodiment of the invention;
FIG. 3 illustrates a flow diagram of a judicial court trial-based problem generation method 300, according to one embodiment of the invention;
fig. 4 illustrates a schematic diagram of a judicial court trial-based problem generation apparatus 400, in accordance with one embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
FIG. 1 shows a schematic diagram of a judicial court trial-based questioning system 100, according to one embodiment of the present invention. As shown in fig. 1, the questioning system 100 includes a terminal apparatus 110 and a computing apparatus 200.
The terminal device 110 may specifically be a personal computer such as a desktop computer and a notebook computer, and may also be a mobile phone, a tablet computer, a multimedia device, a smart speaker, a smart wearable device, and the like, but is not limited thereto. Computing device 200 is used to provide services to terminal device 110, which may be implemented as a server, such as an application server, a Web server, or the like; but may also be implemented as a desktop computer, a notebook computer, a processor chip, a tablet computer, etc., but is not limited thereto.
According to one embodiment, the computing device 200 may provide intelligent court trial services, and the terminal device 110 may establish a connection with the computing device 200 via the internet, such that a user may have a human-computer conversation with the computing device 200 via the terminal device 110. Specifically, the computing device 200 may automatically determine case category information based on the complaint text, automatically generate a first question sentence of the court hearing officer based on the complaint text and the case category information, send the question sentence to the terminal device 110, and broadcast the question sentence by the terminal device 110.
The terminal device 110 may also collect voice data of the user, such as voice data that was originally reported to answer a question sentence, and perform voice recognition processing on the voice data to obtain an answer sentence, or the terminal device may also transmit the voice data to the computing device 200, and perform voice recognition processing on the voice data by the computing device 200 to obtain an answer sentence.
Further, the computing device 200 may also automatically generate a next question sentence to send to the terminal device 110 based on the historical court trial dialog (including the automatically generated question sentence and the original tolled answer sentence) and the case category information.
In one embodiment, the judicial court trial-based questioning system 100 further includes a data storage device 120. The data storage 120 may be a relational database such as MySQL, ACCESS, etc., or a non-relational database such as NoSQL, etc.; the data storage device 120 may be a local database residing in the computing device 200, or may be disposed at a plurality of geographic locations as a distributed database, such as HBase, in short, the data storage device 120 is used for storing data, and the present invention is not limited to the specific deployment and configuration of the data storage device 120. The computing device 200 may connect with the data storage 120 and retrieve data stored in the data storage 120. For example, the computing device 200 may directly read the data in the data storage 120 (when the data storage 120 is a local database of the computing device 200), or may access the internet in a wired or wireless manner and obtain the data in the data storage 120 through a data interface.
In an embodiment of the present invention, the data storage 120 is adapted to store a text generation model adapted to generate a question statement for a court hearing judge based on the complaint text and/or the court trial conversation and the case category information. The text generation model is a sequence-to-sequence (seq2seq) model and comprises an encoder and a decoder, wherein the encoder is suitable for encoding the complaint text or the court trial conversation and generating a first semantic vector aiming at the complaint text or a second semantic vector aiming at the court trial conversation; the text generation model further comprises a classifier which is suitable for processing the first semantic vector to obtain a category label of a case category representing the complaint text; the input of the decoder is a first spliced vector of the first semantic vector and the category label or a second spliced vector of the second semantic vector and the category label, the first spliced vector is decoded to generate a first question sentence of the court trial judge, and the second spliced vector is decoded to generate a subsequent question sentence of the court trial judge.
The data storage 120 is further adapted to store historical court trial data, which the computing device 200 may use as a training sample set to train the text generation model described above. Specifically, each training sample in the training sample set comprises a complaint text with labeled case category labels and a court trial record comprising question sentences of court officers in the court trial and answer sentences of the original postings.
The judicial court trial-based problem generation method of embodiments of the present invention may be performed in the computing device 200. FIG. 2 shows a block diagram of a computing device 200, according to one embodiment of the invention. As shown in FIG. 2, in a basic configuration 202, a computing device 200 typically includes a system memory 206 and one or more processors 204. A memory bus 208 may be used for communication between the processor 204 and the system memory 206.
Depending on the desired configuration, the processor 204 may be any type of processing, including but not limited to: a microprocessor (μ P), a microcontroller (μ C), a Digital Signal Processor (DSP), or any combination thereof. The processor 204 may include one or more levels of cache, such as a level one cache 210 and a level two cache 212, a processor core 214, and registers 216. Example processor cores 214 may include Arithmetic Logic Units (ALUs), Floating Point Units (FPUs), digital signal processing cores (DSP cores), or any combination thereof. The example memory controller 218 may be used with the processor 204, or in some implementations the memory controller 218 may be an internal part of the processor 204.
Depending on the desired configuration, system memory 206 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. System memory 106 may include an operating system 220, one or more applications 222, and program data 224. The application 222 is actually a plurality of program instructions that direct the processor 204 to perform corresponding operations. In some embodiments, application 222 may be arranged to cause processor 204 to operate with program data 224 on an operating system.
Computing device 200 may also include an interface bus 240 that facilitates communication from various interface devices (e.g., output devices 242, peripheral interfaces 244, and communication devices 246) to the basic configuration 202 via the bus/interface controller 230. The example output device 242 includes a graphics processing unit 248 and an audio processing unit 250. They may be configured to facilitate communication with various external devices, such as a display or speakers, via one or more a/V ports 252. Example peripheral interfaces 244 can include a serial interface controller 254 and a parallel interface controller 256, which can be configured to facilitate communications with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 258. An example communication device 246 may include a network controller 260, which may be arranged to facilitate communications with one or more other computing devices 262 over a network communication link via one or more communication ports 264.
A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media, such as carrier waves or other transport mechanisms, in a modulated data signal. A "modulated data signal" may be a signal that has one or more of its data set or its changes made in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, Radio Frequency (RF), microwave, Infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media.
In a computing device 200 according to the present invention, the application 222 comprises a judicial court trial-based problem generation apparatus 400, the apparatus 400 comprising a plurality of program instructions that may direct the processor 104 to perform the judicial court trial-based problem generation method 300.
FIG. 3 illustrates a flow diagram of a judicial court trial-based problem generation method 300, according to one embodiment of the invention. The method 300 is suitable for execution in a computing device, such as the computing device 200 described above.
As shown in fig. 3, the method 300 begins at step S310. In step S310, the encoder encodes the complaint text, and generates a first semantic vector for the complaint text. The appetitive text can be the text corresponding to the appetitive or the text corresponding to the appetitive. If the appealing form is a paper edition, an OCR automatic scanning system can be started, characters in the paper document are converted into image files of black and white dot matrixes in an optical mode, and the characters in the images are converted into text formats through recognition software. If the appeal is an electronic version in a text format, the appeal can be directly processed.
Before the appealing text is input into the encoder, sentence division processing is firstly carried out on the appealing text to obtain a plurality of sentences aiming at the appealing text, then word division processing is carried out on each sentence to obtain a plurality of words aiming at each sentence, and finally each word is converted into a word vector (embedding) to obtain a word vector sequence aiming at the appealing text. And inputting the word vectors in the word vector sequence to an encoder in sequence, and outputting a first semantic vector aiming at the appeal text by the encoder.
The encoder may employ a time series based neural network, such as an RNN network, an LSTM network, or a GRU network.
In one implementation, an attention mechanism (attention) may also be introduced in the encoder. Specifically, when each word vector in the word vector sequence is processed in the encoder, a hidden vector is correspondingly generated, and the attention weight corresponding to each hidden vector is obtained, and the hidden vector sequences corresponding to the word vector sequence are subjected to weighted summation based on the attention weight, so that the first semantic vector with attention can be obtained.
In step S320, the first semantic vector is processed by the classifier to obtain a category tag indicating a case category of the complaint text. The classifier can adopt a SoftMax classifier or other known classifiers, and after the first semantic vector is input into the classifier, the classifier can output a corresponding class label. The classifier can classify the complaint texts at multiple levels, for example, case categories include several major categories such as civil disputes, criminal disputes and administrative disputes, the civil disputes are further divided into several minor categories such as property disputes, divorce disputes, damage compensation disputes, contract disputes and copyright disputes, and each category has a category label (category code).
For example, the input complaint text is: the same loan of 80 ten thousand yuan between the original Levain of the institute of complaints and the original Levain of the east Shangyang people court 2013 of the national institute of famous people number 1881 is already agreed with the settlement of 1881, and the settlement is repeated or false litigation to request the refution of the original appeal of the court. "
Then after being classified by the classifier, the determined case category is as follows: civil dispute-property dispute.
In step S330, the first semantic vector and the category label are spliced to obtain a first spliced vector.
In step S340, the decoder decodes the first mosaic vector to generate a first question sentence of the court trial officer. The decoder may employ a time series based neural network, such as an RNN network, LSTM network, or GRU network. And the decoder decodes the probability of the next word in the whole word list in each step, selects the word with the maximum probability as the next word to be generated, and finally decodes the first question sentence.
For example, the first question sentence generated for the above complaint text is:
"the rules of who advocates and testifies according to the Min complaints" are in accordance with the regulations of the highest Min court about civil litigation ". Is there new evidence provided by the original party? "
After the computing device 200 generates the first question sentence, the question sentence is sent to the terminal device 110, and the terminal device 110 broadcasts the question sentence, so that the original report answers the question sentence. The terminal device 110 may collect voice data of an original reported answer, and the voice data may be subjected to voice Recognition (ASR) through the terminal device 110 or the computing device 200 to obtain an answer sentence (text). ASR speech-to-text is transcribed in real time, and in order to avoid ambiguity problems, noise reduction processing is firstly carried out on court trial utterances. In the transcription process, the minimum unit of the common transcription is the word level, the maximum unit is the sentence level, and then the result of the ASR is subjected to smoothing processing, including sentence break error elimination, repeated spoken language deletion, entity recognition error elimination, legal phrase recognition error elimination and the like.
In step S350, the encoding process is performed on the historical dialogue of the court trial by using the encoder, and a second semantic vector for the historical dialogue is generated.
The historical dialogue is a court trial dialogue from the current time in the court trial process, comprises a question sentence of a court trial judge and an original answered sentence, and can acquire a preset number (for example, 5) of latest dialogue sentences from the historical dialogue.
Before inputting the acquired historical dialogue into an encoder, sentence division processing is firstly carried out on the historical dialogue to obtain a plurality of sentences, then word division processing is carried out on each sentence to obtain a plurality of words aiming at each sentence, and finally each word is converted into a word vector (embedding) to obtain a word vector sequence aiming at the historical dialogue. And inputting the word vectors in the word vector sequence to an encoder in sequence, and outputting a second semantic vector aiming at the historical dialogue by the encoder.
When each word vector in the word vector sequence is processed in the encoder, a hidden vector is correspondingly generated, and a second semantic vector with attention can be obtained by obtaining the attention weight corresponding to each hidden vector and performing weighted summation on the hidden vector sequence corresponding to the word vector sequence based on the attention weight.
In step S360, the second semantic vector is spliced with the category label to obtain a second spliced vector.
In step S370, the decoder decodes the second mosaic vector to generate a subsequent question sentence of the court trial judge.
And repeating the steps S350 to S370 to finish the court trial and form a court trial record.
The following describes a training process of the text generation model in the embodiment of the present invention.
As previously mentioned, the text generation model is a sequence-to-sequence (seq2seq) model that includes an encoder and a decoder, and in embodiments of the present invention, the text generation model further includes a classifier coupled to the encoder and the decoder.
The text generation model described above may be trained using historical court trial data as a training sample set. Specifically, each training sample in the training sample set comprises a complaint text with labeled case category labels and a court trial record comprising question sentences of court officers in the court trial and answer sentences of the original postings. Inputting a training sample into a text generation model to be trained, determining a first loss according to the output of a classifier and a labeled class label, determining a second loss according to the output of a decoder and a question sentence in the training sample, and adjusting a model parameter based on the sum of the first loss and the second loss until the model converges to obtain the trained text generation model.
Fig. 4 illustrates a schematic diagram of a judicial court trial-based problem generation apparatus 400, the apparatus 400 residing in a computing device, according to one embodiment of the invention. Referring to fig. 4, the apparatus 400 includes:
the encoder 410 is suitable for encoding the complaint text and generating a first semantic vector aiming at the complaint text;
a classifier 420 adapted to process the first semantic vector to obtain a category label representing a case category of the complaint text;
the splicing unit 430 is adapted to splice the first semantic vector and the category label to obtain a first spliced vector;
and the decoder 440 is suitable for decoding the first spliced vector to generate the first question sentence of the court hearing officer.
The encoder 410 is further adapted to encode historical dialogue of the court trial to generate a second semantic vector for the historical dialogue, wherein the historical dialogue comprises question sentences and original told answer sentences of the court trial officer;
the splicing unit 430 is further adapted to splice the second semantic vector with the category label to obtain a second spliced vector;
the decoder 440 is further adapted to decode the second stitched vector to generate a subsequent question sentence of the court trial officer.
The specific processing performed by the encoder 410, the classifier 420, the splicing unit 430 and the decoder 440 can refer to the method 300, which is not described herein again.
The application scenario of the above embodiment is an online court trial system designed for a court. Based on a similar principle, the problem generation method of the embodiment of the invention can also be applied to other application scenarios, for example, the problem generation method can also comprise judicial institutions such as public security inquiries and inspection yards. Under a public security interrogation scene, question sentences of public security personnel can be automatically generated; under the examination scene of the examination hall, question sentences of the examination personnel can be automatically generated.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual machine, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Claims (10)

1.一种基于司法庭审的问题生成方法,包括:1. A question generation method based on judicial trial, comprising: 利用编码器对诉状文本进行编码处理,生成针对所述诉状文本的第一语义向量;The pleading text is encoded by an encoder to generate a first semantic vector for the pleading text; 利用分类器对所述第一语义向量进行处理,得到表示所述诉状文本的案件类别的类别标签;Using a classifier to process the first semantic vector to obtain a category label representing the case category of the pleading text; 将所述第一语义向量与所述类别标签进行拼接,得到第一拼接向量;Splicing the first semantic vector and the category label to obtain a first splicing vector; 利用解码器对所述第一拼接向量进行解码处理,生成庭审法官的首个提问语句。The first spliced vector is decoded by a decoder to generate the first question sentence of the trial judge. 2.如权利要求1所述的方法,还包括:2. The method of claim 1, further comprising: 利用所述编码器对庭审的历史对话进行编码处理,生成针对所述历史对话的第二语义向量,其中所述历史对话包括庭审法官的提问语句和原被告的回答语句;Utilize the encoder to encode the historical dialogue of the court trial, and generate a second semantic vector for the historical dialogue, wherein the historical dialogue includes the questioning sentence of the trial judge and the answering sentence of the plaintiff and the defendant; 将所述第二语义向量与所述类别标签进行拼接,得到第二拼接向量;Splicing the second semantic vector and the category label to obtain a second splicing vector; 利用所述解码器对所述第二拼接向量进行解码处理,生成庭审法官的后续提问语句。The second spliced vector is decoded by the decoder to generate a subsequent question sentence of the trial judge. 3.如权利要求2所述的方法,其中,所述利用所述编码器对庭审的历史对话进行编码处理,包括:3. The method according to claim 2, wherein the encoding processing of the historical dialogue of the court trial by using the encoder comprises: 从所述历史对话中获取预定数目个最近的对话语句;obtaining a predetermined number of recent dialogue sentences from the historical dialogue; 利用所述编码器对所获取的对话语句进行编码。The acquired dialogue sentences are encoded using the encoder. 4.如权利要求3所述的方法,其中,所述从所述历史对话中获取预定数目个最近的对话语句,包括:4. The method of claim 3, wherein the obtaining a predetermined number of recent dialogue sentences from the historical dialogue comprises: 当所述历史对话中包括的对话语句的数量小于所述预定数目时,则获取所述历史对话中的所有对话语句。When the number of dialogue sentences included in the historical dialogue is less than the predetermined number, all dialogue sentences in the historical dialogue are acquired. 5.如权利要求1至4中任一项所述的方法,其中,所述编码器和解码器采用RNN网络、LSTM网络或者GRU网络。5. The method of any one of claims 1 to 4, wherein the encoder and decoder employ an RNN network, an LSTM network or a GRU network. 6.如权利要求1至5中任一项所述的方法,其中,所述分类器采用SoftMax分类器。6. The method of any one of claims 1 to 5, wherein the classifier employs a SoftMax classifier. 7.一种基于司法庭审的问题生成装置,包括:7. A question generation device based on judicial trial, comprising: 编码器,适于对诉状文本进行编码处理,生成针对所述诉状文本的第一语义向量;an encoder, adapted to encode the pleading text, and generate a first semantic vector for the pleading text; 分类器,适于对所述第一语义向量进行处理,得到表示所述诉状文本的案件类别的类别标签;a classifier, adapted to process the first semantic vector to obtain a class label representing the case class of the pleading text; 拼接单元,适于将所述第一语义向量与所述类别标签进行拼接,得到第一拼接向量;a splicing unit, adapted to splicing the first semantic vector and the category label to obtain a first splicing vector; 解码器,适于对所述第一拼接向量进行解码处理,生成庭审法官的首个提问语句。The decoder is adapted to decode the first spliced vector to generate the first question sentence of the trial judge. 8.如权利要求7所述的装置,其中:8. The apparatus of claim 7, wherein: 所述编码器还适于,对庭审的历史对话进行编码处理,生成针对所述历史对话的第二语义向量,其中所述历史对话包括庭审法官的提问语句和原被告的回答语句;The encoder is further adapted to perform encoding processing on the historical dialogue of the court trial to generate a second semantic vector for the historical dialogue, wherein the historical dialogue includes the questioning sentence of the trial judge and the answering sentence of the plaintiff and the defendant; 所述拼接单元还适于,将所述第二语义向量与所述类别标签进行拼接,得到第二拼接向量;The splicing unit is further adapted to splicing the second semantic vector and the category label to obtain a second splicing vector; 所述解码器还适于,对所述第二拼接向量进行解码处理,生成庭审法官的后续提问语句。The decoder is further adapted to perform decoding processing on the second spliced vector to generate a subsequent question sentence of the trial judge. 9.一种计算设备,包括:9. A computing device comprising: 至少一个处理器;和at least one processor; and 存储有程序指令的存储器,其中,所述程序指令被配置为适于由所述至少一个处理器执行,所述程序指令包括用于执行如权利要求1-6中任一项所述方法的指令。a memory storing program instructions, wherein the program instructions are configured to be adapted for execution by the at least one processor, the program instructions comprising instructions for performing the method of any of claims 1-6 . 10.一种存储有程序指令的可读存储介质,当所述程序指令被计算设备读取并执行时,使得所述计算设备执行如权利要求1-6中任一项所述的方法。10. A readable storage medium storing program instructions which, when read and executed by a computing device, cause the computing device to perform the method of any one of claims 1-6.
CN202010357575.4A 2020-04-29 2020-04-29 A question generation method, device and computing device based on judicial trial Pending CN113569040A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010357575.4A CN113569040A (en) 2020-04-29 2020-04-29 A question generation method, device and computing device based on judicial trial

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010357575.4A CN113569040A (en) 2020-04-29 2020-04-29 A question generation method, device and computing device based on judicial trial

Publications (1)

Publication Number Publication Date
CN113569040A true CN113569040A (en) 2021-10-29

Family

ID=78157770

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010357575.4A Pending CN113569040A (en) 2020-04-29 2020-04-29 A question generation method, device and computing device based on judicial trial

Country Status (1)

Country Link
CN (1) CN113569040A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992112A (en) * 2023-06-30 2023-11-03 百度在线网络技术(北京)有限公司 Data generation method and device, electronic equipment and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160117314A1 (en) * 2014-10-27 2016-04-28 International Business Machines Corporation Automatic Question Generation from Natural Text
CN110059174A (en) * 2019-04-28 2019-07-26 科大讯飞股份有限公司 Inquiry guidance method and device
CN110704571A (en) * 2019-08-16 2020-01-17 平安科技(深圳)有限公司 Court trial auxiliary processing method, trial auxiliary processing device, equipment and medium
CN110765246A (en) * 2019-09-29 2020-02-07 平安直通咨询有限公司上海分公司 Question answering method and device based on intelligent robot, storage medium and intelligent equipment
CN110866174A (en) * 2018-08-17 2020-03-06 阿里巴巴集团控股有限公司 Pushing method, device and system for court trial problems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160117314A1 (en) * 2014-10-27 2016-04-28 International Business Machines Corporation Automatic Question Generation from Natural Text
CN110866174A (en) * 2018-08-17 2020-03-06 阿里巴巴集团控股有限公司 Pushing method, device and system for court trial problems
CN110059174A (en) * 2019-04-28 2019-07-26 科大讯飞股份有限公司 Inquiry guidance method and device
CN110704571A (en) * 2019-08-16 2020-01-17 平安科技(深圳)有限公司 Court trial auxiliary processing method, trial auxiliary processing device, equipment and medium
CN110765246A (en) * 2019-09-29 2020-02-07 平安直通咨询有限公司上海分公司 Question answering method and device based on intelligent robot, storage medium and intelligent equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992112A (en) * 2023-06-30 2023-11-03 百度在线网络技术(北京)有限公司 Data generation method and device, electronic equipment and medium
CN116992112B (en) * 2023-06-30 2025-07-25 百度在线网络技术(北京)有限公司 Data generation method and device, electronic equipment and medium

Similar Documents

Publication Publication Date Title
CN115114919B (en) Method and device for presenting prompt information and storage medium
CN113421551B (en) Speech recognition method, speech recognition device, computer readable medium and electronic equipment
CN112632244A (en) Man-machine conversation optimization method and device, computer equipment and storage medium
CN116561570A (en) Training method, device and equipment for multi-mode model and readable storage medium
CN113887214B (en) Willingness presumption method based on artificial intelligence and related equipment thereof
CN116453023B (en) Video abstraction system, method, electronic equipment and medium for 5G rich media information
CN113591472A (en) Lyric generation method, lyric generation model training method and device and electronic equipment
CN116775873A (en) A multimodal dialogue emotion recognition method
CN116467603A (en) Model training method, device and equipment
Chen et al. Adversarial attack and defense on deep neural network-based voice processing systems: An overview
WO2025044865A1 (en) Cross-domain problem processing methods and apparatuses, electronic device and storage medium
CN117033722A (en) Financial fraud prevention knowledge dispersion method, device, equipment and storage medium
CN115035351B (en) Image-based information extraction method, model training method, device, equipment and storage medium
CN113569040A (en) A question generation method, device and computing device based on judicial trial
CN118247799B (en) A method for phrase-level localization using text-to-image diffusion model
CN116631406B (en) Identity feature extraction method, equipment and storage medium based on acoustic feature generation
CN118865940A (en) A speaker extraction method and system
CN118709137A (en) A sentiment information extraction method and sentiment recognition method based on adversarial network
CN117556370A (en) Content risk identification methods, storage media and electronic equipment
CN115132166B (en) Speech synthesis model training method, device, computer equipment and storage medium
CN113515617B (en) Method, device and equipment for generating model through dialogue
CN116524931A (en) System, method, electronic equipment and medium for converting voice of 5G rich media message into text
CN114491027A (en) A text intent recognition method, device and computing device
CN114049875A (en) A kind of TTS broadcast method, apparatus, equipment and storage medium
CN114881671B (en) Processing method and device for claim data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination