CN102487461A

CN102487461A - Method for reading aloud webpage on web television and device thereof

Info

Publication number: CN102487461A
Application number: CN2010105699137A
Authority: CN
Inventors: 王丛华
Original assignee: Konka Group Co Ltd
Current assignee: Konka Group Co Ltd
Priority date: 2010-12-02
Filing date: 2010-12-02
Publication date: 2012-06-06

Abstract

The invention provides a method for reading aloud webpage a webpage on a web television and a device thereof. The method comprises the following steps of: through developing a memory buffer area; circularly reading the received webpage text data; analyzing the sentence structure of the webpage text data according to the punctuation in the webpage text data; selecting voice primitives corresponding to the characters in the webpage text data from a preset voice database; performing formant synthesis to form voice flow; then performing voice frequency decoding to the voice flow; and finally outputting voice so as to read aloud the webpage browsed on the web television and further enable the user to feel very comfortable.

Description

On Web TV, realize method and the device thereof that webpage is read aloud

?

Technical field

The present invention relates to electronic technology field, relate in particular to a kind of method and device thereof that webpage is read aloud of on Web TV, realizing.

Background technology

Along with the development of internet television technology, people can utilize TV to surf the Net, and on video screen browsing page.

But people are sitting in apart from the operation of surfing the Net on 2 to 3 meters sofas far away of video screen, because the literal on the webpage is very little, therefore, use the TV online, and people feel very uncomfortable when browsing page.

Summary of the invention

For this reason, technical problem to be solved by this invention is: a kind of method and device thereof that webpage is read aloud of on Web TV, realizing is provided, makes people when using the TV online, the webpage of browsing can be read aloud by sounding, lets the people feel well.

So, the invention provides a kind of method that webpage is read aloud that on Web TV, realizes, this method comprises:

Open up a core buffer, the web page text data of receiving are read in circulation;

Analyze the sentence structure of said web page text data according to the punctuation mark in the web page text data, and from the speech database that presets, select the corresponding speech primitive of each literal in the said web page text data, and carry out waveform concatenation, form voice flow;

Said voice flow is carried out audio decoder, output sound.

Wherein, before opening up a core buffer, also comprise:

Web TV sends hypertext transfer protocol requests to Website server, and Website server is encapsulated in extensible markup language protocol in the HTTP packet and sends to Web TV;

Web TV is resolved the said packet that receives, and resolves to the application data that on the Web TV screen, shows, this application data comprises the web page text data that Gong to read.

Further, this method also comprises: Web TV will resolve to the web page text data that Gong to read and be written to the pipeline that is used for thread communication shared drive zone, for reading.

Wherein, the said speech primitive of from the speech database that presets, selecting each webpage literal correspondence, and carry out waveform concatenation, and form voice flow, comprising:

Open the speech database that presets, read first literal in the web page text data;

Region-position code according to said first literal is retrieved the speech primitive of first literal from the speech database that presets, and puts it into speech waveform data first row;

The speech primitive of second literal of retrieval, and put it into speech waveform data second row, according to said method,, be spliced into speech waveform data up to having retrieved web page text data;

According to the punctuation mark after each web page text data, arrangement speech waveform data form voice flow.

Wherein, said voice flow is carried out audio decoder, comprising:

Said voice flow is written to the pipeline that is used for thread communication shared drive zone;

The voice output thread is read said voice flow from said pipeline, writes audio decoder.

The present invention also provides a kind of device that webpage is read aloud of on Web TV, realizing, this device comprises:

Reading unit is used for reading the web page text data of receiving from core buffer circulation;

Analytic unit, voice are analyzed the sentence structure of the web page text data that reading unit reads according to the punctuation mark in the web page text data;

Select concatenation unit, be used for selecting the corresponding speech primitive of each literal the said web page text data, and carry out waveform concatenation, form voice flow from the speech database that presets;

Audio decoding unit is used for the voice flow that the splicing module of selection forms is carried out audio decoder;

The voice output unit, the audio frequency that is used for audio decoding unit is obtained carries out voice output.

This device also comprises:

Transmitting element is used for sending hypertext transfer protocol requests to Website server;

Receiving element is used to receive the HTTP packet that comprises extensible markup language protocol that Website server sends;

Resolution unit is used for the said packet that receives is resolved, and it is resolved to the application data that on the Web TV screen, shows, this application data comprises the web page text data that Gong to read.

Further, this device also comprises:

Writing unit is used for the web page text data that the confession read that said resolution unit obtains are written to the pipeline that is used for thread communication shared drive zone, for reading.

Wherein, the said concatenation unit of selecting comprises:

Retrieval unit is used to open the speech database that presets, and reads the literal in the web page text data, from the speech database that presets, retrieves the language primitive of said literal successively according to the region-position code of said literal;

Concatenation unit with the language primitive of said retrieval, is spliced into speech waveform data successively.

Wherein, said audio decoding unit comprises:

The voice writing unit is used for said voice flow is written to the pipeline that is used for thread communication shared drive zone, for reading;

The voice sensing element is used for reading said voice flow from pipeline, and writes audio decoder.

The realization webpage is read aloud on Web TV method and device thereof according to the invention, through opening up a core buffer, the web page text data of receiving are read in circulation; And analyze the sentence structure of said web page text data according to the punctuation mark in the web page text data; From the speech database that presets, select the corresponding speech primitive of each literal in the said web page text data, and carry out waveform concatenation, form voice flow; Again voice flow is carried out audio decoder; Last output sound has realized that the webpage of on Web TV, browsing can be read aloud by sounding, lets the people feel well.

Description of drawings

Fig. 1 is that the said webpage of on Web TV, realizing of the embodiment of the invention is read aloud the flow chart of method;

Fig. 2 is the said structural representation of on Web TV, realizing the bright read apparatus of webpage of the embodiment of the invention.

Embodiment

Below, describe the present invention in conjunction with accompanying drawing.

As shown in Figure 1, present embodiment provides a kind of method that webpage is read aloud that on Web TV, realizes, in order to realize this method, at first, Web TV will obtain web data, and on tv display screen, shows the web data that obtains.

Because the restriction of Web TV CPU self disposal ability temporarily can only be visited specific number of site at present, XML (Extensible Markup Language, extend markup language) agreement and network television client end communication are adopted in these websites.

Web TV sends HTTP (HyperText Transfer Protocol to Website server; HTTP) request; Website server is handed down to network television client end with the XML protocol encapsulation in the HTTP packet; Network television client end is resolved the XML packet of receiving, and it is resolved to the application data that on the Web TV screen, shows, this application data comprises the web page text data that Gong to read.Application data among the XML is presented on the Web TV screen, the web page text data that the confession read in the application data are written to the pipeline that is used for thread communication shared drive zone, for reading, data filters such as other data such as picture fall.

Secondly, realize the conversion from the text data to the speech data, promptly set up a sounding engine.Its implementation is:

Set up speech database.With the Chinese character is example, for the sounding fragment recorded in each Chinese character, and saves as the multimedia file of WAV, sets up the Hanzi speech library about about 6,000 words.The sounding engine is an independently process, and this process also is the core component of whole invention.It opens up a buffering area in internal memory, the web page text data of receiving are read in circulation, and analyzes its sentence structure according to the punctuation mark of web page text data, from speech database, picks out the corresponding speech primitive of each literal, carries out waveform concatenation.

Specific algorithm is following:

(1) open speech database. from speech database, read format 46 bytes of depositing audio file, for example CHAR HEAD [46];

(2) from speech database, retrieve the speech waveform data of first Chinese character according to first Hanzi section-position code in the current text data, put in the speech data of 80 bytes of each byte 3.2k, for example first of CHAR BUF [3200] [80] the array row.

So above-mentioned two-dimensional array has just been deposited the speech waveform data of one group of Chinese character, its each row of data length depends on factors such as the speed, tonequality of voice.

(3) read second Chinese character in the web page text data, and in speech database, search.

(4) repeating step (2), (3) are up in speech database, retrieving complete text data speech waveform data.

(5) according to punctuation mark arrangement voice data intonation parameter last in each text data, and output.

It is thus clear that the emphasis of waveform concatenation is the splicing of basic voice unit, set up a huge speech database in advance, when synthetic, pick out required speech primitive and splice, thereby form voice flow output.

Best, carry out the broadcast of speech data:

Because the conversion from the web page text data to the multimedia audio data is to carry out with the mode of stream in real time, does not have fixing file size, so can not adopt the mode of file that speech data is played back., must be also in real time speech data be pushed in the middle of the audio player for this reason, form voice flow and play back with the mode of stream.

Because it is PCM (Pulse Code Modulation, pulse-code modulation recording data) that the web page text data transaction becomes what form behind the speech data, data volume is very big, and management and storage overhead are all pretty troublesome.Therefore, present embodiment still adopts pipe technology, after data are written to pipeline from the sounding engine, in the voice output thread, from pipeline voice flow is read the audio decoder that writes direct output.

It is thus clear that the said method that the realization webpage is read aloud on Web TV of present embodiment is through opening up a core buffer; The web page text data of receiving are read in circulation, and analyze the sentence structure of said web page text data according to the punctuation mark in the web page text data, from the speech database that presets, select the corresponding speech primitive of each literal in the said web page text data; And carry out waveform concatenation; Form voice flow, again voice flow is carried out audio decoder, last output sound; Realized that the webpage of on Web TV, browsing can be read aloud by sounding, let the people feel well.

According to said method, as shown in Figure 2, present embodiment also provides a kind of device that webpage is read aloud of on Web TV, realizing, this device comprises:

Reading unit 21 is used for reading the web page text data of receiving from core buffer circulation;

Analytic unit 22, voice are analyzed the sentence structure of the web page text data that reading unit reads according to the punctuation mark in the web page text data;

Select concatenation unit 23, be used for selecting the corresponding speech primitive of each literal the said web page text data, and carry out waveform concatenation, form voice flow from the speech database that presets;

Audio decoding unit 24 is used for the voice flow that the splicing module of selection forms is carried out audio decoder;

Voice output unit 25, the audio frequency that is used for audio decoding unit is obtained carries out voice output.

In order to realize said method, at first, Web TV will obtain web data, and on tv display screen, shows the web data that obtains.For this reason, this device can also comprise:

Transmitting element 26 is used for sending hypertext transfer protocol requests to Website server;

Receiving element 27 is used to receive the HTTP packet that comprises extensible markup language protocol that Website server sends;

Resolution unit 28 is used for the said packet that receives is resolved, and it is resolved to the application data that on the Web TV screen, shows, this application data comprises the web page text data that Gong to read.

And; Data volume is big in order to reduce, management and the big drawback of storage overhead; Adopt pipe technology and necessary writing unit 29, be used for the web page text data that the confession read that said resolution unit obtains are written to the pipeline that is used for thread communication shared drive zone, for reading.

Wherein, selecting concatenation unit 23 comprises:

Retrieval unit 231 is used to open the speech database that presets, and reads the literal in the web page text data, from the speech database that presets, retrieves the language primitive of said literal successively according to the region-position code of said literal;

Concatenation unit 232 with the language primitive of said retrieval, is spliced into speech waveform data successively.

Audio decoding unit 24 comprises:

Voice writing unit 241 is used for said voice flow is written to the pipeline that is used for thread communication shared drive zone, for reading;

Voice sensing element 242 is used for reading said voice flow from pipeline, and writes audio decoder.

It is thus clear that said apparatus is through opening up a core buffer; Read the web page text data of receiving by reading unit 21 circulations; And, from the speech database that presets, select the corresponding speech primitive of each literal in the said web page text data by selecting concatenation unit 23, and carry out waveform concatenation according to the sentence structure of the punctuation mark in the web page text data by the said web page text data of analytic unit 22 analyses; Form voice flow; Carry out audio decoder by 24 pairs of voice flows of audio decoding unit again,, realized that the webpage of on Web TV, browsing can be read aloud by sounding at last by voice output unit 25 output sounds.

In sum, present embodiment said the realization webpage is read aloud on Web TV method and device thereof are through opening up a core buffer; The web page text data of receiving are read in circulation, and analyze the sentence structure of said web page text data according to the punctuation mark in the web page text data, from the speech database that presets, select the corresponding speech primitive of each literal in the said web page text data; And carry out waveform concatenation; Form voice flow, again voice flow is carried out audio decoder, last output sound; Realized that the webpage of on Web TV, browsing can be read aloud by sounding, let the people feel well.

The above is merely preferred embodiment of the present invention, and is in order to restriction the present invention, not all within spirit of the present invention and principle, any modification of being done, is equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. on Web TV, realize the method that webpage is read aloud for one kind, it is characterized in that, comprising:

Said voice flow is carried out audio decoder, output sound.

2. method according to claim 1 is characterized in that, before opening up a core buffer, also comprises:

3. method according to claim 2 is characterized in that, also comprises: Web TV will resolve to the web page text data that Gong to read and be written to the pipeline that is used for thread communication shared drive zone, for reading.

4. method according to claim 1 is characterized in that, the said speech primitive of from the speech database that presets, selecting each webpage literal correspondence, and carry out waveform concatenation, and form voice flow, comprising:

5. method according to claim 1 is characterized in that, said voice flow is carried out audio decoder, comprising:

6. on Web TV, realize the device that webpage is read aloud for one kind, it is characterized in that, comprising:

7. device according to claim 6 is characterized in that, also comprises:

8. device according to claim 7 is characterized in that, also comprises:

9. device according to claim 6 is characterized in that, the said concatenation unit of selecting comprises:

10. device according to claim 6 is characterized in that, said audio decoding unit comprises: