[go: up one dir, main page]

CN102487461A - Method for reading aloud webpage on web television and device thereof - Google Patents

Method for reading aloud webpage on web television and device thereof Download PDF

Info

Publication number
CN102487461A
CN102487461A CN2010105699137A CN201010569913A CN102487461A CN 102487461 A CN102487461 A CN 102487461A CN 2010105699137 A CN2010105699137 A CN 2010105699137A CN 201010569913 A CN201010569913 A CN 201010569913A CN 102487461 A CN102487461 A CN 102487461A
Authority
CN
China
Prior art keywords
text data
web page
page text
web
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2010105699137A
Other languages
Chinese (zh)
Inventor
王丛华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Konka Group Co Ltd
Original Assignee
Konka Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Konka Group Co Ltd filed Critical Konka Group Co Ltd
Priority to CN2010105699137A priority Critical patent/CN102487461A/en
Publication of CN102487461A publication Critical patent/CN102487461A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention provides a method for reading aloud webpage a webpage on a web television and a device thereof. The method comprises the following steps of: through developing a memory buffer area; circularly reading the received webpage text data; analyzing the sentence structure of the webpage text data according to the punctuation in the webpage text data; selecting voice primitives corresponding to the characters in the webpage text data from a preset voice database; performing formant synthesis to form voice flow; then performing voice frequency decoding to the voice flow; and finally outputting voice so as to read aloud the webpage browsed on the web television and further enable the user to feel very comfortable.

Description

On Web TV, realize method and the device thereof that webpage is read aloud
?
Technical field
The present invention relates to electronic technology field, relate in particular to a kind of method and device thereof that webpage is read aloud of on Web TV, realizing.
Background technology
Along with the development of internet television technology, people can utilize TV to surf the Net, and on video screen browsing page.
But people are sitting in apart from the operation of surfing the Net on 2 to 3 meters sofas far away of video screen, because the literal on the webpage is very little, therefore, use the TV online, and people feel very uncomfortable when browsing page.
Summary of the invention
For this reason, technical problem to be solved by this invention is: a kind of method and device thereof that webpage is read aloud of on Web TV, realizing is provided, makes people when using the TV online, the webpage of browsing can be read aloud by sounding, lets the people feel well.
So, the invention provides a kind of method that webpage is read aloud that on Web TV, realizes, this method comprises:
Open up a core buffer, the web page text data of receiving are read in circulation;
Analyze the sentence structure of said web page text data according to the punctuation mark in the web page text data, and from the speech database that presets, select the corresponding speech primitive of each literal in the said web page text data, and carry out waveform concatenation, form voice flow;
Said voice flow is carried out audio decoder, output sound.
Wherein, before opening up a core buffer, also comprise:
Web TV sends hypertext transfer protocol requests to Website server, and Website server is encapsulated in extensible markup language protocol in the HTTP packet and sends to Web TV;
Web TV is resolved the said packet that receives, and resolves to the application data that on the Web TV screen, shows, this application data comprises the web page text data that Gong to read.
Further, this method also comprises: Web TV will resolve to the web page text data that Gong to read and be written to the pipeline that is used for thread communication shared drive zone, for reading.
Wherein, the said speech primitive of from the speech database that presets, selecting each webpage literal correspondence, and carry out waveform concatenation, and form voice flow, comprising:
Open the speech database that presets, read first literal in the web page text data;
Region-position code according to said first literal is retrieved the speech primitive of first literal from the speech database that presets, and puts it into speech waveform data first row;
The speech primitive of second literal of retrieval, and put it into speech waveform data second row, according to said method,, be spliced into speech waveform data up to having retrieved web page text data;
According to the punctuation mark after each web page text data, arrangement speech waveform data form voice flow.
Wherein, said voice flow is carried out audio decoder, comprising:
Said voice flow is written to the pipeline that is used for thread communication shared drive zone;
The voice output thread is read said voice flow from said pipeline, writes audio decoder.
The present invention also provides a kind of device that webpage is read aloud of on Web TV, realizing, this device comprises:
Reading unit is used for reading the web page text data of receiving from core buffer circulation;
Analytic unit, voice are analyzed the sentence structure of the web page text data that reading unit reads according to the punctuation mark in the web page text data;
Select concatenation unit, be used for selecting the corresponding speech primitive of each literal the said web page text data, and carry out waveform concatenation, form voice flow from the speech database that presets;
Audio decoding unit is used for the voice flow that the splicing module of selection forms is carried out audio decoder;
The voice output unit, the audio frequency that is used for audio decoding unit is obtained carries out voice output.
This device also comprises:
Transmitting element is used for sending hypertext transfer protocol requests to Website server;
Receiving element is used to receive the HTTP packet that comprises extensible markup language protocol that Website server sends;
Resolution unit is used for the said packet that receives is resolved, and it is resolved to the application data that on the Web TV screen, shows, this application data comprises the web page text data that Gong to read.
Further, this device also comprises:
Writing unit is used for the web page text data that the confession read that said resolution unit obtains are written to the pipeline that is used for thread communication shared drive zone, for reading.
Wherein, the said concatenation unit of selecting comprises:
Retrieval unit is used to open the speech database that presets, and reads the literal in the web page text data, from the speech database that presets, retrieves the language primitive of said literal successively according to the region-position code of said literal;
Concatenation unit with the language primitive of said retrieval, is spliced into speech waveform data successively.
Wherein, said audio decoding unit comprises:
The voice writing unit is used for said voice flow is written to the pipeline that is used for thread communication shared drive zone, for reading;
The voice sensing element is used for reading said voice flow from pipeline, and writes audio decoder.
The realization webpage is read aloud on Web TV method and device thereof according to the invention, through opening up a core buffer, the web page text data of receiving are read in circulation; And analyze the sentence structure of said web page text data according to the punctuation mark in the web page text data; From the speech database that presets, select the corresponding speech primitive of each literal in the said web page text data, and carry out waveform concatenation, form voice flow; Again voice flow is carried out audio decoder; Last output sound has realized that the webpage of on Web TV, browsing can be read aloud by sounding, lets the people feel well.
Description of drawings
Fig. 1 is that the said webpage of on Web TV, realizing of the embodiment of the invention is read aloud the flow chart of method;
Fig. 2 is the said structural representation of on Web TV, realizing the bright read apparatus of webpage of the embodiment of the invention.
Embodiment
Below, describe the present invention in conjunction with accompanying drawing.
As shown in Figure 1, present embodiment provides a kind of method that webpage is read aloud that on Web TV, realizes, in order to realize this method, at first, Web TV will obtain web data, and on tv display screen, shows the web data that obtains.
Because the restriction of Web TV CPU self disposal ability temporarily can only be visited specific number of site at present, XML (Extensible Markup Language, extend markup language) agreement and network television client end communication are adopted in these websites.
Web TV sends HTTP (HyperText Transfer Protocol to Website server; HTTP) request; Website server is handed down to network television client end with the XML protocol encapsulation in the HTTP packet; Network television client end is resolved the XML packet of receiving, and it is resolved to the application data that on the Web TV screen, shows, this application data comprises the web page text data that Gong to read.Application data among the XML is presented on the Web TV screen, the web page text data that the confession read in the application data are written to the pipeline that is used for thread communication shared drive zone, for reading, data filters such as other data such as picture fall.
Secondly, realize the conversion from the text data to the speech data, promptly set up a sounding engine.Its implementation is:
Set up speech database.With the Chinese character is example, for the sounding fragment recorded in each Chinese character, and saves as the multimedia file of WAV, sets up the Hanzi speech library about about 6,000 words.The sounding engine is an independently process, and this process also is the core component of whole invention.It opens up a buffering area in internal memory, the web page text data of receiving are read in circulation, and analyzes its sentence structure according to the punctuation mark of web page text data, from speech database, picks out the corresponding speech primitive of each literal, carries out waveform concatenation.
Specific algorithm is following:
(1) open speech database. from speech database, read format 46 bytes of depositing audio file, for example CHAR HEAD [46];
(2) from speech database, retrieve the speech waveform data of first Chinese character according to first Hanzi section-position code in the current text data, put in the speech data of 80 bytes of each byte 3.2k, for example first of CHAR BUF [3200] [80] the array row.
So above-mentioned two-dimensional array has just been deposited the speech waveform data of one group of Chinese character, its each row of data length depends on factors such as the speed, tonequality of voice.
(3) read second Chinese character in the web page text data, and in speech database, search.
(4) repeating step (2), (3) are up in speech database, retrieving complete text data speech waveform data.
(5) according to punctuation mark arrangement voice data intonation parameter last in each text data, and output.
It is thus clear that the emphasis of waveform concatenation is the splicing of basic voice unit, set up a huge speech database in advance, when synthetic, pick out required speech primitive and splice, thereby form voice flow output.
Best, carry out the broadcast of speech data:
Because the conversion from the web page text data to the multimedia audio data is to carry out with the mode of stream in real time, does not have fixing file size, so can not adopt the mode of file that speech data is played back., must be also in real time speech data be pushed in the middle of the audio player for this reason, form voice flow and play back with the mode of stream.
Because it is PCM (Pulse Code Modulation, pulse-code modulation recording data) that the web page text data transaction becomes what form behind the speech data, data volume is very big, and management and storage overhead are all pretty troublesome.Therefore, present embodiment still adopts pipe technology, after data are written to pipeline from the sounding engine, in the voice output thread, from pipeline voice flow is read the audio decoder that writes direct output.
It is thus clear that the said method that the realization webpage is read aloud on Web TV of present embodiment is through opening up a core buffer; The web page text data of receiving are read in circulation, and analyze the sentence structure of said web page text data according to the punctuation mark in the web page text data, from the speech database that presets, select the corresponding speech primitive of each literal in the said web page text data; And carry out waveform concatenation; Form voice flow, again voice flow is carried out audio decoder, last output sound; Realized that the webpage of on Web TV, browsing can be read aloud by sounding, let the people feel well.
According to said method, as shown in Figure 2, present embodiment also provides a kind of device that webpage is read aloud of on Web TV, realizing, this device comprises:
Reading unit 21 is used for reading the web page text data of receiving from core buffer circulation;
Analytic unit 22, voice are analyzed the sentence structure of the web page text data that reading unit reads according to the punctuation mark in the web page text data;
Select concatenation unit 23, be used for selecting the corresponding speech primitive of each literal the said web page text data, and carry out waveform concatenation, form voice flow from the speech database that presets;
Audio decoding unit 24 is used for the voice flow that the splicing module of selection forms is carried out audio decoder;
Voice output unit 25, the audio frequency that is used for audio decoding unit is obtained carries out voice output.
In order to realize said method, at first, Web TV will obtain web data, and on tv display screen, shows the web data that obtains.For this reason, this device can also comprise:
Transmitting element 26 is used for sending hypertext transfer protocol requests to Website server;
Receiving element 27 is used to receive the HTTP packet that comprises extensible markup language protocol that Website server sends;
Resolution unit 28 is used for the said packet that receives is resolved, and it is resolved to the application data that on the Web TV screen, shows, this application data comprises the web page text data that Gong to read.
And; Data volume is big in order to reduce, management and the big drawback of storage overhead; Adopt pipe technology and necessary writing unit 29, be used for the web page text data that the confession read that said resolution unit obtains are written to the pipeline that is used for thread communication shared drive zone, for reading.
Wherein, selecting concatenation unit 23 comprises:
Retrieval unit 231 is used to open the speech database that presets, and reads the literal in the web page text data, from the speech database that presets, retrieves the language primitive of said literal successively according to the region-position code of said literal;
Concatenation unit 232 with the language primitive of said retrieval, is spliced into speech waveform data successively.
Audio decoding unit 24 comprises:
Voice writing unit 241 is used for said voice flow is written to the pipeline that is used for thread communication shared drive zone, for reading;
Voice sensing element 242 is used for reading said voice flow from pipeline, and writes audio decoder.
It is thus clear that said apparatus is through opening up a core buffer; Read the web page text data of receiving by reading unit 21 circulations; And, from the speech database that presets, select the corresponding speech primitive of each literal in the said web page text data by selecting concatenation unit 23, and carry out waveform concatenation according to the sentence structure of the punctuation mark in the web page text data by the said web page text data of analytic unit 22 analyses; Form voice flow; Carry out audio decoder by 24 pairs of voice flows of audio decoding unit again,, realized that the webpage of on Web TV, browsing can be read aloud by sounding at last by voice output unit 25 output sounds.
In sum, present embodiment said the realization webpage is read aloud on Web TV method and device thereof are through opening up a core buffer; The web page text data of receiving are read in circulation, and analyze the sentence structure of said web page text data according to the punctuation mark in the web page text data, from the speech database that presets, select the corresponding speech primitive of each literal in the said web page text data; And carry out waveform concatenation; Form voice flow, again voice flow is carried out audio decoder, last output sound; Realized that the webpage of on Web TV, browsing can be read aloud by sounding, let the people feel well.
The above is merely preferred embodiment of the present invention, and is in order to restriction the present invention, not all within spirit of the present invention and principle, any modification of being done, is equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1. on Web TV, realize the method that webpage is read aloud for one kind, it is characterized in that, comprising:
Open up a core buffer, the web page text data of receiving are read in circulation;
Analyze the sentence structure of said web page text data according to the punctuation mark in the web page text data, and from the speech database that presets, select the corresponding speech primitive of each literal in the said web page text data, and carry out waveform concatenation, form voice flow;
Said voice flow is carried out audio decoder, output sound.
2. method according to claim 1 is characterized in that, before opening up a core buffer, also comprises:
Web TV sends hypertext transfer protocol requests to Website server, and Website server is encapsulated in extensible markup language protocol in the HTTP packet and sends to Web TV;
Web TV is resolved the said packet that receives, and resolves to the application data that on the Web TV screen, shows, this application data comprises the web page text data that Gong to read.
3. method according to claim 2 is characterized in that, also comprises: Web TV will resolve to the web page text data that Gong to read and be written to the pipeline that is used for thread communication shared drive zone, for reading.
4. method according to claim 1 is characterized in that, the said speech primitive of from the speech database that presets, selecting each webpage literal correspondence, and carry out waveform concatenation, and form voice flow, comprising:
Open the speech database that presets, read first literal in the web page text data;
Region-position code according to said first literal is retrieved the speech primitive of first literal from the speech database that presets, and puts it into speech waveform data first row;
The speech primitive of second literal of retrieval, and put it into speech waveform data second row, according to said method,, be spliced into speech waveform data up to having retrieved web page text data;
According to the punctuation mark after each web page text data, arrangement speech waveform data form voice flow.
5. method according to claim 1 is characterized in that, said voice flow is carried out audio decoder, comprising:
Said voice flow is written to the pipeline that is used for thread communication shared drive zone;
The voice output thread is read said voice flow from said pipeline, writes audio decoder.
6. on Web TV, realize the device that webpage is read aloud for one kind, it is characterized in that, comprising:
Reading unit is used for reading the web page text data of receiving from core buffer circulation;
Analytic unit, voice are analyzed the sentence structure of the web page text data that reading unit reads according to the punctuation mark in the web page text data;
Select concatenation unit, be used for selecting the corresponding speech primitive of each literal the said web page text data, and carry out waveform concatenation, form voice flow from the speech database that presets;
Audio decoding unit is used for the voice flow that the splicing module of selection forms is carried out audio decoder;
The voice output unit, the audio frequency that is used for audio decoding unit is obtained carries out voice output.
7. device according to claim 6 is characterized in that, also comprises:
Transmitting element is used for sending hypertext transfer protocol requests to Website server;
Receiving element is used to receive the HTTP packet that comprises extensible markup language protocol that Website server sends;
Resolution unit is used for the said packet that receives is resolved, and it is resolved to the application data that on the Web TV screen, shows, this application data comprises the web page text data that Gong to read.
8. device according to claim 7 is characterized in that, also comprises:
Writing unit is used for the web page text data that the confession read that said resolution unit obtains are written to the pipeline that is used for thread communication shared drive zone, for reading.
9. device according to claim 6 is characterized in that, the said concatenation unit of selecting comprises:
Retrieval unit is used to open the speech database that presets, and reads the literal in the web page text data, from the speech database that presets, retrieves the language primitive of said literal successively according to the region-position code of said literal;
Concatenation unit with the language primitive of said retrieval, is spliced into speech waveform data successively.
10. device according to claim 6 is characterized in that, said audio decoding unit comprises:
The voice writing unit is used for said voice flow is written to the pipeline that is used for thread communication shared drive zone, for reading;
The voice sensing element is used for reading said voice flow from pipeline, and writes audio decoder.
CN2010105699137A 2010-12-02 2010-12-02 Method for reading aloud webpage on web television and device thereof Pending CN102487461A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010105699137A CN102487461A (en) 2010-12-02 2010-12-02 Method for reading aloud webpage on web television and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010105699137A CN102487461A (en) 2010-12-02 2010-12-02 Method for reading aloud webpage on web television and device thereof

Publications (1)

Publication Number Publication Date
CN102487461A true CN102487461A (en) 2012-06-06

Family

ID=46152909

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010105699137A Pending CN102487461A (en) 2010-12-02 2010-12-02 Method for reading aloud webpage on web television and device thereof

Country Status (1)

Country Link
CN (1) CN102487461A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017092312A1 (en) * 2015-12-01 2017-06-08 乐视控股(北京)有限公司 Method of browsing webpage on browser and device
CN111459445A (en) * 2020-02-28 2020-07-28 问问智能信息科技有限公司 Webpage end audio generation method and device and storage medium
CN113077550A (en) * 2021-04-29 2021-07-06 广东工业大学 Virtual idol system based on network teaching

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004171111A (en) * 2002-11-18 2004-06-17 Oki Electric Ind Co Ltd Web browser control method and device
CN101156196A (en) * 2005-03-28 2008-04-02 莱塞克技术公司 Hybrid Speech Synthesizers, Methods, and Usage
US20100145686A1 (en) * 2008-12-04 2010-06-10 Sony Computer Entertainment Inc. Information processing apparatus converting visually-generated information into aural information, and information processing method thereof
US7818164B2 (en) * 2006-08-21 2010-10-19 K12 Inc. Method and system for teaching a foreign language

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004171111A (en) * 2002-11-18 2004-06-17 Oki Electric Ind Co Ltd Web browser control method and device
CN101156196A (en) * 2005-03-28 2008-04-02 莱塞克技术公司 Hybrid Speech Synthesizers, Methods, and Usage
US7818164B2 (en) * 2006-08-21 2010-10-19 K12 Inc. Method and system for teaching a foreign language
US20100145686A1 (en) * 2008-12-04 2010-06-10 Sony Computer Entertainment Inc. Information processing apparatus converting visually-generated information into aural information, and information processing method thereof

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017092312A1 (en) * 2015-12-01 2017-06-08 乐视控股(北京)有限公司 Method of browsing webpage on browser and device
CN111459445A (en) * 2020-02-28 2020-07-28 问问智能信息科技有限公司 Webpage end audio generation method and device and storage medium
CN113077550A (en) * 2021-04-29 2021-07-06 广东工业大学 Virtual idol system based on network teaching
CN113077550B (en) * 2021-04-29 2024-07-12 广东工业大学 Virtual idol system based on network teaching

Similar Documents

Publication Publication Date Title
JP5030617B2 (en) Method, system, and program for RSS content management for rendering RSS content on a digital audio player (RSS content management for rendering RSS content on a digital audio player)
CN101042752B (en) Method and sytem used for email administration
US9372926B2 (en) Intelligent video summaries in information access
US9318100B2 (en) Supplementing audio recorded in a media file
KR100571347B1 (en) Multimedia Contents Service System and Method Based on User Preferences and Its Recording Media
KR101462688B1 (en) Information processing apparatus, and method and system for searching for reputaion of content
JP2007242013A (en) Method, system, and program for calling content management instructions (calling content management instructions)
US20110153330A1 (en) System and method for rendering text synchronized audio
US20090043739A1 (en) Method of displaying customized data and browser agent
WO2018120821A1 (en) Method and device for producing presentation
CN103956167A (en) Visual sign language interpretation method and device based on Web
JPH11249867A (en) Voice browser system
WO2018120819A1 (en) Method and device for producing presentation
CN102880599A (en) Sentence exploring method for analyzing sentences and supporting learning of analysis
KR100859270B1 (en) Providing method and system with web contents using web page division based on mobile internet
CN102487461A (en) Method for reading aloud webpage on web television and device thereof
CN100362858C (en) Method for realizing expansion EPG function by browser technology
WO2018120820A1 (en) Presentation production method and apparatus
CN1155199C (en) Sequential chaining thermal target and continuous flow video browsing device in wanwei network browsing device
KR100689059B1 (en) Network synchronization system and method for converting Internet content into TTS and providing it to personal information processor
CN115640420A (en) ES-based audio information index database establishing and retrieving method, ES-based audio information index database establishing and retrieving equipment and ES-based audio information index database storing medium
EP2447940B1 (en) Method of and apparatus for providing audio data corresponding to a text
Owen et al. Computed synchronization for multimedia applications
CN109036383B (en) A lightweight real-time speech recognition and translation device combined with crawler technology
Draxler Speech databases

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20120606