CN105095309A - Webpage processing method and device - Google Patents
Webpage processing method and device Download PDFInfo
- Publication number
- CN105095309A CN105095309A CN201410217134.9A CN201410217134A CN105095309A CN 105095309 A CN105095309 A CN 105095309A CN 201410217134 A CN201410217134 A CN 201410217134A CN 105095309 A CN105095309 A CN 105095309A
- Authority
- CN
- China
- Prior art keywords
- information
- target web
- form information
- web
- web page
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 21
- 238000000034 method Methods 0.000 claims description 36
- 239000000284 extract Substances 0.000 claims description 23
- 238000013507 mapping Methods 0.000 claims description 14
- 238000000605 extraction Methods 0.000 claims description 9
- 238000003860 storage Methods 0.000 claims description 6
- 238000001514 detection method Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 2
- 230000006870 function Effects 0.000 description 11
- 230000008569 process Effects 0.000 description 11
- 230000009193 crawling Effects 0.000 description 10
- 235000014510 cooky Nutrition 0.000 description 9
- 238000012545 processing Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 230000015654 memory Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000013515 script Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 241000239290 Araneae Species 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000004064 recycling Methods 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Landscapes
- Information Transfer Between Computers (AREA)
Abstract
The invention discloses a webpage processing method and device. The webpage processing method comprises the following steps: accessing a target webpage to extract the form information of the target webpage; inserting an interception program into the form information, wherein the interception program is used for intercepting the identity information of the target webpage; submitting the form information which is inserted into the interception program to the target webpage; when the form information is browsed, intercepting the identity information of the target webpage through the running of the interception program; and utilizing the identity information to access the target webpage, and deleting the webpage content of the target webpage, wherein the webpage content corresponds to the form information. The problem of insecure user information since the user information is lost is solved so as to achieve an effect that the safety of the user information is improved.
Description
Technical field
The present invention relates to internet arena, in particular to a kind of web page processing method and device.
Background technology
Internet has mass users information, and due to website in internet, dragons and fishes jumbled together, and user is once have access to unsafe website just as easy as rolling off a log loss causing user profile, and such as, user QQ password is stolen by fishing website.For this situation, usually after these unsafe websites being detected, close down server or shield these websites by protection capacity of safety protection software, access these unsafe websites to avoid user.
But, because protection capacity of safety protection software cannot judge that whether the domain name of new application is the domain name of dangerous website, so by applying for that the method that new domain name sets up dangerous website just can prevent these unsafe websites from being shielded by protection capacity of safety protection software.In addition, if the server of these unsafe websites abroad, also these servers cannot be closed down, therefore, these two kinds of methods are utilized all cannot initiatively to process these unsafe websites, the information security of user cannot be ensured after user profile is stolen, thus cause user profile dangerous.
Lose for user profile in correlation technique and cause the unsafe problem of user profile, not yet propose effective solution at present.
Summary of the invention
The fundamental purpose of the embodiment of the present invention is to provide a kind of web page processing method and device, causes the unsafe problem of user profile to solve user profile loss in correlation technique.
To achieve these goals, according to an aspect of the embodiment of the present invention, a kind of web page processing method is provided.Web page processing method according to the embodiment of the present invention comprises: the form information being extracted above-mentioned target web by access destination webpage; In above-mentioned form information, insert intercepting program, wherein, above-mentioned intercepting program is for intercepting the identity information of above-mentioned target web; The form information inserting above-mentioned intercepting program is committed to above-mentioned target web; When above-mentioned form information is viewed, by running the identity information of the above-mentioned target web of above-mentioned intercepting program intercepts; And utilize above-mentioned identity information to access above-mentioned target web and delete the web page contents of above-mentioned target web, wherein, above-mentioned web page contents is the web page contents corresponding with above-mentioned form information.
To achieve these goals, according to the another aspect of the embodiment of the present invention, provide a kind of page processor.Page processor according to the embodiment of the present invention comprises: extraction unit, for being extracted the form information of above-mentioned target web by access destination webpage; Plug-in unit, for inserting intercepting program in above-mentioned form information, wherein, above-mentioned intercepting program is for intercepting the identity information of above-mentioned target web; Commit unit, for being committed to above-mentioned target web by the form information inserting above-mentioned intercepting program; Interception unit, for when above-mentioned form information is viewed, by running the identity information of the above-mentioned target web of above-mentioned intercepting program intercepts; And delete cells, access above-mentioned target web for utilizing above-mentioned identity information and delete the web page contents of above-mentioned target web, wherein, above-mentioned web page contents is the submission information submitted to by above-mentioned form information.
In embodiments of the present invention, the form information being extracted target web by access destination webpage is adopted; In form information, insert intercepting program, wherein, intercepting program is for intercepting the identity information of target web; The form information inserting intercepting program is committed to target web; When form information is viewed, by running the identity information intercepting program intercepts target web; And utilize identity information access destination webpage and delete the web page contents of target web, wherein, web page contents is the submission information submitted to by form information, by extracting form information and insert intercepting program in form information, intercepting program is utilized to obtain the identity of target web, recycling identity information access destination webpage deletes the web page contents of target web, the submission information that target web is obtained is deleted, thus solve family information dropout and cause the unsafe problem of user profile, and then reach the effect improving user information safety.
Accompanying drawing explanation
The accompanying drawing forming a application's part is used to provide a further understanding of the present invention, and schematic description and description of the present invention, for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is a kind of structured flowchart of computing machine;
Fig. 2 is the process flow diagram of the web page processing method according to the embodiment of the present invention;
Fig. 3 is the process flow diagram of web page processing method according to the preferred embodiment of the invention;
Fig. 4 is the schematic diagram of the Web Page Processing system according to the embodiment of the present invention; And
Fig. 5 is the schematic diagram of the page processor according to the embodiment of the present invention.
Embodiment
The present invention program is understood better in order to make those skilled in the art person, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the embodiment of a part of the present invention, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, should belong to the scope of protection of the invention.
It should be noted that, term " first ", " second " etc. in instructions of the present invention and claims and above-mentioned accompanying drawing are for distinguishing similar object, and need not be used for describing specific order or precedence.Should be appreciated that the data used like this can be exchanged in the appropriate case, so as embodiments of the invention described herein can with except here diagram or describe those except order implement.In addition, term " comprises " and " having " and their any distortion, intention is to cover not exclusive comprising, such as, contain those steps or unit that the process of series of steps or unit, method, system, product or equipment is not necessarily limited to clearly list, but can comprise clearly do not list or for intrinsic other step of these processes, method, product or equipment or unit.
Technical term is introduced:
List: list is primary responsibility data acquisition function in webpage.A list has three elements, i.e. form tags, form fields and list button.The method that the URL of processing list forms data cgi script used and data are submitted to server is contained in form tags; Form fields contains text box, password box, Hidden field, multiline text frame, check box, radio box, drop-down choice box and files passe frame etc.; List button comprises submit button, reset button and general button, for data being sent to the CGI (Common Gateway Interface) (CommonGatewayInterface on server, be called for short CGI) script or cancel input, other can also define the work for the treatment of of processing scripts with list button control.
CGI scripting: be one section of program physically, may operate on server, provides the interface of the same client html page.
XSS: user inserts html code in Web page, when another user browses this page, inserts html code in embedding Web page and namely can perform, thus obtain the information browsing the user of this page.
The embodiment of the present invention provides a kind of web page processing method, and it can be performed by computing machine or similar arithmetic unit.Figure 1 shows that a kind of structured flowchart of computing machine.As shown in Figure 1, computing machine 100 comprises one or more (only illustrating one in figure) processor 102, storer 104 and transport module 106.One of ordinary skill in the art will appreciate that, the structure shown in Fig. 1 is only signal, and it does not cause restriction to the structure of above-mentioned electronic installation.Such as, computing machine 100 also can comprise than assembly more or less shown in Fig. 1, or has the configuration different from shown in Fig. 1.
Storer 104 can be used for storing software program and module, as the web page processing method in the embodiment of the present invention and programmed instruction/module corresponding to device, processor 102 is by running the software program and module that are stored in storer 104, thus perform the application of various function and data processing, namely realize above-mentioned web page processing method and device.Storer 104 can comprise high speed random access memory, also can comprise nonvolatile memory, as one or more magnetic storage device, flash memory or other non-volatile solid state memories.In some instances, storer 104 can comprise the storer relative to the long-range setting of processor 102 further, and these remote memories can be connected to computing machine 100 by network.The example of above-mentioned network includes but not limited to internet, intranet, LAN (Local Area Network), mobile radio communication and combination thereof.
Transport module 106 for via a network reception or send data.Above-mentioned network instantiation can include spider lines and wireless network.In an example, transport module 106 comprises a network adapter (NetworkInterfaceController, be called for short NIC), and it to be connected with router by netting twine and other network equipments thus can to carry out communication with internet.In an example, transport module 106 is ethernet module, and it is for carrying out communication by ethernet line mode and internet.
Embodiment 1
According to the embodiment of the present invention, provide a kind of embodiment of the method that may be used for implementing the application's device embodiment, it should be noted that, can perform in the computer system of such as one group of computer executable instructions in the step shown in the process flow diagram of accompanying drawing, and, although show logical order in flow charts, in some cases, can be different from the step shown or described by order execution herein.
According to the embodiment of the present invention, provide a kind of web page processing method.
Fig. 2 is the process flow diagram of the web page processing method according to the embodiment of the present invention, is described this web page processing method below in conjunction with Fig. 1, and as shown in the figure, the method comprises the steps:
Step S202, extracts the form information of target web by access destination webpage.
Target web has list and fills in function, this target web then has form information, user can submit information to target web by filling in list, the information submitted to can be kept on server corresponding to target web, and namely target web can by the information providing the function of filling in list to obtain user to user.
Such as: personal homepage has message pad function.User is in the message pad region of client input information, and then user presses " message " button, and the information that user inputs is sent to server by client, and the information that user inputs is stored in specified file by server.
Such as: fishing website has malicious web pages and collects user profile, the information that user is filled in by list is collected when user accesses this malicious web pages.
For the webpage in above-mentioned example, all there is list function, namely can extract the form information of target web by accessing these target webs.The network address of target web can be collected by the report of protection capacity of safety protection software, such as, transfers the network address of the dangerous webpage that Tengxun mobile phone house keeper collects, and by network address access destination webpage.
Step S204, inserts intercepting program in form information, and wherein, intercepting program is for intercepting the identity information of target web.
Intercepting program does not intercept the program of the identity information of target web, and namely the identity information of target web can log in the identity information that this target web operates this target web, the such as identity information of keeper's identity.This intercepting program can be XSS statement, by making form information carry intercepting program by this XSS statement write form information.
Step S206, is committed to target web by the form information inserting intercepting program.
Due to information can be submitted to target web by list, as above " message " in routine message pad, then also the intercepting program of carrying in form information can be committed to target web by submission form information, when needs submit intercepting program to target web, by the form information inserting intercepting program is committed to target web.
Step S208, when form information is viewed, by running the identity information intercepting program intercepts target web.
The intercepting program of the identity information for intercepting target web is inserted with in form information, when form information is viewed, can by running the identity information intercepting program intercepts target web.Form information is viewed is browsed by the backstage of target web, and the user that not all opens this target web can browse the submission information of other users, therefore, the managerial personnel of this target web are generally when form information is viewed, now run the identity information that intercepting program can intercept the managerial personnel of target web, that is, the identity information intercepting the target web of program intercepts is the identity information of the personnel browsing form information.The identity information of target web is generally held in the cookie of target web, can be obtained the identity information of target web by the cookie intercepted in program looks target web.
Such as, for fishing website, after submitting the form information with the program of intercepting to fishing website, the staff of fishing website checks the information by submission of sheet with the identity of keeper, in the process of checking, the intercepting program be inserted in form information is run automatically, and obtains the identity information that this browses personnel.
Step S210, utilize identity information access destination webpage and delete the web page contents of target web, wherein, web page contents is the web page contents corresponding with form information.
The web page contents of target web is the web page contents corresponding with form information, as data such as the username and passwords that user is submitted to target web by form information.After acquisition identity information, utilize identity information access destination webpage, namely have the authority that this identity information operates this target web, can operate this target web according to the identity represented by identity information, delete the web page contents of target web.
Above-described embodiment can be applied in following scene:
Fishing webpage submits No. QQ and QQ password by sending fishing network address deceive users to user at fishing webpage, thus steals No. QQ and the QQ password of user.In order to avoid user lose No. QQ and QQ password maliciously used, first this fishing webpage can be accessed according to the network address of fishing webpage, obtain the form information of this fishing webpage, intercepting program is inserted in form information, and backward target web submits the form information being inserted with intercepting program to, form information after submission is browsed by the managerial personnel of fishing webpage, meanwhile, intercepting program in form information is run, and obtains the cookies of this target web thus obtains the identity information that this browses personnel.After the identity information obtaining target web, this target web is again accessed with the identity of the identity information obtained representative, the processing authority of target web that now identity information has deletes the web page contents of this target web, namely No. QQ that is stolen and QQ password, thus No. QQ that makes to be stolen by fishing webpage and QQ password cannot be used by fishing webpage.
Pass through above-described embodiment, the form information in target web is utilized to submit intercepting program to, and utilize intercepting program to obtain the identity information of target web, then with the identity information obtained, the web page contents in target web is deleted, namely the user profile by submission of sheet is deleted, by the web page contents deleting target web, the user profile be stolen cannot be used by target web, thus ensure that the security of user profile.
Preferably, in order to avoid the form information of target web is missed, below in conjunction with Fig. 3, the form information extracting target web in this embodiment is described, comprises the steps:
Step S302, obtains the page info of target web.
Step S304, judges whether comprise form information in page info.
Step S306, if comprise form information in page info, then extracts form information.
Step S308, if do not comprise form information in page info, then obtains the frame information of target web, and extracts form information by frame information.
The page info of target web comprises the frame information etc. of form information and webpage.Usually, list in webpage has form label usually, by form label statement list, the scope of definition image data, namely the data comprised in <form> and <form> are submitted in server, as <formaction=" url " method=" getpost " enctype=" mime " target=" ... " ><form>, when extracting the form information of target web so again, first can judge whether have list in page info by searching form label, if form label can be found, extract the form information in this webpage.If do not find form label and do not mean that in this webpage not there is form information yet, because also other pages may be embedded in this target web, this just needs the frame information obtaining target web, extract form information by frame information again, the concrete steps being extracted form information by frame information are as follows:
Step S410, judges whether be embedded with embedded web page in target web according to frame information.
Step S412, if having embedded web page in target web, then detects the form information in embedded web page.
Step S414, after detecting the form information in embedded web page, extracts the form information in embedded web page.
The frame information of target web is Iframe information, in page info, search Iframe information and judge whether be embedded with embedded web page in this target web according to the frame information found, if there is embedded web page in this target web, then detect in this embedded web page and whether there is form information, and extract the form information in this embedded web page.Pass through said method, the form information be hidden in target web can be detected, and form information is extracted in embedded web page, that avoids target web form information is undetected, thus the method that can be provided by the embodiment of the present invention is obtained the identity information of target web and utilizes identity information access destination webpage and delete the web page contents of target web.
Further, the form information inserting intercepting program is committed to target web to comprise: the submission address obtaining form information.Judge to submit to address to be whether the specific address of target web.If submit to address not to be specific address, obtain specific address by carrying out mapping to submission address.The form information inserting intercepting program is committed to by mapping the specific address obtained.
Specific address is made up of IP4 rule on the internet, and any website can directly can reach target web corresponding to this specific address by this specific address; And relative address refer to address relative to certain object and with whether have nothing to do in this locality, only comprise the web page address of local path.Therefore, in order to can direct access destination webpage, form information is accurately submitted to target web, in form information, then obtain the submission address of submission information, and judge to submit to address to be whether the specific address of target web, if submit to address not to be specific address, then mapping is carried out to submission address and obtain specific address, and the form information with the program of intercepting is committed to by mapping the specific address obtained.
In addition, crawling of batch can submit address to when obtaining and submitting address to, when crawling submission address, the speed crawling relative address will faster than the speed crawling specific address, and, conveniently the management of website, does not use specific address usually, therefore, what often crawl when crawling the submission address in form information is non-specific address, i.e. relative address, this maps, form information is accurately submitted to target web submission address with regard to needs.
Preferably, in order to again give the submission information that user loses for change, after the identity information by running intercepting program intercepts target web, the method also comprises: utilize identity information to obtain the web page contents of target web.Preserve web page contents.And send predetermined warning information according to the classification of web page contents.
After the identity information obtaining target web, the web page contents of this identity information to target web can being utilized to process, as deleted the web page contents of target web in above-described embodiment, or preserving web page contents.Web page contents due to target web is submission information, after being preserved by the web page contents of target web, directly can check the web page contents of this target web, namely check the submission information obtained by target web.In addition, different warning information can also be sent according to web page contents, such as, QQ user for the QQ number that is stolen can notify that corresponding QQ user changes its password, can send the warning information such as note for the user of the bank card password that is stolen or the user of other personal information, taking appropriate measures to point out user ensures the safety of user profile.
In the above-described embodiments, after the identity information obtaining target web, identity information can be preserved in the server, identity information can be transferred at interval of the schedule time, and according to the identity access destination webpage representated by identity information, and delete the web page contents of target web or preserve the web page contents of target web.
The embodiment of the present invention also provides a kind of Web Page Processing system, can perform the web page processing method in the embodiment of the present invention.This system as shown in Figure 4, comprises the file that can run above-mentioned web page processing method.As shown in the figure, auto.bat is the trigger file running this Web Page Processing system, after this file of triggering, network address is extracted with access destination webpage from url.txt, then fishlast.pl is run successively, fishlast2.pl and fishlast3.pl to extract form information from target web, wherein, fishlast.pl is used for extracting the form information of target web, if do not extract form information, then run fishlast2.pl to obtain the frame information of target web, according in frame information determination target web containing embedded web page time, run the form information that fishlast3.pl obtains embedded web page.This form information comprises the CGI address of list and list, then submits the list with XSS statement to target web, obtains the cookie of target web to obtain identity information when browsing the content of submission of sheet.Plan target cral.pl is at interval of the schedule time, as the cookie that access in five minutes obtains, and with the identity information access destination webpage in this cookie, delete web page contents or preserve web page contents, the web page contents of target web can be seen like this, namely by user profile that target web obtains.
It should be noted that, for aforesaid each embodiment of the method, in order to simple description, therefore it is all expressed as a series of combination of actions, but those skilled in the art should know, the present invention is not by the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and involved action and module might not be that the present invention is necessary.
Through the above description of the embodiments, those skilled in the art can be well understood to the mode that can add required general hardware platform by software according to the method for above-described embodiment and realize, hardware can certainly be passed through, but in a lot of situation, the former is better embodiment.Based on such understanding, technical scheme of the present invention can embody with the form of software product the part that prior art contributes in essence in other words, this computer software product is stored in a storage medium (as ROM/RAM, magnetic disc, CD), comprising some instructions in order to make a station terminal equipment (can be mobile phone, computing machine, server, or the network equipment etc.) perform method described in each embodiment of the present invention.
Embodiment 2
According to the embodiment of the present invention, additionally provide a kind of page processor for implementing above-mentioned web page processing method, this page processor is mainly used in the web page processing method that execution embodiment of the present invention foregoing provides, and the page processor provided the embodiment of the present invention below does concrete introduction:
Fig. 5 is the schematic diagram of the page processor according to the embodiment of the present invention.As shown in the figure, this page processor comprises extraction unit 10, plug-in unit 20, commit unit 30, interception unit 40 and delete cells 50.
Extraction unit 10 is for extracting the form information of target web by access destination webpage.
Target web has list and fills in function, this target web then has form information, user can submit information to target web by filling in list, the information submitted to can be kept on server corresponding to target web, and namely target web can by the information providing the function of filling in list to obtain user to user.
Such as: personal homepage has message pad function.User is in the message pad region of client input information, and then user presses " message " button, and the information that user inputs is sent to server by client, and the information that user inputs is stored in specified file by server.
Such as: fishing website has malicious web pages and collects user profile, the information that user is filled in by list is collected when user accesses this malicious web pages.
For the webpage in above-mentioned example, all there is list function, namely can extract the form information of target web by accessing these target webs.The network address of target web can be collected by the report of protection capacity of safety protection software, such as, transfers the network address of the dangerous webpage that Tengxun mobile phone house keeper collects, and by network address access destination webpage.
Plug-in unit 20 for inserting intercepting program in form information, and wherein, intercepting program is for intercepting the identity information of target web.
Intercepting program does not intercept the program of the identity information of target web, and namely the identity information of target web can log in the identity information that this target web operates this target web, the such as identity information of keeper's identity.This intercepting program can be XSS statement, by making form information carry intercepting program by this XSS statement write form information.
Commit unit 30 is for being committed to target web by the form information inserting intercepting program.
Due to information can be submitted to target web by list, as above " message " in routine message pad, then also the intercepting program of carrying in form information can be committed to target web by submission form information, when needs submit intercepting program to target web, by the form information inserting intercepting program is committed to target web.
Interception unit 40, for when form information is viewed, intercepts the identity information of program intercepts target web by operation.
The intercepting program of the identity information for intercepting target web is inserted with in form information, when form information is viewed, can by running the identity information intercepting program intercepts target web.Form information is viewed is browsed by the backstage of target web, and the user that not all opens this target web can browse the submission information of other users, therefore, the managerial personnel of this target web are generally when form information is viewed, now run the identity information that intercepting program can intercept the managerial personnel of target web, that is, the identity information intercepting the target web of program intercepts is the identity information of the personnel browsing form information.The identity information of target web is generally held in the cookie of target web, can be obtained the identity information of target web by the cookie intercepted in program looks target web.
Such as, for fishing website, after submitting the form information with the program of intercepting to fishing website, the staff of fishing website checks the information by submission of sheet with the identity of keeper, in the process of checking, the intercepting program be inserted in form information is run automatically, and obtains the identity information that this browses personnel.
Delete cells 50 is for utilizing identity information access destination webpage and deleting the web page contents of target web, and wherein, web page contents is the web page contents corresponding with form information.
The web page contents of target web is the web page contents corresponding with form information, as data such as the username and passwords that user is submitted to target web by form information.After acquisition identity information, utilize identity information access destination webpage, namely have the authority that this identity information operates this target web, can operate this target web according to the identity represented by identity information, delete the web page contents of target web.
Above-described embodiment can be applied in following scene:
Fishing webpage submits No. QQ and QQ password by sending fishing network address deceive users to user at fishing webpage, thus steals No. QQ and the QQ password of user.In order to avoid user lose No. QQ and QQ password maliciously used, first this fishing webpage can be accessed according to the network address of fishing webpage, obtain the form information of this fishing webpage, intercepting program is inserted in form information, and backward target web submits the form information being inserted with intercepting program to, form information after submission is browsed by the managerial personnel of fishing webpage, meanwhile, intercepting program in form information is run, and obtains the cookies of this target web thus obtains the identity information that this browses personnel.After the identity information obtaining target web, this target web is again accessed with the identity of the identity information obtained representative, the processing authority of target web that now identity information has deletes the web page contents of this target web, namely No. QQ that is stolen and QQ password, thus No. QQ that makes to be stolen by fishing webpage and QQ password cannot be used by fishing webpage.
Pass through above-described embodiment, the form information in target web is utilized to submit intercepting program to, and utilize intercepting program to obtain the identity information of target web, then with the identity information obtained, the web page contents in target web is deleted, namely the user profile by submission of sheet is deleted, by the web page contents deleting target web, the user profile be stolen cannot be used by target web, thus ensure that the security of user profile.
Preferably, in order to avoid the form information of target web is missed, extraction unit comprises: the first acquisition module, for obtaining the page info of target web.Judge module, for judging whether comprise form information in page info.Extraction module, during for comprising form information in page info, extracts form information.And second acquisition module, during for not comprising form information in page info, obtaining the frame information of target web, and extracting form information by frame information.
The page info of target web comprises the frame information etc. of form information and webpage.Usually, list in webpage has form label usually, by form label statement list, the scope of definition image data, namely the data comprised in <form> and <form> are submitted in server, as <formaction=" url " method=" getpost " enctype=" mime " target=" ... " ><form>, when extracting the form information of target web so again, first can judge whether have list in page info by searching form label, if form label can be found, extract the form information in this webpage.If do not find form label and do not mean that in this webpage not there is form information yet, because also other pages may be embedded in this target web, this just needs the frame information obtaining target web, form information is extracted again by frame information, wherein, the second acquisition module extracts the form information in embedded web page by following submodule:
Judge submodule, for judging whether be embedded with embedded web page in target web according to frame information.Detection sub-module, during for there being embedded web page in target web, detects the form information in embedded web page.And extraction submodule, for after detecting the form information in embedded web page, extract the form information in embedded web page.
The frame information of target web is Iframe information, in page info, search Iframe information and judge whether be embedded with embedded web page in this target web according to the frame information found, if there is embedded web page in this target web, then detect in this embedded web page and whether there is form information, and extract the form information in this embedded web page.Pass through said method, the form information be hidden in target web can be detected, and form information is extracted in embedded web page, that avoids target web form information is undetected, thus the method that can be provided by the embodiment of the present invention is obtained the identity information of target web and utilizes identity information access destination webpage and delete the web page contents of target web.
Further, plug-in unit comprises: address acquisition module, for obtaining the submission address of form information.Whether address judgment module, judge to submit to address to be the specific address of target web.Mapping block, for when submitting to address not to be specific address, obtains specific address by carrying out mapping to submission address.And submission module, for the form information inserting intercepting program is committed to by mapping the specific address obtained.
Specific address is made up of IP4 rule on the internet, and any website can directly can reach target web corresponding to this specific address by this specific address; And relative address refer to address relative to certain object and with whether have nothing to do in this locality, only comprise the web page address of local path.Therefore, in order to can direct access destination webpage, form information is accurately submitted to target web, in form information, then obtain the submission address of submission information, and judge to submit to address to be whether the specific address of target web, if submit to address not to be specific address, then mapping is carried out to submission address and obtain specific address, and the form information with the program of intercepting is committed to by mapping the specific address obtained.
In addition, crawling of batch can submit address to when obtaining and submitting address to, when crawling submission address, the speed crawling relative address will faster than the speed crawling specific address, and, conveniently the management of website, does not use specific address usually, therefore, what often crawl when crawling the submission address in form information is non-specific address, i.e. relative address, this maps, form information is accurately submitted to target web submission address with regard to needs.
Preferably, in order to again give the submission information that user loses for change, this device also comprises: acquiring unit, for after the identity information by running intercepting program intercepts target web, utilizes identity information to obtain the web page contents of target web.Storage unit, for preserving web page contents.And Alarm Unit, for sending predetermined warning information according to the classification of web page contents.
After the identity information obtaining target web, the web page contents of this identity information to target web can being utilized to process, as deleted the web page contents of target web in above-described embodiment, or preserving web page contents.Web page contents due to target web is submission information, after being preserved by the web page contents of target web, directly can check the web page contents of this target web, namely check the submission information obtained by target web.In addition, different warning information can also be sent according to web page contents, such as, QQ user for the QQ number that is stolen can notify that corresponding QQ user changes its password, can send the warning information such as note for the user of the bank card password that is stolen or the user of other personal information, taking appropriate measures to point out user ensures the safety of user profile.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
In the above embodiment of the present invention, the description of each embodiment is all emphasized particularly on different fields, in certain embodiment, there is no the part described in detail, can see the associated description of other embodiments.
In several embodiments that the application provides, should be understood that, device embodiment described above is only schematic, the such as division of described unit, be only a kind of logic function to divide, actual can have other dividing mode when realizing, such as multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of unit or module or communication connection can be electrical or other form.
The described unit illustrated as separating component or can may not be and physically separates, and the parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of unit wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, also can be that the independent physics of unit exists, also can two or more unit in a unit integrated.Above-mentioned integrated unit both can adopt the form of hardware to realize, and the form of SFU software functional unit also can be adopted to realize.
If described integrated unit using the form of SFU software functional unit realize and as independently production marketing or use time, can be stored in a computer read/write memory medium.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words or all or part of of this technical scheme can embody with the form of software product, this computer software product is stored in a storage medium, comprises all or part of step of some instructions in order to make a computer equipment (can be personal computer, server or the network equipment etc.) perform method described in each embodiment of the present invention.And aforesaid storage medium comprises: USB flash disk, ROM (read-only memory) (ROM, Read-OnlyMemory), random access memory (RAM, RandomAccessMemory), portable hard drive, magnetic disc or CD etc. various can be program code stored medium.
The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.
Claims (10)
1. a web page processing method, is characterized in that, comprising:
The form information of described target web is extracted by access destination webpage;
In described form information, insert intercepting program, wherein, described intercepting program is for intercepting the identity information of described target web;
The form information inserting described intercepting program is committed to described target web;
When described form information is viewed, by running the identity information of target web described in described intercepting program intercepts; And
Utilize described identity information to access described target web and delete the web page contents of described target web, wherein, described web page contents is the web page contents corresponding with described form information.
2. method according to claim 1, is characterized in that, the form information extracting described target web comprises:
Obtain the page info of described target web;
Judge whether comprise described form information in described page info;
If comprise described form information in described page info, then extract described form information; And
If do not comprise described form information in described page info, then obtain the frame information of described target web, and extract described form information by described frame information.
3. method according to claim 2, is characterized in that, extracts described form information comprise by described frame information:
Judge whether be embedded with embedded web page in described target web according to described frame information;
If have described embedded web page in described target web, then detect the form information in described embedded web page; And
After detecting the form information in described embedded web page, extract the form information in described embedded web page.
4. method according to claim 1, is characterized in that, the form information inserting described intercepting program is committed to described target web and comprises:
Obtain the submission address of described form information;
Judge that whether described submission address is the specific address of described target web;
If described submission address is not described specific address, obtain described specific address by carrying out mapping to described submission address; And
The form information inserting described intercepting program is committed to by mapping the described specific address obtained.
5. method according to claim 1, is characterized in that, after the identity information by running target web described in described intercepting program intercepts, described method also comprises:
Described identity information is utilized to obtain the web page contents of described target web;
Preserve described web page contents; And
Predetermined warning information is sent according to the classification of described web page contents.
6. a page processor, is characterized in that,
Extraction unit, for extracting the form information of described target web by access destination webpage;
Plug-in unit, for inserting intercepting program in described form information, wherein, described intercepting program is for intercepting the identity information of described target web;
Commit unit, for being committed to described target web by the form information inserting described intercepting program;
Interception unit, for when described form information is viewed, by running the identity information of target web described in described intercepting program intercepts; And
Delete cells, accesses described target web for utilizing described identity information and deletes the web page contents of described target web, and wherein, described web page contents is the web page contents corresponding with described form information.
7. device according to claim 6, is characterized in that, described extraction unit comprises:
First acquisition module, for obtaining the page info of described target web;
Judge module, for judging whether comprise described form information in described page info;
Extraction module, during for comprising described form information in described page info, extracts described form information; And
Second acquisition module, during for not comprising described form information in described page info, obtains the frame information of described target web, and extracts described form information by described frame information.
8. device according to claim 7, is characterized in that, described second acquisition module comprises:
Judge submodule, for judging whether be embedded with embedded web page in described target web according to described frame information;
Detection sub-module, during for there being described embedded web page in described target web, detects the form information in described embedded web page; And
Extract submodule, for after detecting the form information in described embedded web page, extract the form information in described embedded web page.
9. device according to claim 6, is characterized in that, described plug-in unit comprises:
Address acquisition module, for obtaining the submission address of described form information;
Address judgment module, judges that whether described submission address is the specific address of described target web;
Mapping block, for when described submission address is not described specific address, obtains described specific address by carrying out mapping to described submission address; And
Submit module to, for being committed to the form information inserting described intercepting program by mapping the described specific address obtained.
10. device according to claim 6, is characterized in that, described device also comprises:
Acquiring unit, for after the identity information by running target web described in described intercepting program intercepts, utilizes described identity information to obtain the web page contents of described target web;
Storage unit, for preserving described web page contents; And
Alarm Unit, for sending predetermined warning information according to the classification of described web page contents.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410217134.9A CN105095309A (en) | 2014-05-21 | 2014-05-21 | Webpage processing method and device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410217134.9A CN105095309A (en) | 2014-05-21 | 2014-05-21 | Webpage processing method and device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN105095309A true CN105095309A (en) | 2015-11-25 |
Family
ID=54575759
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201410217134.9A Pending CN105095309A (en) | 2014-05-21 | 2014-05-21 | Webpage processing method and device |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN105095309A (en) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106021561A (en) * | 2016-05-30 | 2016-10-12 | 中国南方电网有限责任公司 | Page form processing method and device |
| CN110244896A (en) * | 2019-06-24 | 2019-09-17 | 北京向上一心科技有限公司 | Screenshot method, device, controller and storage medium in webpage |
| CN110602134A (en) * | 2019-09-24 | 2019-12-20 | 杭州安恒信息技术股份有限公司 | Method, device and system for identifying illegal terminal access based on session label |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101023419A (en) * | 2004-05-14 | 2007-08-22 | 模比莱普斯有限公司 | Method of providing a web page with inserted content |
| US20110154464A1 (en) * | 2009-12-23 | 2011-06-23 | Puneet Agarwal | Systems and methods for intercepting and automatically filling in forms by the appliance for single-sign on |
| CN102387135A (en) * | 2011-09-29 | 2012-03-21 | 北京邮电大学 | User identity filtering method and firewall |
| CN103023869A (en) * | 2012-11-02 | 2013-04-03 | 北京奇虎科技有限公司 | Malicious attack prevention method and browser |
-
2014
- 2014-05-21 CN CN201410217134.9A patent/CN105095309A/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101023419A (en) * | 2004-05-14 | 2007-08-22 | 模比莱普斯有限公司 | Method of providing a web page with inserted content |
| US20110154464A1 (en) * | 2009-12-23 | 2011-06-23 | Puneet Agarwal | Systems and methods for intercepting and automatically filling in forms by the appliance for single-sign on |
| CN102387135A (en) * | 2011-09-29 | 2012-03-21 | 北京邮电大学 | User identity filtering method and firewall |
| CN103023869A (en) * | 2012-11-02 | 2013-04-03 | 北京奇虎科技有限公司 | Malicious attack prevention method and browser |
Non-Patent Citations (1)
| Title |
|---|
| XFK: "XSS获取cookie并利用", 《HTTP://WWW.FREEBUF.COM/ARTICLES/WEB/6204.HTML》 * |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106021561A (en) * | 2016-05-30 | 2016-10-12 | 中国南方电网有限责任公司 | Page form processing method and device |
| CN110244896A (en) * | 2019-06-24 | 2019-09-17 | 北京向上一心科技有限公司 | Screenshot method, device, controller and storage medium in webpage |
| CN110602134A (en) * | 2019-09-24 | 2019-12-20 | 杭州安恒信息技术股份有限公司 | Method, device and system for identifying illegal terminal access based on session label |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11036855B2 (en) | Detecting frame injection through web page analysis | |
| CN108932426B (en) | Unauthorized vulnerability detection method and device | |
| US9838419B1 (en) | Detection and remediation of watering hole attacks directed against an enterprise | |
| US10904286B1 (en) | Detection of phishing attacks using similarity analysis | |
| US10601865B1 (en) | Detection of credential spearphishing attacks using email analysis | |
| CN108259514B (en) | Vulnerability detection method and device, computer equipment and storage medium | |
| WO2013044757A1 (en) | Method, device and system for detecting security of download link | |
| CN104253785B (en) | Dangerous network address recognition methods, apparatus and system | |
| CN108183900B (en) | A mining script detection method, server, system, terminal device and storage medium | |
| CN103501306B (en) | A kind of network address knows method for distinguishing, server and system | |
| CN107612924A (en) | Attacker's localization method and device based on wireless network invasion | |
| CN110035075A (en) | Detection method, device, computer equipment and the storage medium of fishing website | |
| CN103632084A (en) | Building method for malicious feature data base, malicious object detecting method and device of malicious feature data base | |
| CN111371778B (en) | Attack group identification method, device, computing equipment and medium | |
| CN103888480B (en) | Network information security authentication method and cloud device based on cloud monitoring | |
| CN108573146A (en) | A kind of malice URL detection method and device | |
| CN104580203A (en) | Website malicious program detection method and device | |
| WO2020082763A1 (en) | Decision trees-based method and apparatus for detecting phishing website, and computer device | |
| US10621337B1 (en) | Application-to-application device ID sharing | |
| CN104820667A (en) | Method, device and system for reporting webpage click rate | |
| CN105959324A (en) | Regular matching-based network attack detection method and apparatus | |
| CN108900496A (en) | A kind of quick detection website is implanted the detection method and device of digging mine wooden horse | |
| CN107465702A (en) | Method for early warning and device based on wireless network invasion | |
| CN105095309A (en) | Webpage processing method and device | |
| CN109831451A (en) | Preventing Trojan method based on firewall |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20151125 |