[go: up one dir, main page]

CN119341816B - Fishing website detection method based on YOLOv and Resnet-101 - Google Patents

Fishing website detection method based on YOLOv and Resnet-101

Info

Publication number
CN119341816B
CN119341816B CN202411472624.3A CN202411472624A CN119341816B CN 119341816 B CN119341816 B CN 119341816B CN 202411472624 A CN202411472624 A CN 202411472624A CN 119341816 B CN119341816 B CN 119341816B
Authority
CN
China
Prior art keywords
detection
target
phishing
legal
website
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202411472624.3A
Other languages
Chinese (zh)
Other versions
CN119341816A (en
Inventor
朱二周
刘豪
赵俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui University
Original Assignee
Anhui University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui University filed Critical Anhui University
Priority to CN202411472624.3A priority Critical patent/CN119341816B/en
Publication of CN119341816A publication Critical patent/CN119341816A/en
Application granted granted Critical
Publication of CN119341816B publication Critical patent/CN119341816B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a phishing website detection method based on YOLOv and Resnet-101, which comprises the steps of constructing and training a phishing website detection network model, wherein the phishing website detection network model comprises a target detection module and a similarity calculation module, the target detection module is combined with a YOLOv s network and an attention module, the similarity calculation module is based on a Resnet-101 network, target information of a website to be detected is obtained through the target detection module, feature extraction and cosine similarity calculation are carried out on a detection target Logo image and a legal Logo image through the similarity calculation module, and when the cosine similarity is higher than a set threshold, the detection target Logo image is judged to be correct, and the website to be detected is marked as the legal website. The invention does not need any training of fishing data and has the characteristics of small time cost, high detection accuracy and strong expansibility.

Description

Fishing website detection method based on YOLOv and Resnet-101
Technical Field
The invention belongs to the network security technology, and particularly relates to a phishing website detection method based on YOLOv and Resnet-101.
Background
In recent years, the number of phishing events has increased dramatically, and phishing detection methods for URLs, html, and web site shots have emerged to cope with increasingly severe phishing threats. The phishing detection method based on target detection aims at identifying webpage key information, namely legal brand Logo, in a website screenshot. Binary phishing reports (legal and illegal) are then generated for the user in combination with the domain name extracted from the URL.
Common conventional target detection algorithms include hog+svm, DPM, etc. The method has some limitations in target detection tasks, such as dependence of characteristic extraction on manual design, low efficiency of sliding window and candidate region generation, low detection precision and difficulty in coping with multi-target detection. The Logo in the webpage is basically a small target, and a large number of missing reports and false reports can be caused due to low detection accuracy.
Along with the development of deep learning, a plurality of target detection algorithms based on convolutional neural networks are created, features are automatically learned from data, the step of manually designing the features is omitted, and the detection speed and the detection precision are improved. Among them, YOLO is one of representative algorithms, has remarkable characteristics and advantages, and the occurrence of YOLO has been improved until now. YOLOv5 inherits the characteristics of high speed and high efficiency of the YOLO family, and is further optimized on the basis, so that the device is lighter, and better balance between accuracy and speed is realized. By virtue of the diversified model versions, the easy-to-use characteristics and the strong community support, YOLOv becomes one of the most widely applied target detection algorithms at present, and is suitable for practical application in various environments from embedded equipment to high-performance computing and the like. Therefore YOLOv is used to detect legal Logo in the web page.
The original YOLOv model has lower detection precision on the small target, and in order to improve the detection precision of YOLOv on the small target, an attention module is embedded in a YOLOv characteristic extraction network, so that the characteristic learning capability of the model on the small target is enhanced. These modules make the model more focused on the key areas of small objects by dynamically adjusting the weights of the channels and spatial dimensions.
Thanks to the extremely fast detection speed of YOLOv, the detection result can be verified by other means after the detection is completed, and the detection accuracy is higher. The method comprises the steps of extracting improved YOLOv detection results and features of legal brand Logo screenshot by utilizing Resnet-101 obtained through training on a large dataset ImageNet, calculating cosine similarity of the detection results and features of legal brand Logo screenshot, and considering that detection is correct when the similarity is higher than a certain threshold value, otherwise, detecting failure.
Disclosure of Invention
The invention aims to solve the defects in the prior art and provide a fishing website detection method based on YOLOv and Resnet-101;
The invention aims to solve the problems, and extracts legal brand Logo in the webpage by improving YOLOv, and extracts target detection results and features of real brand Logo by utilizing Resnet-101. An interpretable phishing detection report is ultimately generated for the user. The explanatory report provides detailed information to the user to help them determine whether to access the target web page.
The technical scheme is that the phishing website detection method based on YOLOv and Resnet-101 comprises the following steps:
Step S1, acquiring legal Logo and making an image data set, splitting the legal Logo into a training set and a verification set of YOLOv s+SE target detection module, acquiring a URL address and a corresponding webpage screenshot of a legal website, then acquiring the URL address and the corresponding webpage screenshot of a phishing website, respectively merging and dividing the two types of data, and respectively serving as a phishing detection data set and a target detection data set;
Step S1, firstly acquiring a URL address of a legal website and a corresponding legal Logo image, then acquiring a URL address of a phishing website and a corresponding webpage screenshot, then combining two types of data to be used as a phishing detection data set, and dividing a target detection data set at the same time, wherein the target detection data set comprises the legal Logo image and the phishing webpage screenshot;
s2, constructing and training a phishing website detection network model, wherein the phishing website detection network model comprises a target detection module and a similarity calculation module, the target detection module is combined with a YOLOv S network and an attention module, and the similarity calculation module is based on a Resnet-101 network;
S3, inputting the URL address of the website to be detected and the corresponding webpage screenshot into a trained phishing website detection network model, and obtaining target information of the website to be detected through a target detection module;
The target information comprises category information and coordinate information, a legal Logo image and a corresponding URL domain name are obtained according to the category information, a target area is intercepted on a webpage screenshot according to the coordinate information, and the intercepted target area is scaled into a detection target Logo image with the height consistent with that of the obtained legal Logo image;
S4, performing feature extraction on the detection target Logo image and the legal Logo image obtained in the step S3 by using a similarity calculation module, calculating a cosine similarity value, and judging that the detection target Logo image is correct when the obtained cosine similarity value is higher than a set threshold value;
and S5, combining the domain name in the URL address with a correct detection target Logo image to generate an interpretable phishing detection report.
Further, in the step 1, the synthetic Logo image is randomly rotated, scaled, different backgrounds are added, and the invalid URL address in the data set is deleted.
Further, the target detection module is based on YOLOv s network, and an attention module (CBAM attention module or SE attention module) is added before a space pyramid pooling rapid module SPPF of YOLOv s network, so that the calculation process of the space pyramid pooling is accelerated while the capability of fusing multi-scale features is maintained, and finally the detection precision is improved.
And inquiring a legal Domain database and a legal Logo image database according to the category information in the target information to obtain a legal Logo Domain name and a legal Logo image corresponding to the detection target.
Further, the processing procedure of the similarity calculation module is as follows:
Step 4.1, firstly adjusting an input image into a fixed size, performing normalization processing, further scaling a pixel value to a specific range, then inputting the obtained image into a ResNet-101 model, and performing forward propagation through each layer of the ResNet-101 model to obtain a feature vector with the dimension of 1x1x 2048;
And 4.2, calculating cosine similarity between the feature vector of the step 4.1 and the features of the legal Logo image.
Further, the specific method for generating the interpretable phishing report in the step S5 is as follows:
Based on the correct Logo image and the corresponding legal domain name list in the webpage screenshot, four phishing detection reports are obtained by combining domain name information extracted from the URL:
(a) Legal websites;
(b) A phishing website of a user is deceived by legal Logo;
(c) A phishing website that uses a legitimate domain name to fool the user;
(d) Phishing websites without legal information.
The invention has the advantages that no training of any fishing data is needed, and the invention has the characteristics of small time cost, high detection accuracy and strong expansibility. The target detection module disclosed by the invention uses the rapid spatial pyramid pooling module in YOLOv s, the calculation process of spatial pyramid pooling is accelerated while the capability of fusing multi-scale features is maintained, and before feature fusion, the attention module is further added in front of the rapid spatial pyramid pooling module SPPF, so that the detection accuracy is improved.
Drawings
FIG. 1 is a schematic overall flow chart of the present invention;
FIG. 2 is a schematic diagram of a target detection module according to the present invention;
FIG. 3 is a schematic diagram of a target detection result in an embodiment;
FIG. 4 is a schematic diagram of Resnet-101 structures in an embodiment;
FIG. 5 is a schematic diagram of the number of labels of 16 categories of the target detection training set in an embodiment;
FIG. 6 is a graph showing the performance of 10 models at different cosine similarity values in the examples;
fig. 7 is an explanatory phishing detection report diagram of the present invention.
Detailed Description
The technical scheme of the present invention is described in detail below, but the scope of the present invention is not limited to the embodiments.
As shown in FIG. 1, the phishing website detection method based on YOLOv and Resnet-101 of the invention comprises the following steps:
Step S1, acquiring legal Logo and making an image data set, splitting the legal Logo into a training set and a verification set of YOLOv s+SE target detection module, acquiring a URL address and a corresponding webpage screenshot of a legal website, then acquiring the URL address and the corresponding webpage screenshot of a phishing website, respectively merging and dividing the two types of data, and respectively serving as a phishing detection data set and a target detection data set;
s2, constructing and training a phishing website detection network model, wherein the phishing website detection network model comprises a target detection module and a similarity calculation module, the target detection module is combined with a YOLOv S network and an attention module, and the similarity calculation module is based on a Resnet-101 network;
S3, inputting the URL address of the website to be detected and the corresponding webpage screenshot into a trained phishing website detection network model, and obtaining target information of the website to be detected through a target detection module;
The target information comprises category information and coordinate information, a legal Logo image and a corresponding URL domain name are obtained according to the category information, a target area is intercepted on a webpage screenshot according to the coordinate information, and the intercepted target area is scaled into a detection target Logo image with the height consistent with that of the obtained legal Logo image;
S4, performing feature extraction on the detection target Logo image and the legal Logo image obtained in the step S3 by using a similarity calculation module, calculating a cosine similarity value, and judging that the detection target Logo image is correct when the obtained cosine similarity value is higher than a set threshold value;
and S5, combining the domain name in the URL address with a correct detection target Logo image to generate an interpretable phishing detection report.
In the step 1 of the embodiment, random rotation and scaling are carried out on the synthetic Logo image, different backgrounds are added, and invalid URL addresses in the dataset are deleted.
For example, in this embodiment, 16 kinds of logos of 5 brands are searched first, as shown in fig. 5, including :google,search,chrome,mail,map,play,chat,picture,meeting,amazon 1,amazon 2,alibaba,twitter 1,twitter 2,facebook_1,facebook_2., then the legal Logo image is rotated, scaled and added with different backgrounds, and LabelMe is used to label the legal Logo in the image, so as to make a training set and a verification set for the target detection module. The training set of the target detection module contains 435 images and the verification set of the target detection module contains 201 images.
Meanwhile, the PHISHTANK website is accessed to acquire the URL of the phishing website and the webpage screenshot thereof since 2023, data cleaning is carried out, and the URL which has been invalidated is deleted. The phishing webpage image after data cleaning is divided into two parts, wherein one part is only provided with an image, a LabelMe label legal Logo is used as a test set of the target detection module, and the other part comprises the image and the URL thereof and is used for the phishing detection test set. 131 legal website URLs and corresponding web page shots are manually collected and divided into a portion of the test set of the target detection module and a portion of the phishing detection data set. The test set of the target detection module contains 2203 URLs and their web site images, wherein 2122 URLs are phishing web sites and 81 URLs are legal web sites. The phishing detection test set contains 1108 URL machine website images, of which 1058 URLs are phishing websites and 50 are legitimate websites.
The target detection module is based on YOLOv s network, and adds an attention module before a space pyramid pooling rapid module SPPF of YOLOv s network, and queries a legal Domain database and a legal Logo image database according to category information in target information to obtain legal Logo Domain names and legal Logo images corresponding to the legal Logo. The processing procedure of the similarity calculation module in this embodiment is as follows:
Step 4.1, firstly adjusting an input image into a fixed size, performing normalization processing, further scaling a pixel value to a specific range, then inputting the obtained image into a ResNet-101 model, and performing forward propagation through each layer of the ResNet-101 model to obtain a feature vector with the dimension of 1x1x 2048;
And 4.2, calculating cosine similarity between the feature vector of the step 4.1 and the features of the legal Logo image.
For example, all the Target shots obtained in the previous step are taken as a data set and recorded as Target, and two values, categories and judgment values of each row are sequentially recorded in the csv file according to the sequence in the Target. Judging that the value is only 0 or 1, wherein the value is 1 to represent correctness, the value is 0 to represent error, 10 models Resnet-101, resnet-50, EFFICIENTNET-B0-B7 are selected, and each model comprises the following steps:
The method comprises the steps of obtaining a Target image, inquiring a legal Logo image database according to the category of a csv corresponding row to obtain legal Logo, respectively carrying out feature extraction on the two images by using a model, calculating similarity by using a cosine similarity formula, obtaining 1 when the similarity is higher than a certain threshold value, indicating that the two images are similar, otherwise, obtaining 0 and dissimilar, and finally comparing the similarity with a judging value to judge that the judgment is correct.
Examples
MAP, accuracy, precision, recall and F 1 -Score were used to evaluate the model's metrics.
In the field of computer vision, average Accuracy (AP) is commonly used to evaluate the performance of object detection and image classification models. The AP measures the average accuracy of the model used to predict a particular type of object in the complete dataset. As shown in equation (1), the calculation of the AP considers both accuracy and recall by forming an accuracy-recall curve (recall on x-axis, accuracy on y-axis), and the value of the AP is obtained from the area under the curve.
For multi-category target detection tasks, a single category of AP values may not fully measure the effectiveness of the detection model. Therefore, it is necessary to average the object AP values of all categories in the dataset. As shown in equation (2), the average AP (mAP) accurately reflects the overall performance of the detection model.
TABLE 1 mAP of eight target detection models
Table 25 shows the performance of the YOLOv model.
Table 1 gives the experimental results for 8 target detection models. In this table, mAP_Train and mAP_test represent mAP values for training and testing phases, respectively. Time is the average Time cost of detecting a graph, detectedNum is the number of Logo detected. The experimental results listed in table 1 show that 8 models have high mAP values during the test phase of object detection. However, the YOLOv model (YOLOv s, yolov 8m and YOLOv l) was not selected for two reasons. (1) The YOLOv series of models identify many background elements as objects. The target detection test set has only 201 marked Logo. However, as shown in column 5 of Table 1, YOLOV m and YOLOv l each identified 800 more Logo. Thus, the YOLOv model produces too many false positives. (2) YOLOv model can result in high time costs. As shown in column 4 of table 1, the minimum detection time for the YOLOv series model was higher than for the YOLOv series model.
In order to determine the most effective object detection model, the object detection experimental results of the remaining 5 models (YOLOv s, YOLOv5m, YOLOv 5l, YOLOv s+cbam of the present invention and YOLOv s+se of the present invention) are first compared with the actual annotation data in the target detection verification set. Then, the values of the evaluation index, precision, accuracy, recall and F 1 -Score, were calculated as shown in FIG. 6.
The results are shown in table 2, and it can be seen that adding CBAM and SE attention module to YOLOv s significantly improved the four index values compared to the YOLOv s alone. Furthermore, the performance of YOLOv s+cbm and YOLOv s+se is superior to the more complex YOLOv 5m and YOLOv l models. Thus, adding an attention mechanism in YOLOv s is an effective method. The experimental results in table 2 also show that the Precision of YOLOv s+se is 20.2% higher than YOLOv s. This means that YOLOv s+SE will produce very small false positives in object prediction and reduce the error in identifying the background element as Logo. The largest F 1 -Score indicated that the overall performance of YOLOv s+SE was optimal.
According to the similarity calculation module, based on the detected Logo and category obtained by the target detection module in the webpage screenshot, a corresponding legal Logo image is obtained from a legal Logo image library. And acquiring the detected Logo and the corresponding features of the legal Logo image by using a feature extraction module. And calculating cosine similarity between the two images.
In order to verify the performance of the feature extraction module in the technical scheme of the invention, 10 different models, namely Resnet-50, resnet-101 and EFFICIENTNET-B0-B7, are selected for extracting features from Logo images. From the experimental results, the effect of the invention is optimal.
And after the Logo image of the detection target is obtained, performing cosine similarity calculation between the Logo image and a legal Logo image as feature vectors, and evaluating the similarity between the Logo image and the legal Logo image by calculating cosine values of included angles between the two vectors through the cosine similarity. In particular, if the angle between two vectors is closer to 0, i.e. their cosine values are closer to 1, this indicates that the two vectors are more similar. Conversely, if the angle is closer to 180 degrees, the cosine value is closer to-1, indicating that the two vectors are less similar. The range of cosine similarity is [ -1,1], where 1 represents two completely similar vectors, -1 represents two completely opposite vectors, and 0 represents two orthogonal or independent vectors.
For example, given two vectors a and b, the a vector represents the feature of the detected Logo image extracted through Resnet-101, the b vector represents the feature of the legal Logo image extracted through Resnet-101, and their cosine similarity can be calculated by the following formula:
In the experimental results shown in FIG. 6, it can be seen that when the cosine similarity threshold is set between 0.5 and 0.6, the recall rates of Resnet-50 and Resnet-101 models are close to 1, the precision values of Resnet-101 are each higher than 0.85 as shown in FIG. 6 (a), and furthermore, accuracy and F 1 -Score of Resnet-101 are best in 10 models as shown in FIGS. 6 (b) and (c). Experimental results show that two different images can be distinguished by combining the feature extraction capability of Resnet-101 model with cosine similarity calculation. Resnet-101 obtained the highest F 1 -Score (0.926) when the cosine similarity threshold reached 0.6. Since F 1 -Score is a comprehensive evaluation index reflecting overall performance of the model, we set the similarity to 0.6. In other words, when the similarity detection result is greater than or equal to the threshold value, the object detection result is considered to be correct.
Step S4 of this embodiment generates four types of interpretable phishing reports, as shown in fig. 7:
(a) Legal website. Only if the domain name extracted from the URL can be successfully matched in the legal domain name corresponding to the correct Logo image, the URL is considered as a legal website. For example reported as "Legal// Domain: di Logo li," where di is the Domain name and li is the correct Logo image category.
(B) And using legal Logo to deceive the phishing website of the user. The target detection and similarity calculation module determines that the Logo image on the target webpage is authentic. But the input URL does not contain the domain name to which the Logo belongs. In this case, the report is "Phish// No domain. Receive the User WITH LEGAL logo".
(C) And spoofing the user's phishing website using legitimate domain names. The domain name contained in the input URL is consistent with the legitimate domain name in the legitimate domain name database. However, the Logo image on the screenshot determined by the object detection and similarity calculation module is not authentic. In this case, the content of the detection report is "Phish// Fake logo. Decetive User WITH LEGAL Domain".
(D) Phishing websites without legal information. The webpage screenshot does not have any Logo image, and the URL also has no legal domain name. In this case, the content of the detection report is "Phish// No Legal Information".

Claims (4)

1.一种基于YOLOv5和Resnet-101的钓鱼网站检测方法,其特征在于,包括以下步骤:1. A phishing website detection method based on YOLOv5 and Resnet-101, characterized by comprising the following steps: 步骤S1、获取合法Logo并制作图像数据集,将其拆分为目标检测模块的训练集、验证集;获取合法网站的URL地址及对应网页截图,然后获取钓鱼网站的URL地址及对应网页截图,分别将两类数据合并然后划分,分别作为钓鱼检测数据集和目标检测数据集;Step S1: Obtain legitimate logos and create an image dataset, which is then split into a training set and a validation set for the target detection module. Obtain the URL address and corresponding webpage screenshots of legitimate websites, then obtain the URL address and corresponding webpage screenshots of phishing websites, merge the two types of data, and then divide them into a phishing detection dataset and a target detection dataset, respectively. 步骤S2、构建和训练钓鱼网站检测网络模型,所述钓鱼网站检测网络模型包括目标检测模块和相似度计算模块;目标检测模块结合有YOLOv5s网络和注意力模块,相似度计算模块包括Resnet-101网络;Step S2: constructing and training a phishing website detection network model, wherein the phishing website detection network model includes a target detection module and a similarity calculation module; the target detection module combines a YOLOv5s network and an attention module, and the similarity calculation module includes a Resnet-101 network; 步骤S3、将待检测网站的URL地址和对应网页截图输入训练好的钓鱼网站检测网络模型,通过目标检测模块得到待检测网站的目标信息;Step S3: Input the URL address of the website to be detected and the corresponding webpage screenshot into the trained phishing website detection network model, and obtain the target information of the website to be detected through the target detection module; 目标信息包括类别信息和坐标信息,依据类别信息得到合法Logo图像和对应URL域名,依据坐标信息在网页截图上截取目标区域,并将截取到的目标区域缩放为与所得合法Logo图像高度一致的检测目标Logo图像;The target information includes category information and coordinate information. Based on the category information, a legitimate logo image and corresponding URL domain name are obtained. Based on the coordinate information, the target area is captured on the webpage screenshot and scaled to a detection target logo image that is highly consistent with the obtained legitimate logo image. 步骤S4、使用相似度计算模块对步骤S3所得检测目标Logo图像和合法Logo图像进行特征提取,然后计算余弦相似度值,当所得余弦相似度值高于设定阈值,则判断检测目标Logo图像为正确的;Step S4: Use a similarity calculation module to perform feature extraction on the detection target logo image and the legal logo image obtained in step S3, and then calculate the cosine similarity value. When the obtained cosine similarity value is higher than a set threshold, the detection target logo image is judged to be correct; 将步骤S3依据类别信息所得合法Logo图像对应的URL域名,与待检测网站的URL域名进行比较,如果二者一致,则判断待检测网站的URL域名为合法的;Compare the URL domain name corresponding to the legal logo image obtained in step S3 based on the category information with the URL domain name of the website to be detected. If the two are consistent, the URL domain name of the website to be detected is determined to be legal; 步骤S5、将域名检测结果和Logo图像检测结果组合,生成可解释的网络钓鱼检测报告;生成可解释性网络钓鱼报告的具体方法为:Step S5: Combine the domain name detection result and the logo image detection result to generate an interpretable phishing detection report. The specific method for generating an interpretable phishing report is as follows: 基于在网页截图中正确的Logo图像及其对应的合法域名列表,并结合URL中提取的域名信息,得到四种网络钓鱼检测报告:Based on the correct logo image in the webpage screenshot and its corresponding legitimate domain name list, combined with the domain name information extracted from the URL, four phishing detection reports are obtained: (a)、合法网站;(a) Legal website; (b)、使用合法的Logo欺骗用户的钓鱼网站;(b) Phishing websites that use legitimate logos to deceive users; (c)、使用合法的域名欺骗用户的钓鱼网站;(c) Phishing websites that use legitimate domain names to deceive users; (d)、没有合法信息的钓鱼网站。(d) Phishing websites that do not contain legitimate information. 2.根据权利要求1所述的基于YOLOv5和Resnet-101的钓鱼网站检测方法,其特征在于,所述步骤S1中对合法Logo图像随机作旋转、放缩处理,并添加不同背景;删除数据集中的失效URL地址。2. The phishing website detection method based on YOLOv5 and Resnet-101 according to claim 1 is characterized in that in step S1, the legitimate logo image is randomly rotated and scaled, and different backgrounds are added; and invalid URL addresses in the data set are deleted. 3.根据权利要求1所述的基于YOLOv5和Resnet-101的钓鱼网站检测方法,其特征在于,所述目标检测模块基于YOLOv5s网络,且在YOLOv5s网络的空间金字塔池化快速模块SPPF之前添加注意力模块;3. The phishing website detection method based on YOLOv5 and Resnet-101 according to claim 1, characterized in that the target detection module is based on the YOLOv5s network, and an attention module is added before the spatial pyramid pooling fast module SPPF of the YOLOv5s network; 根据目标信息中的类别信息来查询合法Domain数据库和合法Logo图像数据库,得到检测目标对应的合法Logo域名和合法Logo图像。According to the category information in the target information, the legal domain database and legal logo image database are queried to obtain the legal logo domain name and legal logo image corresponding to the detection target. 4.根据权利要求1所述的基于YOLOv5和Resnet-101的钓鱼网站检测方法,其特征在于,所述相似度计算模块的处理过程如下:4. The phishing website detection method based on YOLOv5 and Resnet-101 according to claim 1, wherein the processing process of the similarity calculation module is as follows: 步骤S4.1、先将输入的图像调整为固定尺寸并做归一化处理,进而将像素值缩放到特定范围,然后将所得图像输入ResNet-101模型,通过ResNet-101模型的各层进行前向传播,获得维度为1x1x2048的特征向量;Step S4.1: First, resize the input image to a fixed size and normalize it, scaling the pixel values to a specific range. Then, input the resulting image into the ResNet-101 model and perform forward propagation through the layers of the ResNet-101 model to obtain a feature vector of dimension 1x1x2048. 步骤S4.2、计算步骤S4.1所得特征向量和合法Logo图像的特征之间的余弦相似度。Step S4.2: Calculate the cosine similarity between the feature vector obtained in step S4.1 and the features of the legal logo image.
CN202411472624.3A 2024-10-22 2024-10-22 Fishing website detection method based on YOLOv and Resnet-101 Active CN119341816B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202411472624.3A CN119341816B (en) 2024-10-22 2024-10-22 Fishing website detection method based on YOLOv and Resnet-101

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202411472624.3A CN119341816B (en) 2024-10-22 2024-10-22 Fishing website detection method based on YOLOv and Resnet-101

Publications (2)

Publication Number Publication Date
CN119341816A CN119341816A (en) 2025-01-21
CN119341816B true CN119341816B (en) 2025-10-21

Family

ID=94267494

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202411472624.3A Active CN119341816B (en) 2024-10-22 2024-10-22 Fishing website detection method based on YOLOv and Resnet-101

Country Status (1)

Country Link
CN (1) CN119341816B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120034399B (en) * 2025-04-22 2026-01-02 西南石油大学 A High-Efficiency Cascaded Multi-Stage Adaptive Threshold Fishing Detection Method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093990A (en) * 2023-08-18 2023-11-21 中国电信股份有限公司技术创新中心 Training method, phishing website identification method, device and storage medium
CN118314327A (en) * 2024-04-26 2024-07-09 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Integrated circuit flag detection method, apparatus, computer device, readable storage medium, and program product

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3888335A4 (en) * 2018-11-26 2022-08-10 Cyberfish Ltd. PHISHING PROTECTION METHODS AND SYSTEMS
CN112468501B (en) * 2020-11-27 2022-10-25 安徽大学 URL-oriented phishing website detection method
CN113098874B (en) * 2021-04-02 2022-04-26 安徽大学 A phishing website detection method based on URL string random rate feature extraction
CN112990792B (en) * 2021-05-11 2021-08-31 北京智源人工智能研究院 A method, device and electronic device for automatic detection of infringement risk
CN114448664B (en) * 2021-12-22 2024-01-02 深信服科技股份有限公司 Method and device for identifying phishing webpage, computer equipment and storage medium
CN114070653B (en) * 2022-01-14 2022-06-24 浙江大学 Hybrid phishing website detection method and device, electronic equipment and storage medium
CN114978624B (en) * 2022-05-09 2023-11-03 深圳大学 Phishing webpage detection method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117093990A (en) * 2023-08-18 2023-11-21 中国电信股份有限公司技术创新中心 Training method, phishing website identification method, device and storage medium
CN118314327A (en) * 2024-04-26 2024-07-09 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Integrated circuit flag detection method, apparatus, computer device, readable storage medium, and program product

Also Published As

Publication number Publication date
CN119341816A (en) 2025-01-21

Similar Documents

Publication Publication Date Title
CN112541476B (en) A method for identifying malicious web pages based on semantic feature extraction
Wang et al. You are your photographs: Detecting multiple identities of vendors in the darknet marketplaces
CN112990792B (en) A method, device and electronic device for automatic detection of infringement risk
CN103294813A (en) Sensitive image search method and device
Yao et al. Deep learning for phishing detection
Biswas et al. Recognition of service domains on TOR dark net using perceptual hashing and image classification techniques
CN116722992A (en) Fraud website identification method and device based on multi-mode fusion
Ouyang et al. Robust copy-move forgery detection method using pyramid model and Zernike moments
Diwan et al. Keypoint based comprehensive copy‐move forgery detection
Zhou et al. Visual similarity based anti-phishing with the combination of local and global features
CN114070653B (en) Hybrid phishing website detection method and device, electronic equipment and storage medium
CN108270754B (en) Method and device for detecting phishing website
CN110781876B (en) A lightweight detection method and system for counterfeit domain names based on visual features
CN109543674A (en) A kind of image copy detection method based on generation confrontation network
Warif et al. A comprehensive evaluation procedure for copy-move forgery detection methods: results from a systematic review
Zhao et al. Source camera identification via low dimensional PRNU features
CN119341816B (en) Fishing website detection method based on YOLOv and Resnet-101
CN107798080B (en) A Similar Sample Set Construction Method for Phishing URL Detection
Hao et al. It doesn't look like anything to me: Using diffusion model to subvert visual phishing detectors
Karageogiou et al. Evolution of detection performance throughout the online lifespan of synthetic images
Wan et al. Efficient virtual data search for annotation‐free vehicle reidentification
CN104123382B (en) A kind of image set abstraction generating method under Social Media
Li et al. Meaod: Model extraction attack against object detectors
CN101436210B (en) Method and system for recognizing counterfeit web page
CN117079180A (en) A video detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant