[go: up one dir, main page]

CN119996406A - A browser large file download method and device - Google Patents

A browser large file download method and device Download PDF

Info

Publication number
CN119996406A
CN119996406A CN202510384367.6A CN202510384367A CN119996406A CN 119996406 A CN119996406 A CN 119996406A CN 202510384367 A CN202510384367 A CN 202510384367A CN 119996406 A CN119996406 A CN 119996406A
Authority
CN
China
Prior art keywords
shard
file
download
data
browser
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202510384367.6A
Other languages
Chinese (zh)
Inventor
范开鑫
周祥龙
魏子重
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Inspur Science Research Institute Co Ltd
Original Assignee
Shandong Inspur Science Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Inspur Science Research Institute Co Ltd filed Critical Shandong Inspur Science Research Institute Co Ltd
Priority to CN202510384367.6A priority Critical patent/CN119996406A/en
Publication of CN119996406A publication Critical patent/CN119996406A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a method and equipment for downloading a large file of a browser, which belong to the technical field of front-end Web and are used for solving the technical problems that the use efficiency of a memory and the data storage efficiency are low, the network interrupt processing capability is insufficient, the effective integration of the browser bottom technology is lacking, and the offline caching and the efficient recovery are difficult to realize in the traditional large file downloading scheme under the browser environment. The method comprises the steps of carrying out slicing on a target file in a file downloading task according to a preset slicing rule, creating a slicing downloading task for each slicing, dynamically distributing the slicing downloading task through a parallel task pool, storing downloaded slicing data and metadata thereof through a IndexedDB database after each slicing downloading task is completed, carrying out integrity verification on the slicing data, carrying out re-downloading on the current slicing according to a slicing integrity verification result, and reading all slicing data according to a slicing sequence and splicing the slicing data into a complete file after all slicing downloading tasks are executed.

Description

Method and device for downloading large files of browser
Technical Field
The invention relates to the technical field of front-end Web, in particular to a method and equipment for downloading a large file of a browser.
Background
Multiple challenges and limitations are faced when downloading large files in a traditional browser environment. The first problem is the limitation of memory use, namely, the traditional scheme requires that the whole data file is required to be completely loaded into the memory when the large file is downloaded, and the method is very easy to trigger memory overflow errors, so that a browser is crashed, and user experience and data integrity are seriously affected.
Second, network stability becomes another major obstacle. The traditional HTTP request mechanism is worry when processing network interruption, lacks an automatic continuous transmission mechanism, and once network connection is interrupted in the downloading process, users often need to download again from the beginning, which wastes precious time and bandwidth resources and greatly reduces the downloading efficiency.
Furthermore, the limitations of the storage scheme are not negligible. Existing browser storage technologies such as LocalStorage have relatively limited storage capacity and are difficult to meet the requirements of large file shard storage. Even if the data can be stored in a sliced manner, when the sliced data are combined, all the sliced data need to be loaded into a memory for processing, and the step is not only inefficient, but also can cause memory problems again.
In addition, the prior art has not fully integrated and utilized the advanced functions of the browser bottom layer, such as Service workbench and IndexedDB. These techniques may provide users with the ability to cache offline and efficiently recover data, but these potentials are not fully exploited due to the lack of corresponding integration strategies. The current situation greatly limits the effective application of the traditional downloading scheme in high-load and high-demand scenes such as cloud storage, and cannot meet the increasing demands for online data storage and access.
Disclosure of Invention
The embodiment of the invention provides a method and equipment for downloading a large file of a browser, which are used for solving the technical problems that the use efficiency of a memory and the data storage efficiency are low, the network interrupt processing capability is insufficient, the effective integration of the browser bottom technology is lacking, and the offline caching and the efficient recovery are difficult to realize in a traditional large file downloading scheme under the browser environment.
The embodiment of the invention adopts the following technical scheme:
On one hand, the embodiment of the invention provides a method for downloading a large file of a browser, which comprises the steps of intercepting a file downloading request of the browser through a Service workbench script, and proxy a file downloading task to a background thread for execution;
Fragmenting the target file in the file downloading task according to a preset fragmenting rule, and creating a fragmenting downloading task for each fragment;
creating a parallel task pool, and dynamically distributing the fragment downloading task through the parallel task pool;
after each fragment downloading task is completed, storing the downloaded fragment data and metadata thereof through a IndexedDB database of a browser, and carrying out integrity check on the fragment data;
According to the fragment integrity checking result, re-downloading the current fragment;
after all the fragment downloading tasks are executed, reading all the fragment data in the IndexedDB database according to the fragment sequence and splicing the fragment data into a complete file;
and after the complete file is written into a user-specified path, deleting the corresponding fragment data and the metadata thereof in the IndexedDB database, and releasing the storage space.
In a possible implementation manner, the method includes the steps of fragmenting the target file in the file downloading task according to a preset fragmenting rule, and creating a fragmenting downloading task for each fragment, and specifically includes:
Acquiring the file size of the target file, and calling a preset slicing rule which is currently set, wherein the preset slicing rule at least comprises slicing according to the fixed slicing size, slicing according to a preset percentage of the target file size and slicing according to the user-defined slicing size;
based on the preset slicing rules and the file size, calculating the slicing number and the slicing size of each slicing data so as to slice the target file;
creating an identifier for each piece of data and a piece of downloading task for each piece of data;
Meanwhile, a target file metadata record table and a fragment metadata record table are created in a IndexedDB database and are used for storing metadata of a target file and metadata of fragment data, wherein the metadata of the target file at least comprise file sizes, file types, fragment numbers and hash values of each fragment of the target file, and the metadata of the fragment data at least comprise fragment identifiers, starting bytes, ending bytes, downloading states, hash values, retry times and last updating time of the fragment data.
In a possible implementation manner, a parallel task pool is created, and the shard downloading task is dynamically distributed through the parallel task pool, which specifically includes:
The method comprises the steps of obtaining the concurrency limit number of a current browser and creating a parallel task pool, wherein the number of parallel threads in the parallel task pool is the same as the concurrency limit number;
and circularly filling the parallel task pool through a task queue mechanism, and storing unexecuted tasks into a waiting queue so as to dynamically allocate the fragmented download tasks.
In one possible implementation, after each fragment downloading task is completed, storing the downloaded fragment data and metadata thereof through a IndexedDB database of a browser, and performing integrity check on the fragment data, wherein the method specifically comprises the steps of:
After each fragment downloading task is completed, storing the downloaded binary fragment data into IndexedDB database, and associating with the corresponding fragment metadata record table in IndexedDB database;
calculating the hash value of the downloaded fragment data, comparing the hash value with the hash value of the fragment recorded in the metadata record table of the target file, if the hash value is consistent with the hash value of the fragment recorded in the metadata record table of the target file, updating the downloading state in the metadata record table of the fragment to be finished, and if the hash value is inconsistent with the hash value of the fragment recorded in the metadata record table of the target file, triggering automatic retry logic to download the fragment data again;
And after the retry times are exceeded, marking the downloading state of the data of the score sheet as failure, and prompting the manual intervention of a user through a user interface.
In a possible embodiment, before the current fragment is re-downloaded according to the fragment integrity check result, the method further includes:
detecting interruption in the process of downloading fragments, and determining interruption reasons, wherein the interruption reasons comprise network interruption and other reasons;
when the interrupt source is network interrupt, the unfinished slicing request information is stored in an offline task table of IndexedDB database, wherein the slicing request information at least comprises slicing URL, slicing index and request time;
when the network connection recovery is detected, automatically triggering the continuous transmission of the offline task by monitoring a sync event, reading the incomplete fragment downloading task from the offline task table, re-adding the task into a task queue, and preferentially executing the task;
And when the terminal reasons are other reasons, reading the downloaded fragment list in the IndexedDB database to determine the breakpoint, and re-adding the incompletely fragmented download tasks after the breakpoint into the task queue to perform breakpoint download recovery.
In one possible implementation, intercepting a file downloading request of a browser through a Service workbench script, and performing a file downloading task by proxy to the Service workbench thread, specifically including:
Registering a Service workbench script by a navigator, serviceworkbench and register method, and designating a scope as a root path to ensure that all file downloading requests can be intercepted;
Intercepting a file downloading request of a browser through a registered Service workbench script, and proxy the intercepted file downloading task to a background thread for execution.
In a possible embodiment, the method further comprises:
In the process of downloading the fragmented data, the data stream of the current fragmented data is written into the persistent disk space of the browser in real time through STREAMS API, meanwhile, the data stream is read in the persistent disk space in real time through a parallel thread in a memory for downloading, and the downloaded data stream is directly written into a disk file.
In a possible embodiment, the method further comprises:
After each slicing process is completed, immediately releasing the memory resource, and forcedly cleaning the residual data through WeakMap garbage collection mechanism to prevent the memory from leaking;
for repeated downloading requests of the same target file, judging whether the target file is changed or not;
if not, directly reading IndexedDB the stored fragment data in the database;
if the change occurs, the fragment data of the unchanged part in the IndexedDB database is read, and only the fragment data of the changed part is downloaded again, so that the redundant flow consumption is reduced.
In a possible embodiment, the method specifically includes:
Acquiring the number of fragments which are completely downloaded from a metadata download state change event of IndexedDB database through an event monitoring mechanism;
Determining the downloading progress of the target file according to the number of fragments which are downloaded, and displaying the downloading progress in a front-end interface in real time;
when the user clicks the pause button, all ongoing slicing requests are interrupted through AbortController and the current state is saved to IndexedDB data, and when the user clicks resume, the current state is reloaded and the downloading is continued.
On the other hand, the embodiment of the invention also provides a device for downloading the large file of the browser, which comprises:
at least one processor, and
A memory communicatively coupled to the at least one processor, wherein,
The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the one browser large file download method.
Compared with the prior art, the method and the device for downloading the large browser file have the following beneficial effects:
the invention creatively provides a high-efficiency downloading solution for large files in a browser environment, which skillfully merges a dynamic slicing strategy and a parallel downloading control technology and remarkably optimizes the data transmission efficiency. By introducing IndexedDB databases as persistent storage media, not only is the reliable preservation of the fragmented data and the downloading state realized, but also the breakpoint continuous transmission function and the strict verification of the fragment integrity are perfectly supported. In addition, by means of the powerful capability of the Service Worker, the method further realizes intelligent scheduling of background silence downloading and offline task management.
In the technical innovation level, the scheme adopts an advanced stream processing technology and a strategy of directly writing fragments into a disk, and the revolutionary design greatly reduces the occupation of memory resources. Meanwhile, the utilization efficiency of network bandwidth is effectively improved by combining a fine priority scheduling algorithm. The series of optimization measures fundamentally solve the common memory overflow risk, irrecoverability after downloading interruption, performance bottleneck and other stubborn diseases of the traditional single-thread downloading mode.
Therefore, the method and the device are particularly suitable for large file transmission scenes of the Web end, not only realize the functions of dynamic slicing, intelligent scheduling, breakpoint continuous transmission, memory optimization, offline management and the like, but also remarkably improve the downloading speed, remarkably enhance the stability and reliability of the downloading process and bring smoother and more efficient downloading experience for users.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and other drawings may be obtained according to the drawings without inventive effort to those skilled in the art. In the drawings:
FIG. 1 is a flowchart of a method for downloading a large file of a browser according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a large file downloading device of a browser according to an embodiment of the present invention.
Detailed Description
In order to make the technical solution of the present invention better understood by those skilled in the art, the technical solution of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present invention.
The embodiment of the invention provides a method for downloading a large file of a browser, which specifically comprises the following steps S101-S106 as shown in FIG. 1:
s101, intercepting a file downloading request of a browser through a Service workbench script, and proxy a file downloading task to a background thread for execution.
Specifically, first, the front-end main thread registers a Service workbench script through a navigator, serviceworkbench, and designates its scope as a root path, so as to ensure that all file download requests can be intercepted.
Further, the file downloading request of the browser is intercepted through the registered Service workbench script, and the intercepted file downloading task is proxy-executed to the background thread.
In one embodiment, the download request of the browser is intercepted by the registered Service workbench script, and the task is proxy-executed to the background thread. For example, when the user clicks the download button, the main thread intercepts task information (such as URL, fragment configuration, etc.), and sends the task information to the background Service Worker thread, and the background Service Worker thread independently manages fragment download, storage and progress reporting.
As a possible implementation mode, the Service workbench monitors the fetch event, intercepts all the fragment downloading requests (such as URL path containing/chunk/request), and preferentially searches the Cache response from the Cache Storage after interception. If the Cache is hit and not expired (for example, the Cache time is within 30 days), the Cache data is directly returned, and if the Cache is not hit or expired, a request is initiated to the network and the response is cached to the Cache Storage.
S102, slicing the target file in the file downloading task according to a preset slicing rule, and creating a slicing downloading task for each slicing.
The method comprises the steps of obtaining the file size of a target file, and calling a preset slicing rule which is set currently, wherein the preset slicing rule at least comprises slicing according to the fixed slicing size, slicing according to a preset percentage of the target file size and slicing according to the user-defined slicing size.
Further, based on a preset slicing rule and a file size, the slicing number and the slicing size of each slicing data are calculated so as to slice the target file.
As a possible implementation, the file is divided into several slices according to the slice size, and different slice policies may be set, i.e. a fixed slice size, a set percentage size, or a user-defined setting. For example, the file is split according to a fixed proportion (for example, one slice is arranged every 1% -5%), or according to a fixed slice size, for example, if the file size exceeds 1GB, the default slice size is 10MB, and if the file size is less than 100MB, the slice size is 1MB, and the number of slices and the downloading efficiency are balanced.
Further, an identifier is created for each piece of data, and a piece of download task is created for each piece of data. Meanwhile, a target file metadata record table and a fragment metadata record table are created in a IndexedDB database and are used for storing metadata of a target file and metadata of fragment data, wherein the metadata of the target file at least comprise file sizes, file types, fragment numbers and hash values of each fragment of the target file, and the metadata of the fragment data at least comprise fragment identifiers, starting bytes, ending bytes, downloading states, hash values, retry times and last updating time of the fragment data.
As a possible implementation, the front end obtains metadata of the target file from the server through an HTTP request, including a total file size (e.g., 2 GB), a number of fragments (pre-computed or dynamically generated by the server), a unique hash value of each fragment (e.g., SHA-256), a file type (e.g., video/mp 4), and so on. The server returns the metadata in the JSON format, and the front end analyzes the metadata and stores the metadata in the browser memory or temporary cache. A unique identifier (e.g., fileId _index) is generated for each fragment, while a metadata record table is created in IndexedDB, recording the fragment index, download status (not started/download in/completed/failed), hash value, and number of retries (initially 0).
S103, creating a parallel task pool, and dynamically distributing the fragment downloading task through the parallel task pool.
The method comprises the steps of obtaining the concurrency limit number of a current browser and creating a parallel task pool, wherein the number of parallel threads in the parallel task pool is the same as the concurrency limit number.
Furthermore, a parallel task pool is circularly filled through a task queue mechanism, and unexecuted tasks are stored in a waiting queue, so that the fragmented download tasks are dynamically distributed, and the bandwidth utilization rate is ensured to be maximized.
As a possible implementation, the front-end main thread uses a task queue mechanism to control the number of parallel downloads to be 6 based on the concurrent request restriction of the browser (e.g. http1.X protocol has 6 concurrent requests per domain name). For example, the parallelism is set to be 6, that is, 6 fragments are downloaded at the same time, the unexecuted task enters a task queue, and the unexecuted tasks are triggered in sequence.
Even if the user closes the browser or switches the tab page, the Service workbench can continue to download tasks in the background after the user agrees to grant the "background synchronization" authority. For example, by Background Sync API automatically waking up the Service Worker when the network is available, the remaining fragment download is completed. Setting a timeout threshold value for each fragment downloading task, if the timeout is not finished, automatically stopping the request and marking the fragments as 'to be retried', and triggering an error callback and prompting a user after the retrying times reach the upper limit.
S104, after each fragment downloading task is completed, storing the downloaded fragment data and metadata thereof through a IndexedDB database of the browser, carrying out integrity check on the fragment data, and re-downloading the current fragments according to the fragment integrity check result.
Specifically, after each fragment downloading task is completed, the downloaded binary fragment data is stored in the IndexedDB database and is associated with the corresponding fragment metadata record table in the IndexedDB database.
Further, the hash value of the downloaded fragment data is calculated and compared with the hash value of the fragment recorded in the metadata record table of the target file, if the hash value is consistent with the hash value of the fragment recorded in the metadata record table of the target file, the downloading state in the metadata record table of the fragment is updated to be finished, if the hash value is inconsistent with the hash value, the automatic retry logic is triggered to download the fragment data again, the automatic retry is triggered for at most 3 times, and the retry number field of the fragment in the database is updated IndexedDB before each retry. And after the retry times are exceeded, marking the downloading state of the data of the score sheet as failure, and prompting the manual intervention of a user through a user interface.
If the downloading is successful, storing binary data (ArrayBuffer format) of each fragment through a 'fragment data table' of IndexedDB, wherein the main key is a unique identifier (such as fileId _index) of the fragment, and when writing, indexedDB transaction is used to ensure atomicity and avoid inconsistent state caused by partial writing of data.
As a possible implementation, each slice requests a download via HTTP RANGE, and the request header specifies a byte range (e.g., bytes=0-10485759 represents the first 10MB slice). After the downloading is completed, the front end uses a browser-built-in crypt. Subtitle. Digest API to calculate the SHA-256 hash value of the fragment data, compares the hash value with the hash value stored in IndexedDB database, marks the fragment as completed if the hash value is consistent, and triggers an automatic retry logic if the hash value is inconsistent. If the fragment download fails (such as network interruption or hash check fails), the system automatically retries a maximum of 3 times, and updates IndexedDB the "retry number" field of the fragment before each retry. And after the retry times are exceeded, marking the fragmentation as failed, and prompting the user to manually intervene through a user interface.
Further, interrupt detection is performed in the process of downloading the fragments, and the interrupt reasons are determined, wherein the interrupt reasons comprise network interrupt and other reasons.
When the interrupt source is network interrupt, the unfinished slicing request information is stored in an offline task table of IndexedDB database, wherein the slicing request information at least comprises slicing URL, slicing index and request time. When the network connection recovery is detected, the off-line task continuous transmission is automatically triggered by monitoring the sync event, the unfinished fragment downloading task is read from the off-line task list, the task queue is added again, and the task is executed preferentially.
And when the terminal reasons are other reasons, reading the downloaded fragment list in the IndexedDB database to determine the breakpoint, and re-adding the incompletely fragmented download tasks after the breakpoint into the task queue to perform breakpoint download recovery.
As a possible implementation, when the network is interrupted, the Service Worker stores incomplete fragment request information (including fragment URL, fragment index, request time) into the "offline task table" of IndexedDB, so as to ensure data persistence. When the browser detects that the network is restored to be connected, the Service Worker automatically triggers the continuous transmission of the offline task by monitoring a sync event (event label is retry-flush-tasks), and the system reads the unfinished slicing task from IndexedDB, rejoins the downloading queue and preferentially executes the task. If the file is interrupted for other reasons, the downloading is initiated again, the downloaded fragment list is directly read from IndexedDB, and only unfinished fragments are requested, for example, if the total size of the file is 1GB and 600MB is downloaded, only fragments corresponding to the remaining 400MB are requested, and stored data is skipped.
Further, in the downloading process, the invention adopts the streaming write-in disk storage, namely, in the downloading process of the fragmented data, the data stream of the current fragmented data is written into the persistent disk space of the browser in real time through STREAMS API, and simultaneously, the data stream is read in the persistent disk space in real time through a parallel thread in the memory for downloading, and the downloaded data stream is directly written into the disk file.
As a possible implementation, the complete sliced data is prevented from being loaded into the memory by STREAMS API writing the sliced data stream into the browser-allocated persistent disk space (e.g., FILE SYSTEM ACCESS API of Chrome) in real time. For example, during the slice downloading, the data stream is read block by block through response.body.getread (), and directly written into the disk file, and only the data block which is being processed by the current thread is reserved in the memory.
And S105, after the execution of all the fragment downloading tasks is finished, reading all the fragment data in a IndexedDB database according to the fragment sequence and splicing the fragment data into a complete file.
Specifically, after each piece of processing is completed, the memory resource is released immediately, and residual data is forcedly cleaned through WeakMap garbage collection mechanism, so that memory leakage is prevented.
Further, for repeated download requests of the same target file, whether the target file is changed is judged. If the data is not changed, the stored fragmented data in the IndexedDB database is directly read, and if the data is changed, the fragmented data of the unchanged part in the IndexedDB database is read, and only the fragmented data of the changed part is downloaded again, so that the redundant flow consumption is reduced.
Furthermore, the invention also monitors the downloading progress, and acquires the number of fragments which are completely downloaded from the metadata downloading state change event of IndexedDB database through an event monitoring mechanism. And determining the downloading progress of the target file according to the number of the fragments which are downloaded, and displaying the downloading progress in a front-end interface in real time. When the user clicks the pause button, all ongoing slicing requests are interrupted by AbortController and the current state is saved to IndexedDB data, and when the user clicks resume, the current state is reloaded and the download is continued.
Further, after all download tasks are completed, all downloaded state shard data is read from IndexedDB in shard order (starting byte ascending order). And then splicing the segmented data into a complete file through BlobAPI to generate a Blob object of the final file. The showSaveFilePicker method of FILE SYSTEM ACCESS API is called and the Blob is written to the user specified path (download directory) to complete the download.
And S106, after the complete file is written into a user-specified path, deleting the corresponding fragment data and the metadata thereof in the IndexedDB database, and releasing the storage space.
Specifically, after the file is downloaded successfully, the corresponding fragment metadata and binary data in IndexedDB are deleted, and the storage space is released.
For fragments of data that have not been downloaded, after the system is started, the daily execution of the timed task is registered (via SETINTERVAL) and the "creation time" field of all files in IndexedDB is checked. If the file is not downloaded for more than 7 days, or is finished but not cleaned for more than 3 days, all the fragment data and the metadata are automatically deleted, and the storage space is released.
In addition, the embodiment of the invention also provides a device for downloading the large file of the browser, as shown in fig. 2, the device specifically comprises:
and a memory communicatively coupled to the at least one processor, wherein,
The memory stores instructions executable by the at least one processor to enable the at least one processor to perform:
Intercepting a file downloading request of a browser through a Service workbench script, and proxy the file downloading task to a background thread for execution;
Fragmenting the target file in the file downloading task according to a preset fragmenting rule, and creating a fragmenting downloading task for each fragment;
creating a parallel task pool, and dynamically distributing the fragment downloading task through the parallel task pool;
after each fragment downloading task is completed, storing the downloaded fragment data and metadata thereof through a IndexedDB database of a browser, and carrying out integrity check on the fragment data;
According to the fragment integrity checking result, re-downloading the current fragment;
after all the fragment downloading tasks are executed, reading all the fragment data in the IndexedDB database according to the fragment sequence and splicing the fragment data into a complete file;
and after the complete file is written into a user-specified path, deleting the corresponding fragment data and the metadata thereof in the IndexedDB database, and releasing the storage space.
The embodiments of the present invention are described in a progressive manner, and the same and similar parts of the embodiments are all referred to each other, and each embodiment is mainly described in the differences from the other embodiments. In particular, for apparatus, devices, non-volatile computer storage medium embodiments, the description is relatively simple, as it is substantially similar to method embodiments, with reference to the section of the method embodiments being relevant.
The foregoing describes certain embodiments of the present invention. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The foregoing is merely exemplary of the present invention and is not intended to limit the present invention. Various modifications and changes may be made to the embodiments of the invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the embodiments of the present invention should be included in the protection scope of the present invention.

Claims (10)

1.一种浏览器大文件下载方法,其特征在于,所述方法包括:1. A method for downloading large files from a browser, characterized in that the method comprises: 通过Service Worker脚本拦截浏览器的文件下载请求,将文件下载任务代理至后台线程执行;Intercept the browser's file download request through the Service Worker script and delegate the file download task to the background thread for execution; 按照预设分片规则对所述文件下载任务中的目标文件进行分片,并为每个分片创建分片下载任务;Segment the target file in the file download task according to a preset segmentation rule, and create a segment download task for each segment; 创建并行任务池,并通过所述并行任务池动态分配所述分片下载任务;Creating a parallel task pool, and dynamically allocating the segment download tasks through the parallel task pool; 每个分片下载任务完成后,通过浏览器的IndexedDB数据库存储下载的分片数据及其元数据,并进行分片数据的完整性校验;After each shard download task is completed, the downloaded shard data and its metadata are stored in the browser's IndexedDB database, and the integrity of the shard data is checked; 根据分片完整性校验结果,对当前分片进行重新下载;According to the fragment integrity check result, the current fragment is re-downloaded; 所有分片下载任务执行完成后,在所述IndexedDB数据库中按分片顺序读取所有分片数据并拼接为完整文件;After all the shard download tasks are completed, all the shard data are read in the IndexedDB database in shard order and spliced into a complete file; 将所述完整文件写入用户指定路径后,在所述IndexedDB数据库中将对应的分片数据及其元数据删除,释放存储空间。After the complete file is written to the user-specified path, the corresponding shard data and its metadata are deleted in the IndexedDB database to release storage space. 2.根据权利要求1所述的一种浏览器大文件下载方法,其特征在于,按照预设分片规则对所述文件下载任务中的目标文件进行分片,并为每个分片创建分片下载任务,具体包括:2. A browser large file downloading method according to claim 1, characterized in that the target file in the file download task is segmented according to a preset segmentation rule, and a segment download task is created for each segment, specifically comprising: 获取所述目标文件的文件大小,并调取当前设置的预设分片规则;其中,所述预设分片规则至少包括:按照固定分片大小进行分片、按照目标文件大小的预设百分比进行分片以及按照用户自定义设置分片大小进行分片;Obtain the file size of the target file, and retrieve the currently set preset sharding rules; wherein the preset sharding rules at least include: sharding according to a fixed sharding size, sharding according to a preset percentage of the target file size, and sharding according to a user-defined sharding size; 基于所述预设分片规则以及所述文件大小,计算分片数量以及每个分片数据的分片大小,以对所述目标文件进行分片;Based on the preset sharding rule and the file size, the number of shards and the shard size of each shard data are calculated to shard the target file; 为每个分片数据创建标识符,并为每个分片创建分片下载任务;Create an identifier for each shard data and create a shard download task for each shard; 同时,在IndexedDB数据库中创建目标文件元数据记录表以及分片元数据记录表,用于存储目标文件的元数据以及分片数据的元数据;其中,目标文件的元数据至少包括:目标目标文件的文件大小、文件类型、分片数量以及每个分片的哈希值;分片数据的元数据至少包括:分片数据的分片标识符、起始字节、结束字节、下载状态、哈希值、重试次数以及最后更新时间。At the same time, a target file metadata record table and a shard metadata record table are created in the IndexedDB database to store the metadata of the target file and the metadata of the shard data; wherein the metadata of the target file includes at least: the file size, file type, number of shards and hash value of each shard of the target file; the metadata of the shard data includes at least: the shard identifier, start byte, end byte, download status, hash value, number of retries and last update time of the shard data. 3.根据权利要求1所述的一种浏览器大文件下载方法,其特征在于,创建并行任务池,并通过所述并行任务池动态分配所述分片下载任务,具体包括:3. A browser large file downloading method according to claim 1, characterized in that a parallel task pool is created, and the segment download tasks are dynamically allocated through the parallel task pool, specifically comprising: 获取当前浏览器的并发限制数量,并创建并行任务池;所述并行任务池中的并行线程数量与所述并发限制数量相同;Obtain the concurrent limit number of the current browser and create a parallel task pool; the number of parallel threads in the parallel task pool is the same as the concurrent limit number; 通过任务队列机制循环填充所述并行任务池,将未执行任务存入等待队列,以动态分配所述分片下载任务。The parallel task pool is cyclically filled through a task queue mechanism, and unexecuted tasks are stored in a waiting queue to dynamically allocate the segment download tasks. 4.根据权利要求1所述的一种浏览器大文件下载方法,其特征在于,每个分片下载任务完成后,通过浏览器的IndexedDB数据库存储下载的分片数据及其元数据,并进行分片数据的完整性校验;根据分片完整性校验结果,对当前分片进行重新下载,具体包括:4. A browser large file download method according to claim 1, characterized in that after each shard download task is completed, the downloaded shard data and its metadata are stored in the browser's IndexedDB database, and the integrity check of the shard data is performed; according to the shard integrity check result, the current shard is re-downloaded, specifically comprising: 在每个分片下载任务完成后,将下载完成的二进制分片数据存入IndexedDB数据库中,并与IndexedDB数据库中对应的分片元数据记录表进行关联;After each shard download task is completed, the downloaded binary shard data is stored in the IndexedDB database and associated with the corresponding shard metadata record table in the IndexedDB database; 计算下载完成的分片数据的哈希值,并与目标文件元数据记录表中记录的该分片的哈希值进行对比,若对比一致,则将所述分片元数据记录表中的下载状态更新为已完成,若对比不一致,则触发自动重试逻辑,重新下载该分片数据;触发自动重试最多3次,每次重试前更新IndexedDB数据库中该分片的重试次数字段;Calculate the hash value of the downloaded shard data and compare it with the hash value of the shard recorded in the target file metadata record table. If the comparison is consistent, update the download status in the shard metadata record table to completed. If the comparison is inconsistent, trigger the automatic retry logic to re-download the shard data; trigger automatic retry up to 3 times, and update the retry count field of the shard in the IndexedDB database before each retry; 超过重试次数后,标记分片数据的下载状态为“失败”,并通过用户界面提示用户手动干预。After the number of retries is exceeded, the download status of the shard data is marked as "failed", and the user is prompted to intervene manually through the user interface. 5.根据权利要求1所述的一种浏览器大文件下载方法,其特征在于,在根据分片完整性校验结果,对当前分片进行重新下载之前,所述方法还包括:5. A browser large file downloading method according to claim 1, characterized in that before re-downloading the current segment according to the segment integrity check result, the method further comprises: 在分片下载过程中进行中断检测,确定中断原因;其中,所述中断原因包括网络中断以及其他原因;Performing interruption detection during the segment download process to determine the interruption cause; wherein the interruption cause includes network interruption and other causes; 当中断原因为网络中断时,将未完成的分片请求信息存储至IndexedDB数据库的离线任务表中;其中,所述分片请求信息至少包括分片URL、分片索引以及请求时间;When the interruption reason is a network interruption, the unfinished shard request information is stored in the offline task table of the IndexedDB database; wherein the shard request information includes at least the shard URL, shard index and request time; 当检测到网络恢复连接时,通过监听sync事件自动触发离线任务续传,并从所述离线任务表中读取未完成的分片下载任务,重新加入任务队列,并优先执行;When the network is detected to be restored, the offline task resuming is automatically triggered by listening to the sync event, and the unfinished segment download tasks are read from the offline task table, re-added to the task queue, and executed first; 当终端原因为其他原因时,在IndexedDB数据库中读取已下载分片列表,确定断点;将断点之后的未完成分片下载任务重新加入任务队列,进行断点下载恢复。When the terminal reason is other reasons, the downloaded shard list is read from the IndexedDB database to determine the breakpoint; the unfinished shard download tasks after the breakpoint are re-added to the task queue to perform breakpoint download recovery. 6.根据权利要求1所述的一种浏览器大文件下载方法,其特征在于,通过ServiceWorker脚本拦截浏览器的文件下载请求,将文件下载任务代理至Service Worker线程执行,具体包括:6. A browser large file download method according to claim 1, characterized in that the browser's file download request is intercepted by a ServiceWorker script, and the file download task is delegated to the Service Worker thread for execution, specifically comprising: 通过navigator.serviceWorker.register方法注册Service Worker脚本,并指定其作用域为根路径,确保能够拦截所有文件下载请求;Register the Service Worker script through the navigator.serviceWorker.register method and specify its scope as the root path to ensure that all file download requests can be intercepted; 通过注册的Service Worker脚本拦截浏览器的文件下载请求,并将拦截下的文件下载任务代理至后台线程执行。The browser's file download request is intercepted through the registered Service Worker script, and the intercepted file download task is delegated to the background thread for execution. 7.根据权利要求1所述的一种浏览器大文件下载方法,其特征在于,所述方法还包括:7. A browser large file downloading method according to claim 1, characterized in that the method further comprises: 在分片数据下载过程中,通过Streams API将当前分片数据的数据流实时写入浏览器的持久化磁盘空间,同时通过内存中的并行线程实时在所述持久化磁盘空间中读取数据流进行下载,并将下载的数据流直接写入磁盘文件。During the shard data download process, the data stream of the current shard data is written to the browser's persistent disk space in real time through the Streams API. At the same time, the data stream is read in real time from the persistent disk space through the parallel thread in the memory for downloading, and the downloaded data stream is directly written to the disk file. 8.根据权利要求1所述的一种浏览器大文件下载方法,其特征在于,所述方法还包括:8. A browser large file downloading method according to claim 1, characterized in that the method further comprises: 每个分片处理完成后,立即释放内存资源,并通过WeakMap垃圾回收机制强制清理残留数据,防止内存泄漏;After each shard is processed, memory resources are released immediately, and residual data is forcibly cleaned up through the WeakMap garbage collection mechanism to prevent memory leaks; 对于同一目标文件的重复下载请求,判断该目标文件是否变更;For repeated download requests for the same target file, determine whether the target file has changed; 若未变更,则直接读取IndexedDB数据库中已存储的分片数据;If there is no change, directly read the shard data stored in the IndexedDB database; 若发生变更,则读取IndexedDB数据库中未变更部分的分片数据,仅重新下载变更部分的分片数据,减少冗余流量消耗。If a change occurs, the shard data of the unchanged part in the IndexedDB database is read, and only the shard data of the changed part is downloaded again to reduce redundant traffic consumption. 9.根据权利要求1所述的一种浏览器大文件下载方法,其特征在于,具体包括:9. A browser large file downloading method according to claim 1, characterized in that it specifically comprises: 通过事件监听机制从IndexedDB数据库的元数据下载状态变更事件中获取完成下载的分片数量;The number of shards that have been downloaded is obtained from the metadata download status change event of the IndexedDB database through the event monitoring mechanism; 根据完成下载的分片数量,确定所述目标文件的下载进度;将所述下载进度实时显示在前端界面中;Determine the download progress of the target file according to the number of downloaded segments; and display the download progress in real time on the front-end interface; 当用户点击暂停按钮时,通过AbortController中断所有进行中的分片请求,并将当前状态保存至IndexedDB数据;当用户点击恢复时,重新加载所述当前状态并继续下载。When the user clicks the Pause button, all ongoing shard requests are interrupted through AbortController, and the current state is saved to the IndexedDB data; when the user clicks Resume, the current state is reloaded and the download continues. 10.一种浏览器大文件下载设备,其特征在于,所述设备包括:10. A browser large file downloading device, characterized in that the device comprises: 至少一个处理器;以及,at least one processor; and, 与所述至少一个处理器通信连接的存储器;其中,a memory communicatively connected to the at least one processor; wherein, 所述存储器存储有能够被所述至少一个处理器执行的指令,以使所述至少一个处理器能够执行根据权利要求1-9任一项所述的一种浏览器大文件下载方法。The memory stores instructions that can be executed by the at least one processor, so that the at least one processor can execute the browser large file downloading method according to any one of claims 1-9.
CN202510384367.6A 2025-03-28 2025-03-28 A browser large file download method and device Pending CN119996406A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510384367.6A CN119996406A (en) 2025-03-28 2025-03-28 A browser large file download method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510384367.6A CN119996406A (en) 2025-03-28 2025-03-28 A browser large file download method and device

Publications (1)

Publication Number Publication Date
CN119996406A true CN119996406A (en) 2025-05-13

Family

ID=95624598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510384367.6A Pending CN119996406A (en) 2025-03-28 2025-03-28 A browser large file download method and device

Country Status (1)

Country Link
CN (1) CN119996406A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120935167A (en) * 2025-10-10 2025-11-11 冠骋信息技术(苏州)有限公司 Asynchronous file generation and downloading realization method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN120935167A (en) * 2025-10-10 2025-11-11 冠骋信息技术(苏州)有限公司 Asynchronous file generation and downloading realization method and system

Similar Documents

Publication Publication Date Title
CN114341792B (en) Data partition switching between storage clusters
US10185629B2 (en) Optimized remote cloning
US12197758B2 (en) Distributed object replication architecture
US11010240B2 (en) Tracking status and restarting distributed replication
US10382380B1 (en) Workload management service for first-in first-out queues for network-accessible queuing and messaging services
US10509675B2 (en) Dynamic allocation of worker nodes for distributed replication
US11169835B1 (en) VM data migration between storage devices
US10628235B2 (en) Accessing log files of a distributed computing system using a simulated file system
US11983438B2 (en) Technique for improving operations log indexing
US10599622B2 (en) Implementing storage volumes over multiple tiers
US10620871B1 (en) Storage scheme for a distributed storage system
WO2015054998A1 (en) Method and device for recreating index online
CN108460045A (en) A kind of processing method and distributed block storage system of snapshot
CN103902479A (en) Quick reconstruction mechanism for metadata cache on basis of metadata log
US10877684B2 (en) Changing a distributed storage volume from non-replicated to replicated
US10642697B2 (en) Implementing containers for a stateful application in a distributed computing system
CN119996406A (en) A browser large file download method and device
US10845997B2 (en) Job manager for deploying a bundled application
CN115408341A (en) File deletion method, system, device, processor and electronic equipment
CN103197987A (en) Data backup method, data recovery method and cloud storage system
CN113806145A (en) Backup and recovery method and device for OpenStack virtualization platform based on Ceph storage
US20200034475A1 (en) Relocation Of A Primary Copy Of A Replicated Volume
WO2021169163A1 (en) File data access method and apparatus, and computer-readable storage medium
US11188248B2 (en) System and method to achieve an uninterrupted file level backup using a pass-through snapshot engine
WO2023221804A1 (en) Memory management method, network device and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination