CN118885492A - A label updating method based on LRU - Google Patents
A label updating method based on LRU Download PDFInfo
- Publication number
- CN118885492A CN118885492A CN202410962164.6A CN202410962164A CN118885492A CN 118885492 A CN118885492 A CN 118885492A CN 202410962164 A CN202410962164 A CN 202410962164A CN 118885492 A CN118885492 A CN 118885492A
- Authority
- CN
- China
- Prior art keywords
- time
- history
- historical
- linked list
- real
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2379—Updates performed during online database operations; commit processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
- G06F16/2358—Change logging, detection, and notification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及数据处理技术领域,尤其涉及一种基于LRU的标签更新方法。The present invention relates to the technical field of data processing, and in particular to a label updating method based on LRU.
背景技术Background Art
随着信息化社会的快速发展,数据处理技术在各行各业的应用越来越广泛。标签作为数据处理过程中的一种重要元素,其在分类、搜索、推荐等应用中举足轻重。With the rapid development of the information society, data processing technology is increasingly being used in all walks of life. As an important element in the data processing process, tags play a vital role in applications such as classification, search, and recommendation.
中国专利申请公开号为CN117648608A的专利文献公开了一种基于时序性的用户标签更新方法、系统、设备及存储介质,该方法包括:基于时序性的用户标签更新方法包括实时获取用户状态数据和第一用户标签,将所述用户状态数据输入至预设的趋势分析模型内,得到时序性数据;基于所述时序性数据获取第一时间数据特征,根据所述第一时间数据特征生成标签更新数据;将所述标签更新数据输入至第一用户标签内,根据所述标签更新数据对第一用户标签进行更新,得到第二用户标签;基于所述第二用户标签与第一用户标签获取标签关联信息,根据所述标签关联信息构建出标签映射表。The patent document with Chinese patent application publication number CN117648608A discloses a user label update method, system, device and storage medium based on time series, and the method includes: the user label update method based on time series includes real-time acquisition of user status data and a first user label, inputting the user status data into a preset trend analysis model to obtain time series data; acquiring a first time data feature based on the time series data, and generating label update data according to the first time data feature; inputting the label update data into the first user label, updating the first user label according to the label update data, and obtaining a second user label; acquiring label association information based on the second user label and the first user label, and constructing a label mapping table according to the label association information.
现有的标签更新方法往往采用定时更新或批量更新的方式,由于标签的更新频率和更新量难以准确预测,导致资源的浪费。Existing tag update methods often use scheduled updates or batch updates. Since the update frequency and update amount of tags are difficult to accurately predict, it leads to a waste of resources.
发明内容Summary of the invention
为此,本发明提供一种基于LRU的标签更新方法,可以解决资源浪费的问题。To this end, the present invention provides a label updating method based on LRU, which can solve the problem of resource waste.
为实现上述目的,本发明提供一种基于LRU的标签更新方法,该方法包括:To achieve the above object, the present invention provides a label updating method based on LRU, the method comprising:
获取用户历史周期内的若干历史行为数据,基于若干历史行为数据确定若干历史标签及其对应的若干访问时间戳;Acquire a number of historical behavior data within a historical period of the user, and determine a number of historical tags and their corresponding access timestamps based on the number of historical behavior data;
基于若干所述历史标签和历史标签对应的若干访问时间戳创建历史存储链表;Creating a history storage linked list based on the history tags and the access timestamps corresponding to the history tags;
基于若干所述访问时间戳确定任一所述历史标签的初始更新频率;Determining an initial update frequency of any of the history tags based on a number of the access timestamps;
收集用户预设时段内的实时行为数据,根据实时行为数据确定用户实时标签;Collect the user's real-time behavior data within a preset time period, and determine the user's real-time tag based on the real-time behavior data;
将所述实时标签与所述历史存储链表进行匹配,获取匹配结果,根据匹配结果计算标签的实时更新频率;Matching the real-time tag with the historical storage linked list to obtain a matching result, and calculating the real-time update frequency of the tag according to the matching result;
基于所述初始更新频率和所述实时更新频率对所述历史存储链表的历史长度进行调整,基于所述匹配结果对所述历史存储链表的历史排序进行调整,以获取更新存储链表;Adjusting the history length of the history storage linked list based on the initial update frequency and the real-time update frequency, and adjusting the history sorting of the history storage linked list based on the matching result to obtain an updated storage linked list;
遍历所述更新存储链表,对若干更新标签对应的更新时间戳进行过期检测,根据过期检测结果对所述更新存储链表进行再次调整,获取最终存储链表。The update storage linked list is traversed, an expiration detection is performed on the update timestamps corresponding to the plurality of update tags, and the update storage linked list is adjusted again according to the expiration detection result to obtain a final storage linked list.
进一步地,所述基于用户若干历史行为数据确定若干历史标签的步骤包括:Furthermore, the step of determining a number of historical tags based on a number of historical behavior data of the user includes:
获取用户历史交易记录,所述历史交易记录包括历史交易信息、历史交易时间、历史交易金额和历史商品类型;Obtaining historical transaction records of the user, wherein the historical transaction records include historical transaction information, historical transaction time, historical transaction amount, and historical commodity type;
基于所述历史交易时间确定若干所述访问时间戳;determining a plurality of said access timestamps based on said historical transaction time;
基于所述历史交易信息、所述历史交易金额和所述历史商品类型确定所述历史标签。The historical tag is determined based on the historical transaction information, the historical transaction amount and the historical commodity type.
进一步地,其特征在于,所述基于所述历史交易信息、所述历史交易金额和所述历史商品类型确定所述历史标签的步骤包括:Furthermore, it is characterized in that the step of determining the historical tag based on the historical transaction information, the historical transaction amount and the historical commodity type includes:
基于所述历史交易信息确定历史交易频率,基于交易频率确定交易活跃度;Determine the historical transaction frequency based on the historical transaction information, and determine the transaction activity based on the transaction frequency;
基于所述历史交易金额确定历史总交易金额,基于历史总交易金额确定消费能力;Determine the total historical transaction amount based on the historical transaction amount, and determine the consumption capacity based on the total historical transaction amount;
基于所述历史商品类型确定商品偏好;determining a product preference based on the historical product types;
基于所述交易活跃度、所述消费能力和所述商品偏好确定所述历史标签。The historical tag is determined based on the transaction activity, the spending power, and the commodity preference.
进一步地,所述基于若干所述访问时间戳确定任一所述历史标签的初始更新频率的步骤包括:Furthermore, the step of determining the initial update frequency of any of the history tags based on the plurality of access timestamps includes:
基于时间序列对任一所述历史标签对应的若干所述访问时间戳进行排序,获取排序结果;Sort the access timestamps corresponding to any of the history tags based on the time series to obtain a sorting result;
根据排序结果计算两个相邻访问时间戳之间的时间间隔;Calculate the time interval between two adjacent access timestamps based on the sorting results;
计算若干所述时间间隔的均值,将均值作为均值周期;Calculate the mean of a number of said time intervals and use the mean as the mean period;
计算任一所述历史标签对应的访问时间戳数量与若干所述历史标签的访问时间戳总数量的比值,将比值作为访问频率;Calculate the ratio of the number of access timestamps corresponding to any of the history tags to the total number of access timestamps of the history tags, and use the ratio as the access frequency;
基于所述均值周期、所述访问频率和调整系数确定所述初始更新频率。The initial update frequency is determined based on the averaging period, the access frequency, and an adjustment coefficient.
进一步地,所述将所述实时标签与所述历史存储链表进行匹配的步骤包括:Furthermore, the step of matching the real-time tag with the historical storage linked list includes:
将所述实时标签遍历所述历史存储链表,若所述历史存储链表中包含所述实时标签,则将所述历史存储链表中对应历史标签进行标记,获取标记标签;Traverse the historical storage linked list with the real-time tag, and if the historical storage linked list contains the real-time tag, mark the corresponding historical tag in the historical storage linked list to obtain a marked tag;
若所述历史存储链表中不包含所述实时标签,则计算所述实时标签与若干历史标签的相似度,获取相似度计算结果;If the real-time tag is not included in the historical storage linked list, the similarity between the real-time tag and several historical tags is calculated to obtain a similarity calculation result;
若所述相似度计算结果大于或等于预设相似度,则将相似度计算结果最大值对应的历史标签进行标记,获取标记标签。If the similarity calculation result is greater than or equal to the preset similarity, the historical tag corresponding to the maximum value of the similarity calculation result is marked to obtain a marked tag.
进一步地,所述根据匹配结果计算标签的实时更新频率的步骤包括:Furthermore, the step of calculating the real-time update frequency of the tag according to the matching result includes:
当所述匹配结果为所述历史存储链表中包含所述标记标签时,获取标记标签对应的若干标记时间戳;When the matching result is that the history storage linked list contains the marking tag, obtaining a plurality of marking timestamps corresponding to the marking tag;
基于若干所述标记时间戳计算标记时长;Calculate the marking duration based on the plurality of marking timestamps;
基于所述标记标签的数量和所述标记时长计算第一更新频率;Calculate a first update frequency based on the number of the marking tags and the marking duration;
当所述匹配结果为所述历史存储链表中不包含所述标记标签时,获取实时标签对应的若干实时时间戳;When the matching result is that the historical storage linked list does not contain the marking tag, obtaining several real-time timestamps corresponding to the real-time tag;
基于若干所述实时时间戳计算实时时长;Calculate the real-time duration based on the real-time timestamps;
基于所述实时标签的数量和所述实时时长计算第二更新频率。A second update frequency is calculated based on the number of the real-time tags and the real-time duration.
进一步地,所述基于所述初始更新频率和所述实时更新频率对所述历史存储链表的历史长度进行调整的步骤包括:Furthermore, the step of adjusting the history length of the history storage linked list based on the initial update frequency and the real-time update frequency includes:
通过所述第一更新频率和所述初始更新频率的比较结果增加所述历史长度或减少所述历史长度;increasing the history length or decreasing the history length according to a comparison result between the first update frequency and the initial update frequency;
通过所述第二更新频率和所述初始更新频率的比较结果增加所述历史长度。The history length is increased according to a comparison result of the second update frequency and the initial update frequency.
进一步地,所述基于所述匹配结果对所述历史存储链表的历史排序进行调整的步骤包括:Furthermore, the step of adjusting the history order of the history storage linked list based on the matching result includes:
当所述匹配结果为所述历史存储链表中包含所述标记标签时,获取所述标记标签在历史存储链表中的位置;When the matching result is that the historical storage linked list contains the marking tag, obtaining the position of the marking tag in the historical storage linked list;
基于所述标记标签对应的最新标记时间戳遍历所述历史存储链表,根据遍历结果将所述标记标签进行移动;Traversing the history storage linked list based on the latest marking timestamp corresponding to the marking tag, and moving the marking tag according to the traversal result;
当所述匹配结果为所述历史存储链表中不包含所述标记标签时,将所述实时标签和其对应的实时时间戳添加至历史存储链表的头部。When the matching result is that the historical storage linked list does not contain the marking tag, the real-time tag and its corresponding real-time timestamp are added to the head of the historical storage linked list.
进一步地,所述对若干更新标签对应的更新时间戳进行过期检测的步骤包括:Furthermore, the step of performing expiration detection on update timestamps corresponding to a plurality of update tags includes:
根据当前时间戳与所述更新时间戳的差值计算过期时间;Calculating the expiration time according to the difference between the current timestamp and the update timestamp;
将所述过期时间与预设时间进行比较,根据比较结果判断是否过期。The expiration time is compared with a preset time, and whether it is expired is determined according to the comparison result.
进一步地,所述根据过期检测结果对所述更新存储链表进行再次调整的步骤包括:Furthermore, the step of re-adjusting the update storage linked list according to the expiration detection result includes:
当所述过期时间大于所述预设时间时,将所述更新标签从所述更新存储链表中移除,并更新剩余标签的排序;When the expiration time is greater than the preset time, the update tag is removed from the update storage linked list, and the order of the remaining tags is updated;
当所述过期时间小于或等于所述预设时间时,不对所述更新链表进行调整。When the expiration time is less than or equal to the preset time, the update linked list is not adjusted.
与现有技术相比,本发明的有益效果在于通过收集和分析用户的历史行为数据,能够深入了解用户的兴趣和需求,为后续标签的准确更新提供基础,使用链表结构能够高效地管理历史标签,同时结合访问时间戳,方便地实现LRU算法,确保最常使用的标签被保留,初始更新频率的确定为后续标签的更新提供准确的数据支持,使得后续的更新策略更加合理和有效,通过实时行为数据的收集和分析能够反映用户的当前兴趣和需求,为标签的实时更新提供重要依据,通过匹配实时标签和历史标签,评估标签的实时使用情况,进而计算实时更新频率,为标签的更新策略提供实时反馈,根据初始和实时的更新频率,动态地调整历史存储链表,确保最常使用的标签被保留,同时移除不常用的标签,从而优化存储资源的使用,过期检测能够及时发现并移除过期的标签,确保存储链表中的标签都是有效和准确的,同时优化了存储资源的使用。Compared with the prior art, the beneficial effect of the present invention lies in that by collecting and analyzing the user's historical behavior data, the user's interests and needs can be deeply understood, providing a basis for the accurate update of subsequent tags, the use of a linked list structure can efficiently manage historical tags, and at the same time, combined with the access timestamp, the LRU algorithm can be conveniently implemented to ensure that the most frequently used tags are retained, and the determination of the initial update frequency provides accurate data support for the update of subsequent tags, making the subsequent update strategy more reasonable and effective, and the collection and analysis of real-time behavior data can reflect the user's current interests and needs, providing an important basis for the real-time update of tags, by matching real-time tags and historical tags, evaluating the real-time usage of tags, and then calculating the real-time update frequency, providing real-time feedback for the tag update strategy, and dynamically adjusting the historical storage linked list according to the initial and real-time update frequencies to ensure that the most frequently used tags are retained and remove infrequently used tags, thereby optimizing the use of storage resources, and expiration detection can timely discover and remove expired tags to ensure that the tags in the storage linked list are valid and accurate, while optimizing the use of storage resources.
尤其,通过时间序列对访问时间戳进行排序,清晰地看出历史标签被访问的时间序列,从而了解标签的使用模式,计算相邻访问时间戳之间的时间间隔,并求取均值作为均值周期,反映标签使用的平均周期,为更新频率的确定提供重要依据,计算历史标签的访问频率,量化标签的相对重要性,结合均值周期、访问频率以及调整系数动态地确定每个历史标签的初始更新频率,能够提高标签更新的效率和准确性,通过为每个历史标签设置合适的初始更新频率,避免不必要的频繁更新或更新不及时的问题,提高系统的整体性能和稳定性,优化了存储资源的使用。In particular, by sorting the access timestamps through the time series, we can clearly see the time series of the historical tags being accessed, so as to understand the usage pattern of the tags, calculate the time intervals between adjacent access timestamps, and take the mean as the mean period, which reflects the average period of tag use and provides an important basis for determining the update frequency. The access frequency of historical tags is calculated, the relative importance of tags is quantified, and the initial update frequency of each historical tag is dynamically determined based on the mean period, access frequency and adjustment coefficient. This can improve the efficiency and accuracy of tag updates. By setting a suitable initial update frequency for each historical tag, unnecessary frequent updates or untimely updates can be avoided, thereby improving the overall performance and stability of the system and optimizing the use of storage resources.
尤其,通过遍历历史存储链表,迅速定位到是否存在与实时标签相同的标签,提高了匹配效率,当历史存储链表中不存在与实时标签完全相同的标签时,计算实时标签与已有历史标签的相似度,量化实时标签与历史标签的相关程度,通过设定预设相似度阈值,根据实际需求调整匹配的严格程度,从而增强标签更新的灵活性,通过相似度计算和匹配,识别与实时标签高度相关的历史标签,从而避免重复存储相似的标签,优化了存储空间的使用。In particular, by traversing the historical storage linked list, it is possible to quickly locate whether there is a label identical to the real-time label, thereby improving the matching efficiency. When there is no label identical to the real-time label in the historical storage linked list, the similarity between the real-time label and the existing historical label is calculated, and the correlation between the real-time label and the historical label is quantified. By setting a preset similarity threshold, the strictness of the matching is adjusted according to actual needs, thereby enhancing the flexibility of label updating. Through similarity calculation and matching, historical labels that are highly correlated with the real-time label are identified, thereby avoiding repeated storage of similar labels and optimizing the use of storage space.
尤其,通过比较第一更新频率和初始更新频率,动态地调整历史存储链表的历史长度,当第一更新频率高于或等于初始更新频率时,增加历史长度,以适应数据的高频更新,当第一更新频率低于初始更新频率时,减少历史长度,以优化存储空间的使用,对于实时更新频率高于或等于初始更新频率的情况,根据实时更新频率和初始更新频率的比值来增加历史长度,确保了系统能够保留足够的历史数据,以支持对数据变化的准确分析和预测,当实时更新频率低于初始更新频率时,不对历史长度进行调整,避免了不必要的存储空间浪费,通过动态调整历史存储链表的历史长度,有助于提高数据处理效率,通过精确控制历史存储链表的历史长度,避免过多的数据存储,从而减少存储空间、计算资源和维护成本的浪费。In particular, by comparing the first update frequency and the initial update frequency, the history length of the history storage linked list is dynamically adjusted. When the first update frequency is higher than or equal to the initial update frequency, the history length is increased to adapt to the high-frequency update of the data. When the first update frequency is lower than the initial update frequency, the history length is reduced to optimize the use of storage space. For the case where the real-time update frequency is higher than or equal to the initial update frequency, the history length is increased according to the ratio of the real-time update frequency to the initial update frequency, thereby ensuring that the system can retain sufficient historical data to support accurate analysis and prediction of data changes. When the real-time update frequency is lower than the initial update frequency, the history length is not adjusted, thereby avoiding unnecessary waste of storage space. Dynamically adjusting the history length of the history storage linked list helps to improve data processing efficiency. By accurately controlling the history length of the history storage linked list, excessive data storage can be avoided, thereby reducing the waste of storage space, computing resources and maintenance costs.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本发明实施例提供的基于LRU的标签更新方法的流程示意图;FIG1 is a schematic flow chart of a label updating method based on LRU according to an embodiment of the present invention;
图2为本发明实施例提供的基于LRU的标签更新方法的确定初始更新频率的流程示意图;FIG2 is a schematic diagram of a flow chart of determining an initial update frequency of a label update method based on LRU according to an embodiment of the present invention;
图3为本发明实施例提供的基于LRU的标签更新方法的实时标签与历史存储链表匹配过程的流程示意图;3 is a schematic flow chart of a process of matching a real-time label with a historical storage linked list in a label update method based on LRU according to an embodiment of the present invention;
图4为本发明实施例提供的基于LRU的标签更新方法的对更新时间戳进行过期检测的流程示意图。FIG. 4 is a schematic diagram of a flow chart of performing expiration detection on an update timestamp in a label update method based on LRU provided in an embodiment of the present invention.
具体实施方式DETAILED DESCRIPTION
为了使本发明的目的和优点更加清楚明白,下面结合实施例对本发明作进一步描述;应当理解,此处所描述的具体实施例仅仅用于解释本发明,并不用于限定本发明。In order to make the objects and advantages of the present invention more clearly understood, the present invention is further described below in conjunction with embodiments; it should be understood that the specific embodiments described herein are only used to explain the present invention and are not used to limit the present invention.
下面参照附图来描述本发明的优选实施方式。本领域技术人员应当理解的是,这些实施方式仅仅用于解释本发明的技术原理,并非在限制本发明的保护范围。The preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only used to explain the technical principles of the present invention and are not intended to limit the protection scope of the present invention.
需要说明的是,在本发明的描述中,术语“上”“下”“左”“右”“内”“外”等指示的方向或位置关系的术语是基于附图所示的方向或位置关系,这仅仅是为了便于描述,而不是指示或暗示所述装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。It should be noted that, in the description of the present invention, terms such as "up", "down", "left", "right", "inside" and "outside" indicating directions or positional relationships are based on the directions or positional relationships shown in the drawings. This is merely for the convenience of description and does not indicate or imply that the device or element must have a specific orientation, be constructed and operated in a specific orientation. Therefore, it should not be understood as a limitation on the present invention.
此外,还需要说明的是,在本发明的描述中,除非另有明确的规定和限定,术语“安装”“相连”“连接”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或一体的连接;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通。对于本领域技术人员而言,可根据具体情况理解上述术语在本发明中的具体含义。In addition, it should be noted that in the description of the present invention, unless otherwise clearly specified and limited, the terms "installed", "connected" and "connected" should be understood in a broad sense, for example, it can be a fixed connection, a detachable connection, or an integral connection; it can be a mechanical connection or an electrical connection; it can be a direct connection or an indirect connection through an intermediate medium, or it can be the internal communication of two components. For those skilled in the art, the specific meanings of the above terms in the present invention can be understood according to specific circumstances.
请参阅图1所示,本发明实施例提供一种基于LRU的标签更新方法,该方法包括:Referring to FIG. 1 , an embodiment of the present invention provides a label updating method based on LRU, the method comprising:
步骤S100,获取用户历史周期内的若干历史行为数据,基于若干历史行为数据确定若干历史标签及其对应的若干访问时间戳;Step S100, obtaining a number of historical behavior data within a historical period of the user, and determining a number of historical tags and their corresponding access timestamps based on the number of historical behavior data;
步骤S200,基于若干所述历史标签和历史标签对应的若干访问时间戳创建历史存储链表;Step S200, creating a history storage linked list based on the history tags and the access timestamps corresponding to the history tags;
步骤S300,基于若干所述访问时间戳确定任一所述历史标签的初始更新频率;Step S300, determining an initial update frequency of any of the history tags based on a number of the access timestamps;
步骤S400,收集用户预设时段内的实时行为数据,根据实时行为数据确定用户实时标签;Step S400, collecting real-time behavior data of the user within a preset time period, and determining the user's real-time tag according to the real-time behavior data;
步骤S500,将所述实时标签与所述历史存储链表进行匹配,获取匹配结果,根据匹配结果计算标签的实时更新频率;Step S500, matching the real-time tag with the historical storage linked list, obtaining a matching result, and calculating the real-time update frequency of the tag according to the matching result;
步骤S600,基于所述初始更新频率和所述实时更新频率对所述历史存储链表的历史长度进行调整,基于所述匹配结果对所述历史存储链表的历史排序进行调整,以获取更新存储链表;Step S600, adjusting the history length of the history storage linked list based on the initial update frequency and the real-time update frequency, and adjusting the history order of the history storage linked list based on the matching result to obtain an updated storage linked list;
步骤S700,遍历所述更新存储链表,对若干更新标签对应的更新时间戳进行过期检测,根据过期检测结果对所述更新存储链表进行再次调整,获取最终存储链表。Step S700, traverse the update storage linked list, perform expiration detection on the update timestamps corresponding to a number of update tags, adjust the update storage linked list again according to the expiration detection result, and obtain a final storage linked list.
具体而言,本发明实施例所述基于用户若干历史标签创建存储链表的步骤包括:Specifically, the step of creating a storage linked list based on a number of historical tags of a user according to the embodiment of the present invention includes:
创建若干空双向链表作为历史存储链表;Create several empty bidirectional linked lists as historical storage linked lists;
将若干所述历史标签和其对应的若干访问时间戳作为双向链表节点;Using the history tags and their corresponding access timestamps as nodes of a bidirectional linked list;
若所述历史存储链表为空,则将所述双向链表节点添加至所述历史存储链表头部;If the history storage linked list is empty, adding the bidirectional linked list node to the head of the history storage linked list;
若所述历史存储链表不为空,则遍历所述历史存储链表,标记遍历结果中最后一个访问时间戳小于新的双向链表节点的访问时间戳的节点,将新的双向链表节点插入到该节点之后或插入至历史存储链表尾部。If the historical storage linked list is not empty, traverse the historical storage linked list, mark the last node in the traversal result whose access timestamp is less than the access timestamp of the new bidirectional linked list node, and insert the new bidirectional linked list node after the node or to the end of the historical storage linked list.
具体而言,本发明实施例一个历史存储链表的例子为:若历史标签A对应的访问时间戳为[25,45,150,320],历史标签B对应的访问时间戳为[15,100],历史标签C对应的访问时间戳为[50,240],则选择每个标签的最新访问时间戳作为代表时间戳,按照代表时间戳的升序来创建双向链表:Specifically, an example of a history storage linked list in an embodiment of the present invention is: if the access timestamp corresponding to history tag A is [25, 45, 150, 320], the access timestamp corresponding to history tag B is [15, 100], and the access timestamp corresponding to history tag C is [50, 240], then the latest access timestamp of each tag is selected as the representative timestamp, and a bidirectional linked list is created in ascending order of the representative timestamps:
初始化一个空的双向链表作为历史存储链表;Initialize an empty doubly linked list as the history storage list;
将历史标签A作为双向链表节点添加至所述历史存储链表头部;Adding history tag A as a bidirectional linked list node to the head of the history storage linked list;
将历史标签B对应的代表时间戳100与节点A的代表时间戳320进行比较,根据比较结果将历史标签B对应的节点B插入到A节点之前;Compare the representative timestamp 100 corresponding to the history tag B with the representative timestamp 320 of the node A, and insert the node B corresponding to the history tag B before the node A according to the comparison result;
将历史标签C对应的代表时间戳240分别与节点B对应的代表时间戳100、节点A对应的代表时间戳320进行比较,根据比较结果将历史标签C对应的节点C插入到节点A和节点B之间;Compare the representative timestamp 240 corresponding to the history label C with the representative timestamp 100 corresponding to the node B and the representative timestamp 320 corresponding to the node A, and insert the node C corresponding to the history label C between the node A and the node B according to the comparison result;
则其对应的最终历史存储链表为下表所示:The corresponding final history storage linked list is shown in the following table:
具体而言,本发明实施例中所述历史周期为一年;Specifically, the historical period in the embodiment of the present invention is one year;
所述预设时段为历史周期的1/12。The preset time period is 1/12 of the historical period.
具体而言,本发明实施例通过收集和分析用户的历史行为数据,能够深入了解用户的兴趣和需求,为后续标签的准确更新提供基础,使用链表结构能够高效地管理历史标签,同时结合访问时间戳,方便地实现LRU算法,确保最常使用的标签被保留,初始更新频率的确定为后续标签的更新提供准确的数据支持,使得后续的更新策略更加合理和有效,通过实时行为数据的收集和分析能够反映用户的当前兴趣和需求,为标签的实时更新提供重要依据,通过匹配实时标签和历史标签,评估标签的实时使用情况,进而计算实时更新频率,为标签的更新策略提供实时反馈,根据初始和实时的更新频率,动态地调整历史存储链表,确保最常使用的标签被保留,同时移除不常用的标签,从而优化存储资源的使用,过期检测能够及时发现并移除过期的标签,确保存储链表中的标签都是有效和准确的,同时优化了存储资源的使用。Specifically, the embodiments of the present invention can deeply understand the interests and needs of users by collecting and analyzing historical behavior data of users, and provide a basis for accurate updating of subsequent tags. The use of a linked list structure can efficiently manage historical tags. At the same time, combined with access timestamps, the LRU algorithm can be conveniently implemented to ensure that the most frequently used tags are retained. The determination of the initial update frequency provides accurate data support for the update of subsequent tags, making the subsequent update strategy more reasonable and effective. The collection and analysis of real-time behavior data can reflect the current interests and needs of users, providing an important basis for the real-time update of tags. By matching real-time tags and historical tags, the real-time usage of tags is evaluated, and then the real-time update frequency is calculated to provide real-time feedback for the tag update strategy. According to the initial and real-time update frequencies, the historical storage linked list is dynamically adjusted to ensure that the most frequently used tags are retained and infrequently used tags are removed, thereby optimizing the use of storage resources. Expiration detection can timely detect and remove expired tags, ensuring that the tags in the storage linked list are valid and accurate, and optimizing the use of storage resources.
具体而言,所述基于若干历史行为数据确定若干历史标签及其对应的若干访问时间戳的步骤包括:Specifically, the step of determining a number of historical tags and their corresponding access timestamps based on a number of historical behavior data includes:
获取用户历史交易记录,所述历史交易记录包括历史交易信息、历史交易时间、历史交易金额和历史商品类型;Obtaining historical transaction records of the user, wherein the historical transaction records include historical transaction information, historical transaction time, historical transaction amount, and historical commodity type;
基于所述历史交易时间确定若干所述访问时间戳;determining a plurality of said access timestamps based on said historical transaction time;
基于所述历史交易信息、所述历史交易金额和所述历史商品类型确定所述历史标签。The historical tag is determined based on the historical transaction information, the historical transaction amount and the historical commodity type.
具体而言,所述基于所述历史交易信息、所述历史交易金额和所述历史商品类型确定所述历史标签的步骤包括:Specifically, the step of determining the historical tag based on the historical transaction information, the historical transaction amount and the historical commodity type includes:
基于所述历史交易信息确定历史交易频率,基于交易频率确定交易活跃度;Determine the historical transaction frequency based on the historical transaction information, and determine the transaction activity based on the transaction frequency;
基于所述历史交易金额确定历史总交易金额,基于历史总交易金额确定消费能力;Determine the total historical transaction amount based on the historical transaction amount, and determine the consumption capacity based on the total historical transaction amount;
基于所述历史商品类型确定商品偏好;determining a product preference based on the historical product types;
基于所述交易活跃度、所述消费能力和所述商品偏好确定所述历史标签。The historical tag is determined based on the transaction activity, the spending power, and the commodity preference.
具体而言,本发明实施例基于所述用户交易偏好确定所述历史标签的一种例子为:用户L在过去一年内在该平台进行了50次交易,总交易金额达到5万元人民币,其中,购买商品的类型主要包括电子产品(占比40%)和图书(占比30%),退货2次和换货5次,用户L一年内进行了50次交易,其交易频率约为4次/月,与平台统计所有用户的平均交易频率6次/月进行比较,确定用户交易活跃度为中低活跃度,与平台统计所有用户的最小交易频率2次/月相比,确定用户交易活跃度为中等活跃度;Specifically, an example of determining the historical tag based on the user transaction preference in an embodiment of the present invention is: user L has conducted 50 transactions on the platform in the past year, with a total transaction amount of RMB 50,000, of which the types of purchased goods mainly include electronic products (accounting for 40%) and books (accounting for 30%), with 2 returns and 5 exchanges. User L has conducted 50 transactions in one year, with a transaction frequency of about 4 times/month. Compared with the average transaction frequency of 6 times/month of all users counted by the platform, it is determined that the user's transaction activity is medium-low activity. Compared with the minimum transaction frequency of 2 times/month of all users counted by the platform, it is determined that the user's transaction activity is medium activity.
将用户总交易金额5万元与平台统计所有用户的平均总交易金额4.5万元相比,确定用户消费能力为高消费能力;Comparing the user's total transaction amount of 50,000 yuan with the average total transaction amount of 45,000 yuan for all users in the platform statistics, the user's consumption capacity is determined to be high;
基于用户购买商品类型中占比最大的产品为电子产品(占比40%),确定用户商品偏好为数码电子;Based on the fact that the largest proportion of products purchased by users is electronic products (accounting for 40%), it is determined that the user's product preference is digital electronics;
基于上述确定用户标签为[中等活跃度,高消费能力,数码电子]。Based on the above, the user tags are determined to be [moderate activity, high spending power, digital electronics].
具体而言,本发明实施例通过细致分析用户的交易频率、总交易金额和商品类型,精确地确定用户的交易活跃度、消费能力和商品偏好,从而更准确地获得用户标签,以便为后续的标签更新和管理提供有力的支持。Specifically, the embodiments of the present invention accurately determine the user's transaction activity, consumption capacity and product preferences by carefully analyzing the user's transaction frequency, total transaction amount and product type, thereby more accurately obtaining user tags to provide strong support for subsequent tag updates and management.
参阅图2所示,所述基于若干所述访问时间戳确定任一所述历史标签的初始更新频率的步骤包括:Referring to FIG. 2 , the step of determining the initial update frequency of any of the history tags based on the access timestamps includes:
步骤S310,基于时间序列对任一所述历史标签对应的若干所述访问时间戳进行排序,获取排序结果;Step S310, sorting the access timestamps corresponding to any of the history tags based on the time series to obtain a sorting result;
步骤S320,根据排序结果计算两个相邻访问时间戳之间的时间间隔;Step S320, calculating the time interval between two adjacent access timestamps according to the sorting result;
步骤S330,计算若干所述时间间隔的均值,将均值作为均值周期;Step S330, calculating the average of the plurality of time intervals, and taking the average as the average period;
步骤S340,计算任一所述历史标签对应的访问时间戳数量与若干所述历史标签的访问时间戳总数量的比值,将比值作为访问频率;Step S340, calculating the ratio of the number of access timestamps corresponding to any of the history tags to the total number of access timestamps of the history tags, and taking the ratio as the access frequency;
步骤S350,基于所述均值周期、所述访问频率和调整系数确定所述初始更新频率。Step S350: determining the initial update frequency based on the averaging period, the access frequency and the adjustment coefficient.
具体而言,本发明实施例所述时间序列为时间由先到后。Specifically, the time sequence described in the embodiment of the present invention is from earliest to latest.
具体而言,本发明实施例所述调整周期为最大时间间隔与最小时间间隔的和除以最大时间间隔与最小时间间隔的差获得;Specifically, the adjustment period in the embodiment of the present invention is obtained by dividing the sum of the maximum time interval and the minimum time interval by the difference between the maximum time interval and the minimum time interval;
假设某一历史标签在过去一周内有以下访问时间戳(单位:秒):[1,100,500,1000,1200,1800,2400],时间间隔为:[99,400,500,200,600,600],均值周期为:(99+400+500+200+600+600)/5=479.8秒,假设总的历史标签访问时间戳数量为1000,该标签的访问时间戳数量为23,则访问频率为:23/100=0.23,则初始更新频率可设置为均值周期乘以访问频率再乘以调整系数,初始更新频率为:479.8×0.007×((600+99)/(600-99))=153.96秒。Assume that a certain historical tag has the following access timestamps (unit: seconds) in the past week: [1, 100, 500, 1000, 1200, 1800, 2400], the time interval is: [99, 400, 500, 200, 600, 600], the average period is: (99+400+500+200+600+600)/5=479.8 seconds, and assume that the total number of historical tag access timestamps is 1000, and the number of access timestamps of this tag is 23, then the access frequency is: 23/100=0.23, then the initial update frequency can be set to the average period multiplied by the access frequency and then multiplied by the adjustment coefficient, and the initial update frequency is: 479.8×0.007×((600+99)/(600-99))=153.96 seconds.
具体而言,本发明实施例通过时间序列对访问时间戳进行排序,清晰地看出历史标签被访问的时间序列,从而了解标签的使用模式,计算相邻访问时间戳之间的时间间隔,并求取均值作为均值周期,反映标签使用的平均周期,为更新频率的确定提供重要依据,计算历史标签的访问频率,量化标签的相对重要性,结合均值周期、访问频率以及调整系数动态地确定每个历史标签的初始更新频率,能够提高标签更新的效率和准确性,通过为每个历史标签设置合适的初始更新频率,避免不必要的频繁更新或更新不及时的问题,提高系统的整体性能和稳定性,优化了存储资源的使用。Specifically, the embodiment of the present invention sorts the access timestamps by time series, clearly sees the time series of the historical tags being accessed, and thus understands the usage pattern of the tags, calculates the time intervals between adjacent access timestamps, and obtains the mean as the mean period, which reflects the average period of tag use and provides an important basis for determining the update frequency. The access frequency of the historical tags is calculated, the relative importance of the tags is quantified, and the initial update frequency of each historical tag is dynamically determined in combination with the mean period, access frequency and adjustment coefficient, which can improve the efficiency and accuracy of tag updates. By setting a suitable initial update frequency for each historical tag, unnecessary frequent updates or untimely updates can be avoided, thereby improving the overall performance and stability of the system and optimizing the use of storage resources.
参阅图3所示,所述将所述实时标签与所述历史存储链表进行匹配的步骤包括:Referring to FIG. 3 , the step of matching the real-time tag with the historical storage linked list includes:
步骤S510,将所述实时标签遍历所述历史存储链表,若所述历史存储链表中包含所述实时标签,则将所述历史存储链表中对应历史标签进行标记,获取标记标签;Step S510, traversing the historical storage linked list with the real-time tag, if the historical storage linked list contains the real-time tag, marking the corresponding historical tag in the historical storage linked list to obtain a marking tag;
步骤S520,若所述历史存储链表中不包含所述实时标签,则计算所述实时标签与若干历史标签的相似度,获取相似度计算结果;Step S520: If the real-time tag is not included in the history storage linked list, the similarity between the real-time tag and several history tags is calculated to obtain a similarity calculation result;
步骤S530,若所述相似度计算结果大于或等于预设相似度,则将相似度计算结果最大值对应的历史标签进行标记,获取标记标签。Step S530: If the similarity calculation result is greater than or equal to the preset similarity, the historical tag corresponding to the maximum value of the similarity calculation result is marked to obtain a marked tag.
具体而言,本发明实施例还包括若所述相似度计算结果小于所述预设相似度,所述历史存储链表中不包含所述标记标签。Specifically, the embodiment of the present invention further includes that if the similarity calculation result is less than the preset similarity, the history storage linked list does not include the marking tag.
具体而言,本发明实施例所述预设相似度为85%。Specifically, the preset similarity in the embodiment of the present invention is 85%.
具体而言,本发明实施例中相似度计算可通过余弦相似度、Jaccard相似度或其他适合的相似度算法来实现,其中,一种可能的实例为实时标签为“智能手环”,通过使用词嵌入模型(如Word2Vec)或基于TF-IDF的方法将标签转换为向量表示,计算“智能手环”向量与每个历史标签向量的余弦值来确定它们之间的相似度,计算“智能手环”与历史标签“智能电子”的相似度结果大于预设相似度,将“智能电子”作为标记标签。Specifically, in the embodiment of the present invention, the similarity calculation can be implemented by cosine similarity, Jaccard similarity or other suitable similarity algorithms. Among them, a possible example is that the real-time label is "smart bracelet", and the label is converted into a vector representation by using a word embedding model (such as Word2Vec) or a TF-IDF-based method, and the cosine value of the "smart bracelet" vector and each historical label vector is calculated to determine the similarity between them. The similarity result between the "smart bracelet" and the historical label "smart electronics" is calculated to be greater than the preset similarity, and "smart electronics" is used as a marking label.
具体而言,本发明实施例通过遍历历史存储链表,迅速定位到是否存在与实时标签相同的标签,提高了匹配效率,当历史存储链表中不存在与实时标签完全相同的标签时,计算实时标签与已有历史标签的相似度,量化实时标签与历史标签的相关程度,通过设定预设相似度阈值,根据实际需求调整匹配的严格程度,从而增强标签更新的灵活性,通过相似度计算和匹配,识别与实时标签高度相关的历史标签,从而避免重复存储相似的标签,优化了存储空间的使用。Specifically, the embodiment of the present invention traverses the historical storage linked list to quickly locate whether there is a label identical to the real-time label, thereby improving the matching efficiency. When there is no label identical to the real-time label in the historical storage linked list, the similarity between the real-time label and the existing historical label is calculated, and the correlation between the real-time label and the historical label is quantified. By setting a preset similarity threshold, the strictness of the matching is adjusted according to actual needs, thereby enhancing the flexibility of label updating. Through similarity calculation and matching, historical labels that are highly correlated with the real-time label are identified, thereby avoiding repeated storage of similar labels and optimizing the use of storage space.
具体而言,所述根据匹配结果计算标签的实时更新频率的步骤包括:Specifically, the step of calculating the real-time update frequency of the tag according to the matching result includes:
当所述匹配结果为所述历史存储链表中包含所述标记标签时,获取标记标签对应的若干标记时间戳;When the matching result is that the history storage linked list contains the marking tag, obtaining a plurality of marking timestamps corresponding to the marking tag;
基于若干所述标记时间戳计算标记时长;Calculate the marking duration based on the plurality of marking timestamps;
基于所述标记标签的数量和所述标记时长计算第一更新频率;Calculate a first update frequency based on the number of the marking tags and the marking duration;
当所述匹配结果为所述历史存储链表中不包含所述标记标签时,获取实时标签对应的若干实时时间戳;When the matching result is that the historical storage linked list does not contain the marking tag, obtaining several real-time timestamps corresponding to the real-time tag;
基于若干所述实时时间戳计算实时时长;Calculate the real-time duration based on the real-time timestamps;
基于所述实时标签的数量和所述实时时长计算第二更新频率。A second update frequency is calculated based on the number of the real-time tags and the real-time duration.
具体而言,本发明实施例所述第一更新频率为标记标签的数量除以标记时长;Specifically, the first update frequency in the embodiment of the present invention is the number of marked tags divided by the marking duration;
所述第二更新频率为实时标签的数量除以实时时长。The second update frequency is the number of real-time tags divided by the real-time duration.
具体而言,本发明实施例通过计算标记标签的标记时长和实时标签的实时时长,精确地度量出特定标签在一段时间内的活跃度,有助于系统更准确地理解用户需求和标签的使用趋势,基于计算出的第一更新频率和第二更新频率,为标签的管理提供了依据,通过实时更新频率的计算,根据标签的活跃度为用户提供更加个性化和精准的服务,实时更新频率作为一种数据指标,为标签更新决策提供准确的数据支持,优化资源利用率。Specifically, the embodiment of the present invention accurately measures the activity of a specific tag within a period of time by calculating the marking duration of the marked tag and the real-time duration of the real-time tag, which helps the system to more accurately understand user needs and tag usage trends. Based on the calculated first update frequency and second update frequency, a basis is provided for tag management. By calculating the real-time update frequency, more personalized and accurate services are provided to users according to the activity of the tag. The real-time update frequency, as a data indicator, provides accurate data support for tag update decisions and optimizes resource utilization.
具体而言,所述基于所述初始更新频率和所述实时更新频率对所述历史存储链表的历史长度进行调整的步骤包括:Specifically, the step of adjusting the history length of the history storage linked list based on the initial update frequency and the real-time update frequency includes:
通过所述第一更新频率和所述初始更新频率的比较结果增加所述历史长度或减少所述历史长度;increasing the history length or decreasing the history length according to a comparison result between the first update frequency and the initial update frequency;
通过所述第二更新频率和所述初始更新频率的比较结果增加所述历史长度。The history length is increased according to a comparison result of the second update frequency and the initial update frequency.
具体而言,本发明实施例当所述第一更新频率≥所述初始更新频率时,计算所述第一更新频率和所述初始更新频率的比值,将比值作为增长系数,增加后的历史长度=比值×历史长度;Specifically, in the embodiment of the present invention, when the first update frequency is greater than or equal to the initial update frequency, the ratio of the first update frequency to the initial update frequency is calculated, the ratio is used as the growth coefficient, and the increased history length = ratio × history length;
当所述第一更新频率<所述初始更新频率时,计算所述第一更新频率和所述初始更新频率的差值,计算差值与所述初始更新频率的比值,将比值作为减少幅度,则减少后的历史长度=(1-比值)×历史长度;When the first update frequency is less than the initial update frequency, the difference between the first update frequency and the initial update frequency is calculated, and the ratio of the difference to the initial update frequency is calculated, and the ratio is used as the reduction amplitude, and the reduced history length = (1-ratio) × history length;
当所述第二更新频率≥所述初始更新频率时,计算所述第二更新频率和所述初始更新频率的比值,将比值作为增长幅度,增加后的历史长度=(1+比值)×历史长度;When the second update frequency is greater than or equal to the initial update frequency, the ratio of the second update frequency to the initial update frequency is calculated, and the ratio is used as the growth rate, and the increased history length = (1 + ratio) × history length;
当所述第二更新频率<所述初始更新频率时,不对所述历史长度进行调整。When the second update frequency is less than the initial update frequency, the history length is not adjusted.
具体而言,本发明实施例通过比较第一更新频率和初始更新频率,动态地调整历史存储链表的历史长度,当第一更新频率高于或等于初始更新频率时,增加历史长度,以适应数据的高频更新,当第一更新频率低于初始更新频率时,减少历史长度,以优化存储空间的使用,对于实时更新频率高于或等于初始更新频率的情况,根据实时更新频率和初始更新频率的比值来增加历史长度,确保了系统能够保留足够的历史数据,以支持对数据变化的准确分析和预测,当实时更新频率低于初始更新频率时,不对历史长度进行调整,避免了不必要的存储空间浪费,通过动态调整历史存储链表的历史长度,有助于提高数据处理效率,通过精确控制历史存储链表的历史长度,避免过多的数据存储,从而减少存储空间、计算资源和维护成本的浪费。Specifically, the embodiment of the present invention dynamically adjusts the history length of the history storage linked list by comparing the first update frequency and the initial update frequency. When the first update frequency is higher than or equal to the initial update frequency, the history length is increased to adapt to the high-frequency update of data. When the first update frequency is lower than the initial update frequency, the history length is reduced to optimize the use of storage space. For the case where the real-time update frequency is higher than or equal to the initial update frequency, the history length is increased according to the ratio of the real-time update frequency to the initial update frequency, thereby ensuring that the system can retain sufficient historical data to support accurate analysis and prediction of data changes. When the real-time update frequency is lower than the initial update frequency, the history length is not adjusted, thereby avoiding unnecessary waste of storage space. Dynamically adjusting the history length of the history storage linked list helps to improve data processing efficiency. By accurately controlling the history length of the history storage linked list, excessive data storage is avoided, thereby reducing the waste of storage space, computing resources and maintenance costs.
具体而言,所述基于所述匹配结果对所述历史存储链表的历史排序进行调整的步骤包括:Specifically, the step of adjusting the history order of the history storage linked list based on the matching result includes:
当所述匹配结果为所述历史存储链表中包含所述标记标签时,获取所述标记标签在历史存储链表中的位置;When the matching result is that the historical storage linked list contains the marking tag, obtaining the position of the marking tag in the historical storage linked list;
基于所述标记标签对应的最新标记时间戳遍历所述历史存储链表,根据遍历结果将所述标记标签进行移动;Traversing the history storage linked list based on the latest marking timestamp corresponding to the marking tag, and moving the marking tag according to the traversal result;
当所述匹配结果为所述历史存储链表中不包含所述标记标签时,将所述实时标签和其对应的实时时间戳添加至历史存储链表的头部。When the matching result is that the historical storage linked list does not contain the marking tag, the real-time tag and its corresponding real-time timestamp are added to the head of the historical storage linked list.
具体而言,本发明实施例当匹配结果为历史存储链表中包含标记标签时,根据最新标记时间戳对历史存储链表进行遍历和调整,将最新标记的标签移动到链表中的更靠近头部的位置,确保最近更新或最常使用的数据位于链表的前端,从而提高了数据访问的效率,通过将最新标记的标签移动到链表前端,更加有效地组织和管理历史数据,有助于提升系统的整体性能,当匹配结果为历史存储链表中不包含标记标签时,将实时标签和其对应的实时时间戳添加至历史存储链表的头部,实时反映新数据的变化,确保系统能够及时处理和响应新的数据输入,通过将新数据添加到链表头部,旧数据则逐渐向后移动,从而保持了数据的新鲜度和可用性,通过对历史存储链表进行排序调整,更加高效地处理数据查询和检索请求。Specifically, in an embodiment of the present invention, when the matching result is that the historical storage linked list contains a marked label, the historical storage linked list is traversed and adjusted according to the latest marked timestamp, and the latest marked label is moved to a position closer to the head in the linked list to ensure that the most recently updated or most frequently used data is located at the front end of the linked list, thereby improving the efficiency of data access. By moving the latest marked label to the front end of the linked list, historical data is more effectively organized and managed, which helps to improve the overall performance of the system. When the matching result is that the historical storage linked list does not contain a marked label, the real-time label and its corresponding real-time timestamp are added to the head of the historical storage linked list to reflect the changes in new data in real time, ensuring that the system can process and respond to new data input in a timely manner. By adding new data to the head of the linked list, the old data is gradually moved backwards, thereby maintaining the freshness and availability of the data. By sorting and adjusting the historical storage linked list, data query and retrieval requests are processed more efficiently.
参阅图4所示,所述对若干更新标签对应的更新时间戳进行过期检测的步骤包括:Referring to FIG. 4 , the step of performing expiration detection on update timestamps corresponding to a plurality of update tags includes:
步骤S710,根据当前时间戳与所述更新时间戳的差值计算过期时间;Step S710, calculating the expiration time according to the difference between the current timestamp and the update timestamp;
步骤S720,将所述过期时间与预设时间进行比较,根据比较结果判断是否过期。Step S720, comparing the expiration time with a preset time, and determining whether it is expired according to the comparison result.
具体而言,所述根据过期检测结果对所述更新存储链表进行再次调整的步骤包括:Specifically, the step of re-adjusting the update storage linked list according to the expiration detection result includes:
当所述过期时间大于所述预设时间时,将所述更新标签从所述更新存储链表中移除,并更新剩余标签的排序;When the expiration time is greater than the preset time, the update tag is removed from the update storage linked list, and the order of the remaining tags is updated;
当所述过期时间小于或等于所述预设时间时,不对所述更新链表进行调整。When the expiration time is less than or equal to the preset time, the update linked list is not adjusted.
具体而言,本发明实施例通过计算过期时间并与预设时间进行比较,能够准确判断哪些更新标签已经过期,从而避免了无效数据的存储,提高了数据管理的高效性,确保了系统中数据的准确性和实时性,移除过期的更新标签释放存储空间,使系统能够更有效地利用有限的资源,提高系统的整体性能,移除过期标签并更新剩余标签的排序,确保更新存储链表中的数据都是有效且最新的,有助于加速数据检索的过程,通过对更新存储链表进行再次调整,保持其内部数据的一致性和完整性,有助于提升系统的稳定性,减少因数据不一致或错误而导致的潜在问题。Specifically, the embodiments of the present invention can accurately determine which update tags have expired by calculating the expiration time and comparing it with the preset time, thereby avoiding the storage of invalid data, improving the efficiency of data management, ensuring the accuracy and real-time of data in the system, removing expired update tags to release storage space, enabling the system to more effectively utilize limited resources, and improving the overall performance of the system. It removes expired tags and updates the sorting of remaining tags to ensure that the data in the update storage linked list is valid and up-to-date, which helps to speed up the data retrieval process, and by re-adjusting the update storage linked list to maintain the consistency and integrity of its internal data, it helps to improve the stability of the system and reduce potential problems caused by data inconsistencies or errors.
至此,已经结合附图所示的优选实施方式描述了本发明的技术方案,但是,本领域技术人员容易理解的是,本发明的保护范围显然不局限于这些具体实施方式。在不偏离本发明的原理的前提下,本领域技术人员可以对相关技术特征做出等同的更改或替换,这些更改或替换之后的技术方案都将落入本发明的保护范围之内。So far, the technical solutions of the present invention have been described in conjunction with the preferred embodiments shown in the accompanying drawings. However, it is easy for those skilled in the art to understand that the protection scope of the present invention is obviously not limited to these specific embodiments. Without departing from the principle of the present invention, those skilled in the art can make equivalent changes or substitutions to the relevant technical features, and the technical solutions after these changes or substitutions will fall within the protection scope of the present invention.
以上所述仅为本发明的优选实施例,并不用于限制本发明;对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and variations. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410962164.6A CN118885492B (en) | 2024-07-18 | 2024-07-18 | LRU-based tag updating method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410962164.6A CN118885492B (en) | 2024-07-18 | 2024-07-18 | LRU-based tag updating method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118885492A true CN118885492A (en) | 2024-11-01 |
CN118885492B CN118885492B (en) | 2025-03-14 |
Family
ID=93233723
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410962164.6A Active CN118885492B (en) | 2024-07-18 | 2024-07-18 | LRU-based tag updating method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118885492B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119441626A (en) * | 2025-01-09 | 2025-02-14 | 东方物通科技(北京)有限公司 | A user behavior data storage and analysis method based on big data |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6738865B1 (en) * | 2000-06-09 | 2004-05-18 | International Business Machines Corporation | Method, system, and program for demoting data from cache based on least recently accessed and least frequently accessed data |
US20200241788A1 (en) * | 2019-01-28 | 2020-07-30 | International Business Machines Corporation | Storage layer selection based upon information assignment |
US20200342014A1 (en) * | 2019-04-29 | 2020-10-29 | American Express Travel Related Services Company, Inc. | Data indexing system using dynamic tags |
CN116127162A (en) * | 2023-04-13 | 2023-05-16 | 长沙数智融媒科技有限公司 | Big data real-time user tag method and system, computer equipment and storage medium |
CN116680276A (en) * | 2023-06-27 | 2023-09-01 | 深圳博沃智慧科技有限公司 | Data tag storage management method, device, equipment and storage medium |
CN117667768A (en) * | 2022-08-21 | 2024-03-08 | 翼集分(上海)数字科技有限公司 | Method for realizing LRU optimization by establishing cache priority prediction mechanism |
US20240202571A1 (en) * | 2022-12-15 | 2024-06-20 | Capital One Services, Llc | Systems and methods for label selection for machine learning model training based on dataset similiarity metrics |
US20240202572A1 (en) * | 2022-12-15 | 2024-06-20 | Capital One Services, Llc | Systems and methods for label versioning for machine learning input data |
-
2024
- 2024-07-18 CN CN202410962164.6A patent/CN118885492B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6738865B1 (en) * | 2000-06-09 | 2004-05-18 | International Business Machines Corporation | Method, system, and program for demoting data from cache based on least recently accessed and least frequently accessed data |
US20200241788A1 (en) * | 2019-01-28 | 2020-07-30 | International Business Machines Corporation | Storage layer selection based upon information assignment |
US20200342014A1 (en) * | 2019-04-29 | 2020-10-29 | American Express Travel Related Services Company, Inc. | Data indexing system using dynamic tags |
CN117667768A (en) * | 2022-08-21 | 2024-03-08 | 翼集分(上海)数字科技有限公司 | Method for realizing LRU optimization by establishing cache priority prediction mechanism |
US20240202571A1 (en) * | 2022-12-15 | 2024-06-20 | Capital One Services, Llc | Systems and methods for label selection for machine learning model training based on dataset similiarity metrics |
US20240202572A1 (en) * | 2022-12-15 | 2024-06-20 | Capital One Services, Llc | Systems and methods for label versioning for machine learning input data |
CN116127162A (en) * | 2023-04-13 | 2023-05-16 | 长沙数智融媒科技有限公司 | Big data real-time user tag method and system, computer equipment and storage medium |
CN116680276A (en) * | 2023-06-27 | 2023-09-01 | 深圳博沃智慧科技有限公司 | Data tag storage management method, device, equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
李媛媛;李旭晖;: "结合本体与社会化标签的用户动态兴趣建模研究", 情报学报, no. 04, 24 April 2020 (2020-04-24) * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119441626A (en) * | 2025-01-09 | 2025-02-14 | 东方物通科技(北京)有限公司 | A user behavior data storage and analysis method based on big data |
Also Published As
Publication number | Publication date |
---|---|
CN118885492B (en) | 2025-03-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN118761745B (en) | OA collaborative workflow optimization method applied to enterprise | |
WO2013138961A1 (en) | Method and system for measuring web advertising effectiveness based on multiple-contact attribution model | |
CN118885492A (en) | A label updating method based on LRU | |
WO2009010950A1 (en) | System and method for predicting a measure of anomalousness and similarity of records in relation to a set of reference records | |
CN118051187A (en) | Data storage method and system in transaction processing system | |
CN119356628B (en) | Method and device for writing data into storage system, storage medium and electronic equipment | |
CN116610715B (en) | A multi-dimensional analysis method and system for multi-level storage data | |
CN118733637A (en) | An intelligent analysis system based on multidimensional data portrait | |
CN120011265A (en) | A method to improve cache hit strategy and response speed | |
CN119988477A (en) | Intelligent business decision-making method based on multi-source heterogeneous data | |
CN119829626A (en) | LSM engine data cache optimization method | |
CN111666308B (en) | Behavior analysis-based intelligent big data recommendation query method and system | |
CN118796640A (en) | Account management method and system based on big data | |
CN117971913A (en) | System for collecting feedback instruction by using database information | |
CN112308419A (en) | Data processing method, apparatus, equipment and computer storage medium | |
CN117312303A (en) | Automatic data asset checking method, device, electronic equipment and medium | |
CN111221824B (en) | Storage optimization method, device, equipment and medium for storage space | |
CN114218134A (en) | A method and device for caching users | |
JP2022176538A (en) | Recommendation system and commercial material recommendation method | |
CN114298525A (en) | Database risk assessment method and device | |
CN112712396A (en) | Service internet control method based on service community succession prediction model | |
CN115033603B (en) | Data storage and data query method and device | |
CN119377327B (en) | A platform data processing management method and system based on big data | |
CN109977301A (en) | A kind of user's use habit method for digging | |
CN119598215B (en) | Customer demand prediction system and method based on machine learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |