[go: up one dir, main page]

CN114089947B - A multi-classification sorting method based on time series data sets - Google Patents

A multi-classification sorting method based on time series data sets Download PDF

Info

Publication number
CN114089947B
CN114089947B CN202111391842.0A CN202111391842A CN114089947B CN 114089947 B CN114089947 B CN 114089947B CN 202111391842 A CN202111391842 A CN 202111391842A CN 114089947 B CN114089947 B CN 114089947B
Authority
CN
China
Prior art keywords
data
mapping table
sub
time
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111391842.0A
Other languages
Chinese (zh)
Other versions
CN114089947A (en
Inventor
陈金皖
张江勇
温娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Helicopter Research and Development Institute
Original Assignee
China Helicopter Research and Development Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Helicopter Research and Development Institute filed Critical China Helicopter Research and Development Institute
Priority to CN202111391842.0A priority Critical patent/CN114089947B/en
Publication of CN114089947A publication Critical patent/CN114089947A/en
Application granted granted Critical
Publication of CN114089947B publication Critical patent/CN114089947B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/06Arrangements for sorting, selecting, merging, or comparing data on individual record carriers
    • G06F7/08Sorting, i.e. grouping record carriers in numerical or other ordered sequence according to the classification of at least some of the information they carry

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明提供了一种基于时间序列数据集的多分类排序方法,包括:建立位置映射表、时间映射表;基于位置映射表和时间映射表,将第一数据插入子数据集;其中,子数据集是根据第一数据的属性建立的;基于第一数据的属性对子数据集内的数据进行排序,并基于排序数据的序列更新位置映射表和时间映射表;在数据库已满的情况下,输入第二数据,基于位置映射表删除第一数据中的子数据;其中,子数据是写入数据库时间最早的数据;基于子数据集的大小和子数据的位置,确定数据库中剩余数据的移动方向;基于移动方向移动剩余数据,得到目标位置,将第二数据插入所述目标位置。

The present invention provides a multi-classification sorting method based on a time series data set, comprising: establishing a position mapping table and a time mapping table; inserting first data into a sub-data set based on the position mapping table and the time mapping table; wherein the sub-data set is established according to the attributes of the first data; sorting the data in the sub-data set based on the attributes of the first data, and updating the position mapping table and the time mapping table based on the sequence of the sorted data; when the database is full, inputting the second data, and deleting the sub-data in the first data based on the position mapping table; wherein the sub-data is the data with the earliest writing time in the database; determining the moving direction of the remaining data in the database based on the size of the sub-data set and the position of the sub-data; moving the remaining data based on the moving direction to obtain a target position, and inserting the second data into the target position.

Description

Multi-classification ordering method based on time sequence data set
Technical Field
The invention belongs to the technical field of computer software, and particularly relates to a multi-classification ordering method based on a time sequence data set.
Background
The ordering is to arrange the unordered data elements according to the keyword sequence by a certain method. Ordering is a type of operation that is often performed in a computer, and its purpose is to adjust a set of "unordered" recording sequences to "ordered" recording sequences, which is a very important, research and application-wide class of problems in the field of computer applications. For example, ordering algorithms have found very wide application in data processing, databases, data compression, distributed computing, image processing, and computer graphics. Meanwhile, the sorting algorithm is very many, related algorithms are also applied to various fields, but the traditional multi-sorting algorithm is often not very good in application effect on time sequence data.
Time series data flow in an on-board avionics system is the most typical data, sensors are constantly generating streaming data at all times, a large amount of data arrives continuously at a very fast speed in a short time, the data amount is often infinite, the avionics system does not store the whole data, the data changes dynamically with time, on-board software needs to update the data in real time, and the data needs to be ordered by importance degree, distance, target attribute and the like. In the ordered data set, when a new data set comes, the earliest data needs to be removed, and the new data set is inserted into the ordered data set.
Although the ranking method is relatively mature, the time series based multi-class ranking method is not available in avionics systems.
Disclosure of Invention
The method and the device are used for rapidly deleting, inserting and sequencing the data flow targets in the avionics system, so that the memory overhead is saved, and the time consumption is saved.
The invention provides a multi-classification ordering method based on a time sequence data set, which comprises the following steps:
establishing a position mapping table and a time mapping table;
Inserting first data into a sub-data set based on the position mapping table and the time mapping table, wherein the sub-data set is established according to the attribute of the first data;
and ordering the data in the sub data set based on the attribute of the first data, and updating the position mapping table and the time mapping table based on the sequence of the ordered data.
Preferably, the method further comprises:
Under the condition that the database is full, inputting second data, deleting sub data in the first data based on the position mapping table, wherein the sub data is the data with earliest writing time in the database;
determining a moving direction of remaining data in the database based on the size of the sub-data set and the position of the sub-data;
And moving the residual data based on the moving direction to obtain a target position, and inserting the second data into the target position.
Preferably, the target location is a location in a target sub-data set.
Preferably, after the second data is inserted into the target position, the method further includes:
And ordering the data in the target sub-data set based on the attribute of the target sub-data set, and updating the time mapping table and the position mapping table based on the sequence of the ordered data.
Preferably, the establishing a location mapping table includes:
And establishing a position mapping table based on the position relation between the first data and the ordering data.
Preferably, the establishing the time mapping table includes:
and establishing a time mapping table based on the time relation between the first data and the ordering data.
Preferably, the attribute of the first data is used for classifying the first data.
Preferably, the first data is written into the database at a time earlier than the second data is written into the database.
The beneficial technical effects of the invention are as follows:
the invention provides a multi-classification method based on time sequence, which can quickly locate the earliest data through a suggested position mapping table and a time mapping table, reject the data, insert and sort the data on the basis of the existing classification sorting, reduce the memory overhead, shorten the sorting time and meet the real-time performance of multi-classification sorting of the data flow of the airborne software of an avionics system.
Drawings
Fig. 1 is a flow chart provided by an embodiment of the present invention.
Detailed Description
Referring to fig. 1, in a first embodiment, the present invention provides a multi-classification ordering method based on a time-series dataset, including:
step 101, a position mapping table and a time mapping table are established.
In the embodiment of the application, a position mapping table is established based on the position relation between the first data and the ordering data, and a time mapping table is established based on the time relation between the first data and the ordering data.
Step 102, inserting the first data into the sub-data set based on the location mapping table and the time mapping table.
Wherein the sub-data set is established based on the attribute of the first data.
Step 103, sorting the data in the sub data set based on the attribute of the first data, and updating the position mapping table and the time mapping table based on the sequence of the sorted data.
The attribute of the first data is used for classifying the first data.
Step 104, under the condition that the database is full, inputting second data, and deleting sub-data in the first data based on the position mapping table;
The sub data is the data with the earliest time of writing into the database, and the time of writing the first data into the database is earlier than the time of writing the second data into the database.
And 105, determining the moving direction of the residual data in the database based on the size of the sub-data set and the position of the sub-data, moving the residual data based on the moving direction to obtain a target position, and inserting second data into the target position.
And 106, sorting the data in the target sub-data set based on the attribute of the target sub-data set, and updating the time mapping table and the position mapping table based on the sequence of the sorted data.
In a second embodiment, the method for sorting multiple categories based on a time sequence data set mainly comprises four modules:
position mapping table
Wherein the avionics system does not store all time stream data in a conditional way, the onboard software needs to update the data in real time, eliminates excessive data according to time and sorts the data, but after sorting, the sorting of the data sets is not sorted according to time, the earliest data set cannot be directly removed, and the method can save the memory and quickly locate the time attribute of each data by establishing the position mapping table to index the data. The mapping table is mapped to be the sequence position of the data set after sequencing, when a new sequence comes, the earliest data is removed through the mapping table, the mapping table is updated, and the data set is updated, which is a forward deleting process.
Time mapping table
In order to establish a connection between data and the position mapping table and update the data quickly, the invention records the time sequence of each data of the data set at the same time, and can quickly locate the data set which arrives first through the position mapping table because the data which arrives first in the data needs to be removed, but can directly locate the position of the position mapping table by establishing the time mapping table when the data is ordered, and can update the data set and then update the time mapping table and then update the position mapping table when the data ordering position changes, thus completing the reverse updating.
Sub-dataset size statistics record table
The data source in the onboard software is a data source with known classification, under the condition that the classification and the ordering sequence of the data sets are known, the data quantity of each sub-classification can be recorded, when new data arrives, which category is judged through known attributes, then the new data set is inserted into the head of the category by calculating the quantity of the known data set, and the position mapping table and the time mapping table are updated simultaneously.
Multi-class ranking
The invention classifies time sequence data, adopts a mapping table method to record time and position information of the data, can rapidly locate earliest data, eliminates the earliest data, rapidly inserts new data into the data which is already ordered according to classification, and then performs bubbling ordering under subclassification.
In the third embodiment, the data set size is 10, and 3 classification categories are assumed, namely A, AA and AAA, arranged in sequence as shown in Table 1.
TABLE 1
When new data fff arrives, the following steps are performed:
1. And deleting the position of the earliest data in the position mapping table, namely the number 6 bit, through the position mapping table, and updating the number 2 of the sub-data categories.
2. Calculating the fff type starting position, 5+2=7, moving the data, and judging the data moving direction.
3. The new data fff to 8 positions are inserted and the time map and position map and sub-data set sizes are updated.
4. And sorting in the sub-categories, and updating the position mapping table and the time mapping table to obtain the table 2.
TABLE 2
The invention performs multi-classification sorting on time series data sets, establishes a time mapping table and a position mapping table, and rapidly inserts new data sets into the data sets through the number of sub-class data sets, and sorts the new data only in the classification of the new data sets. The invention provides a method for sequencing time series data sets, saves memory overhead and shortens sequencing time.

Claims (3)

1. A multi-class ordering method based on a time series dataset, the method comprising:
Establishing a position mapping table and a time mapping table, wherein the position mapping table records the ordering sequence of data in a data set, the time mapping table records the time sequence of data writing in the data set, the data set comprises a plurality of sub-data sets, and the sub-data sets are classified and ordered in the data set according to the attribute of the sub-data sets;
Inserting first data into a sub-data set based on the location mapping table and the time mapping table;
and ordering the data in the sub-data set based on the attribute of the first data, and updating the position mapping table and the time mapping table based on the sequence of the ordered data, wherein the attribute of the first data is different from the attribute of the sub-data set.
2. The method according to claim 1, wherein the method further comprises:
Under the condition that the data set is full, inputting second data, deleting sub data in the first data based on the position mapping table, wherein the sub data is the data with earliest writing time in the data set;
and inserting the second data into the finished sorting data according to the classification of the sub-data set, performing bubbling sorting under the sub-classification, and updating the position mapping table and the time mapping table based on the sequence of the sorting data.
3. The method of claim 2, wherein the first data is written to the data set at a time earlier than the second data is written to the data set.
CN202111391842.0A 2021-11-19 2021-11-19 A multi-classification sorting method based on time series data sets Active CN114089947B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111391842.0A CN114089947B (en) 2021-11-19 2021-11-19 A multi-classification sorting method based on time series data sets

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111391842.0A CN114089947B (en) 2021-11-19 2021-11-19 A multi-classification sorting method based on time series data sets

Publications (2)

Publication Number Publication Date
CN114089947A CN114089947A (en) 2022-02-25
CN114089947B true CN114089947B (en) 2025-02-18

Family

ID=80303347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111391842.0A Active CN114089947B (en) 2021-11-19 2021-11-19 A multi-classification sorting method based on time series data sets

Country Status (1)

Country Link
CN (1) CN114089947B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932271A (en) * 2017-05-27 2018-12-04 大唐移动通信设备有限公司 A kind of file management method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8862588B1 (en) * 2011-11-30 2014-10-14 Google Inc. Generating an empirically-determined schema for a schemaless database
CN103514229A (en) * 2012-06-29 2014-01-15 国际商业机器公司 Method and device used for processing database data in distributed database system
WO2016141590A1 (en) * 2015-03-12 2016-09-15 华为技术有限公司 Time sequence data processing method and apparatus

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108932271A (en) * 2017-05-27 2018-12-04 大唐移动通信设备有限公司 A kind of file management method and device

Also Published As

Publication number Publication date
CN114089947A (en) 2022-02-25

Similar Documents

Publication Publication Date Title
CN104112026B (en) A kind of short message text sorting technique and system
CN106919957B (en) Method and device for processing data
CN109947904A (en) A kind of preference space S kyline inquiry processing method based on Spark environment
CN108280472A (en) A kind of density peak clustering method optimized based on local density and cluster centre
CN113705570B (en) Deep learning-based few-sample target detection method
CN105589938A (en) Image retrieval system and retrieval method based on FPGA
CN114048318A (en) Clustering method, system, device and storage medium based on density radius
CN112766170B (en) Self-adaptive segmentation detection method and device based on cluster unmanned aerial vehicle image
US10963440B2 (en) Fast incremental column store data loading
TWI769665B (en) Target data updating method, electronic equipment and computer readable storage medium
CN109685122B (en) A clustering method for semi-supervised tourist portrait data based on density peaks and gravitational influence
CN102799681B (en) Top-k query method oriented to any data segment
CN111125396A (en) An Image Retrieval Method with Single Model and Multi-branch Structure
CN112597871B (en) Unsupervised vehicle re-identification method, system and storage medium based on two-stage clustering
CN116701771B (en) A digital library retrieval and resource sharing system based on cloud computing
CN114089947B (en) A multi-classification sorting method based on time series data sets
CN111708853B (en) Taxi hot spot region extraction method based on characteristic density peak clustering
CN103761286A (en) Method for retrieving service resources on basis of user interest
CN105631465A (en) Density peak-based high-efficiency hierarchical clustering method
WO2007114939A2 (en) A fast generalized 2-dimensional heap for hausdorff and earth mover's distance
CN108549913A (en) Improvement K-means clustering algorithms based on density radius
CN112115281A (en) Data retrieval method, device and storage medium
CN107392249A (en) A kind of density peak clustering method of k nearest neighbor similarity optimization
CN108268620A (en) A kind of Document Classification Method based on hadoop data minings
CN119356647A (en) Index construction method, index library and query method for massive seismic data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant