[go: up one dir, main page]

CN108345668B - Time sequence matrix thermodynamic diagram visualization method aiming at category comparison - Google Patents

Time sequence matrix thermodynamic diagram visualization method aiming at category comparison Download PDF

Info

Publication number
CN108345668B
CN108345668B CN201810131948.9A CN201810131948A CN108345668B CN 108345668 B CN108345668 B CN 108345668B CN 201810131948 A CN201810131948 A CN 201810131948A CN 108345668 B CN108345668 B CN 108345668B
Authority
CN
China
Prior art keywords
data
dimension
classification
time
time sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810131948.9A
Other languages
Chinese (zh)
Other versions
CN108345668A (en
Inventor
陈红倩
温玉琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dragon Totem Technology Hefei Co ltd
Original Assignee
Beijing Technology and Business University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Technology and Business University filed Critical Beijing Technology and Business University
Priority to CN201810131948.9A priority Critical patent/CN108345668B/en
Publication of CN108345668A publication Critical patent/CN108345668A/en
Application granted granted Critical
Publication of CN108345668B publication Critical patent/CN108345668B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a time sequence matrix thermodynamic diagram visualization method aiming at category comparison, and belongs to the technical field of computer graphics and visualization. The method comprises the following implementation steps: selecting a classification dimension, a time dimension and a display dimension from all dimensions of the data set, and classifying the data set according to the classification dimension; mapping time values in the time dimension to time series values; establishing a matrix layout by taking the data classification as rows and the time sequence values as columns, wherein each classification occupies one row and each time sequence value occupies one column; mapping each data record in the data set to a pixel block in a matrix layout, representing data classification by an abscissa, representing time by an ordinate, and mapping display dimension data to colors; for each data classification and each time series value, the average thereof is presented by a bar graph. The invention can comprehensively show the classification characteristics of the data and the time sequence characteristics of each classification, and improve the visual analysis efficiency.

Description

针对类别对比的时序矩阵热力图可视化方法Time series matrix heatmap visualization method for category comparison

技术领域technical field

本发明涉及一种针对类别对比的时序矩阵热力图可视化方法,属于计算机图形学与可视化技术领域。The invention relates to a time series matrix heat map visualization method for category comparison, and belongs to the technical field of computer graphics and visualization.

技术背景technical background

在目前的数据分析过程中,对数据分类并进行不同分类之间的对比是一种常见的数据分析形式,分析不同分类数据的时序变化模式是进行数据挖掘以及深入分析的重要前提。In the current data analysis process, classifying data and comparing different classifications is a common form of data analysis, and analyzing the time series change patterns of different classified data is an important prerequisite for data mining and in-depth analysis.

可视化方法已经被证明是在数据挖掘算法应用之前对参数进行设定等的有效手段,其通过对数据的映射以及可视化展现,借助人的智能,从而提高数据挖掘的准确性与有效性。The visualization method has been proved to be an effective means to set parameters before the application of data mining algorithms. It can improve the accuracy and effectiveness of data mining through the mapping and visualization of data and the help of human intelligence.

热力图常被用于数值比较等的可视化方法中,但是对于时序数据的可视化及可视分析的应用中尚为少见,而基于热力图再结合矩阵布局共同应用于时序数据的分析则更为少见,本发明提出的可视化方法是针对不同分类数据分别通过时序矩阵热力图展现其时序变化模式,并以此为基础进一步分析不同分类模式间的对比情况,从而分析数据集的整体时序特征。Heatmaps are often used in visualization methods such as numerical comparisons, but they are still rare in the visualization and analysis of time series data, and the analysis of time series data based on heatmaps combined with matrix layout is even rarer. , the visualization method proposed by the present invention is to display the time series change patterns of different classification data through the time series matrix heat map, and further analyze the comparison between different classification models based on this, so as to analyze the overall time series characteristics of the data set.

发明内容SUMMARY OF THE INVENTION

本发明的目的是针对时序变化数据分析的需求,提出一种针对类别对比的时序矩阵热力图可视化方法。The purpose of the present invention is to propose a time series matrix heat map visualization method for category comparison according to the requirement of time series change data analysis.

本发明的目的是通过以下技术方案实现的:The purpose of this invention is to realize through the following technical solutions:

本发明提出的一种针对类别对比的时序矩阵热力图可视化方法,包含如下步骤:A time series matrix heatmap visualization method for category comparison proposed by the present invention includes the following steps:

步骤1:从数据集的所有维度选取如下几类数据维度:Step 1: Select the following types of data dimensions from all dimensions of the dataset:

(1)在数据集中的所有数据进行分类过程中,作为分类依据的维度(简称为分类维度);(1) In the process of classifying all the data in the dataset, the dimension used as the basis for classification (referred to as the classification dimension);

(2)在数据集中标识每一条数据记录发生时间的维度(简称为时间维度);(2) Identify the dimension of the occurrence time of each data record in the data set (referred to as the time dimension);

(3)在数据集中用于数据分析的维度,即在可视分析过程中用于展现的维度(简称为展现维度)。(3) The dimension used for data analysis in the data set, that is, the dimension used for presentation in the visual analysis process (referred to as the presentation dimension for short).

步骤2:针对步骤1中选取的分类维度,基于一定的分类规则将整个数据集中的所有数据记录划分为不同的分类,将分类种类数记为D。Step 2: For the classification dimension selected in step 1, all data records in the entire data set are divided into different classifications based on certain classification rules, and the number of classification types is recorded as D.

步骤3:针对步骤1中选取的时间维度,将时间值映射为时间序列值,将时间序列值的总个数记为T。Step 3: For the time dimension selected in Step 1, map the time value to a time series value, and record the total number of time series values as T.

时间维度包括月、日、小时、分钟等,对数据中的时间维度进行时间序列值转换,具体转换方法为:The time dimension includes month, day, hour, minute, etc., and the time dimension in the data is converted into time series values. The specific conversion method is as follows:

步骤3.1:时间轴若是按照月、日来定义,则时间序列值转换方法如公式(1)所示:Step 3.1: If the time axis is defined in terms of months and days, the time series value conversion method is shown in formula (1):

DVi=31×moni+di (1)DV i =31×mon i +d i (1)

其中,moni、di分别为第i条数据记录对应的月和日,DVi为其映射的时间序列值。Among them, mon i and d i are the month and day corresponding to the i-th data record, respectively, and DV i is its mapped time series value.

步骤3.2:时间轴若是按照小时、分钟来定义,则时间序列值转换方法如公式(2)所示:Step 3.2: If the time axis is defined in terms of hours and minutes, the time series value conversion method is shown in formula (2):

TVi=60×hi+mi (2)TV i =60× hi +m i ( 2)

其中,hi、mi分别为第i条数据记录对应的小时和分钟,TVi为其映射的时间序列值。Among them, h i and mi are the hours and minutes corresponding to the i -th data record, respectively, and TV i is the mapped time series value.

步骤4:针对步骤1中选取的展现维度,获知其数据的基本分布情况,如有必要,选择合理的转换函数(如指数函数、log函数)使其转换后的数据尽量均匀分布。针对每一展现维度,将需展现数据的最大最小值设为Zmax、ZminStep 4: For the display dimension selected in step 1, learn the basic distribution of its data, and if necessary, select a reasonable conversion function (such as exponential function, log function) to make the converted data as evenly distributed as possible. For each display dimension, the maximum and minimum values of the data to be displayed are set as Z max and Z min .

步骤5:建立可视化结果的布局结构。Step 5: Build the layout structure of the visualization results.

本发明的可视化结果的布局结构中,将所有数据记录的不同分类纵向排列,数据集中的所有分类中,每种分类占一行;将所有数据记录的不同时间序列值横向排列,数据集中的所有时间序列值,每个时间序列值占一列。In the layout structure of the visualization result of the present invention, the different categories of all data records are arranged vertically, and in all categories in the data set, each category occupies a row; Series values, one column per time series value.

该布局中,数据集中的每条数据记录映射为可视化结果中的一个“像素块”,所有数据记录对应的可视元素根据该条记录对应的时间序列值和分类呈现一个“矩阵”的形式,在本发明中称为“矩阵布局”。矩阵布局中呈现所有数据记录的像素块绘制区域简称为“像素块绘制区域”。In this layout, each data record in the data set is mapped to a "pixel block" in the visualization result, and the visual elements corresponding to all data records are presented in the form of a "matrix" according to the time series value and classification corresponding to the record. Referred to in the present invention as "matrix layout". The pixel block rendering area in which all data records are presented in the matrix layout is simply referred to as the "pixel block rendering area".

数据集中的每一分类所占的高度像素数计算方法如公式(3)所示:The calculation method of the number of height pixels occupied by each category in the data set is shown in formula (3):

d=(H-H1-H2)/D (3)d=(HH 1 -H 2 )/D (3)

其中,d为每一分类所占的高度像素数;H为设定的可视化结果总高度像素数,可根据实际情况设定;H1、H2为可视化结果中“像素块绘制区域”外的高度像素数;D为步骤2中获得的数据集的分类种类数。Among them, d is the height pixel number occupied by each category; H is the total height pixel number of the set visualization result, which can be set according to the actual situation; H 1 and H 2 are the pixels outside the "pixel block drawing area" in the visualization result. The number of height pixels; D is the number of classification categories of the dataset obtained in step 2.

数据集中的每一时间序列值所占的宽度像素数计算方法如公式(4)所示:The calculation method of the number of width pixels occupied by each time series value in the data set is shown in formula (4):

t=(W-W1-W2)/T (4)t=(WW 1 -W 2 )/T (4)

其中,t为每一个时间序列值的映射宽度;W为可视化结果总宽度像素数,可根据实际情况设定;W1、W2为可视化结果中“像素块绘制区域”外的宽度像素数;T为时间序列值的个数。Among them, t is the mapping width of each time series value; W is the total width pixel number of the visualization result, which can be set according to the actual situation; W 1 and W 2 are the width pixels outside the "pixel block drawing area" in the visualization result; T is the number of time series values.

步骤6:设置数据记录中展现维度数据的颜色映射表。Step 6: Set the color map of the dimension data in the data record.

在本发明中借鉴“热力图”可视化方法,将所需分析的数据映射为颜色,可以采用能够区分不同数值的颜色即可。In the present invention, the visualization method of "heat map" is used for reference, and the data to be analyzed is mapped into colors, and colors that can distinguish different values may be used.

步骤7:将数据集中的每条数据记录,计算其在可视化结果中对应的“像素块”的位置和颜色编码。一条数据记录对应的像素块的位置和颜色编码的计算方法为:Step 7: Calculate the position and color coding of the corresponding "pixel block" in the visualization result for each data record in the dataset. The calculation method of the position and color coding of the pixel block corresponding to a data record is:

步骤7.1:根据该条数据记录的分类维度在步骤2中所得出的分类,查找矩阵布局中该分类的纵坐标,并设置为自身纵坐标。Step 7.1: According to the classification obtained in step 2 by the classification dimension of the data record, find the ordinate of the classification in the matrix layout, and set it as its own ordinate.

相同的分类下所有数据记录对应的像素块的纵坐标相同,反之,不同的分类下数据记录对应的像素块纵坐标不同;The ordinates of the pixel blocks corresponding to all data records under the same classification are the same, on the contrary, the ordinates of the pixel blocks corresponding to the data records under different classifications are different;

步骤7.2:根据该条数据记录的时间维度在步骤3中所得出的时间序列值,查找矩阵布局中该分类的横坐标,并设置为自身横坐标。Step 7.2: According to the time series value obtained in step 3 for the time dimension of the data record, find the abscissa of the category in the matrix layout, and set it as its own abscissa.

相同的时间序列值的所有数据记录对应的像素块的横坐标相同,反之,不同的时间序列值的数据记录对应的对应的像素块横坐标不同;The abscissas of the pixel blocks corresponding to all data records of the same time series value are the same, on the contrary, the abscissas of the corresponding pixel blocks corresponding to the data records of different time series values are different;

步骤7.3:根据该条数据记录的展现维度数据,计算其对应的像素块的颜色编码如公式(5)所示:Step 7.3: According to the display dimension data of the data record, calculate the color coding of the corresponding pixel block as shown in formula (5):

Figure BSA0000159153790000041
Figure BSA0000159153790000041

其中,CLRp为第p个像素块对应的颜色编号,TAMp为该条数据记录中的展现维度数据的值,maxTAM、minTAM分别为所有数据记录中的展现维度数据的最大值与最小值。Among them, CLR p is the color number corresponding to the p-th pixel block, TAM p is the value of the display dimension data in the data record, and maxTAM and minTAM are the maximum and minimum values of the display dimension data in all data records, respectively.

步骤8:为增加最终可视化结果中所呈现的信息量,将矩阵布局中的“每一行的像素块对应的数据记录”的展现维度数据的平均值映射为条形图(称为分类维度条形图),并置于矩阵布局左侧,该条形图可用于分析不同分类的数据记录的展现维度的数据对比情况。Step 8: In order to increase the amount of information presented in the final visualization result, the average value of the display dimension data of the "data records corresponding to the pixel blocks of each row" in the matrix layout is mapped to a bar graph (called a categorical dimension bar). Figure), and placed on the left side of the matrix layout, the bar chart can be used to analyze the data comparison of the display dimensions of the data records of different categories.

步骤9:为增加最终可视化结果中所呈现的信息量,将矩阵布局中的“每一列的像素块对应的数据记录”的展现维度数据的平均值映射为条形图(称为时间维度条形图),并置于矩阵布局下方,该条形图可用于分析不同时间序列值的数据记录的展现维度的数据对比情况。Step 9: In order to increase the amount of information presented in the final visualization result, the average value of the display dimension data of "the data records corresponding to the pixel blocks of each column" in the matrix layout is mapped to a bar graph (called a time dimension bar). Figure), and placed below the matrix layout, the bar chart can be used to analyze the data comparison of the display dimension of the data records of different time series values.

步骤10:为整体布局添加分类维度标签与时间维度标签。在分类维度条形图左侧添加分类维度标签,包含该分类的具体名称及相应信息;在时间维度条形图下方添加时间维度标签。Step 10: Add classification dimension labels and time dimension labels to the overall layout. Add a category dimension label on the left side of the category dimension bar chart, including the specific name and corresponding information of the category; add a time dimension label below the time dimension bar chart.

至此,即完成了针对类别对比的时序矩阵热力图可视化方法。So far, the time series matrix heatmap visualization method for category comparison has been completed.

有益效果beneficial effect

本文提出的针对类别对比的时序矩阵热力图可视化方法,综合展现了数据各类别的时序特征,与现有的数据类别对比分析方法相比,本文的方法更加直观、全面、易用。The time series matrix heatmap visualization method for category comparison proposed in this paper comprehensively displays the time series characteristics of each category of data. Compared with the existing data category comparison analysis methods, this method is more intuitive, comprehensive and easy to use.

附图说明Description of drawings

附图1为本发明的技术方案的流程图;Accompanying drawing 1 is the flow chart of the technical scheme of the present invention;

附图2为本发明中提出的矩阵布局方法的原理示意图;Accompanying drawing 2 is the principle schematic diagram of the matrix layout method proposed in the present invention;

附图3为本发明提出的矩阵布局热力图应用于影视收视率数据集的一个时序矩阵布局热力图可视化效果。FIG. 3 is a visualization effect of a time-series matrix layout heatmap when the matrix layout heatmap proposed by the present invention is applied to a movie and TV ratings data set.

具体实施方式Detailed ways

本发明提出的一种针对类别对比的时序矩阵热力图可视化方法,具体实施方式以湖南卫视在2015年的收视率数据为例,该数据集中的数据记录条数为4000条,数据集中包括的数据维包括:播出月、播出日、播出开始小时、播出开始分钟、影视剧名称、剧集总集数、播出剧集集号、收视率。A time series matrix heat map visualization method for category comparison proposed by the present invention, the specific implementation is taking the ratings data of Hunan Satellite TV in 2015 as an example, the number of data records in the data set is 4000, and the data included in the data set The dimensions include: broadcast month, broadcast date, broadcast start hour, broadcast start minute, film and television drama name, total number of episodes, broadcast episode number, and audience rating.

步骤1:从数据集的所有维度选取如下几类数据维度:Step 1: Select the following types of data dimensions from all dimensions of the dataset:

(1)在数据集中的所有数据进行分类过程中作为分类依据的维度(简称为分类维度);在本实施例中选取影视剧名称为分类维度。(1) The dimension used as the classification basis in the classification process of all the data in the data set (referred to as the classification dimension); in this embodiment, the name of the film and television drama is selected as the classification dimension.

(2)在数据集中标识每一条数据记录发生时间的维度(简称为时间维度);在本实施例中选取播出月、播出日为时间维度。(2) Identify the dimension of the occurrence time of each data record in the data set (referred to as the time dimension); in this embodiment, the broadcast month and the broadcast date are selected as the time dimension.

(3)在数据集中用于数据分析的维度,即在可视分析过程中用于展现的维度(简称为展现维度)。在本实施例中选取收视率为展现维度。(3) The dimension used for data analysis in the data set, that is, the dimension used for presentation in the visual analysis process (referred to as the presentation dimension for short). In this embodiment, the audience rating is selected as the presentation dimension.

步骤2:针对步骤1中标明的分类维度,基于一定的分类规则将整个数据集中的所有数据记录划分为不同的分类,将分类种类数记为D。Step 2: For the classification dimension indicated in Step 1, all data records in the entire data set are divided into different classifications based on certain classification rules, and the number of classification types is recorded as D.

在本实施例中,将数据集依据影视剧题材分为8类,具体分类见表1。In this embodiment, the data set is divided into 8 categories according to the themes of film and television dramas, and the specific categories are shown in Table 1.

表1 影视剧的题材类别分类Table 1 Classification of subject categories of film and television dramas

题材分类编号Subject Category Number 题材分类Subject classification 11 民国Republic of China 22 军旅military 33 家庭family 44 爱情love 55 青春youth 66 武侠martial arts 77 抗战Anti-Japanese War 88 栏目剧column drama

步骤3:针对步骤1中标明的时间维度,将时间值映射为时间序列值,将时间序列值的总个数记为T。Step 3: For the time dimension indicated in Step 1, map time values to time series values, and record the total number of time series values as T.

本实施例中,时间维度包含播出月、播出日,故将时间轴按照月、日来定义,例如第一条数据记录为某电视剧在5月2日播出,则其对应的映射到x坐标上的时间序列值为In this embodiment, the time dimension includes the broadcast month and broadcast date, so the time axis is defined according to the month and day. For example, the first data record is that a TV drama was broadcast on May 2, then its corresponding mapping is The time series value on the x coordinate is

DVi=31×moni+di=31×5+2=157 (6)DV i =31×mon i +d i =31×5+2=157 (6)

其中,moni,di分别为第i条数据记录对应的月和日,DVi为其映射的时间序列值。Among them, mon i , d i are the month and day corresponding to the i-th data record, respectively, and DV i is the mapped time series value.

步骤4:针对步骤1中的展现维度,即收视率,获知其数据的基本分布情况,如有必要,选择合理的转换函数(如指数函数、log函数)使其转换后的数据尽量均匀分布。Step 4: For the display dimension in Step 1, namely the audience rating, learn the basic distribution of its data, and if necessary, select a reasonable conversion function (such as exponential function, log function) to make the converted data as evenly distributed as possible.

在本实施例中,整个数据集中的收视率的最大值为Zmax=3.19、最小值Zmin=0。In this embodiment, the maximum value of the audience ratings in the entire data set is Z max =3.19, and the minimum value is Z min =0.

步骤5:建立可视化结果的布局结构。Step 5: Build the layout structure of the visualization results.

本发明的可视化结果的布局结构中,将所有数据记录的不同分类纵向排列,即将步骤2中获得的影视剧题材分类纵向排列,每种题材占一行;将所有数据记录的不同时间序列值横向排列,即将步骤3中获得的所有影视剧的时间序列值横向排列,数据集中的所有播出时间,每个时间段占一列。In the layout structure of the visualization result of the present invention, the different categories of all data records are arranged vertically, that is, the categories of film and television drama themes obtained in step 2 are arranged vertically, and each theme occupies a row; the different time series values of all data records are arranged horizontally. , that is, the time series values of all the movies and TV dramas obtained in step 3 are arranged horizontally, and all the broadcast times in the data set, each time period occupies a column.

该布局中,数据集中的每条数据记录映射为可视化结果中的一个“像素块”,即将影视剧的收视率映射为一个“像素块”。所有影视剧收视率记录对应的可视元素根据该条记录对应的时间序列值和题材分类呈现一个“矩阵”的形式。In this layout, each data record in the dataset is mapped to a "pixel block" in the visualization result, that is, the ratings of movie and TV dramas are mapped to a "pixel block". The visual elements corresponding to the ratings records of all film and television dramas are presented in the form of a "matrix" according to the time series values and subject classifications corresponding to the records.

本实施例中,设定可视化结果总高度像素数为300像素;可视化结果中“像素块绘制区域”外的高度像素数,H1=50像素、H2=50像素;本实施例中步骤2中获得的影视剧的题材分类为8类。数据集中的每一种影视题材分类所占的高度像素数计算方法如公式(7)所示:In this embodiment, the total number of height pixels of the visualization result is set to 300 pixels; the number of height pixels outside the "pixel block drawing area" in the visualization result, H 1 =50 pixels, H 2 =50 pixels; Step 2 in this embodiment The subject matter of the film and television dramas obtained in 2019 is classified into 8 categories. The calculation method of the number of height pixels occupied by each category of film and television subject matter in the dataset is shown in formula (7):

d=(H-H1-H2)/D=(300-50-50)/8=25 (7)d=(HH 1 -H 2 )/D=(300-50-50)/8=25 (7)

设定可视化结果总宽度像素数为1000像素,可视化结果中“像素块绘制区域”外的宽度像素数W1=50像素、W2=50像素,时间序列值的个数为244,则映射到矩阵布局上的每一像素块宽度值的计算方法如公式(8)所示:Set the total width pixels of the visualization result to 1000 pixels, the width pixels outside the "pixel block drawing area" in the visualization result W 1 = 50 pixels, W 2 = 50 pixels, and the number of time series values is 244, then map to The calculation method of the width value of each pixel block on the matrix layout is shown in formula (8):

t=(W-W1-W2)/T=(1000-50-50)/244=3.69 (8)t=(WW 1 -W 2 )/T=(1000-50-50)/244=3.69 (8)

步骤6:设置数据记录中展现维度数据的颜色映射表。Step 6: Set the color map of the dimension data in the data record.

在本发明中借鉴“热力图”可视化方法,将所需分析的数据映射为颜色,可以采用能够区分不同数值的颜色即可。在本实施例中采用灰度颜色,所选用的颜色RGB值如表2所示。In the present invention, the visualization method of "heat map" is used for reference, and the data to be analyzed is mapped into colors, and colors that can distinguish different values may be used. In this embodiment, grayscale colors are used, and the RGB values of the selected colors are shown in Table 2.

表2 像素块采用的灰度颜色表Table 2 Grayscale color table used by pixel blocks

颜色编号color number 颜色RGB值color RGB value 11 (140,140,140)(140, 140, 140) 22 (130,130,130)(130, 130, 130) 33 (120,120,120)(120, 120, 120) 44 (110,110,110)(110, 110, 110) 55 (100,100,100)(100, 100, 100) 66 (90,90,90)(90, 90, 90) 77 (80,80,80)(80, 80, 80) 88 (70,70,70)(70, 70, 70) 99 (60,60,60)(60, 60, 60) 1010 (50,50,50)(50, 50, 50) 1111 (40,40,40)(40, 40, 40) 1212 (30,30,30)(30, 30, 30)

步骤7:将数据集中的每条数据记录,计算其在可视化结果中对应的“像素块”的位置和颜色编码。一条数据记录对应的像素块的位置和颜色编码的计算方法为:Step 7: Calculate the position and color coding of the corresponding "pixel block" in the visualization result for each data record in the dataset. The calculation method of the position and color coding of the pixel block corresponding to a data record is:

步骤7.1:根据该条数据记录的分类维度在步骤2中所得出的分类,查找矩阵布局中该分类的纵坐标,并设置为自身纵坐标。Step 7.1: According to the classification obtained in step 2 by the classification dimension of the data record, find the ordinate of the classification in the matrix layout, and set it as its own ordinate.

步骤7.2:根据该条数据记录的时间维度在步骤3中所得出的时间序列值,查找矩阵布局中该分类的横坐标,并设置为自身横坐标。Step 7.2: According to the time series value obtained in step 3 for the time dimension of the data record, find the abscissa of the category in the matrix layout, and set it as its own abscissa.

步骤7.3:根据该条数据记录的展现维度数据,计算其对应的像素块的颜色编码。Step 7.3: Calculate the color code of the corresponding pixel block according to the presentation dimension data of the data record.

以5月2日播出的第一条数据记录为例,收视率为1.7,则该条数据记录中的展现维度数据的值TAMp=1.7,全部数据集中最大收视率maxTAM为3.19,最小的收视率minTAM为0,则求出该像素块的颜色编码为:Taking the first data record broadcast on May 2 as an example, the audience rating is 1.7, then the value of the display dimension data in this data record TAM p = 1.7, the maximum audience rating maxTAM in all data sets is 3.19, the smallest If the audience rating minTAM is 0, the color coding of the pixel block is obtained as:

Figure BSA0000159153790000091
Figure BSA0000159153790000091

即该像素块对应的颜色为表3中的第7种颜色。That is, the color corresponding to the pixel block is the seventh color in Table 3.

步骤8:为增加最终可视化结果中所呈现的信息量,将矩阵布局中的“每一行的像素块对应的数据记录”的展现维度数据的平均值映射为条形图(称为分类维度条形图),并置于矩阵布局左侧,该条形图可用于分析不同分类的数据记录的展现维度的数据对比情况。Step 8: In order to increase the amount of information presented in the final visualization result, the average value of the display dimension data of the "data records corresponding to the pixel blocks of each row" in the matrix layout is mapped to a bar graph (called a categorical dimension bar). Figure), and placed on the left side of the matrix layout, the bar chart can be used to analyze the data comparison of the display dimensions of the data records of different categories.

本实施例中,将每一行同一题材分类的所有收视率的平均值映射为条形图,置于矩阵的左侧,用于分析不同题材分类下的收视率的对比情况。In this embodiment, the average value of all the audience ratings of the same theme category in each row is mapped into a bar graph, which is placed on the left side of the matrix to analyze the comparison of audience ratings under different theme categories.

步骤9:为增加最终可视化结果中所呈现的信息量,将矩阵布局中的“每一列的像素块对应的数据记录”的展现维度数据的平均值映射为条形图(称为时间维度条形图),并置于矩阵布局下方,该条形图可用于分析不同时间序列值的数据记录的展现维度的数据对比情况。Step 9: In order to increase the amount of information presented in the final visualization result, the average value of the display dimension data of "the data record corresponding to the pixel block of each column" in the matrix layout is mapped to a bar graph (called a time dimension bar). Figure), and placed below the matrix layout, the bar chart can be used to analyze the data comparison of the display dimension of the data records of different time series values.

本实施例中,将每一列相同时间序列值的收视率平均值映射为条形图,置于矩阵布局的下方,用于分析不同日期的收视率对比情况。In this embodiment, the average audience ratings of each column of the same time-series values are mapped into a bar graph, which is placed below the matrix layout to analyze the comparison of audience ratings on different dates.

步骤10:为整体布局添加分类维度标签与时间维度标签。在分类维度条形图左侧添加分类维度标签,包含该分类的具体名称及相应信息;在时间维度条形图下方添加时间维度标签。Step 10: Add classification dimension labels and time dimension labels to the overall layout. Add a classification dimension label to the left of the classification dimension bar chart, including the specific name and corresponding information of the classification; add a time dimension label below the time dimension bar chart.

本实施例中,在分类维度条形图左侧添加题材分类,在时间维度条形图下方添加播出月。In this embodiment, subject categories are added to the left side of the bar graph in the category dimension, and broadcast months are added below the bar graph in the time dimension.

至此,即完成了针对类别对比的时序矩阵热力图可视化方法。So far, the time series matrix heatmap visualization method for category comparison has been completed.

根据附图3选取的湖南卫视收视率情况的时序矩阵热力图可视化效果进行分析可得结论包括:According to the time sequence matrix heat map visualization effect of the Hunan Satellite TV ratings situation selected in accompanying drawing 3, the conclusions can be drawn include:

(1)针对电视剧播出时间分析,湖南卫视除了长条形的像素块,还存在许多较短的像素块,说明湖南卫视的周播剧与日播剧同时存在,且周播剧更为普遍;(1) For the analysis of the broadcast time of TV dramas, in addition to the long pixel blocks, Hunan Satellite TV also has many shorter pixel blocks, which means that Hunan Satellite TV’s weekly and daily broadcasts coexist, and weekly broadcasts are more common. ;

(2)针对电视剧播出类型分析,湖南卫视播出的各类别电视剧平均收视率不均衡,民国、军旅、家庭、爱情、青春剧收视较高,民国戏为最高,武侠、抗战、栏目剧收视较低;(2) Based on the analysis of the types of TV dramas broadcast, the average ratings of various types of TV dramas broadcast by Hunan Satellite TV are uneven. lower;

(3)针对电视剧播出月份分析,湖南卫视表现出较高的假期档特点:7、8、9月暑期档收视率呈现一个高峰趋势;(3) According to the analysis of the broadcast months of TV dramas, Hunan Satellite TV shows the characteristics of high holiday files: the viewing rate of summer files in July, August and September shows a peak trend;

(4)针对电视剧播出量分析,湖南卫视播放青春、爱情剧较多,且青春、爱情、武侠剧多以周播剧的形式呈现,收视率排行前三分类为民国、军旅、家庭剧则都以日播剧的形式播放。(4) According to the analysis of the broadcast volume of TV dramas, Hunan Satellite TV broadcasts more youth and romance dramas, and youth, love and martial arts dramas are mostly presented in the form of weekly broadcast dramas. The top three categories in terms of ratings are the Republic of China, military and family dramas All are broadcast in the form of daily dramas.

Claims (1)

1. A time sequence matrix thermodynamic diagram visualization method aiming at category comparison comprises the following steps:
step 1: the following data dimensions are selected from all dimensions of the data set:
(1) the classification dimension is the dimension which is used as a classification basis in the process of classifying all data in the data set;
(2) the time dimension is the dimension for identifying the occurrence time of each data record in the data set;
(3) the dimension of presentation, namely the dimension used for data analysis in the data set, and the dimension used for presentation in the visual analysis process;
step 2: aiming at the classification dimensionality selected in the step 1, dividing all data records in the whole data set into different classifications based on a certain classification rule, and recording the number of classification types as D;
and step 3: mapping the time values into time sequence values according to the time dimensions selected in the step 1, and recording the total number of the time sequence values as T;
the time dimension comprises month, day, hour and minute, and the time dimension in the data is subjected to time sequence value conversion, and the specific conversion method comprises the following steps:
step 3.1: if the time axis is defined by month and day, the time series value conversion method is as shown in formula (1):
DVi=31×moni+di(1)
wherein moni、diRespectively recording the month and day corresponding to the ith data, DViA time series value mapped thereto;
step 3.2: if the time axis is defined in terms of hours and minutes, the time-series value conversion method is as shown in formula (2):
TVi=60×hi+mi(2)
wherein h isi、miRecord corresponding hour and minute, TV, for the ith dataiA time series value mapped thereto;
and 4, step 4: acquiring the basic distribution condition of the data of the display dimensions selected in the step 1, if necessary, selecting a log function or an exponential function for conversion to ensure that the converted data are uniformly distributed as much as possible, and setting the maximum and minimum values of the data to be displayed as Z for each display dimensionmax、Zmin
And 5: the method for establishing the layout structure of the visual result comprises the following specific steps: in the layout structure of the visualization result, different classifications of all data records are longitudinally arranged, and in all classifications in a data set, each classification occupies one line; transversely arranging the different time sequence values of all data records, wherein each time sequence value in a data set occupies one column;
in the layout, each data record in the data set is mapped into a pixel block in a visualization result, visual elements corresponding to all the data records present a matrix form according to a time sequence value and classification corresponding to the record, the matrix layout is called as the matrix layout, and a pixel block drawing area presenting all the data records in the matrix layout is called as the pixel block drawing area for short;
the calculation method of the number of height pixels occupied by each category in the data set is shown in formula (3):
d=(H-H1-H2)/D (3)
wherein d is the number of height pixels occupied by each classification; h is the total height pixel number of the set visualization result, and can be set according to the actual situation; h1、H2The height pixel number outside the pixel block drawing area in the visualization result is obtained; d is the number of the classification types of the data set obtained in the step 2;
the calculation method of the number of width pixels occupied by each time-series value in the data set is shown in formula (4):
t=(W-W1-W2)/T (4)
wherein t is the mapping width of each time series value; w is the total width pixel number of the visualization result and can be set according to the actual situation; w1、W2The width pixel number outside the pixel block drawing area in the visualization result is obtained; t is the number of time sequence values;
step 6: setting a color mapping table for displaying dimension data in the data record;
mapping data to be analyzed into colors capable of distinguishing different numerical values by using a visualization method of thermodynamic diagram;
and 7: calculating the position and color code of the corresponding pixel block in the visualization result of each data record in the data set, wherein the calculation method of the position and color code of the pixel block corresponding to one data record comprises the following steps:
step 7.1: searching the longitudinal coordinate of the classification in the matrix layout according to the classification obtained in the step 2 by the classification dimension of the data record, and setting the longitudinal coordinate as the longitudinal coordinate of the data record;
the vertical coordinates of the pixel blocks corresponding to all the data records under the same classification are the same, otherwise, the vertical coordinates of the pixel blocks corresponding to the data records under different classifications are different;
step 7.2: searching the horizontal coordinate of the classification in the matrix layout according to the time sequence value obtained in the step 3 according to the time dimension of the data record, and setting the horizontal coordinate as the horizontal coordinate of the classification;
the abscissa of the pixel block corresponding to all the data records with the same time sequence value is the same, whereas the abscissa of the pixel block corresponding to the data records with different time sequence values is different;
step 7.3: according to the presentation dimension data of the data record, the color coding of the corresponding pixel block is calculated as shown in formula (5):
Figure FSB0000187034960000031
wherein, CLRpNumbering the color corresponding to the p-th pixel block, TAMpFor the value of the display dimension data in the data record, maxTAM and minTAM are respectively the maximum value and the minimum value of the display dimension data in all the data records;
and 8: in order to increase the amount of information presented in the final visualization result, the average value of the display dimension data of the data records corresponding to the pixel blocks in each row in the matrix layout is mapped into a bar graph and is placed on the left side of the matrix layout, and the bar graph can be used for analyzing the data comparison condition of the display dimension of the data records of different classifications;
and step 9: in order to increase the amount of information presented in the final visualization result, the average value of the presentation dimension data of the data records corresponding to the pixel blocks in each column in the matrix layout is mapped into a bar graph and placed below the matrix layout, and the bar graph can be used for analyzing the data comparison condition of the presentation dimension of the data records with different time sequence values;
step 10: adding a classification dimension label and a time dimension label to the overall layout, and adding a classification dimension label on the left side of the matrix layout, wherein the classification dimension label comprises a specific name and corresponding information of the classification; time dimension labels are added below the matrix layout.
CN201810131948.9A 2018-02-09 2018-02-09 Time sequence matrix thermodynamic diagram visualization method aiming at category comparison Active CN108345668B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810131948.9A CN108345668B (en) 2018-02-09 2018-02-09 Time sequence matrix thermodynamic diagram visualization method aiming at category comparison

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810131948.9A CN108345668B (en) 2018-02-09 2018-02-09 Time sequence matrix thermodynamic diagram visualization method aiming at category comparison

Publications (2)

Publication Number Publication Date
CN108345668A CN108345668A (en) 2018-07-31
CN108345668B true CN108345668B (en) 2020-06-26

Family

ID=62959146

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810131948.9A Active CN108345668B (en) 2018-02-09 2018-02-09 Time sequence matrix thermodynamic diagram visualization method aiming at category comparison

Country Status (1)

Country Link
CN (1) CN108345668B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109949395A (en) * 2019-03-15 2019-06-28 智慧足迹数据科技有限公司 Thermodynamic chart rendering method and device
CN110164516B (en) * 2019-05-24 2021-09-24 山东大学齐鲁医院 A method and system for drawing a time distribution diagram of an inspection document
CN110164551B (en) * 2019-05-24 2022-03-29 山东大学齐鲁医院 Intelligent diagnosis and treatment auxiliary system for blood diseases
CN110502570B (en) * 2019-08-26 2022-03-01 东北大学秦皇岛分校 Three-dimensional visualization method for matrix thermodynamic diagram
CN110824280B (en) * 2019-10-08 2021-08-17 西南交通大学 Diagnosis and visualization method of switch health status based on feature similarity
CN111596979B (en) * 2020-04-08 2021-08-24 北京大学 Adaptive visual mapping adjustment method and system for pixel visualization of sequence data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106503170A (en) * 2016-10-31 2017-03-15 清华大学 A kind of based on the image base construction method for blocking dimension
CN107330454A (en) * 2017-06-20 2017-11-07 西安建筑科技大学 The non-linear visualization of magnanimity higher-dimension sequence data sort feature and quantitative analysis method
CN107636718A (en) * 2015-05-19 2018-01-26 真斯开普无形控股有限公司 Method and system for determining the status of one or more storage tanks in a particular location

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6917961B2 (en) * 2000-03-30 2005-07-12 Kettera Software, Inc. Evolving interactive dialog box for an internet web page

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107636718A (en) * 2015-05-19 2018-01-26 真斯开普无形控股有限公司 Method and system for determining the status of one or more storage tanks in a particular location
CN106503170A (en) * 2016-10-31 2017-03-15 清华大学 A kind of based on the image base construction method for blocking dimension
CN107330454A (en) * 2017-06-20 2017-11-07 西安建筑科技大学 The non-linear visualization of magnanimity higher-dimension sequence data sort feature and quantitative analysis method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种针对农残检测数据的时序分组可视化方法;陈红倩等;《系统仿真学报》;20161031;第28卷(第10期);第2510-2518、2526页 *

Also Published As

Publication number Publication date
CN108345668A (en) 2018-07-31

Similar Documents

Publication Publication Date Title
CN108345668B (en) Time sequence matrix thermodynamic diagram visualization method aiming at category comparison
AU2019340677B2 (en) Methods, systems, articles of manufacture and apparatus to privatize consumer data
US7599945B2 (en) Dynamic cluster visualization
US10580025B2 (en) Micro-geographic aggregation system
US10089358B2 (en) Methods and apparatus to partition data
CN103854261B (en) The bearing calibration of colour cast image
CN114639051B (en) Advertisement short video quality evaluation method, system and storage medium based on big data analysis
US20080068622A1 (en) Methods and apparatus to identify images in print advertisements
US20070156787A1 (en) Apparatus and method for strategy map validation and visualization
Kumar et al. Content based image retrieval using gray scale weighted average method
US10642902B2 (en) Systems and methods for rule-based content generation
Krah et al. Temperature affects the timing and duration of fungal fruiting patterns across major terrestrial biomes
Alsakran et al. Using entropy-related measures in categorical data visualization
US20210165792A1 (en) Ontology driven crowd sourced multi-dimensional question-answer processing for automated bid processing for rapid bid submission and win rate enhancement
CN115809286A (en) Structured data statistical analysis and report intelligent generation system
US20070179922A1 (en) Apparatus and method for forecasting control chart data
CN113409885B (en) Automatic data processing and mapping method and system
US20180060913A1 (en) Information processing apparatus, information processing method, and program
Pickle et al. Visualizing health data with micromaps
US20200074488A1 (en) Methods, and devices, to optimize consumer subsampling from groups of independently modeled audience segments to enable representative comparitive analytics
Machado et al. An investigation of students behavior in discussion forums using Educational Data Mining.
Noviana et al. Organizational changes; bibliometric analysis for 2016-2022
Few et al. Criteria for evaluating visual EDA tools
CN116614679B (en) Optimization processing method and system for short video advertisement delivery
Villa et al. Psychovisual evaluations of many luminous environments on the internet

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240226

Address after: 230000 floor 1, building 2, phase I, e-commerce Park, Jinggang Road, Shushan Economic Development Zone, Hefei City, Anhui Province

Patentee after: Dragon totem Technology (Hefei) Co.,Ltd.

Country or region after: China

Address before: 100048 Beijing Haidian District Fucheng Road 33 Beijing University of Industry and Commerce

Patentee before: BEIJING TECHNOLOGY AND BUSINESS University

Country or region before: China